Re: getting around Ken Thompson's compiler Trojan

From: Alun Jones (alun@texis.com)
Date: 01/23/03


From: alun@texis.com (Alun Jones)
Date: Wed, 22 Jan 2003 23:13:28 GMT

In article <d06acab6.0301221311.396f70b1@posting.google.com>,
christopherlmarshall@yahoo.com (Chris Marshall) wrote:
>First, write a program to obfuscate the source code of any other program
>by randomizing the variable and function names, as well as the names
>of the source code files.
>
>Use the corrupted gcc to build the obfuscated gcc. The "am I compiling
>the compiler?" code in the corrupted gcc won't detect that it is
>compiling the compiler and will produce an uncorrupted binary of the
>compiler.

It is all going to depend on how the "am I compiling the compiler" test works.
 First, you're going to see the object files built with some token parsing,
resulting in an object file that likely contains exactly the same object code
as the un-obfuscated version produced, with a symbol table for external
linking. If the test checks this object code, your obfuscations do nothing of
any use, because the test looks for the object code, not the source.

You'd have to have a fair good idea as to what part of the object code was
being compared, and find some way to alter it significantly - such that the
optimiser, for instance, would not optimise it back to the original object
code. So, you'd probably have to have the code in question do the same
operation, but in a completely new way that the optimiser does not see as
equivalent.

For instance, let's say the original source code says:

y=y*6;

You could replace that with:

y=(y<<1) + (y<<2);

However, there are some environments where the latter is actually a faster
operation than the former, and the optimiser will already have made the same
replacement! So, you've got a battle on your hands to not only guess which
part to alter (or find some optimiser-defeating way of altering every line of
code), but also find a way to rewrite it that the optimiser will be incapable
of detecting.

Hand-compiling the compiler might be quicker.

Alun.
~~~~

[Please don't email posters, if a Usenet response is appropriate.]

-- 
Texas Imperial Software   | Try WFTPD, the Windows FTP Server. Find us at
1602 Harvest Moon Place   | http://www.wftpd.com or email alun@texis.com
Cedar Park TX 78613-1419  | VISA/MC accepted.  NT-based sites, be sure to
Fax/Voice +1(512)258-9858 | read details of WFTPD Pro for XP/2000/NT.


Relevant Pages

  • Re: Vuescan Epson 4490 Nightmare
    ... underlying source code is a mess. ... Now I grant you some folks over-use OS facilities, but it would be hard to prove that from the object code. ... indirect result of the various libraries Vuescan is linked with (as ... structure will change depending on what compiler flags were used. ...
    (comp.periphs.scanners)
  • Re: Could the SBCL (or any X Lisp) compiler be optimised to Qi output?
    ... The Source Code ... The Object Code ... it may take some real work to set up the foundation and ... That would really be equivalent to writing a compiler for Qi as a ...
    (comp.lang.lisp)
  • Re: inability to print from console
    ... >> primarily emphasize translating source code to object code. ... routinely use the Comeau online compiler despite the fact that they get ... >> translates erroneous source code into useful and helpful diagnostics. ...
    (alt.comp.lang.learn.c-cpp)
  • Re: Modelling Disjoint Subtypes
    ... exist implementations of rationals such that when the rationals ... Further, in the case that the compiler does so, the object code ... source code using the rational type. ...
    (comp.databases.theory)
  • Re: GPL vs non-GPL device drivers
    ... shipped the source code of the modified POP server. ... given you the compiler he compiled it with, ... Actually, if memory serves, when you license a work under the GPL, part of the ... a derivative work" the claim is invalid - because, as it has been shown, a ...
    (Linux-Kernel)