[adacrypt] For Paulo (and anyone else bored enough)

(See the end of this message for a note about killfiles.)

You needn't wait for adacrypt to respond to your challenge. Find enclosed a Dropbox link pointing to two complete plaintext/ciphertext pairs, along with a third ciphertext, generated using adacrypt's self-published algorithm. These are based on my best understanding of the intended use of adacrypt's obfuscation system.

The primary source file used has the following digests:
md5 real_time_encryption_program_mark_2.adb
sha1 real_time_encryption_program_mark_2.adb
sha256 real_time_encryption_program_mark_2.adb

Other sources files were drawn from the same ZIP file, available on request (unless adacrypt objects, but since he published it, I can't imagine he will).

The two "known" plaintexts (body-1.in, body-2.in) were produced using Samuel L. Ipsum and lipsum.com respectivey. The third plaintext was produced by means I will not yet disclose, other than to confirm that it contains written English. All inputs are over 4kb in length, before counting the embedded metadata.

The obfuscated outputs were created using this recipe:

$ for file in *.ads* *.ali*; do
> # gnat appears to be picky about caps, and to want all-lowercase
> # inputs. De-smash adacrypt's randomly-capitalized names.
> target=$(echo $file | tr A-Z a-z);
> mv $file $target;
> done
$ rm *.ali *.o *.exe # eliminate untrusted binaries and rebuild
$ gnatmake real_time_encryption_program_mark_2.adb
…output elided…
$ ./real_time_encryption_program_mark_2 < body-1.in
$ ./real_time_encryption_program_mark_2 < body-2.in
$ ./real_time_encryption_program_mark_2 < body-3.in

Note that the plaintext files include extra cruft to satisfy adacrypt's in-band metadata protocol: each body-N.in file has the following structure:

line 1: output filename (body-N.out)
line 2…n: input text (7-bit clean ASCII text)
line n+1: single ~ character terminating input text

I believe that the trailing ~ is included in the output despite being intended as a sentinel for the end of input. This may be a bug in adacrypt's implementation of his own algorithm.

adacrypt's code expects further input, not included in the obfuscated output, after the end of the main text. These additional inputs are meant to control the display of diagnostic information about the input and output. As the responses to these prompts are not included in the input files, real_time_encryption_program_mark_2 exits with an exception the first time an unsatisfied prompt is displayed. This appears to be safe; the output file is closed cleanly before any attempts to prompt for diagnostics.

It would help if anyone inclined to work on this data set could show enough detail in their break of this obfuscation algorithm to demonstrate that they did not need to use source files other than real_time_encryption_program_mark_2.adb. This file does not contain enough "secret" information to compromise the algorithm, so including references to that file in your break should be acceptable. Leave other files out, as they contain large tables of coefficients that comprise adacrypt's "keyset" and which I believe could compromise the system even if it were otherwise sound and were sensibly used.

If you'd prefer, I believe I can generate a new suite of "keyset" files, covering

* alices_digital_signature.adb
* alices_encryption_numbers.adb
* change_of_origin_i_coefficients.adb
* change_of_origin_j_coefficients.adb
* change_of_origin_k_coefficients.adb
* normal_vector_i_coefficients.adb
* normal_vector_j_coefficients.adb
* normal_vector_k_coefficients.adb
* normal_vector_multipliers.adb

with which to re-create the output files as above. This would ensure that you don't have any access to the bulk of the secrets in (my instance of) adacrypt's system. adacrypt has not seen fit to define a "keyset" generation method, so I make no guarantees about the correctness of any files I generate, but I believe I've "understood" enough of his monologues to put something together.

The ZIP: http://db.tt/RUGK8dIg
sha256 adacribs.zip

This link will remain valid for at least a month; after that, I may remove it to make room for other things.

Some interesting empirical properties:

* Each output file is just over 32 times the size of the corresponding input.

* Compressing the output files using `zip -9` shrinks them by 65%, consistently. This reduces the output files to "only" 11 times the size of the (uncompressed) inputs. (The inputs can be compressed by around 70% using the same approach.)

* Decrypting the output files again (using the included general_decryption_program_mark_2.adb) indicates that newlines may not survive round-trips through this system.


P.S.: I'll include the [adacrypt] Subject: marker if I post further messages on this subject, and I suggest others do the same. Please killfile this marker if you wish not to see any further postings from me on the subject. I'll also remain at this address for the foreseeable future, if you'd rather not see my posts at all.

I believe, having read adacrypt's code, that Paulo's understanding of the obfuscation algorithm is basically correct, and am interested in seeing him demonstrate his break. adacrypt's algorithm is firmly in the category of "kid sister" ciphers, so I'm mainly in this for my own amusement, but I'd probably learn something from seeing the break in action.

Oh, and Paulo? Please, stop responding to adacrypt with your canned message. If you can work out that he's either a crank or a fraud from first principles, so can everyone else you'd want to talk to; your warning is unnecessary and slightly crankish itself.