Re: Breaking Compression Codes

From: Vedat Hallac (mvhallac_at_crosswinds.net)
Date: 05/29/03


Date: Thu, 29 May 2003 20:58:08 +1000

On Wed, 28 May 2003 03:31:01 +0000, bil wrote:

> I am looking for information on decompressing data without having
> complete knowedge of how the data was originally compressed.
>
> I remember seeing a paper by a Mohtashemi awhile back dealing with
> decrypting Huffman coded files without having the tree. Anyone know of
> other work along these Mohtashemilines?

The paper you are after is:
D. W. Gillman, M. Mohtashemi, and R. L. Rivest,
"On breaking a huffman code,"
IEEE Transactions on Information Theory, vol. 42, pp. 972 976, May 1996

Citeseer also found a snippet that sounds very interesting:
"Motivated by the same problem, Fraenkel and Klein [8] have shown that the
problem of finding the encoding rule given both a sample of the source
stream (or original file) and the corresponding sample of the encoded file
is NP complete."

You might as well give up. :-) Just joking. Their setup and your case
might not be the same.

If you want to research further, you may want to start from

http://citeseer.nj.nec.com/fraenkel94complexity.html

The paper is "Complexity Aspects of Guessing Prefix Codes (1994)", by
Aviezri S. Fraenkel, Shmuel T. Klein. Maybe you'll hit something
interesting in the paper, or one of the citations.