Re: Does Base64 encoding before encryption makes it easier to break?

From: John E. Hadstate (jh113355_at_hotmail.com)
Date: 12/29/03


Date: Mon, 29 Dec 2003 07:12:11 -0500


"Mok-Kong Shen" <mok-kong.shen@t-online.de> wrote in message
news:3FEF6DE4.8C7BA625@t-online.de...
>
>
> "Goh, Yong Kwang" wrote:
> >
> [snip]
> > But to me it may reduce the security because base 64 encoding reduces
> > the number of symbols (characters) used to represent the plaintext. In
> > original binary mode, there would be 256 combinations for each byte.
> > Whereas in base 64 encoding, there would be only 65 combinations in
> > use for each byte, thus my rationale is that it may make it easier for
> > the attacker to do some statistical cryptanalysis.
>
> But this (reversible) conversion doesn't involve any
> encryption key. So the two forms are simply equivalent
> 'representations' of the same thing and thus shouldn't
> inherently affect the statistical properties in them.

This may require a little more thought. First, a (reversible) keyed AES
encipherment of a block of 16 plaintext bytes is a "simply equivalent
representation of the same thing" and yet the statistical properties of the
ciphertext will differ wildly from those of the plaintext.

Second, the analysis of the statistical properties of Base-64 encoding must
show *Zero* probability that one will ever see bit combinations outside of
the basic set of characters (A-Z, a-z, and a few others) regardless of the
distribution of bit patterns in the source material--another example of a
change in statistical properties by an unkeyed reversible conversion.

Third, Base-64 encoding can be considered to be a polygraphic,
polyalphabetic substitution cipher with a fixed, known key and a relatively
short period.

Base-64 encoding adds no entropy to the plaintext, but it does increase the
amount of plaintext available and provides a signature for cryptanalysis in
the form of a characteristic probability distribution. It is a "data
armoring" technique historically used to ensure that natively 8-bit
character data can be routed through legacy communication systems that were
limited to processing 7-bit characters.



Relevant Pages