Re: Foiling the known-plaintext attacks

Mok-Kong Shen wrote:
Bryan wrote:
  Mok-Kong Shen wrote:
The main point my mine is how could one deal with the "one equation but
two unknowns" issue. Mr. Maaatin, if you could answer that in place of
Mr. Olson, I am grateful to you just as well.

Both Maaartin and I already answered that "issue" in the Hill-cipher
thread. The attacker collects enough known plaintext that the
equations dominate the unknowns.

But istn't what you wrote entirely "vague"? What is exactly a measure
of "dominance"? How (i.e. through which varaibles involved) does there
come about a dominance?

What just I explained: the number of equations becomes larger than the
number of unknowns as the attacker sees more ciphertext with known
plaintext. The system is likely to be solvable as soon as the attacker
has as many equations as unknowns, but, as I just quoted Maaartin
explaining, the attacker typically wants lots of ciphertext.

It is necessary to start from plaintext and
ciphertext, via the equation that link them together, to trace the
exact path of the "influence" of plaintet and ciphertext on the
parameters of the PRNGs involved, isn't it?

So why have you not tried that already? What stops you other than your
unwillingness to put forth a serious effort?

I'll give you an example: One of the PRNG's I suggested you might look
at was a lagged Fibonacci generator. Knuth suggests one which he calls
an "additive generator" in AoCP section 3.2.2, with lags of 24 and 55.
It uses arithmetic mod 2**word_size, so it is convenient for
generating the 32-bit words of the Shen-Hill matrix. The PRNG is "not
too poor" in the sense that it passes statistical tests, so it meats
your stated requirements.

The generator has 55 words in its state, call them X_0 through X_54,
thus as the attacker we starts with 55 unknowns, and no equations. To
set up the first pair of matrices requires, if I'm reading the
proposal correctly, 16 outputs from the PRNG, called g_0 ... g_15. The
PRNG from Knuth produces these as:

g_0 = X_0 + X_31 mod 2**32
g_1 = X_1 + X_32 mod 2**32
g_15 = X_15 + X_46 mod 2**32

That gives us 16 new unknowns but also 16 equations, so
(number_of_equations - number_of_unknowns) is still 55.

The 16 matrix entries are used to encrypt 8 words of 32 bits, and
since the attacker has known plaintext, that gives him 8 more
equations with no more unknowns. So after the first block of 8 words,
(equations - unknowns) drops to 47.

To encrypt the next 8 words takes the next 16 outputs from the PRNG to
form Shen-Hill matrix entires, which gives us 16 new unknowns and 16
new equations. The ciphertext and known plaintext again give us 8 more
equations and no more unknowns, so after the second block of 8 words,
(equations - unknowns) is 39.

As we might expect, (equations - unknowns) drops by 8 for each 8-word
block of ciphertext with known plaintext. After 7 blocks, (equations -
unknowns) is -1, and the system is likely to be uniquely solvable.
After 100 blocks the equations outnumber the unknowns by 745, so the
system will definitely be over-constrained and solvable.

The equations describing the PRNG outputs are linear in the ring mod
2**32, and the operations in the the Shen-Hill matrix multiplication
are also mod 2**32, so all our equations are linear in same ring. The
attacker has nothing more to do than solve a system of simultaneous
linear equations.

Could you "formally"
demonstrate the "dominance" you meant?

Does there exist a cipher such that the number of unknowns is always
greater than the number of equations available to an attacker? Yes;
it's called the one-time pad. Your scheme is not such. Any cipher with
a bounded key size has a unicity distance, which is the amount of
ciphertext the attacker needs for the solution to be unique.

Mr. Shen, answers don't go away if you refuse to study them, and
they're not going to change just because you keep asking the same
questions over and over.

It would be entirely out of my fantasy to imagine my scheme could be
something to be compared to OTP.

Yet here you are deluding yourself into thinking the attacker can't
get enough equations to determine the unknowns.

But the word linearity "without" any
"qulifications" doesn't unconditionally equal to "solvability" or
"weakness in crypto", as I wrote previously.

Have you considered trying to learn about the subject before
attempting to teach about it? Mr. Shen, you've been proposing purely
linear cryptosystems for decades.

Every pupil learning
elementary math in schools knows that a linear equation having two
unknowns can't be solved or more exactly has no unique solution but
as a rule have a large number of feasible solutions.

That's why we explained to you, over and over, that the attacker uses
additional ciphertext / known plaintext. As still quoted above, "Any
cipher with a bounded key size has a unicity distance, which is the
amount of ciphertext the attacker needs for the solution to be
unique." Did you think you have some sort of exemption?

In the present
case with the equation C_i = g_0 + g_1*P_i  mod 2^32 g_0 and g_1
are the unknowns and, for a particular chosen value of g_0, there is
further an issue when P_i happens to be even. This state of affairs
is of course trivial and well-known. But the "triviality" doesn't
(unconditionally) translate to what one needs in the present context,
namely to obtain g_0 and g_1, or at the very least obtain frequency
distributions of g_0 and g_1 so as to somehow with these to attempt
to start predicting the parameters of the PRNGs. If there is not even
a dimmest indication of such paths to a crack, it doesn't "help" at all
simply to "claim" that the scheme is crakable. Certainly, anything
other than an OTP doesn't have perfect security and I would be among
the last persons to consider/imagine the scheme to be absolutely secure.
But the issue here is not a pure theoretical one but a practical one,
namely to look and find where there possibly may be "concrete" paths
(mathematical relations) that the analyst can "effectively" exploit:
As I said, I haven't yet seen any "concrete" indications at all in
this thread. (See also what I said concerning "dominance" above.)

Obviously, Mr. Shen, you did not see the concrete indications because
you did put forth a serious effort to understand the technical issues
people explained to you. You were offered many clues, both in this
thread and in the "Dynamic Hill cipher" thread. I suggested you start
by looking at a generator that is linear in the same field or ring as
the Shen-Hill matrix operations. We told you, repeatedly, that your
not-enough-equations argument fails as the attacker collects more
ciphertext with known plaintext.