# Re: VENONA Query; Kullback-Leibler Divergence

**From:** David Wagner (*daw_at_taverner.cs.berkeley.edu*)

**Date:** 08/18/05

**Next message:**chuckles: "Simple solution to the Diffie Hellman MITM attack"**Previous message:**Regis: "Re: md5 collisions and speeding tickets"**In reply to:**Jim Foyle: "VENONA Query; Kullback-Leibler Divergence"**Next in thread:**Jim Foyle: "Re: VENONA Query; Kullback-Leibler Divergence"**Reply:**Jim Foyle: "Re: VENONA Query; Kullback-Leibler Divergence"**Reply:**Douglas A. Gwyn: "Re: VENONA Query; Kullback-Leibler Divergence"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]

Date: Thu, 18 Aug 2005 17:27:18 +0000 (UTC)

Jim Foyle wrote:

*>In NSA Monograph #4, on VENONA, Robert Benson says that in 1952 Condray
*

*>proposed attacking messages with unknown message beginnings using the work
*

*>of the Naval cryptanalyst and mathematician, Dr. Richard Leibler.
*

The KL divergence is a measure of how far apart two distributions are.

*>Can anybody tell me if this Divergence test, or some suitable modification
*

*>of it, could in fact be used to tell if two messages were likely
*

*>super-encrypted with the same one-time pad?
*

If there is some application of the KL Divergence to this problem,

perhaps Condray defined two distributions, argued that if they are

"close" then the two messages used the same one-time pad, and then

used KL divergence to measure "closeness". The question would be how

those distributions are defined, as a function of the observed ciphertexts

and anything else that was known to the cryptanalysts. I'd have to know

more about the system to speculate on that topic. Is a copy of the

monograph available anywhere?

Perhaps one approach might be as follows. Let D denote the distribution

obtained by xor-ing two random messages (if super-encryption is used, I use

the term "message" to denote the input to the one-time pad, i.e., the

result of the pre-encipherment). Given two ciphertexts c,c', one might

test whether c xor c' looks like it could have come from distribution D.

Notice that if c and c' were enciphered using the same one-time pad, then

c xor c' should be distributed according to D; if they were enciphered with

different one-time pads, then c xor c' should be uniform.

One might apply the KL divergence as follows. Let F denote the empirical

distribution of c xor c'. For instance, if each code group of each

message is determined to be processed independently of all others (by the

pre-encipherment), then we might define F by

F(g) = (# of times that code group g appears in c xor c') /

(# of code groups in c xor c').

Let D_0 denote the a priori distribution on a code group of a message,

and let D denote the distribution obtained by drawing two independent

samples from D_0 and xor-ing them. Then one might use KL(D,F) as a

measure of similarity between D and F; the smaller that KL(D,F) is, the

more likely that c and c' used the same one-time pad. Or, one might

compare KL(D,F) to KL(Uniform,F), and presume that c,c' used the same

one-time pad if the former is smaller, otherwise presume that c,c' used

different pads.

These are mere speculations. I have no idea whether this kind of

approach would work, or would be optimal, let alone whether it is what

Condray had in mind.

There has been some recent discussion of KL divergence and other such

measures here on this group by Doug Gwyn.

**Next message:**chuckles: "Simple solution to the Diffie Hellman MITM attack"**Previous message:**Regis: "Re: md5 collisions and speeding tickets"**In reply to:**Jim Foyle: "VENONA Query; Kullback-Leibler Divergence"**Next in thread:**Jim Foyle: "Re: VENONA Query; Kullback-Leibler Divergence"**Reply:**Jim Foyle: "Re: VENONA Query; Kullback-Leibler Divergence"**Reply:**Douglas A. Gwyn: "Re: VENONA Query; Kullback-Leibler Divergence"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]