Re: new /dev/random
From: Ernst Lippe (ernstl-at-planet-dot-nl_at_ignore.this)
Date: 10/13/04
- Next message: Jean-Luc Cooke: "Re: new /dev/random"
- Previous message: Niclas Bäckman: "PBE with PKCD#5"
- In reply to: David Wagner: "Re: new /dev/random"
- Next in thread: David Wagner: "Re: new /dev/random"
- Reply: David Wagner: "Re: new /dev/random"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Date: Wed, 13 Oct 2004 13:52:34 +0200
On Tue, 12 Oct 2004 05:15:31 +0000, David Wagner wrote:
> David Wagner wrote:
>>D is (h1,h2,epsilon)-secure if,
>> Given k is drawn from any distribution with min-entropy h1, and x is
>> drawn independently from any distribution with min-entropy h2, then
>> D(k,x) is indistinguishable from uniform except with advantage at most
>> epsilon.
>
> Let's try the simplest thing I can think of. k and x are n bits long, and
> we want to generate just one uniformly distributed bit. The simplest
> construction I can imagine is D(k,x) = k.x (the dot-product).
>
> It turns out that one can't guarantee anything if h1 + h2 <= n. For
> instance, imagine if the distributions are chosen such that the top n/2
> bits of k are always zero, and the bottom n/2 bits of x are always zero;
> then D(k,x) is always zero. No joy.
>
> But what about the case where h1 and h2 are large? Say, h1 = h2 = 3n/4?
> Can one show that D(k,x) is fairly close to uniform, as long as n is large
> enough? It seems quite plausible that this might work. Does anyone know
> how to work out the details of the analysis?
This is certainly not a full analysis, but I believe that there are serious
problems with this approach when an adversary can select the input
distributions. I hope the following analysis is correct (but I
am pretty certain that it does contain some typo's).
Let n1 = bitlength(k) and n2 = bitlength(x) and m = bitlength(D(k,x)).
There are 2^(n1+n2) different inputs to D with max 2^m different possible
outputs. By the pidgeon hole principle there must be at least one output value
C that is a collision for at least 2^(n1+n2-m) possible input combinations.
Now we can select a set S with all <k,x> combinations such that D(k,x) =
C. Let Sk be the possible k values that occur in S and let Sx be the possible
x values that occur in S.
Now we choose a set Tk with 2^h1 possible k values and a set Ts with 2^h2
possible x values. The distributions for k and x will draw a uniformly
distributed sample from Tk and Tx, respectively. Denote the probability
that the output from D is equal to C by
Q = P(D(k,x) = C | k in Tk and x in Tx).
We have to distinguish four different cases:
CASE 1: 2^h1 >= |Sk| and 2^h2 >= |Sx|
We select some Tk such that it is an arbitrary superset of Sk and Tx such that
it is an arbitrary superset of Sx.
Q = |S| / (|Tk| * |Tx|) >= 2^(n1+n2-m-h1-h2)
When h1+h2 < n1+h2 we can find a distinguisher with
advantage 2^(n1+n2-h1-h2-m) - 2^-m.
CASE 2: 2^h1 < |Sk| and 2^h2 >= |Sx|
We select Tk as an arbitrary subset of Sk and Tx as an arbitrary superset of
Sx. For all k1 in Sk there exists at least one x1 in Sx such that D(k1,x1)=C,
therefore for all k2 in Tk P(D(k2,x2) = C | x2 in Tx) >= 2^-h2. So Q >= 2^-h2.
When h2 < m we can find a distinguisher with advantage 2^-h2 - 2^-m.
CASE 3: 2^h1 >= |Sk| and 2^h2 < |Sx|
We select Tk as an arbitrary superset of Sk and Tx as an arbitrary subset of
Sx. For all x1 in Sx there exists at least one k1 in Sk such that D(k1,x1)=C,
therefore for all x2 in Tx P(D(k2,x2) = C | k2 in Tk) >= 2^-h1. So Q >= 2^-h1.
When h1 < m we can find a distinguisher with advantage 2^-h1 - 2^-m.
CASE 4: 2^h1 < |Sk| and 2^h2 < |Sx|
In this case there are two subcases that depend on the magnitude
of h1 and h2.
When h1 < h2, first select Tk as an arbitrary subset of Sk,
then select Tx as a subset of Sx subject to the constraint
that for all k1 in Tk there exists at least one x1 in Sx
such that D(k1,x1) = C. Now for all k2 in Tk: P(D(k2,x2) = C | x2 in Tx)
>= 2^-h2, therefore Q >= 2^-h2.
When h1 > h2, we follow the mirror construction which leads to Q >= 2^-h1.
Combining both subcases we get Q >= 2^-max(h1,h2). So when max(h1,h2) < m
we can find a distinguisher with advantage 2^-max(h1,h2) - 2^-m.
I would expect that this scheme should work optimally when h1=n1. But then the
results of cases 1 and 3 are rather devastating. Essentially, they mean that
we cannot extract entropy from x, because in all realistic cases (e.g. h2 <
n2) either there is a distinguisher or we cannot extract more entropy than we
have supplied in k.
If I understand Paul correctly, he wants to find an information
theoretical definition that should work with all possible input distributions.
I don't expect that this is possible. The alternative that I
prefer is to define some measure on the possible input distributions, because
for any compression function there are only "few" pathological input
distributions. The other solution is the approach advocated by David
i.e. to use computational security, because an attacker cannot effectively
find collisions in D.
Ernst Lippe
- Next message: Jean-Luc Cooke: "Re: new /dev/random"
- Previous message: Niclas Bäckman: "PBE with PKCD#5"
- In reply to: David Wagner: "Re: new /dev/random"
- Next in thread: David Wagner: "Re: new /dev/random"
- Reply: David Wagner: "Re: new /dev/random"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Relevant Pages
|