Computational secure entropy extraction

From: Ernst Lippe (ernstl-at-planet-dot-nl_at_ignore.this)
Date: 10/27/04

Date: Wed, 27 Oct 2004 13:56:02 +0200

Somewhere hidden in the /dev/random thread was a nice discussion about
distilling entropy from an unknown distribution. One of the main topics was if
there existed some universal entropy distiller that could be used on all input
distributions. With a information theoretical definition of security such an
animal does not exist. But the situation becomes a lot more interesting when
we have two different and independently distributed inputs and use a
computational security approach.

One of the questions that David Wagner asked was if it was possible to show
that there existed a universal entropy distiller with two independent inputs
that was computationally secure. I believe that under certain conditions the
answer is "yes".

An m-bit distiller D(k,x) takes strings k,x of specified fixed length n1 and
n2 respectively as inputs and produces an m-bit string.

D is (h1,h2,t,epsilon)-secure if given that k is drawn from any distribution
with min-entropy h1, and x is drawn independently from any distribution with
min-entropy h2, no attack running in time at most t can distinguish D(k,x)
from uniform with advantage better than epsilon.

I will only consider the case where k contains maximum entropy, i.e. where
h1 = n1. In a certain sense this is the optimal case, if it is not possible to
securely extract entropy under these conditions "secure entropy distillation"
is certainly not possible.

The adversary can select some input distribution for x subject to the
constraint that its min-entropy is equal to h2. The adversary need only
consider nearly uniform distributions (that is a distribution where at most
one value has a probability that is different from 0 and 2^-h2). That is
because if there is an input distribution with a certain advantage that is not
nearly uniform it is always possible to find an equivalent distribution with
at least the same advantage that is nearly uniform. So the adversary
has to find some subset Tx of the possible x values that will maximize
his advantage. Because the min-entropy of the x values is h2, Tx
should contain 2^h2 elements.

For the entropy distiller D() we take a PRF. This implies when an attacker
wants to know t different outputs of D() he has to do at least O(t) work. It
seems probable that a PRF is the optimal computational secure entropy
distiller, since examining the outputs for a give set of inputs will give the
adversary no information about the inputs that have not been examined.

In t time an attacker can examine at most t different combinations of
k,x. In the rest of this analysis I will assume that the time t is very

The adversary selects some output value C and attempts to increase the
frequency of this value in the outputs. Let's define that a "hit" is the case
where the output of D() is equal to C.

When t is not very large compared to 2^m, an adversary might have some
advantage to select a specific value for C (e.g. the most common output value
that he found). But when t gets very large the variance of the fraction of C's
in the output will decrease with a factor of t^-1/2. When t is very large, as
we assumed, then all possible output values are equivalent, and the
adversary might just as well select C a priori.

Now if we take a <k,x> pair that has not been examined by the adversary, then
the probability of a hit is equal to 2^-m, because D is a PRF with 2^m
possible different outputs.

For a value x let k(x) be the number of keys that were examined for this x
and let h(x) be the number of hits that were found. Define the "bias" b(x)
= h(x) - 2^-m * k(x). So the bias b(x) is the difference between the
number of observed hits and the number of expected hits. For an x that has not
been examined by the adversary the value of b is zero.

When the value of the x input to D(k,x) happens to be equal to x, then the
expected probability of a hit is equal to 2^-m + 2^-h1 * b(x)

The adversary can select some algorithm to select the <x,k> pairs that he
wants to examine. (It turns out to be pretty difficult to analyze such
selection algorithms) But for all examination strategies there is a single
optimal method to select the x values for Tx. It should be obvious that the
optimal strategy of the adversary is to select 2^h2 values for x that have the
highest values for b(x).

Now when we define bmean as the average value of b(x) for x in Tx, then the
expected probability of a hit in the outputs of D() is equal to 2^-m + 2^-h1 *
bmean. So the advantage for the attacker is epsilon = 2^-h1 * bmean.

This can be used to give a simple lower bound on the time t, for a given
advantage epsilon. When some x element has bias b(x) then the adversary
must have found b(x) / (1 - 2^-m) hits for this x. So when the average
value of b over 2^h2 elements is bmean, an attacker must have found
at least bmean * 2^h2 / (1- 2^-m) hits. But in order to find
a single hit, the attacker must examine on average 2^m different inputs,
so the attacker will have examined at least bmean * 2^(h2+m) / (1- 2^-m)
different inputs. Because bmean = 2^h1 * epsilon, this gives
t >= epsilon * 2^(h1+h2+m) / (1- 2^-m)

This is only a crude lower bound, it is only a realistic lower bound
for small values of epsilon. I have analyzed several selection algorithms
and in all cases t increased much more rapidly than this formula would

Overall, when this analysis is correct, this is a very positive result. A PRF
can be used as a universal entropy distiller and it is effectively possible to
reduce the advantage of an adversary to arbitrary low levels by increasing the
entropy in k.

I have analyzed several different approaches to get better bounds for the
security. In general, this turns out to be rather complicated. Because
this post is already very long I have not included them, but if
there is any interest I could post some of my results.

Ernst Lippe

Relevant Pages

  • Re: new /dev/random
    ... > optimistic dreams for what one would like from an entropy distiller. ... SHA1 is a secure hash, though, which would also be very exciting. ... > to any such distribution yields pseudorandom outputs. ...
  • Re: behavior as mapping
    ... estimating a probability distribution, the distribution ... sequence with equal probability - since you have microsecond temporal ... reduction of the entropy Pto the entropy P ... If there were 4 genes we would need 2 bits of binding site info. ...
  • Re: behavior as mapping
    ... estimating a probability distribution, the distribution ... sequence with equal probability - since you have microsecond temporal ... reduction of the entropy Pto the entropy P ... If there were 4 genes we would need 2 bits of binding site info. ...
  • Re: new /dev/random
    ... >A c1,c2 entropy generator takes any input k and produces a string of ... >fixing any probability distribution it likes over those strings, ... >long as the entropy exceeds k. ... our mixer to do a good job with, we can then ask whether applying SHA1 ...
  • Re: new /dev/random
    ... There is still hope for an entropy distiller D ... anything else about the form of those two distributions. ... a way to distill one bit of uniform random output securely from this? ...