Re: random data

From: Mark Wooding (mdw_at_nsict.org)
Date: 07/31/03


Date: 31 Jul 2003 08:21:54 GMT

MacGregor K. Phillips <mkp@topsecretcrypto.com> wrote:

[re diehard...]

> After you have run all of the 18 tests on a file of random bytes, does
> the final p-value have any real meaning? It says for all the
> individual tests the smaller the p-value the better. So if the final
> p-value for all 18 tests is very small, is this good, bad, or does not
> have any real meaning in the grand scheme of things?

The `p-values' are computed from your input data in various complicated
ways. The idea is that they're computed in such a way that, if the
input bits are uniformly distributed, then the p-values are uniformly
distributed over the interval [0, 1). Suppose P is a p-value; then,
still assuming uniform input, we have

  Pr[P < x] = x and Pr[P > x] = 1 - x

The vaguely clever thing is that the p-values are /also/ computed in
such a way as to push them `outwards from the centre' if the input data
is nonrandom. Thus can you tell the difference.

So, no, you don't want `small' p-values: you want them all scattered
widely across the interval. If you have lots of small (or large)
p-values, and the last one comes out really small, then you've got a
failure.

-- [mdw]