Re: diehard and ent results quesion

From: Douglas A. Gwyn (DAGwyn@null.net)
Date: 03/05/03


Date: Tue, 04 Mar 2003 20:27:44 -0500
From: "Douglas A. Gwyn" <DAGwyn@null.net>

Bryan Olson wrote:
> ... I view the chi-square statistic as much more
> like a variance than like an average.

Indeed, all similar tests measure (perhaps approximately)
what is known in the trade as a "bulge" in the distribution.

A point that should have been made earlier about the testing
of the distribution 8-bit "byte" values from what is
presumably at heart a *bit* generator: That measures, not
as crisply as possible, in effect the bulge for just one of
the coefficients of the discrete Fourier transform of the
bit sequence. One source of confusion in this thread has
been concerning exactly *what* is being tested. If the real
question is whether the *bit* sequence appears to agree well
with the uncorrelated-uniform-random model, then the chunking
should be avoided. One thing I like about the "MDI" approach
is that *each available piece of data* contributes as much as
possible (theorem!) to the information in favor of (or
opposing) the hypothesis under test. While some other tests
measure essentially the same thing, its not as obvious that
they do as good a job (even if they do). Anyway, switching
back and forth between different but related hypotheses just
makes it harder to determine *what* one ends up knowing.

Note: tests similar to Pearson's chi-square per this thread
are checking only the *first-order* statistics, not
correlations. Now, chi-square *can be* and *is* used to test
other hypotheses, but those tests must be designed to use the
data appropriately. Since the DFT is directly useful for
such higher-order correlation, it should be helpful here.
It turns out that the human eye is fairly good at spotting
patterns in the output of many PRNGs, when the bits are
arranged in a regular bitmap. Marsaglia appears to have been
among the pioneers of this technique; I think at least one of
the readers of this newsgroup was experimenting with this.

Then there is "Maurer's Universal Test" (which recently has
been patched up in the literature). I don't know whether
anybody has used it extensively to check PRNGs. It would be
interesting to hear whether it is useful for this in practice.


Quantcast