Re: How random is random?



On May 11, 12:12 pm, "Colin B." <cbi...@xxxxxxxxxxxxxxxxxxxxxxxxx>
wrote:

$ dd if=/dev/random of=<filename> bs=x count=y conv=sync.

Now assuming that we keep the filesize the same (i.e. x*y=constant),
the time to generate files goes up as count increases and bs decreases.
The interesting thing is that files created with low count and high bs...
- compress much better
- generate far fewer lines (as measured by wc -l)

Now since compress and gzip are apparently entropy-based algorithms, it
stands to reason (at least by me!) that the small-count file has less
entropy. The question is, what does this actually mean, and what are the
consequences of it?

'conv=sync' tells 'dd' that if it gets a short read from its input
then it
should pad the output record to the specified blocksize with zeroes.
/dev/random can produce short reads if its entropy pool gets depleted.
If you examine the compressible output files I expect you'll find
that
they contain lots of runs of zeroes, and those runs of zeroes are
highly compressible.

This is also the reason why the large 'bs' causes the file to be
generated more quickly.

OttoM.
__
ottomeister

Disclaimer: These are my opinions. I do not speak for my employer.

.



Relevant Pages

  • Re: How random is random?
    ... the time to generate files goes up as count increases and bs decreases. ... Now since compress and gzip are apparently entropy-based algorithms, ... should pad the output record to the specified blocksize with zeroes. ... /dev/random can produce short reads if its entropy pool gets depleted. ...
    (comp.security.misc)
  • Data vs: Information (was: A unique number for every "person" - can it be done?)
    ... >> entropy have less information than systems with low entropy (though ... >> amount of data (10 coins), but one is much more ordered. ... > compress the random text significantly. ... > engine, and if you burned them carefully enough, the engine powered by ...
    (sci.crypt)
  • Re: Best existing binary compressor method?
    ... get closer than huffman to the entropy on average. ... but only due to the limitation of huffman not ... the probabilities of the source alphabet are inverse powers of two, ... compress smaller than the entropy and some longer. ...
    (comp.compression)
  • Re: Entropy and Equivalent Key Lengths?
    ... This increases to 4 bits of entropy if we allow case, ... > and special characters. ... best language models can compress text to 1.2-1.3 bpc. ...
    (sci.crypt)
  • Re: Some questions
    ... There is always side information to be stored, and *that depends on the filing system used*. ... It is not an entropy limit, neither related to entropy at all. ... If you want to compress down to C bits size, then there are only 2^C different configurations you can represent out of the 2^100, and those are the ones you can compress. ... then either it compresses *less* files better, or *more* files worse, this is exactly what I wrote above, by a precise formula. ...
    (comp.compression)