Re: Efficient way to gen a random number in a particular range?

On 2009-08-11, Ertugrul Söylemez <es@xxxxxxxx> wrote:
MTGAP schrieb:
I assume Scott is doing this because modding by a power of 2 can be
efficiently implemented with the bitwise AND operation. But it's
faster (and simpler) to do it like this:

x = prng() % 40 + 10;

Bad method, because it's biased towards 0..15. Mod 41 is also bad,
because it's biased towards 0..36. Modulo operation is only good in
this case, if your module is a power of 2. In all other cases you
should use 64 bit integer operations and do it this way:

r <- prng
40*r / 2^32

This yields a number 0..39. Generally replacing 40 by n yields a number

That's still biased in favor of the numbers 2, 4, 7, 9, 12, 14, 17,
19, 22, 24, 27, 29, 32, 34, 37 and 39, assuming I did my math right.

Mind you, in this case the bias is only about one in a hundred
thousand, but then that's true of both methods.

Anyway, the fastest method that is completely unbiased, assuming the
underlying generator is so, is (in C code):

do { tmp = prng(); } while (tmp >= 4294967280);
x = 10 + tmp % 40;

The number 4294967280 is simply the largest multiple of 40 less than
or equal to 2**32. If you'd prefer to take the range as a parameter,
you can calculate it at runtime like this:

unsigned int random (unsigned int min, unsigned int max) {
unsigned int len, top, tmp;
len = max - min + 1; /* length of range */
if (len & (len-1)) { /* not a power of 2 */
top = PRNG_MAX - PRNG_MAX % len;
do { tmp = prng(); } while (tmp >= top);
return min + tmp % len;
} else if (len) { /* power of 2, max-min < PRNG_MAX */
return min + prng() % len;
} else return prng(); /* min == 0, max == PRNG_MAX */

This assumes that prng() returns uniformly distributed numbers between
0 and PRNG_MAX inclusive, where PRNG_MAX is one less than a power of
two, and that the low bits of its output are at least as random as any
others (which won't be the case for simple LCPRNGs like most C library
rand() implementations). If you're using a LCPRNG (which you really
shouldn't be doing, but some still do), use Ertugrul's method _after_
the rejection sampling, i.e. replace the line:

return min + tmp % len;
return min + ((uint64_t) tmp * len) / top;

and the line:

return min + prng() % len;
return min + ((uint64_t) prng() * len) / ((uint64_t) PRNG_MAX + 1);

(BTW, I spent quite some time tweaking that code, and I *believe* that
it's correct. However, I believed the same thing several times while
I was writing this before finding more bugs and odd corner cases, so I
make no guarantees. Use at your own risk, and preferably test first.)

Ilmari Karonen
To reply by e-mail, please replace ".invalid" with ".net" in address.