Random number generators discussion

Random number generators | www.agner.org

 
thread Multiplying uniform floats to obtain random ints - Nathan Kurz - 2015-11-15
last replythread Multiplying uniform floats to obtain random ints - Agner - 2015-11-16
last replythread Multiplying uniform floats to obtain random ints - Nathan Kurz - 2015-11-16
last reply Multiplying uniform floats to obtain random ints - Agner - 2015-11-17
 
Multiplying uniform floats to obtain random ints
Author:  Date: 2015-11-15 20:43
Since modulus is such a slow operation, it's tempting to create a random integer in the range [0,r) by generating a uniform random float, multiplying by the range, and then taking the floor. The R language, which is generally reputed to have good statistical practices, takes this approach: https://github.com/wch/r-source/blob/b156e3a711967f58131e23c1b1dc1ea90e2f0c43/src/main/unique.c#L1758

But I haven't been able to find much analysis about this approach. Your Random Vector paper (http://www.agner.org/random/theory/randomvector.pdf) says "Floating point calculation methods give the
same error because we are mapping an interval of length 2^32 to another interval of incommensurable length d", but I'm not sure this is what you are referring to.

For r < 2^32, is this a safe means of generating a random integer in range [0,r)?

1) Generate 52 random bits
2) Create random double [1.0,2.0)
3) Subtract one to have [0.0,1.0)
4) Multiply by r to obtain [0.0, r.0)
5) Take floor to obtain [0, r-1]

My instinct is that this is equitable other than the +/-1 difference in the number of uniform floats that map to each integer.

Are there other effects? Are there way to overcome this while avoiding an expensive modulus?

   
Multiplying uniform floats to obtain random ints
Author: Agner Date: 2015-11-16 14:01
Yes, this is what I am referring to. Your method still has the quantization error described in my paper.

If r is a power of 2, then you can simply use (C language:)
y = x & (r-1);
where x is random bits.

If r is not a power of 2, then you can use the method you describe, or this method which is faster:

y = ((uint32_t)x * (uint64_t)r) >> 32;

Both methods have the same quantization error. If you want exactness then use the rejection method described in my paper (which will be published in the next issue of Journal of Modern Applied Statistical Methods).

   
Multiplying uniform floats to obtain random ints
Author:  Date: 2015-11-16 21:45
> Your method still has the quantization error described in my paper

Could you expand on this a bit? I think converting from a random double [0,1) to a ranged integer [0,r-1] has a maximum quantization error of about 1/2^(53-log2(r)). The denominator is the number of floats that map to a particular integer, and the numerator is the maximum difference between any two integers. This would be be a much smaller quantization error than is present with the modulo approach without rejection. In some applications, this degree of bias would be acceptable, in others it wouldn't. I'd hesitate to call it the "same" error, but I see why you might. Is this accurate, or is there some larger quantization error in the floating point method that I'm missing?

> y = ((uint32_t)x * (uint64_t)r) >> 32;

Is there further explanation of the equivalence of this to the modulo approach in another one of your other papers?

> (which will be published in the next issue of Journal of Modern Applied Statistical Methods)

Congratulations!

   
Multiplying uniform floats to obtain random ints
Author: Agner Date: 2015-11-17 06:49
OK, I see your point. The higher resolution you have in the random number, the less quantization error. The example in my article is based on a resolution of 32 bits in x. If you use a double with a resolution of 52 (or 53) bits and r < 2^32 then you will have a very small quantization error, probably too small to be significant.

Alternatively, you may use a 64 bit x 64 bit -> 128 bit unsigned integer multiplication of x and r, and take the high part of the result. If you are running in 64 bit mode, use this method:
stackoverflow.com/questions/1541426/computing-high-64-bits-of-a-64x64-int-product-in-c

In 32-bit mode use this method:
blogs.msdn.com/b/oldnewthing/archive/2014/12/08/10578956.aspx