|
Wil Baden 2000-05-27
In 1997, Makoto Matsumoto and Takuji Nishimura developed the Mersenne Twister. The Mersenne Twister is "designed with consideration of the flaws of various existing generators," has a super-astronomical period of 2^19937 - 1, gives a sequence that is 623-dimensionally equidistributed, and "has passed many stringent tests, including the die-hard test of G. Marsaglia and the load test of P. Hellekalek and S. Wegenkittl." It is efficient in memory usage (typically using 2506 bytes of static data, and the code is quite short as well). It generates random numbers in batches of 624 at a time, so the caching and pipelining of modern systems is exploited. It is also divide- and mod-free.
And it is fast. In C, it's four times faster than the
inadequate Standard C Library function rand. Elko
Tchernev reports 100 million random numbers in 22 seconds
using SwiftForth.
For full information,
Mersenne TwisterIt is easy to port the C code to Standard Forth.
TEXTA Forth program for MT19937: Integer version (1999/10/28)
GENRAND generates one pseudorandom unsigned integer (32bit)
which is uniformly distributed among 0 to 2^32-1 for each
call. seed SGRENRAND sets initial values in the working
area of 624 words. Before GENRAND, seed SGENRAND must be
called once. (seed is any 32-bit integer.)
Coded in C by Takuji Nishimura, considering the suggestions by Topher Cooper and Marc Rieffel in July-Aug. 1997.
Converted to Standard Forth by Wil Baden, 2000-05-15.
C version copyright (C) 1997, 1999 Makoto Matsumoto and Takuji Nishimura. When you use this, send an email to: M. Matsumoto with an appropriate reference to your work.
REFERENCE
M. Matsumoto and T. Nishimura,
"Mersenne Twister: A 623-Dimensionally Equidistributed Uniform Pseudo-Random Number Generator",
ACM Transactions on Modeling and Computer Simulation, Vol. 8, No. 1, January 1998, pp 3--30.
DECIMAL
: PRIME ( n -- flag )
DUP 1 AND 0= IF 2 = EXIT THEN
1 ( n d)
BEGIN 2 +
2DUP DUP * U< IF 2DROP TRUE EXIT THEN
2DUP MOD 0=
UNTIL 2DROP ( )
FALSE ;
\ Period parameters
624 CONSTANT MTN \ "N" has been renamed "MTN".
397 CONSTANT MTM \ "M" has been renamed "MTM".
H# 9908B0DF CONSTANT Matrix-A \ Constant vector A
H# 80000000 CONSTANT Upper-Mask \ Most significant bits
H# 7FFFFFFF CONSTANT Lower-Mask \ Least significant bits
\ Tempering parameters
H# 9D2C5680 CONSTANT Tempering-Mask-B
H# EFC60000 CONSTANT Tempering-Mask-C
: Tempering-Shift-U 11 RSHIFT ;
: Tempering-Shift-S 7 LSHIFT ;
: Tempering-Shift-T 15 LSHIFT ;
: Tempering-Shift-L 18 RSHIFT ;
CREATE MT MTN CELLS ALLOT \ The array for the state vector.
CREATE MTI -1 ,
\ Unsigned MTI > MTN means MT[...] isn't initialized.
\ Initializing the array with a seed.
: SGENRAND ( seed -- )
MTN 0 DO
DUP H# FFFF0000 AND I 'th MT !
69069 * 1+
DUP H# FFFF0000 AND 16 RSHIFT I 'th MT @ OR I 'th MT !
69069 * 1+
LOOP DROP ( )
MTN MTI ! ;
Initialization by SGENRAND is an example. Theoretically,
there are 2^19937-1 possible states as an initial state. The
following function allows choosing any of 2^19937-1 states.
Essential bits in seed-array[] are the following 19937
bits: seed-array[0]&Upper-Mask, seed-array[1], ...,
seed-array[MTN-1]. seed-array[0]&Lower-Mask is discarded.
Theoretically, seed-array[0]&Upper-Mask, seed-array[1],
..., seed-array[MTN-1] can take any values except all zeros.
: LSGENRAND ( &seed-array -- )
\ The length of seed-array[] must be at least MTN cells.
MTN 0 DO
I 'th OVER @ I 'th MT !
LOOP DROP ( )
MTN MTI ! ;
: GENRAND ( -- u )
MTI @ MTN U< NOT IF \ Generate MTN words at one time.
MTI @ MTN = NOT IF \ If SGENRAND hasn't been called,
4357 SGENRAND \ a default initial seed is used.
THEN
\ {0...N-M-1}
MTN MTM - 0 DO
I 'th MT @ Upper-Mask AND
I 1+ 'th MT @ Lower-Mask AND OR ( y)
DUP 1 RSHIFT SWAP ( x y)
1 AND IF Matrix-A XOR THEN ( x)
I MTM + 'th MT @ XOR
I 'th MT ! ( )
LOOP
\ {N-M...N-2}
MTN 1- MTN MTM - DO
I 'th MT @ Upper-Mask AND
I 1+ 'th MT @ Lower-Mask AND OR ( y)
DUP 1 RSHIFT SWAP ( x y)
1 AND IF Matrix-A XOR THEN ( x)
I MTM + MTN - 'th MT @ XOR
I 'th MT ! ( )
LOOP
\ N-1, 0
MTN 1- 'th MT @ Upper-Mask AND
MT @ Lower-Mask AND OR ( y)
DUP 1 RSHIFT SWAP ( x y)
1 AND IF Matrix-A XOR THEN ( x)
MTM 1- 'th MT @ XOR
MTN 1- 'th MT ! ( )
0 MTI !
THEN
MTI @ 'th MT @ 1 MTI +! ( u)
DUP Tempering-Shift-U XOR
DUP Tempering-Shift-S Tempering-Mask-B AND XOR
DUP Tempering-Shift-T Tempering-Mask-C AND XOR
DUP Tempering-Shift-L XOR ;
The speed and staggering features make this a powerful candidate for Monte Carlo simulation and cryptography.
2^19937-1 is a big number, suitable for billions and billions of pseudo-random numbers. How big can be seen by writing it in decimal -- 6002 digits.
Program Text 3
\ `u GENRANDMAX` yields a uniform random integer < `u`.
: GENRANDMAX ( u -- n )
GENRAND UM* NIP ;
\ Mersenne Twister for Floating Point.
: FGENRAND ( F: -- 0. <= r <= 1. )
GENRAND 0 D>F 2.3283064370807974E-10 F*
;
: FGENRAND-1 ( F: -- 0. <= r < 1. )
GENRAND 0 D>F 2.3283064365386963E-10 F*
;
A Mersenne number is a number of the form 2^w-1. A Mersenne prime is a Mersenne number that is prime. A necessary but not sufficient condition for a Mersenne prime is that w is prime. Another condition will make it sufficient.
As of May 2000, there are 38 known Mersenne primes. 2^19937-1 is the 24th and has 6002 decimal digits. The 38th is 2^6972593-1 and I don't know how many decimal digits.
Finding Primes
\ Lukas-Lehmer Test for Mersenne Primes
: Lukas-Lehmer ( w -- flag )
DUP 2 < IF 1 = EXIT THEN
DUP 1 AND 0= IF 2 = EXIT THEN
DUP PRIME NOT IF DROP FALSE EXIT THEN
1 OVER LSHIFT 1- SWAP ( 2^w-1 w)
4 SWAP 1- 1 DO ( 2^w-1 u)
DUP UM* -2 M+ THIRD UM/MOD DROP
LOOP
NIP 0= ;
\ Mersenne primes in 32 bits.
MARKER ONCE
: RUN CR ." \ "
CR 32 2 DO
I Lukas-Lehmer IF 1 I LSHIFT 1- . THEN
LOOP ;
RUN ONCE
\ 3 7 31 127 8191 131071 524287 2147483647
2^61-1 and 2^89-1 are the next two Mersenne primes.
For cryptography, replace SGENRAND with a function using a
secure hash to fill MT.
COUNTER SGENRAND or uCOUNTER XOR SGENRAND is usually good
enough to initialize simple randomizing.
For best performance, use macros to define ancillary words.
: Tempering-Shift-U S" 11 RSHIFT " EVALUATE ; IMMEDIATE
: Tempering-Shift-S S" 7 LSHIFT " EVALUATE ; IMMEDIATE
: Tempering-Shift-T S" 15 LSHIFT " EVALUATE ; IMMEDIATE
: Tempering-Shift-L S" 18 RSHIFT " EVALUATE ; IMMEDIATE
How Mersenne Twister Works
One way to define a linear congruential series is to pick numbers m and p, where p is a prime, and m is a "generator" for it. A generator for a prime p is a number such that its powers modulo p yield all positive numbers less than p. Thus you can start with any positive number less than p, and continue multiplying by m to get all positive numbers less than p.
For example, 3 and 5 are the generators for 7.
MARKER Run-and-Forgotten
7 VALUE The-Prime
: Generator-Check ( -- )
The-Prime 1 DO CR
I BEGIN ( x)
DUP . DUP 1 >
WHILE
I * The-Prime MOD
REPEAT DROP ( )
LOOP ;
Generator-Check Run-and-Forgotten
1
2 4 1
3 2 6 4 5 1
4 2 1
5 4 6 2 3 1
6 1
In the Mersenne Twister, instead of numbers, we are working with vectors of 19937 bits. In this, the equivalent of a generator is a 19937 by 19937 matrix that can successively multiply any of the vectors that is not all 0 to obtain every vector that is not all 0. The arithmetic is done modulo 2. Addition is x + y mod 2 and multiplication is x * y mod 2. These are the same as x xor y and x and y.
In the program, Matrix-A is a value that can be used to form
such a matrix. The once-every-624-times part of the program
takes this value and does calculations that are equivalent to
multiplying by the full matrix.
This gives 2^19937-1 combinations of 19937 bits.
The matrix has the following form, but 19937 by 19937.
0 1 0 0 0 0
0 0 1 0 0 0
0 0 0 1 0 0
0 0 0 0 1 0
0 0 0 0 0 1
a b c d e f
The done-every-time part twists the output to obscure the algebraic connection between successive elements. It's equivalent to multiplying by an invertible 19937 by 19937 matrix.
Program Text 4MARKER ONCE : RUN MTI ON 1000 0 DO I 5 MOD 0= IF CR THEN GENRAND 12 U.R LOOP ; RUN ONCE