Linear congruential generator

Two modulo-9 LCGs show how different parameters lead to different cycle lengths. Each row shows the state evolving until it repeats. The top row shows a generator with m = 9, a = 2, c = 0, and a seed of 1, which produces a cycle of length 6. The second row is the same generator with a seed of 3, which produces a cycle of length 2. Using a = 4 and c = 1 (bottom row) gives a cycle length of 9 with any seed in [0, 8].

A linear congruential generator (LCG) is an algorithm that yields a sequence of pseudo–randomized numbers calculated with a discontinuous piecewise linear equation. The method represents one of the oldest and best–known pseudorandom number generator algorithms.[1] The theory behind them is relatively easy to understand, and they are easily implemented and fast, especially on computer hardware which can provide modulo arithmetic by storage–bit truncation.

The generator is defined by recurrence relation:

where is the sequence of pseudorandom values, and

— the "modulus"
— the "multiplier"
— the "increment"
— the "seed" or "start value"

are integer constants that specify the generator. If c = 0, the generator is often called a multiplicative congruential generator (MCG), or Lehmer RNG. If c  0, the method is called a mixed congruential generator.[2]:4-

Period length

A benefit of LCGs is that with appropriate choice of parameters, the period is known and long. Although not the only criterion, too short a period is a fatal flaw in a pseudorandom number generator.[3]

While LCGs are capable of producing pseudorandom numbers which can pass formal tests for randomness, this is extremely sensitive to the choice of the parameters m and a. For example, a = 1 and c = 1 produces a simple modulo-m counter, which has a long period, but is obviously non–random.

Historically, poor choices for a have led to ineffective implementations of LCGs. A particularly illustrative example of this is RANDU, which was widely used in the early 1970s and led to many results which are currently being questioned because of the use of this poor LCG.[4]

There are three common families of parameter choice:

m prime, c = 0

This is the original Lehmer RNG formulation. The period is m-1 if the multiplier a is chosen to be a primitive element of the integers modulo m. The initial state must be chosen between 1 and m-1.

One disadvantage of a prime modulus is that the modular reduction requires a double–width product and an explicit reduction step. Often a prime just less than a power of 2 is used (the Mersenne primes 231-1 and 261-1 are popular), so that the reduction modulo 2e - d can be computed as (ax  mod  2e) - d ax/2e. This must be followed by a conditional addition of m if the result is negative, but the number of additions is limited to ad/m, which can be easily limited to one if d is small.

If a double–width product is unavailable, and the multiplier is chosen carefully, Schrage's method[5] may be used. To do this, factor m  =  qa+r, i.e. q = ⌊m/a and r = m mod a. Then compute ax  mod  m = a(x mod q) - rx/q. Since x  mod  q < qm/a, the first term is strictly less than am/a = m. If a is chosen so that r  q (and thus r/q  1), then the second term is also: rx/qrx/q = x(r/q) ≤ x(1) = x < m. Thus, the difference will lie in the range [1-m, m-1], and can be reduced to [0, m-1] with a single conditional add.[6]

A second disadvantage is that it is awkward to convert the value 1  X < m to uniform random bits. If a prime just less than a power of 2 is used, sometimes the missing values are simply ignored.

m a power of 2, c=0

Choosing m to be a power of 2, most often m = 232 or m = 264, produces a particularly efficient LCG, because this allows the modulus operation to be computed by simply truncating the binary representation. In fact, the most significant bits are usually not computed at all. There are, however, disadvantages.

This form has maximal period m/4, achieved if a  3 or a  5 (mod 8). The initial state X0 must be odd, and the low three bits of X alternate between are not useful. It can be shown that this form is equivalent to a generator with a modulus a quarter the size and c ≠ 0.[2]

A more serious issue with the use of a power-of-two modulus is that the low bits have a shorter period than the high bits. The lowest–order bit of X never changes (X is always odd), and the next two bits alternate between two states. (If a  5 (mod 8), then bit 1 never changes and bit 2 alternates. If a  3 (mod 8), then bit 2 never changes and bit 1 alternates.) Bit 3 repeats with a period of 4, bit 4 has a period of 8, and so on. Only the most significant bit of X achieves the full period.

c≠0

When c≠0, correctly chosen parameters allow a period equal to m, for all seed values. This will occur if and only if:[2]:17—19

  1. and the offset are relatively prime,
  2. is divisible by all prime factors of ,
  3. is divisible by 4 if is divisible by 4.

These three requirements are referred to as the Hull–Dobell Theorem.[7][8]

This form may be used with any m. A power–of–2 modulus equal to a computer's word size is convenient and achieves the longest possible period (for any odd c and a 1 mod 4), but has the same problem mentioned above: only the most significant bit achieves the full period. Lower–order bit k repeats with a period of length 2⁺¹.

An odd (usually prime) modulus avoids this problem, but as mentioned previously cannot produce certain values.

The generator is not sensitive to the choice of c, as long as it is relatively prime to the modulus (e.g. if m is a power of 2, then c must be odd), so the value c=1 is commonly chosen. The series produced by other choices of c can be written as a simple function of the series when c=1.[2]:11

Parameters in common use

The following table lists the parameters of LCGs in common use, including built–in rand() functions in runtime libraries of various compilers.

Sourcemodulus
m
multiplier
a   
increment
c
output bits of seed in rand() or Random(L)
Numerical Recipes2³²16645251013904223
Borland C/C++2³²226954771bits 30..16 in rand(), 30..0 in lrand()
glibc (used by GCC)[9] 2³¹ 110351524512345bits 30..0
ANSI C: Watcom, Digital Mars, CodeWarrior, IBM VisualAge C/C++ [10]
C99, C11: Suggestion in the ISO/IEC 9899 [11]
2³¹110351524512345bits 30..16
Borland Delphi, Virtual Pascal2³²1347758131bits 63..32 of (seed * L)
Turbo Pascal2³²134775813 (0x8088405₁₆)1
Microsoft Visual/Quick C/C++2³²214013 (343FD₁₆)2531011 (269EC3₁₆)bits 30..16
Microsoft Visual Basic (6 and earlier)[12]2²⁴1140671485 (43FD43FD₁₆)12820163 (C39EC3₁₆)
RtlUniform from Native API[13]2³¹ - 1 2147483629 (7FFFFFED₁₆)2147483587 (7FFFFFC3₁₆)
Apple CarbonLib, C++11's minstd_rand0[14]2³¹ - 1168070see MINSTD
C++11's minstd_rand[14]2³¹ - 1482710see MINSTD
MMIX by Donald Knuth2⁶⁴63641362238467930051442695040888963407
Newlib, Musl2⁶⁴63641362238467930051bits 63...32
VMS's MTH$RANDOM,[15] old versions of glibc2³²69069 (10DCD₁₆)1
Java's java.util.Random, POSIX [ln]rand48, glibc [ln]rand48[_r]2⁴⁸25214903917 (5DEECE66D₁₆)11bits 47...16

random0[16][17][18][19][20]

If Xₙ is even then Xₙ₊₁ will be odd, and vice versa—the lowest bit oscillates at each step.

134456 = 2³7⁵812128411
POSIX[21] [jm]rand48, glibc [mj]rand48[_r]2⁴⁸25214903917 (5DEECE66D₁₆)11bits 47...15
POSIX [de]rand48, glibc [de]rand48[_r]2⁴⁸25214903917 (5DEECE66D₁₆)11bits 47...0
cc65[22]2²³65793 (10101₁₆)4282663 (415927₁₆)bits 22...8
cc652³²16843009 (1010101₁₆)826366247 (31415927₁₆)bits 31...16
Formerly common: RANDU [4]2³¹  655390

As shown above, LCGs do not always use all of the bits in the values they produce. For example, the Java implementation operates with 48–bit values at each iteration but returns only their 32 most significant bits. This is because the higher–order bits have longer periods than the lower–order bits (see below). LCGs that use this truncation technique produce statistically better values than those that do not. This is especially noticeable in scripts that use the mod operation to reduce range; modifying the random number mod 2 will lead to alternating 0 and 1 without truncation.

Advantages and disadvantages

LCGs are fast and require minimal memory (one modulo-m number, often 32 or 64 bits) to retain state. This makes them valuable for simulating multiple independent streams.

Hyperplanes of a linear congruential generator in three dimensions. This structure is what the spectral test measures.

Although LCGs have a few specific weaknesses, many of their flaws come from having too small a state. The fact that people have been lulled for so many years into using them with such small moduli can be seen as a testament to strength of the technique. A LCG with large enough state can pass even stringent statistical tests; a modulo-2 LCG which returns the high 32 bits passes TestU01's SmallCrush suite, and a 96–bit LCG passes the most stringent BigCrush suite.[23]

For a specific example, an ideal random number generator with 32 bits of output is expected (by the Birthday theorem) to begin duplicating earlier outputs after m 2¹⁶ results. Any PRNG whose output is its full, untruncated state will not produce duplicates until its full period elapses, an easily detectable statistical flaw. For related reasons, any PRNG should have a period longer than the square of the number of outputs required. Given modern computer speeds, this means a period of 2⁶⁴ for all but the least demanding applications, and longer for demanding simulations.

One flaw specific to LCGs is that, if used to choose points in an n–dimensional space, the points will lie on, at most, (n!⋅m)^1÷n hyperplanes (Marsaglia's Theorem, developed by George Marsaglia).[24] This is due to serial correlation between successive values of the sequence Xₙ. Carelessly chosen multipliers will usually have far fewer, widely spaced planes, which can lead to problems. The spectral test, which is a simple test of an LCG's quality, measures this spacing and allows a good multiplier to be chosen.

The plane spacing depends both on the modulus and the multiplier, a large enough modulus can reduce this distance below the resolution of double precision numbers. The choice of the multiplier becomes less important when the modulus is large. It is still necessary to calculate the spectral index and make sure that the multiplier is not a bad one, but purely probabilistically it becomes extremely unlikely to encounter a bad multiplier when the modulus is larger than about 2^64.

Another flaw specific to LCGs is the short period of the low–order bits when m is chosen to be a power of 2. This can be mitigated by using a modulus larger than the required output, and using the most significant bits of the state.

LCGs should be chosen very carefully for applications where high–quality randomness is critical. Any Monte–Carlo simulation should use an LCG with a modulus greater and preferably much greater than the cube root of the number of random samples which are required. This means, for example, that a (good) 32–bit LCG can be used to obtain about a thousand random numbers; a 64–bit LCG is good for about 2^21 random samples which is a little over two million, etc. For this reason, LCGs are in practice not suitable for large scale Monte–Carlo simulations.

LCGs are not intended, and must not be used, for cryptographic applications; use a cryptographically secure pseudorandom number generator for such applications. Nevertheless, for some applications LCGs may be a good option. For instance, in an embedded system, the amount of memory available is often severely limited. Similarly, in an environment such as a video game console taking a small number of high–order bits of an LCG may well suffice. The low–order bits of LCGs when m is a power of 2 should never be relied on for any degree of randomness whatsoever. Indeed, simply substituting 2 for the modulus term reveals that the low order bits go through very short cycles. In particular, any full–cycle LCG when m is a power of 2 will produce alternately odd and even results.

Sample Python code

The following is an implementation of an LCG in Python:

def lcg(modulus, a, c, seed):
    while True:
        seed = (a * seed + c) % modulus
        yield seed

Sample Free Pascal code

Free Pascal uses a Mersenne Twister as its default pseudo random number generator whereas Delphi uses a LCG. Here is a Delphi compatible example in Free Pascal based on the information in the table above. Given the same RandSeed value it generates the same sequence of random numbers as Delphi.

unit lcg_random;
{$ifdef fpc}{$mode delphi}{$endif}
interface

function LCGRandom: extended; overload;inline;
function LCGRandom(const range:longint):longint;overload;inline;

implementation
function IM:cardinal;inline;
begin
  RandSeed := RandSeed * 134775813  + 1;
  Result := RandSeed;
end;

function LCGRandom: extended; overload;inline;
begin
  Result := IM * 2.32830643653870e-10;
end;

function LCGRandom(const range:longint):longint;overload;inline;
begin
  Result := IM * range shr 32;
end;

Like all pseudorandom number generators, a LCG needs to store state and alter it each time it generates a new number. Multiple threads may access this state simultaneously causing a race condition. Implementations should use different state each with unique initialization for different threads to avoid equal sequences of random numbers on simultaneously executing threads.

LCG derivatives

There are several generators which are linear congruential generators in a different form, and thus the techniques used to analyze LCGs can be applied to them.

One method of producing a longer period is to sum the outputs of several LCGs of different periods having a large least common multiple; the Wichmann–Hill generator is an example of this form. (We would prefer them to be completely coprime, but a prime modulus implies an even period, so there must be a common factor of 2, at least.) This can be shown to be equivalent to a single LCG with a modulus equal to the product of the component LCG moduli.

Marsaglia's add–with–carry and subtract–with–borrow PRNGs with a word size of b=2ʷ and lags r and s (r > s) are equivalent to LCGs with a modulus of  ±  ± 1.[25][26]

Multiply–with–carry PRNGs with a multiplier of a are equivalent to LCGs with a large prime modulus of ab-1 and a power–of–2 multiplier b.

A permuted congruential generator begins with a power–of–2–modulus LCG and applies an output transformation to eliminate the short period problem in the low–order bits.

Comparison with other PRNGs

Generators based on linear recurrences (such as xorshift*) or on good avalanching functions (such as SplitMix64 ) outperform linear congruential generators even at small state sizes. Moreover, linear generators can generate very long sequences, and when suitably perturbed at the output, they pass strong statistical tests. Several examples can be found in the Xorshift family.

The Mersenne twister algorithm provides a very long period (2¹⁹⁹³⁷ - 1) and variate uniformity, but it fails some statistical tests.[27] A common Mersenne twister implementation uses an LCG to generate seed data.

Linear congruential generators have the problem that all of the bits in each number are usually not equally random. Either the modulus is a power of 2 and the least–significant bits are less random than the most–significant bits, or the modulus is not a power of 2 and the outputs are biased. A linear feedback shift register PRNG produces a stream of pseudo–random bits, each of which is truly pseudo–random,[28] and can be implemented with essentially the same amount of memory as a linear congruential generator, albeit with a bit more computation.

The linear feedback shift register has a strong relationship to linear congruential generators.[29] Given a few values in the sequence, some techniques can predict the following values in the sequence for not only linear congruent generators but any other polynomial congruent generator.[29]

See also

Notes

  1. "Linear Congruential Generators" by Joe Bolte, Wolfram Demonstrations Project.
  2. 1 2 3 4 Knuth, Donald (1997). Seminumerical Algorithms. The Art of Computer Programming. 2 (3rd ed.). Reading, MA: Addison-Wesley Professional.
  3. L'Ecuyer, Pierre (13 July 2017). Chan, W. K. V.; D’Ambrogio, A.; Zacharewicz, G.; Mustafee, N.; Wainer, G.; Page, E., eds. History of Uniform Random Number Generation (PDF). Proceedings of the 2017 Winter Simulation Conference (to appear). Las Vegas, United States. hal-01561551.
  4. 1 2 Press, William H.; et al. (1992). Numerical Recipes in Fortran 77: The Art of Scientific Computing (2nd ed.). ISBN 0-521-43064-X.
  5. Jain, Raj (9 July 2010). "Computer Systems Performance Analysis Chapter 26: Random–Number Generation" (PDF). pp. 19–20. Retrieved 2017-10-31.
  6. Fenerty, Paul (11 September 2006). "Schrage's Method". Retrieved 2017-10-31.
  7. Hull, T. E.; Dobell, A. R. (July 1962). "Random Number Generators" (PDF). SIAM Review. 4 (3): 230–254. doi:10.1137/1004061. Retrieved 2016-06-26.
  8. Severance, Frank (2001). System Modeling and Simulation. John Wiley & Sons, Ltd. p. 86. ISBN 0-471-49694-4.
  9. Implementation in glibc-2.26 release. See the code after the test for "TYPE_0"; the GNU C library's rand() in stdlib.h uses a simple (single state) linear congruential generator only in case that the state is declared as 8 bytes. If the state is larger (an array), the generator becomes an additive feedback generator (initialized using minstd_rand0) and the period increases. See the simplified code that reproduces the random sequence from this library.
  10. K. Entacher (21 August 1997). "A collection of selected pseudorandom number generators with linear structures". CiteSeerX 10.1.1.53.3686. Retrieved 16 June 2012.
  11. "Last public Committee Draft from April 12, 2011, page 346f" (PDF). Retrieved 21 Dec 2014.
  12. "How Visual Basic Generates Pseudo–Random Numbers for the RND Function". Microsoft Support. Microsoft. Retrieved 17 June 2011.
  13. In spite of documentation on MSDN, RtlUniform uses LCG, and not Lehmer's algorithm, implementations before Windows Vista are flawed, because the result of multiplication is cut to 32 bits, before modulo is applied
  14. 1 2 "ISO/IEC 14882:2011". ISO. 2 September 2011. Retrieved 3 September 2011.
  15. GNU Scientific Library: Other random number generators
  16. Stephen J. Chapman. "Example 6.4 - Random Number Generator". "MATLAB Programming for Engineers". 2015. p. 253-256.
  17. Stephen J. Chapman. "Example 6.4 - Random Number Generator". "MATLAB Programming with Applications for Engineers". 2012. p. 292-295.
  18. S. J. Chapman. random0. 2004.
  19. Stephen J. Chapman. "Introduction to Fortran 90/95". 1998. p. 322-324.
  20. Wu–ting Tsai. "'Module': A Major Feature of the Modern Fortran". p. 6-7.
  21. The Open Group Base Specifications Issue 7 IEEE Std 1003.1, 2013 Edition
  22. Cadot, Sidney. "rand.s". cc65. Retrieved 8 July 2016.
  23. O'Neill, Melissa E. (5 September 2014). PCG: A Family of Simple Fast Space–Efficient Statistically Good Algorithms for Random Number Generation (PDF) (Technical report). Harvey Mudd College. pp. 6–7. HMC-CS-2014-0905.
  24. Marsaglia, George (September 1968). "Random Numbers Fall Mainly in the Planes" (PDF). PNAS. 61 (1): 25–28. Bibcode:1968PNAS...61...25M. doi:10.1073/pnas.61.1.25.
  25. Tezuka, Shu; L’Ecuyer, Pierre (October 1993). On the Lattice Structure of the Add–with–Carry and Subtract–with–Borrow Random Number Generators (PDF). Workshop on Stochastic Numerics. Kyoto University.
  26. Tezuka, Shi; L'Ecuyer, Pierre (December 1992). Analysis of Add–with–Carry and Subtract–with–Borrow Generators (PDF). Proceedings of the 1992 Winter Simulation Conference. pp. 443–447.
  27. Matsumoto, Makoto; Nishimura, Takuji (January 1998). "Mersenne twister: a 623–dimensionally equidistributed uniform pseudo–random number generator" (PDF). ACM Transactions on Modeling and Computer Simulation. 8 (1): 3–30. doi:10.1145/272991.272995.
  28. Gershenfeld, Neil (1999). "Section 5.3.2: Linear Feedback". The Nature of Mathematical Modeling (First ed.). Cambridge University Press. p. 59. ISBN 0-521-57095-6.
  29. 1 2 RFC 4086 section 6.1.3 "Traditional Pseudo–random Sequences"

References

  • S.K. Park and K.W. Miller (1988). "Random Number Generators: Good Ones Are Hard To Find". Communications of the ACM. 31 (10): 1192–1201. doi:10.1145/63039.63042.
  • D. E. Knuth. The Art of Computer Programming, Volume 2: Seminumerical Algorithms, Third Edition. Addison–Wesley, 1997. ISBN 0-201-89684-2. Section 3.2.1: The Linear Congruential Method, pp. 10—26.
  • P. L'Ecuyer (1999). "Tables of Linear Congruential Generators of Different Sizes and Good Lattice Structure". Mathematics of Computation. 68 (225): 249–260. doi:10.1090/S0025-5718-99-00996-5.
  • Press, WH; Teukolsky, SA; Vetterling, WT; Flannery, BP (2007), "Section 7.1.1. Some History", Numerical Recipes: The Art of Scientific Computing (3rd ed.), New York: Cambridge University Press, ISBN 978-0-521-88068-8
  • Gentle, James E., (2003). Random Number Generation and Monte Carlo Methods, 2nd edition, Springer, ISBN 0-387-00178-6.
  • Joan Boyar (1989). "Inferring sequences produced by pseudo–random number generators". Journal of the ACM. 36 (1): 129–141. doi:10.1145/58562.59305. (in this paper, efficient algorithms are given for inferring sequences produced by certain pseudo–random number generators).
This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.