Middle-square method

Last updated
One iteration of the middle-square method, showing a 6-digit seed, which is then squared, and the resulting value has its middle 6 digits as the output value (and also as the next seed for the sequence). Middle-square method.svg
One iteration of the middle-square method, showing a 6-digit seed, which is then squared, and the resulting value has its middle 6 digits as the output value (and also as the next seed for the sequence).
Directed graph of all 100 2-digit pseudorandom numbers obtained using the middle-square method with n = 2. Middle square method 2 digits.svg
Directed graph of all 100 2-digit pseudorandom numbers obtained using the middle-square method with n = 2.

In mathematics and computer science, the middle-square method is a method of generating pseudorandom numbers. In practice it is a highly flawed method for many practical purposes, since its period is usually very short and it has some severe weaknesses; repeated enough times, the middle-square method will either begin repeatedly generating the same number or cycle to a previous number in the sequence and loop indefinitely.

Contents

History

In mathematics

The method was invented by John von Neumann, and was described by him at a conference in 1949. [1]

In the 1949 talk, Von Neumann quipped that "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin." What he meant, he elaborated, was that there were no true "random numbers", just means to produce them, and "a strict arithmetic procedure", like the middle-square method, "is not such a method". Nevertheless, he found these methods hundreds of times faster than reading "truly" random numbers off punch cards, which had practical importance for his ENIAC work. He found the "destruction" of middle-square sequences to be a factor in their favor, because it could be easily detected: "one always fears the appearance of undetected short cycles". [1] Nicholas Metropolis reported sequences of 750,000 digits before "destruction" by means of using 38-bit numbers with the "middle-square" method. [2]

The book The Broken Dice by Ivar Ekeland gives an extended account of how the method was invented by a Franciscan friar known only as Brother Edvin sometime between 1240 and 1250. [3] Supposedly, the manuscript is now lost, but Jorge Luis Borges sent Ekeland a copy that he made at the Vatican Library.

Modifying the middle-square algorithm with a Weyl sequence improves period and randomness. [4] [5]

The method

To generate a sequence of n-digit pseudorandom numbers, an n-digit starting value is created and squared, producing a 2n-digit number. If the result has fewer than 2n digits, leading zeroes are added to compensate. The middle n digits of the result would be the next number in the sequence and returned as the result. This process is then repeated to generate more numbers.

The value of n must be even in order for the method to work  if the value of n is odd, then there will not necessarily be a uniquely defined "middle n-digits" to select from. Consider the following: If a 3-digit number is squared, it can yield a 6-digit number (e.g. 5402 = 291600). If there were to be middle 3 digits, that would leave 6 − 3 = 3 digits to be distributed to the left and right of the middle. It is impossible to evenly distribute these digits equally on both sides of the middle number, and therefore there are no "middle digits". It is acceptable to pad the seeds with zeros to the left in order to create an even valued n-digit number (e.g. 540  0540).

For a generator of n-digit numbers, the period can be no longer than 8n. If the middle n digits are all zeroes, the generator then outputs zeroes forever. If the first half of a number in the sequence is zeroes, the subsequent numbers will be decreasing to zero. While these runs of zero are easy to detect, they occur too frequently for this method to be of practical use. The middle-squared method can also get stuck on a number other than zero. For n = 4, this occurs with the values 0100, 2500, 3792, and 7600. Other seed values form very short repeating cycles, e.g., 0540 → 2916 → 5030 → 3009. These phenomena are even more obvious when n = 2, as none of the 100 possible seeds generates more than 14 iterations without reverting to 0, 10, 50, 60, or a 24 ↔ 57 loop.

Example implementation

Here, the algorithm is rendered in Python 3.11.

seed_number=int(input("Please enter a four-digit number:\n[####] "))number=seed_numberalready_seen=set()counter=0whilenumbernotinalready_seen:counter+=1already_seen.add(number)number=int(str(number*number).zfill(8)[2:6])# zfill adds padding of zeroesprint(f"#{counter}: {number}")print(f"We began with {seed_number} and"f" have repeated ourselves after {counter} steps"f" with {number}.")

Related Research Articles

A pseudorandom sequence of numbers is one that appears to be statistically random, despite having been produced by a completely deterministic and repeatable process. Simply put, the problem is that many of the sources of randomness available to humans rely on physical processes not readily available to computer programs.

A pseudorandom number generator (PRNG), also known as a deterministic random bit generator (DRBG), is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random numbers. The PRNG-generated sequence is not truly random, because it is completely determined by an initial value, called the PRNG's seed. Although sequences that are closer to truly random can be generated using hardware random number generators, pseudorandom number generators are important in practice for their speed in number generation and their reproducibility.

<span class="mw-page-title-main">Linear congruential generator</span> Algorithm for generating pseudo-randomized numbers

A linear congruential generator (LCG) is an algorithm that yields a sequence of pseudo-randomized numbers calculated with a discontinuous piecewise linear equation. The method represents one of the oldest and best-known pseudorandom number generator algorithms. The theory behind them is relatively easy to understand, and they are easily implemented and fast, especially on computer hardware which can provide modular arithmetic by storage-bit truncation.

The Mersenne Twister is a general-purpose pseudorandom number generator (PRNG) developed in 1997 by Makoto Matsumoto and Takuji Nishimura. Its name derives from the fact that its period length is chosen to be a Mersenne prime.

In computing, a linear-feedback shift register (LFSR) is a shift register whose input bit is a linear function of its previous state.

<span class="mw-page-title-main">Hardware random number generator</span> Cryptographic device

In computing, a hardware random number generator (HRNG), true random number generator (TRNG), non-deterministic random bit generator (NRBG), or physical random number generator is a device that generates random numbers from a physical process capable of producing entropy, unlike the pseudorandom number generator that utilizes a deterministic algorithm and non-physical nondeterministic random bit generators that do not include hardware dedicated to generation of entropy.

A cryptographically secure pseudorandom number generator (CSPRNG) or cryptographic pseudorandom number generator (CPRNG) is a pseudorandom number generator (PRNG) with properties that make it suitable for use in cryptography. It is also loosely known as a cryptographic random number generator (CRNG).

In mathematics, a low-discrepancy sequence is a sequence with the property that for all values of N, its subsequence x1, ..., xN has a low discrepancy.

The security of cryptographic systems depends on some secret data that is known to authorized persons but unknown and unpredictable to others. To achieve this unpredictability, some randomization is typically employed. Modern cryptographic protocols often require frequent generation of random quantities. Cryptographic attacks that subvert or exploit weaknesses in this process are known as random number generator attacks.

A random permutation is a random ordering of a set of objects, that is, a permutation-valued random variable. The use of random permutations is often fundamental to fields that use randomized algorithms such as coding theory, cryptography, and simulation. A good example of a random permutation is the shuffling of a deck of cards: this is ideally a random permutation of the 52 cards.

Inversive congruential generators are a type of nonlinear congruential pseudorandom number generator, which use the modular multiplicative inverse to generate the next number in a sequence. The standard formula for an inversive congruential generator, modulo some prime q is:

A random seed is a number used to initialize a pseudorandom number generator.

A numeric sequence is said to be statistically random when it contains no recognizable patterns or regularities; sequences such as the results of an ideal dice roll or the digits of π exhibit statistical randomness.

<span class="mw-page-title-main">Random number generation</span> Producing a sequence that cannot be predicted better than by random chance

Random number generation is a process by which, often by means of a random number generator (RNG), a sequence of numbers or symbols that cannot be reasonably predicted better than by random chance is generated. This means that the particular outcome sequence will contain some patterns detectable in hindsight but impossible to foresee. True random number generators can be hardware random-number generators (HRNGs), wherein each generation is a function of the current value of a physical environment's attribute that is constantly changing in a manner that is practically impossible to model. This would be in contrast to so-called "random number generations" done by pseudorandom number generators (PRNGs), which generate numbers that only look random but are in fact pre-determined—these generations can be reproduced simply by knowing the state of the PRNG.

<span class="mw-page-title-main">Fisher–Yates shuffle</span> Algorithm for generating a random permutation of a finite set

The Fisher–Yates shuffle is an algorithm for shuffling a finite sequence. The algorithm takes a list of all the elements of the sequence, and continually determines the next element in the shuffled sequence by randomly drawing an element from the list until no elements remain. The algorithm produces an unbiased permutation: every permutation is equally likely. The modern version of the algorithm takes time proportional to the number of items being shuffled and shuffles them in place.

<span class="mw-page-title-main">Randomness</span> Apparent lack of pattern or predictability in events

In common usage, randomness is the apparent or actual lack of definite pattern or predictability in information. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. Individual random events are, by definition, unpredictable, but if there is a known probability distribution, the frequency of different outcomes over repeated events is predictable. For example, when throwing two dice, the outcome of any particular roll is unpredictable, but a sum of 7 will tend to occur twice as often as 4. In this view, randomness is not haphazardness; it is a measure of uncertainty of an outcome. Randomness applies to concepts of chance, probability, and information entropy.

Non-uniform random variate generation or pseudo-random number sampling is the numerical practice of generating pseudo-random numbers (PRN) that follow a given probability distribution. Methods are typically based on the availability of a uniformly distributed PRN generator. Computational algorithms are then used to manipulate a single random variate, X, or often several such variates, into a new random variate Y such that these values have the required distribution. The first methods were developed for Monte-Carlo simulations in the Manhattan project, published by John von Neumann in the early 1950s.

Wichmann–Hill is a pseudorandom number generator proposed in 1982 by Brian Wichmann and David Hill. It consists of three linear congruential generators with different prime moduli, each of which is used to produce a uniformly distributed number between 0 and 1. These are summed, modulo 1, to produce the result.

References

  1. 1 2 The 1949 papers were not reprinted until 1951. John von Neumann, “Various techniques used in connection with random digits”, in A. S. Householder, G. E. Forsythe, and H. H. Germond, eds., Monte Carlo Method, National Bureau of Standards Applied Mathematics Series, vol. 12 (Washington, D.C.: U.S. Government Printing Office, 1951): pp. 36–38.
  2. Donald E. Knuth, The art of computer programming, Vol. 2, Seminumerical algorithms, 2nd edn. (Reading, Mass.: Addison-Wesley, 1981), ch. 3, section 3.1.
  3. Ivar Ekeland (15 June 1996). The Broken Dice, and Other Mathematical Tales of Chance. University of Chicago Press. ISBN   978-0-226-19992-4.
  4. Kneusel, Ron (2018). Random Numbers and Computers (1 ed.). Springer. pp. 13–14.
  5. Widynski, Bernard (April 2017). "Middle-Square Weyl Sequence RNG". arXiv: 1704.00358 [cs.CR].

See also