Middle-square method

Last updated November 01, 2024

In mathematics and computer science, the middle-square method is a method of generating pseudorandom numbers. In practice it is a highly flawed method for many practical purposes, since its period is usually very short and it has some severe weaknesses; repeated enough times, the middle-square method will either begin repeatedly generating the same number or cycle to a previous number in the sequence and loop indefinitely.

History

In mathematics

The method was invented by John von Neumann, and was described by him at a conference in 1949.^[1]

In the 1949 talk, Von Neumann quipped that "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin." What he meant, he elaborated, was that there were no true "random numbers", just means to produce them, and "a strict arithmetic procedure", like the middle-square method, "is not such a method". Nevertheless, he found these methods hundreds of times faster than reading "truly" random numbers off punch cards, which had practical importance for his ENIAC work. He found the "destruction" of middle-square sequences to be a factor in their favor, because it could be easily detected: "one always fears the appearance of undetected short cycles".^[1] Nicholas Metropolis reported sequences of 750,000 digits before "destruction" by means of using 38-bit numbers with the "middle-square" method.^[2]

The book The Broken Dice by Ivar Ekeland gives an extended account of how the method was invented by a Franciscan friar known only as Brother Edvin sometime between 1240 and 1250.^[3] Supposedly, the manuscript is now lost, but Jorge Luis Borges sent Ekeland a copy that he made at the Vatican Library.

Modifying the middle-square algorithm with a Weyl sequence improves period and randomness.^[4]^[5]

The method

To generate a sequence of n-digit pseudorandom numbers, an n-digit starting value is created and squared, producing a 2n-digit number. If the result has fewer than 2n digits, leading zeroes are added to compensate. The middle n digits of the result would be the next number in the sequence and returned as the result. This process is then repeated to generate more numbers.

The value of n must be even in order for the method to work – if the value of n is odd, then there will not necessarily be a uniquely defined "middle n-digits" to select from. Consider the following: If a 3-digit number is squared, it can yield a 6-digit number (e.g. 540² = 291600). If there were to be middle 3 digits, that would leave 6 − 3 = 3 digits to be distributed to the left and right of the middle. It is impossible to evenly distribute these digits equally on both sides of the middle number, and therefore there are no "middle digits". It is acceptable to pad the seeds with zeros to the left in order to create an even valued n-digit number (e.g. 540 → 0540).

For a generator of n-digit numbers, the period can be no longer than 8ⁿ. If the middle n digits are all zeroes, the generator then outputs zeroes forever. If the first half of a number in the sequence is zeroes, the subsequent numbers will be decreasing to zero. While these runs of zero are easy to detect, they occur too frequently for this method to be of practical use. The middle-squared method can also get stuck on a number other than zero. For n = 4, this occurs with the values 0100, 2500, 3792, and 7600. Other seed values form very short repeating cycles, e.g., 0540 → 2916 → 5030 → 3009. These phenomena are even more obvious when n = 2, as none of the 100 possible seeds generates more than 14 iterations without reverting to 0, 10, 50, 60, or a 24 ↔ 57 loop.

Example implementation

Here, the algorithm is rendered in Python 3.12.

seed_number=int(input("Please enter a four-digit number:\n[####] "))number=seed_numberalready_seen=set()counter=0whilenumbernotinalready_seen:counter+=1already_seen.add(number)number=int(str(number*number).zfill(8)[2:6])# zfill adds padding of zeroesprint(f"#{counter}: {number}")print(f"We began with {seed_number} and"f" have repeated ourselves after {counter} steps"f" with {number}.")

Related Research Articles

A pseudorandom sequence of numbers is one that appears to be statistically random, despite having been produced by a completely deterministic and repeatable process. Pseudorandom number generators are often used in computer programming, as traditional sources of randomness available to humans rely on physical processes not readily available to computer programs, although developments in hardware random number generator technology have challenged this.

A pseudorandom number generator (PRNG), also known as a deterministic random bit generator (DRBG), is an algorithm for generating a sequence of numbers whose properties approximate the properties of sequences of random numbers. The PRNG-generated sequence is not truly random, because it is completely determined by an initial value, called the PRNG's seed. Although sequences that are closer to truly random can be generated using hardware random number generators, pseudorandom number generators are important in practice for their speed in number generation and their reproducibility.

A linear congruential generator (LCG) is an algorithm that yields a sequence of pseudo-randomized numbers calculated with a discontinuous piecewise linear equation. The method represents one of the oldest and best-known pseudorandom number generator algorithms. The theory behind them is relatively easy to understand, and they are easily implemented and fast, especially on computer hardware which can provide modular arithmetic by storage-bit truncation.

The Mersenne Twister is a general-purpose pseudorandom number generator (PRNG) developed in 1997 by Makoto Matsumoto and Takuji Nishimura. Its name derives from the choice of a Mersenne prime as its period length.

In computing, a linear-feedback shift register (LFSR) is a shift register whose input bit is a linear function of its previous state.

In computing, a hardware random number generator (HRNG), true random number generator (TRNG), non-deterministic random bit generator (NRBG), or physical random number generator is a device that generates random numbers from a physical process capable of producing entropy, unlike the pseudorandom number generator that utilizes a deterministic algorithm and non-physical nondeterministic random bit generators that do not include hardware dedicated to generation of entropy.

A cryptographically secure pseudorandom number generator (CSPRNG) or cryptographic pseudorandom number generator (CPRNG) is a pseudorandom number generator (PRNG) with properties that make it suitable for use in cryptography. It is also referred to as a cryptographic random number generator (CRNG).

The security of cryptographic systems depends on some secret data that is known to authorized persons but unknown and unpredictable to others. To achieve this unpredictability, some randomization is typically employed. Modern cryptographic protocols often require frequent generation of random quantities. Cryptographic attacks that subvert or exploit weaknesses in this process are known as random number generator attacks.

A random permutation is a random ordering of a set of objects, that is, a permutation-valued random variable. The use of random permutations is common in games of chance and in randomized algorithms in coding theory, cryptography, and simulation. A good example of a random permutation is the fair shuffling of a standard deck of cards: this is ideally a random permutation of the 52 cards.

Inversive congruential generators are a type of nonlinear congruential pseudorandom number generator, which use the modular multiplicative inverse to generate the next number in a sequence. The standard formula for an inversive congruential generator, modulo some prime q is:

A random seed is a number used to initialize a pseudorandom number generator.

A numeric sequence is said to be statistically random when it contains no recognizable patterns or regularities; sequences such as the results of an ideal dice roll or the digits of π exhibit statistical randomness.

Random number generation is a process by which, often by means of a random number generator (RNG), a sequence of numbers or symbols that cannot be reasonably predicted better than by random chance is generated. This means that the particular outcome sequence will contain some patterns detectable in hindsight but impossible to foresee. True random number generators can be hardware random-number generators (HRNGs), wherein each generation is a function of the current value of a physical environment's attribute that is constantly changing in a manner that is practically impossible to model. This would be in contrast to so-called "random number generations" done by pseudorandom number generators (PRNGs), which generate numbers that only look random but are in fact predetermined—these generations can be reproduced simply by knowing the state of the PRNG.

The Fisher–Yates shuffle is an algorithm for shuffling a finite sequence. The algorithm takes a list of all the elements of the sequence, and continually determines the next element in the shuffled sequence by randomly drawing an element from the list until no elements remain. The algorithm produces an unbiased permutation: every permutation is equally likely. The modern version of the algorithm takes time proportional to the number of items being shuffled and shuffles them in place.

<span class="mw-page-title-main">Xorshift</span> Class of pseudorandom number generators

Xorshift random number generators, also called shift-register generators, are a class of pseudorandom number generators that were invented by George Marsaglia. They are a subset of linear-feedback shift registers (LFSRs) which allow a particularly efficient implementation in software without the excessive use of sparse polynomials. They generate the next number in their sequence by repeatedly taking the exclusive or of a number with a bit-shifted version of itself. This makes execution extremely efficient on modern computer architectures, but it does not benefit efficiency in a hardware implementation. Like all LFSRs, the parameters have to be chosen very carefully in order to achieve a long period.

In common usage, randomness is the apparent or actual lack of definite pattern or predictability in information. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. Individual random events are, by definition, unpredictable, but if there is a known probability distribution, the frequency of different outcomes over repeated events is predictable. For example, when throwing two dice, the outcome of any particular roll is unpredictable, but a sum of 7 will tend to occur twice as often as 4. In this view, randomness is not haphazardness; it is a measure of uncertainty of an outcome. Randomness applies to concepts of chance, probability, and information entropy.

Non-uniform random variate generation or pseudo-random number sampling is the numerical practice of generating pseudo-random numbers (PRN) that follow a given probability distribution. Methods are typically based on the availability of a uniformly distributed PRN generator. Computational algorithms are then used to manipulate a single random variate, X, or often several such variates, into a new random variate Y such that these values have the required distribution. The first methods were developed for Monte-Carlo simulations in the Manhattan project, published by John von Neumann in the early 1950s.

Wichmann–Hill is a pseudorandom number generator proposed in 1982 by Brian Wichmann and David Hill. It consists of three linear congruential generators with different prime moduli, each of which is used to produce a uniformly distributed number between 0 and 1. These are summed, modulo 1, to produce the result.

References

1 2 The 1949 papers were not reprinted until 1951. John von Neumann, “Various techniques used in connection with random digits”, in A. S. Householder, G. E. Forsythe, and H. H. Germond, eds., Monte Carlo Method, National Bureau of Standards Applied Mathematics Series, vol. 12 (Washington, D.C.: U.S. Government Printing Office, 1951): pp. 36–38.
↑ Donald E. Knuth, The art of computer programming, Vol. 2, Seminumerical algorithms, 2nd edn. (Reading, Mass.: Addison-Wesley, 1981), ch. 3, section 3.1.
↑ Ivar Ekeland (15 June 1996). The Broken Dice, and Other Mathematical Tales of Chance. University of Chicago Press. ISBN 978-0-226-19992-4.
↑ Kneusel, Ron (2018). Random Numbers and Computers (1 ed.). Springer. pp. 13–14.
↑ Widynski, Bernard (April 2017). "Middle-Square Weyl Sequence RNG". arXiv: 1704.00358 [cs.CR].

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[vonneumann-1] 1 2 The 1949 papers were not reprinted until 1951. John von Neumann, “Various techniques used in connection with random digits”, in A. S. Householder, G. E. Forsythe, and H. H. Germond, eds., Monte Carlo Method, National Bureau of Standards Applied Mathematics Series, vol. 12 (Washington, D.C.: U.S. Government Printing Office, 1951): pp. 36–38.

[2] Donald E. Knuth, The art of computer programming, Vol. 2, Seminumerical algorithms, 2nd edn. (Reading, Mass.: Addison-Wesley, 1981), ch. 3, section 3.1.

[Ekeland1996-3] Ivar Ekeland (15 June 1996). The Broken Dice, and Other Mathematical Tales of Chance. University of Chicago Press. ISBN 978-0-226-19992-4.

[4] Kneusel, Ron (2018). Random Numbers and Computers (1 ed.). Springer. pp. 13–14.

[5] Widynski, Bernard (April 2017). "Middle-Square Weyl Sequence RNG". arXiv: 1704.00358 [cs.CR].

[1]

[2]

[3]

[4]

[5]