Infinite monkey theorem

Last updated

A chimpanzee probably not writing Hamlet Chimpanzee seated at typewriter.jpg
A chimpanzee probably not writing Hamlet

The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type any given text, including the complete works of William Shakespeare. In fact, the monkey would almost surely type every possible finite text an infinite number of times. The theorem can be generalized to state that any sequence of events that has a non-zero probability of happening will almost certainly occur an infinite number of times, given an infinite amount of time or a universe that is infinite in size.

Contents

In this context, "almost surely" is a mathematical term meaning the event happens with probability 1, and the "monkey" is not an actual monkey, but a metaphor for an abstract device that produces an endless random sequence of letters and symbols. Variants of the theorem include multiple and even infinitely many typists, and the target text varies between an entire library and a single sentence.

One of the earliest instances of the use of the "monkey metaphor" is that of French mathematician Émile Borel in 1913, [1] but the first instance may have been even earlier. Jorge Luis Borges traced the history of this idea from Aristotle's On Generation and Corruption and Cicero's De Natura Deorum (On the Nature of the Gods), through Blaise Pascal and Jonathan Swift, up to modern statements with their iconic simians and typewriters. [2] In the early 20th century, Borel and Arthur Eddington used the theorem to illustrate the timescales implicit in the foundations of statistical mechanics.

Solution

Direct proof

There is a straightforward proof of this theorem. As an introduction, recall that if two events are statistically independent, then the probability of both happening equals the product of the probabilities of each one happening independently. For example, if the chance of rain in Moscow on a particular day in the future is 0.4 and the chance of an earthquake in San Francisco on any particular day is 0.00003, then the chance of both happening on the same day is 0.4 × 0.00003 = 0.000012, assuming that they are indeed independent.

Consider the probability of typing the word banana on a typewriter with 50 keys. Suppose that the keys are pressed randomly and independently, meaning that each key has an equal chance of being pressed regardless of what keys had been pressed previously. The chance that the first letter typed is 'b' is 1/50, and the chance that the second letter typed is 'a' is also 1/50, and so on. Therefore, the probability of the first six letters spelling banana is:

(1/50) × (1/50) × (1/50) × (1/50) × (1/50) × (1/50) = (1/50)6 = 1/15,625,000,000.

The result is less than one in 15 billion, but not zero.

From the above, the chance of not typing banana in a given block of 6 letters is 1  (1/50)6. Because each block is typed independently, the chance Xn of not typing banana in any of the first n blocks of 6 letters is:

As n grows, Xn gets smaller. For n = 1 million, Xn is roughly 0.9999, but for n = 10 billion Xn is roughly 0.53 and for n = 100 billion it is roughly 0.0017. As n approaches infinity, the probability Xn approaches zero; that is, by making n large enough, Xn can be made as small as is desired, [3] and the chance of typing banana approaches 100%. [lower-alpha 1] Thus, the probability of the word banana appearing at some point in an infinite sequence of keystrokes is equal to one.

The same argument applies if we replace one monkey typing n consecutive blocks of text with n monkeys each typing one block (simultaneously and independently). In this case, Xn = (1  (1/50)6)n is the probability that none of the first n monkeys types banana correctly on their first try. Therefore, at least one of infinitely many monkeys will (with probability equal to one) produce a text as quickly as it would be produced by a perfectly accurate human typist copying it from the original.

Infinite strings

This can be stated more generally and compactly in terms of strings, which are sequences of characters chosen from some finite alphabet:

  • Given an infinite string where each character is chosen uniformly at random, any given finite string almost surely occurs as a substring at some position.
  • Given an infinite sequence of infinite strings, where each character of each string is chosen uniformly at random, any given finite string almost surely occurs as a prefix of one of these strings.

Both follow easily from the second Borel–Cantelli lemma. For the second theorem, let Ek be the event that the kth string begins with the given text. Because this has some fixed nonzero probability p of occurring, the Ek are independent, and the below sum diverges,

the probability that infinitely many of the Ek occur is 1. The first theorem is shown similarly; one can divide the random string into nonoverlapping blocks matching the size of the desired text and make Ek the event where the kth block equals the desired string. [lower-alpha 2]

Probabilities

However, for physically meaningful numbers of monkeys typing for physically meaningful lengths of time the results are reversed. If there were as many monkeys as there are atoms in the observable universe typing extremely fast for trillions of times the life of the universe, the probability of the monkeys replicating even a single page of Shakespeare is unfathomably small.

Ignoring punctuation, spacing, and capitalization, a monkey typing letters uniformly at random has a chance of one in 26 of correctly typing the first letter of Hamlet. It has a chance of one in 676 (26 × 26) of typing the first two letters. Because the probability shrinks exponentially, at 20 letters it already has only a chance of one in 2620 = 19,928,148,895,209,409,152,340,197,376 [lower-alpha 3] (almost 2 × 1028). In the case of the entire text of Hamlet, the probabilities are so vanishingly small as to be inconceivable. The text of Hamlet contains approximately 130,000 letters. [lower-alpha 4] Thus, there is a probability of one in 3.4 × 10183,946 to get the text right at the first trial. The average number of letters that needs to be typed until the text appears is also 3.4 × 10183,946, [lower-alpha 5] or including punctuation, 4.4 × 10360,783. [lower-alpha 6]

Even if every proton in the observable universe (which is estimated at roughly 1080) were a monkey with a typewriter, typing from the Big Bang until the end of the universe (when protons might no longer exist), they would still need a far greater amount of time – more than three hundred and sixty thousand orders of magnitude longer – to have even a 1 in 10500 chance of success. To put it another way, for a one in a trillion chance of success, there would need to be 10360,641 observable universes made of protonic monkeys. [lower-alpha 7] As Kittel and Kroemer put it in their textbook on thermodynamics, the field whose statistical foundations motivated the first known expositions of typing monkeys, [5] "The probability of Hamlet is therefore zero in any operational sense of an event ...", and the statement that the monkeys must eventually succeed "gives a misleading conclusion about very, very large numbers."

In fact, there is less than a one in a trillion chance of success that such a universe made of monkeys could type any particular document a mere 79 characters long. [lower-alpha 8]

Almost surely

The probability that an infinite randomly generated string of text will contain a particular finite substring is 1. However, this does not mean the substring's absence is "impossible", despite the absence having a prior probability of 0. For example, the immortal monkey could randomly type G as its first letter, G as its second, and G as every single letter, thereafter, producing an infinite string of Gs; at no point must the monkey be "compelled" to type anything else. (To assume otherwise implies the gambler's fallacy.) However long a randomly generated finite string is, there is a small but nonzero chance that it will turn out to consist of the same character repeated throughout; this chance approaches zero as the string's length approaches infinity. There is nothing special about such a monotonous sequence except that it is easy to describe; the same fact applies to any nameable specific sequence, such as "RGRGRG" repeated forever, or "a-b-aa-bb-aaa-bbb-...", or "Three, Six, Nine, Twelve…".

If the hypothetical monkey has a typewriter with 90 equally likely keys that include numerals and punctuation, then the first typed keys might be "3.14" (the first three digits of pi) with a probability of (1/90)4, which is 1/65,610,000. Equally probable is any other string of four characters allowed by the typewriter, such as "GGGG", "mATh", or "q%8e". The probability that 100 randomly typed keys will consist of the first 99 digits of pi (including the separator key), or any other particular sequence of that length, is much lower: (1/90)100. If the monkey's allotted length of text is infinite, the chance of typing only the digit of pi is 0, which is just as possible (mathematically probable) as typing nothing but Gs (also probability 0).

The same applies to the event of typing a particular version of Hamlet followed by endless copies of itself; or Hamlet immediately followed by all the digits of pi; these specific strings are equally infinite in length, they are not prohibited by the terms of the thought problem, and they each have a prior probability of 0. In fact, any particular infinite sequence the immortal monkey types will have had a prior probability of 0, even though the monkey must type something.

This is an extension of the principle that a finite string of random text has a lower and lower probability of being a particular string the longer it is (though all specific strings are equally unlikely). This probability approaches 0 as the string approaches infinity. Thus, the probability of the monkey typing an endlessly long string, such as all of the digits of pi in order, on a 90-key keyboard is (1/90) which equals (1/∞) which is essentially 0. At the same time, the probability that the sequence contains a particular subsequence (such as the word MONKEY, or the 12th through 999th digits of pi, or a version of the King James Bible) increases as the total string increases. This probability approaches 1 as the total string approaches infinity, and thus the original theorem is correct.

Correspondence between strings and numbers

In a simplification of the thought experiment, the monkey could have a typewriter with just two keys: 1 and 0. The infinitely long string thusly produced would correspond to the binary digits of a particular real number between 0 and 1. A countably infinite set of possible strings end in infinite repetitions, which means the corresponding real number is rational. Examples include the strings corresponding to one-third (010101...), five-sixths (11010101...) and five-eighths (1010000...). Only a subset of such real number strings (albeit a countably infinite subset) contains the entirety of Hamlet (assuming that the text is subjected to a numerical encoding, such as ASCII).

Meanwhile, there is an uncountably infinite set of strings which do not end in such repetition; these correspond to the irrational numbers. These can be sorted into two uncountably infinite subsets: those which contain Hamlet and those which do not. However, the "largest" subset of all the real numbers is those which not only contain Hamlet, but which contain every other possible string of any length, and with equal distribution of such strings. These irrational numbers are called normal. Because almost all numbers are normal, almost all possible strings contain all possible finite substrings. Hence, the probability of the monkey typing a normal number is 1. The same principles apply regardless of the number of keys from which the monkey can choose; a 90-key keyboard can be seen as a generator of numbers written in base 90.

History

Statistical mechanics

In one of the forms in which probabilists now know this theorem, with its "dactylographic" [i.e., typewriting] monkeys (French : singes dactylographes; the French word singe covers both the monkeys and the apes), appeared in Émile Borel's 1913 article "Mécanique Statique et Irréversibilité" (Static mechanics and irreversibility), [1] and in his book "Le Hasard" in 1914. [6] His "monkeys" are not actual monkeys; rather, they are a metaphor for an imaginary way to produce a large, random sequence of letters. Borel said that if a million monkeys typed ten hours a day, it was extremely unlikely that their output would exactly equal all the books of the richest libraries of the world; and yet, in comparison, it was even more unlikely that the laws of statistical mechanics would ever be violated, even briefly.

The physicist Arthur Eddington drew on Borel's image further in The Nature of the Physical World (1928), writing:

If I let my fingers wander idly over the keys of a typewriter it might happen that my screed made an intelligible sentence. If an army of monkeys were strumming on typewriters they might write all the books in the British Museum. The chance of their doing so is decidedly more favourable than the chance of the molecules returning to one half of the vessel. [7] [8]

These images invite the reader to consider the incredible improbability of a large but finite number of monkeys working for a large but finite amount of time producing a significant work and compare this with the even greater improbability of certain physical events. Any physical process that is even less likely than such monkeys' success is effectively impossible, and it may safely be said that such a process will never happen. [5] It is clear from the context that Eddington is not suggesting that the probability of this happening is worthy of serious consideration. On the contrary, it was a rhetorical illustration of the fact that below certain levels of probability, the term improbable is functionally equivalent to impossible.

Origins and "The Total Library"

In a 1939 essay entitled "The Total Library", Argentine writer Jorge Luis Borges traced the infinite-monkey concept back to Aristotle's Metaphysics. Explaining the views of Leucippus, who held that the world arose through the random combination of atoms, Aristotle notes that the atoms themselves are homogeneous and their possible arrangements only differ in shape, position and ordering. In On Generation and Corruption , the Greek philosopher compares this to the way that a tragedy and a comedy consist of the same "atoms", i.e., alphabetic characters. [9] Three centuries later, Cicero's De natura deorum (On the Nature of the Gods) argued against the Epicurean atomist worldview:

Is it possible for any man to behold these things, and yet imagine that certain solid and individual bodies move by their natural force and gravitation, and that a world so beautifully adorned was made by their fortuitous concourse? He who believes this may as well believe that if a great quantity of the one-and-twenty letters, composed either of gold or any other matter, were thrown upon the ground, they would fall into such order as legibly to form the Annals of Ennius. I doubt whether fortune could make a single verse of them. [10]

Borges follows the history of this argument through Blaise Pascal and Jonathan Swift, [11] then observes that in his own time, the vocabulary had changed. By 1939, the idiom was "that a half-dozen monkeys provided with typewriters would, in a few eternities, produce all the books in the British Museum." (To which Borges adds, "Strictly speaking, one immortal monkey would suffice.") Borges then imagines the contents of the Total Library which this enterprise would produce if carried to its fullest extreme:

Everything would be in its blind volumes. Everything: the detailed history of the future, Aeschylus' The Egyptians, the exact number of times that the waters of the Ganges have reflected the flight of a falcon, the secret and true name of Rome, the encyclopedia Novalis would have constructed, my dreams and half-dreams at dawn on August 14, 1934, the proof of Pierre Fermat's theorem, the unwritten chapters of Edwin Drood , those same chapters translated into the language spoken by the Garamantes, the paradoxes Berkeley invented concerning Time but didn't publish, Urizen's books of iron, the premature epiphanies of Stephen Dedalus, which would be meaningless before a cycle of a thousand years, the Gnostic Gospel of Basilides, the song the sirens sang, the complete catalog of the Library, the proof of the inaccuracy of that catalog. Everything: but for every sensible line or accurate fact there would be millions of meaningless cacophonies, verbal farragoes, and babblings. Everything: but all the generations of mankind could pass before the dizzying shelves – shelves that obliterate the day and on which chaos lies – ever reward them with a tolerable page. [12]

Borges' total library concept was the main theme of his widely read 1941 short story "The Library of Babel", which describes an unimaginably vast library consisting of interlocking hexagonal chambers, together containing every possible volume that could be composed from the letters of the alphabet and some punctuation characters.

Actual monkeys

In 2002, [13] lecturers and students from the University of Plymouth MediaLab Arts course used a £2,000 grant from the Arts Council to study the literary output of real monkeys. They left a computer keyboard in the enclosure of six Celebes crested macaques in Paignton Zoo in Devon, England from May 1 to June 22, with a radio link to broadcast the results on a website. [14]

Not only did the monkeys produce nothing but five total pages [15] largely consisting of the letter "S", [13] the lead male began striking the keyboard with a stone, and other monkeys followed by urinating and defecating on the machine. [16] Mike Phillips, director of the university's Institute of Digital Arts and Technology (i-DAT), said that the artist-funded project was primarily performance art, and they had learned "an awful lot" from it. He concluded that monkeys "are not random generators. They're more complex than that. ... They were quite interested in the screen, and they saw that when they typed a letter, something happened. There was a level of intention there." [14] [17]

Applications and criticisms

Evolution

Thomas Huxley is sometimes misattributed with proposing a variant of the theory in his debates with Samuel Wilberforce. Thomas Henry Huxley - Project Gutenberg eText 16935.jpg
Thomas Huxley is sometimes misattributed with proposing a variant of the theory in his debates with Samuel Wilberforce.

In his 1931 book The Mysterious Universe, Eddington's rival James Jeans attributed the monkey parable to a "Huxley", presumably meaning Thomas Henry Huxley. This attribution is incorrect. [18] Today, it is sometimes further reported that Huxley applied the example in a now-legendary debate over Charles Darwin's On the Origin of Species with the Anglican Bishop of Oxford, Samuel Wilberforce, held at a meeting of the British Association for the Advancement of Science at Oxford on 30 June 1860. This story suffers not only from a lack of evidence, but the fact that in 1860 the typewriter was not yet commercially available. [19]

Despite the original mix-up, monkey-and-typewriter arguments are now common in arguments over evolution. As an example of Christian apologetics Doug Powell argued that even if a monkey accidentally types the letters of Hamlet, it has failed to produce Hamlet because it lacked the intention to communicate. His parallel implication is that natural laws could not produce the information content in DNA. [20] A more common argument is represented by Reverend John F. MacArthur, who claimed that the genetic mutations necessary to produce a tapeworm from an amoeba are as unlikely as a monkey typing Hamlet's soliloquy, and hence the odds against the evolution of all life are impossible to overcome. [21]

Evolutionary biologist Richard Dawkins employs the typing monkey concept in his book The Blind Watchmaker to demonstrate the ability of natural selection to produce biological complexity out of random mutations. In a simulation experiment Dawkins has his weasel program produce the Hamlet phrase METHINKS IT IS LIKE A WEASEL, starting from a randomly typed parent, by "breeding" subsequent generations and always choosing the closest match from progeny that are copies of the parent, with random mutations. The chance of the target phrase appearing in a single step is extremely small, yet Dawkins showed that it could be produced rapidly (in about 40 generations) using cumulative selection of phrases. The random choices furnish raw material, while cumulative selection imparts information. As Dawkins acknowledges, however, the weasel program is an imperfect analogy for evolution, as "offspring" phrases were selected "according to the criterion of resemblance to a distant ideal target." In contrast, Dawkins affirms, evolution has no long-term plans and does not progress toward some distant goal (such as humans). The weasel program is instead meant to illustrate the difference between non-random cumulative selection, and random single-step selection. [22] In terms of the typing monkey analogy, this means that Romeo and Juliet could be produced relatively quickly if placed under the constraints of a nonrandom, Darwinian-type selection because the fitness function will tend to preserve in place any letters that happen to match the target text, improving each successive generation of typing monkeys.

A different avenue for exploring the analogy between evolution and an unconstrained monkey lies in the problem that the monkey types only one letter at a time, independently of the other letters. Hugh Petrie argues that a more sophisticated setup is required, in his case not for biological evolution but the evolution of ideas:

In order to get the proper analogy, we would have to equip the monkey with a more complex typewriter. It would have to include whole Elizabethan sentences and thoughts. It would have to include Elizabethan beliefs about human action patterns and the causes, Elizabethan morality and science, and linguistic patterns for expressing these. It would probably even have to include an account of the sorts of experiences which shaped Shakespeare's belief structure as a particular example of an Elizabethan. Then, perhaps, we might allow the monkey to play with such a typewriter and produce variants, but the impossibility of obtaining a Shakespearean play is no longer obvious. What is varied really does encapsulate a great deal of already-achieved knowledge. [23]

James W. Valentine, while admitting that the classic monkey's task is impossible, finds that there is a worthwhile analogy between written English and the metazoan genome in this other sense: both have "combinatorial, hierarchical structures" that greatly constrain the immense number of combinations at the alphabet level. [24]

Zipf's law

Zipf's law states that the frequency of words is a power law function of its frequency rank:

where are real numbers. Assuming that a monkey is typing randomly, with fixed and nonzero probability of hitting each letter key or white space, then the text produced by the monkey follows Zipf's law. [25]

Literary theory

R. G. Collingwood argued in 1938 that art cannot be produced by accident, and wrote as a sarcastic aside to his critics,

... some ... have denied this proposition, pointing out that if a monkey played with a typewriter ... he would produce ... the complete text of Shakespeare. Any reader who has nothing to do can amuse himself by calculating how long it would take for the probability to be worth betting on. But the interest of the suggestion lies in the revelation of the mental state of a person who can identify the 'works' of Shakespeare with the series of letters printed on the pages of a book ... [26]

Nelson Goodman took the contrary position, illustrating his point along with Catherine Elgin by the example of Borges' "Pierre Menard, Author of the Quixote",

What Menard wrote is simply another inscription of the text. Any of us can do the same, as can printing presses and photocopiers. Indeed, we are told, if infinitely many monkeys ... one would eventually produce a replica of the text. That replica, we maintain, would be as much an instance of the work, Don Quixote, as Cervantes' manuscript, Menard's manuscript, and each copy of the book that ever has been or will be printed. [27]

In another writing, Goodman elaborates, "That the monkey may be supposed to have produced his copy randomly makes no difference. It is the same text, and it is open to all the same interpretations. ..." Gérard Genette dismisses Goodman's argument as begging the question. [28]

For Jorge J. E. Gracia, the question of the identity of texts leads to a different question, that of author. If a monkey is capable of typing Hamlet, despite having no intention of meaning and therefore disqualifying itself as an author, then it appears that texts do not require authors. Possible solutions include saying that whoever finds the text and identifies it as Hamlet is the author; or that Shakespeare is the author, the monkey his agent, and the finder merely a user of the text. These solutions have their own difficulties, in that the text appears to have a meaning separate from the other agents: What if the monkey operates before Shakespeare is born, or if Shakespeare is never born, or if no one ever finds the monkey's typescript? [29]

Random document generation

The theorem concerns a thought experiment which cannot be fully carried out in practice, since it is predicted to require prohibitive amounts of time and resources. Nonetheless, it has inspired efforts in finite random text generation.

One computer program run by Dan Oliver of Scottsdale, Arizona, according to an article in The New Yorker , came up with a result on 4 August 2004: After the group had worked for 42,162,500,000 billion billion monkey-years, one of the "monkeys" typed, "VALENTINE. Cease toIdor:eFLP0FRjWK78aXzVOwm)-‘;8.t" The first 19 letters of this sequence can be found in "The Two Gentlemen of Verona". Other teams have reproduced 18 characters from "Timon of Athens", 17 from "Troilus and Cressida", and 16 from "Richard II". [30]

A website entitled The Monkey Shakespeare Simulator, launched on 1 July 2003, contained a Java applet that simulated a large population of monkeys typing randomly, with the stated intention of seeing how long it takes the virtual monkeys to produce a complete Shakespearean play from beginning to end. For example, it produced this partial line from Henry IV, Part 2 , reporting that it took "2,737,850 million billion billion billion monkey-years" to reach 24 matching characters:

RUMOUR. Open your ears; 9r"5j5&?OWTY Z0d

Due to processing power limitations, the program used a probabilistic model (by using a random number generator or RNG) instead of actually generating random text and comparing it to Shakespeare. When the simulator "detected a match" (that is, the RNG generated a certain value or a value within a certain range), the simulator simulated the match by generating matched text. [31]

Testing of random-number generators

Questions about the statistics describing how often an ideal monkey is expected to type certain strings translate into practical tests for random-number generators; these range from the simple to the "quite sophisticated". Computer-science professors George Marsaglia and Arif Zaman report that they used to call one such category of tests "overlapping m-tuple tests" in lectures, since they concern overlapping m-tuples of successive elements in a random sequence. But they found that calling them "monkey tests" helped to motivate the idea with students. They published a report on the class of tests and their results for various RNGs in 1993. [32]

The infinite monkey theorem and its associated imagery is considered a popular and proverbial illustration of the mathematics of probability, widely known to the general public because of its transmission through popular culture rather than through formal education. [lower-alpha 9] This is helped by the innate humor stemming from the image of literal monkeys rattling away on a set of typewriters, and is a popular visual gag.

A quotation attributed [33] [34] to a 1996 speech by Robert Wilensky stated, "We've heard that a million monkeys at a million keyboards could produce the complete works of Shakespeare; now, thanks to the Internet, we know that is not true."

The enduring, widespread popularity of the theorem was noted in the introduction to a 2001 paper, "Monkeys, Typewriters and Networks: The Internet in the Light of the Theory of Accidental Excellence". [35] In 2002, an article in The Washington Post said, "Plenty of people have had fun with the famous notion that an infinite number of monkeys with an infinite number of typewriters and an infinite amount of time could eventually write the works of Shakespeare". [36] In 2003, the previously mentioned Arts Council funded experiment involving real monkeys and a computer keyboard received widespread press coverage. [13] In 2007, the theorem was listed by Wired magazine in a list of eight classic thought experiments. [37]

American playwright David Ives' short one-act play Words, Words, Words , from the collection All in the Timing , pokes fun of the concept of the infinite monkey theorem.

In 2015 Balanced Software released Monkey Typewriter on the Microsoft Store. [38] The software generates random text using the Infinite Monkey theorem string formula. The software queries the generated text for user inputted phrases. However the software should not be considered true to life representation of the theory. This is a more of a practical presentation of the theory rather than scientific model on how to randomly generate text.

See also

Notes

  1. This shows that the probability of typing "banana" in one of the predefined non-overlapping blocks of six letters tends to 1. In addition the word may appear across two blocks, so the estimate given is conservative.
  2. The first theorem is proven by a similar if more indirect route in Gut (2005). [4]
  3. Nearly 20 octillion
  4. Using the Hamlet text "from gutenberg.org"., there are 132680 alphabetical letters and 199749 characters overall
  5. For any required string of 130,000 letters from the set 'a'-'z', the average number of letters that needs to be typed until the string appears is (rounded) 3.4 × 10183,946, except in the case that all letters of the required string are equal, in which case the value is about 4% more, 3.6 × 10183,946. In that case failure to have the correct string starting from a particular position reduces with about 4% the probability of a correct string starting from the next position (i.e., for overlapping positions the events of having the correct string are not independent; in this case there is a positive correlation between the two successes, so the chance of success after a failure is smaller than the chance of success in general). The figure 3.4 × 10183,946 is derived from n = 26130000 by taking the logarithm of both sides: log10(n) = 1300000×log10(26) = 183946.5352, therefore n = 100.5352 × 10183946 = 3.429 × 10183946.
  6. 26 letters ×2 for capitalisation, 12 for punctuation characters = 64, 199749×log10(64) = 4.4 × 10360,783 (this is generous as it assumes capital letters are separate keys, as opposed to a key combination, which makes the problem vastly harder).
  7. There are ~1080 protons in the observable universe. Assume the monkeys write for 1038 years (1020 years is when all stellar remnants will have either been ejected from their galaxies or fallen into black holes, 1038 years is when all but 0.1% of protons have decayed). Assuming the monkeys type non-stop at a ridiculous 400  words per minute (the world record is 216  WPM for a single minute), that is about 2,000 characters per minute (Shakespeare's average word length is a bit under 5 letters). There are about half a million minutes in a year, this means each monkey types half a billion characters per year. This gives a total of 1080×1038×109 = 10127 letters typed – which is still zero in comparison to 10360,783. For a one in a trillion chance, multiply the letters typed by a trillion: 10127×1015 = 10145. 10360,783/10145 = 10360,641.
  8. As explained at "More monkeys". Archived from the original on 16 October 2009. Retrieved 4 December 2013. the problem can be approximated further: 10145/log10(64) = 78.9 characters.
  9. Examples of the theorem being referred to as proverbial include: Schooler, Jonathan W.; Dougal, Sonya (1999). "Why creativity is not like the proverbial typing monkey". Psychological Inquiry. 10 (4).; and Koestler, Arthur (1972). The Case of the Midwife Toad. New York. p. 30. Neo-Darwinism does indeed carry the nineteenth-century brand of materialism to its extreme limits to the proverbial monkey at the typewriter, hitting by pure chance on the proper keys to produce a Shakespeare sonnet.{{cite book}}: CS1 maint: location missing publisher (link) The latter is sourced from "Parable of the Monkeys"., a collection of historical references to the theorem in various formats.

Related Research Articles

<span class="mw-page-title-main">Kolmogorov complexity</span> Measure of algorithmic complexity

In algorithmic information theory, the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program that produces the object as output. It is a measure of the computational resources needed to specify the object, and is also known as algorithmic complexity, Solomonoff–Kolmogorov–Chaitin complexity, program-size complexity, descriptive complexity, or algorithmic entropy. It is named after Andrey Kolmogorov, who first published on the subject in 1963 and is a generalization of classical information theory.

In the computer science subfield of algorithmic information theory, a Chaitin constant or halting probability is a real number that, informally speaking, represents the probability that a randomly constructed program will halt. These numbers are formed from a construction due to Gregory Chaitin.

<span class="mw-page-title-main">Expected value</span> Average value of a random variable

In probability theory, the expected value is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of the possible values a random variable can take, weighted by the probability of those outcomes. Since it is obtained through arithmetic, the expected value sometimes may not even be included in the sample data set; it is not the value you would "expect" to get in reality.

In probability theory, the Borel–Cantelli lemma is a theorem about sequences of events. In general, it is a result in measure theory. It is named after Émile Borel and Francesco Paolo Cantelli, who gave statement to the lemma in the first decades of the 20th century. A related result, sometimes called the second Borel–Cantelli lemma, is a partial converse of the first Borel–Cantelli lemma. The lemma states that, under certain conditions, an event will have probability of either zero or one. Accordingly, it is the best-known of a class of similar theorems, known as zero-one laws. Other examples include Kolmogorov's zero–one law and the Hewitt–Savage zero–one law.

The concept of a random sequence is essential in probability theory and statistics. The concept generally relies on the notion of a sequence of random variables and many statistical discussions begin with the words "let X1,...,Xn be independent random variables...". Yet as D. H. Lehmer stated in 1951: "A random sequence is a vague notion... in which each term is unpredictable to the uninitiated and whose digits pass a certain number of tests traditional with statisticians".

In probability theory, there exist several different notions of convergence of sequences of random variables. The different notions of convergence capture different properties about the sequence, with some notions of convergence being stronger than others. For example, convergence in distribution tells us about the limit distribution of a sequence of random variables. This is a weaker notion than convergence in probability, which tells us about the value a random variable will take, rather than just the distribution.

<span class="mw-page-title-main">Bernoulli process</span> Random process of binary (boolean) random variables

In probability and statistics, a Bernoulli process is a finite or infinite sequence of binary random variables, so it is a discrete-time stochastic process that takes only two values, canonically 0 and 1. The component Bernoulli variablesXi are identically distributed and independent. Prosaically, a Bernoulli process is a repeated coin flipping, possibly with an unfair coin. Every variable Xi in the sequence is associated with a Bernoulli trial or experiment. They all have the same Bernoulli distribution. Much of what can be said about the Bernoulli process can also be generalized to more than two outcomes ; this generalization is known as the Bernoulli scheme.

<span class="mw-page-title-main">Law of large numbers</span> Averages of repeated trials converge to the expected value

In probability theory, the law of large numbers (LLN) is a mathematical theorem that states that the average of the results obtained from a large number of independent and identical random samples converges to the true value, if it exists. More formally, the LLN states that given a sample of independent and identically distributed values, the sample mean converges to the true mean.

In probability theory, de Finetti's theorem states that exchangeable observations are conditionally independent relative to some latent variable. An epistemic probability distribution could then be assigned to this variable. It is named in honor of Bruno de Finetti.

In probability theory, Kolmogorov's zero–one law, named in honor of Andrey Nikolaevich Kolmogorov, specifies that a certain type of event, namely a tail event of independent σ-algebras, will either almost surely happen or almost surely not happen; that is, the probability of such an event occurring is zero or one.

<span class="mw-page-title-main">The Library of Babel</span> Short story by Jorge Luis Borges

"The Library of Babel" is a short story by Argentine author and librarian Jorge Luis Borges (1899–1986), conceiving of a universe in the form of a vast library containing all possible 410-page books of a certain format and character set.

In mathematics, a real number is said to be simply normal in an integer base b if its infinite sequence of digits is distributed uniformly in the sense that each of the b digit values has the same natural density 1/b. A number is said to be normal in base b if, for every positive integer n, all possible strings n digits long have density bn.

In probability theory, an event is said to happen almost surely if it happens with probability 1. In other words, the set of outcomes on which the event does not occur has probability 0, even though the set might not be empty. The concept is analogous to the concept of "almost everywhere" in measure theory. In probability experiments on a finite sample space with a non-zero probability for each outcome, there is no difference between almost surely and surely ; however, this distinction becomes important when the sample space is an infinite set, because an infinite set can have non-empty subsets of probability 0.

<span class="mw-page-title-main">Weasel program</span>

The weasel program or Dawkins' weasel is a thought experiment and a variety of computer simulations illustrating it. Their aim is to demonstrate that the process that drives evolutionary systems—random variation combined with non-random cumulative selection—is different from pure chance.

Algorithmic information theory (AIT) is a branch of theoretical computer science that concerns itself with the relationship between computation and information of computably generated objects (as opposed to stochastically generated), such as strings or any other data structure. In other words, it is shown within algorithmic information theory that computational incompressibility "mimics" (except for a constant that only depends on the chosen universal programming language) the relations or inequalities found in information theory. According to Gregory Chaitin, it is "the result of putting Shannon's information theory and Turing's computability theory into a cocktail shaker and shaking vigorously."

Intuitively, an algorithmically random sequence is a sequence of binary digits that appears random to any algorithm running on a universal Turing machine. The notion can be applied analogously to sequences on any finite alphabet. Random sequences are key objects of study in algorithmic information theory.

The junkyard tornado, sometimes known as Hoyle's fallacy, is an argument against abiogenesis, using a calculation of its probability based on false assumptions, as comparable to "a tornado sweeping through a junk-yard might assemble a Boeing 747 from the materials therein" and to compare the chance of obtaining even a single functioning protein by chance combination of amino acids to a solar system full of blind men solving Rubik's Cubes simultaneously. It was used originally by English astronomer Fred Hoyle (1915–2001) in his book The Intelligent Universe, where he tried to apply statistics to evolution and the origin of life. Similar reasoning were advanced in Darwin's time, and indeed as long ago as Cicero in classical antiquity. While Hoyle himself was an atheist, the argument has since become a mainstay in the rejection of evolution by religious groups.

The Hewitt–Savage zero–one law is a theorem in probability theory, similar to Kolmogorov's zero–one law and the Borel–Cantelli lemma, that specifies that a certain type of event will either almost surely happen or almost surely not happen. It is sometimes known as the Savage-Hewitt law for symmetric events. It is named after Edwin Hewitt and Leonard Jimmie Savage.

In statistics, an exchangeable sequence of random variables is a sequence X1X2X3, ... whose joint probability distribution does not change when the positions in the sequence in which finitely many of them appear are altered. In other words, the joint distribution is invariant to finite permutation. Thus, for example the sequences

<span class="mw-page-title-main">Infinite monkey theorem in popular culture</span>

The infinite monkey theorem and its associated imagery is considered a popular and proverbial illustration of the mathematics of probability, widely known to the general public because of its transmission through popular culture rather than because of its transmission via the classroom.

References

  1. 1 2 Borel, Émile (1913). "La mécanique statique et l'irréversibilité". Journal de Physique Théorique et Appliquée (in French). 3 (1): 189–196. doi:10.1051/jphystap:019130030018900. ISSN   0368-3893. Concevons qu'on ait dressé un million de singes à frapper au hasard sur les touches d'une machine à écrire et que […] ces singes dactylographes travaillent avec ardeur dix heures par jour avec un million de machines à écrire de types variés. […] Au bout d'un an, [leurs] volumes se trouveraient renfermer la copie exacte des livres de toute nature et de toutes langues conservés dans les plus riches bibliothèques du monde.
  2. Jorge Luis Borges, "The Total Library", 1939. Anthologized in Selected Non-fictions (1999). Edited by Eliot Weinberger. New York: Viking
  3. Isaac, Richard E. (1995). The Pleasures of Probability. New York: Springer. pp. 48–50. ISBN   0-387-94415-X. OCLC   610945749– Isaac generalizes this argument immediately to variable text and alphabet size; the common main conclusion is on page 50.{{cite book}}: CS1 maint: postscript (link)
  4. Gut, Allan (2005). Probability: A Graduate Course. Springer. pp. 97–100. ISBN   0-387-22833-0.
  5. 1 2 Kittel, Charles; Kroemer, Herbert (1980). Thermal Physics (2nd ed.). San Francisco: W.H. Freeman Company. p. 53. ISBN   0-7167-1088-9. OCLC   5171399.
  6. Borel, Émile (1914). La hasard (in French). Paris: Félix Alcan. p. 164. Archived from the original on 2008-12-03.
  7. Arthur Eddington (1928). The Nature of the Physical World: The Gifford Lectures. New York: Macmillan. p.  72. ISBN   0-8414-3885-4.
  8. Eddington, Arthur. "Chapter IV: The Running-Down of the Universe". The Nature of the Physical World 1926–1927: The Gifford Lectures . Archived from the original on 2009-03-08. Retrieved 2012-01-22.
  9. Aristotle, Περὶ γενέσεως καὶ φθορᾶς (On Generation and Corruption), 315b14.
  10. Marcus Tullius Cicero, De natura deorum, 2.37. Translation from Cicero's Tusculan Disputations; Also, Treatises On The Nature Of The Gods, And On The Commonwealth, C. D. Yonge, principal translator, New York, Harper & Brothers Publishers, Franklin Square. (1877). Downloadable text.
  11. The English translation of "The Total Library" lists the title of Swift's essay as "Trivial Essay on the Faculties of the Soul." The appropriate reference is, instead: Swift, Jonathan, Temple Scott et al. "A Tritical Essay upon the Faculties of the Mind." The Prose Works of Jonathan Swift, Volume 1. London: G. Bell, 1897, pp. 291-296. Internet Archive
  12. Borges, Jorge Luis (August 1939). "La biblioteca total" [The Total Library]. Sur. No. 59. republished in Selected Non-Fictions. Translated by Weinberger, Eliot. Penguin. 1999. ISBN   0-670-84947-2.
  13. 1 2 3 "Notes towards the complete works of Shakespeare". vivaria.net. 2002. Archived from the original on 2007-07-16. – some press clippings.
  14. 1 2 "No words to describe monkeys' play". BBC News. 2003-05-09. Retrieved 2009-07-25.
  15. "Notes Towards the Complete Works of Shakespeare" (PDF). Archived from the original (PDF) on 2009-03-18.
  16. K., Alfred (April 2013). "Finite Monkeys Don't Type: A story about the interpretations of probability". Alfred K. Archived from the original on 2022-03-31. Retrieved 2023-05-11.
  17. "Monkeys don't write Shakespeare". Wired News. Associated Press. 2003-05-09. Archived from the original on 2004-02-01. Retrieved 2007-03-02.
  18. Padmanabhan, Thanu (2005). "The dark side of astronomy". Nature. 435 (7038): 20–21. Bibcode:2005Natur.435...20P. doi: 10.1038/435020a .Platt, Suzy (1993). Respectfully quoted: a dictionary of quotations . Barnes & Noble. pp.  388–389. ISBN   0-88029-768-9.
  19. Rescher, Nicholas (2006). Studies in the Philosophy of Science: A Counterfactual Perspective on Quantum Entanglement. Ontos Verlag. p. 103. ISBN   978-3-11-032646-8.
  20. Powell, Doug (2006). Holman Quicksource Guide to Christian Apologetics. Broadman & Holman. pp. 60, 63. ISBN   0-8054-9460-X.
  21. MacArthur, John (2003). Think Biblically!: Recovering a Christian Worldview. Crossway Books. pp. 78–79. ISBN   1-58134-412-0.
  22. Dawkins, Richard (1996). The Blind Watchmaker. W.W. Norton & Co. pp.  46–50. ISBN   0-393-31570-3.
  23. As quoted in Blachowicz, James (1998). Of Two Minds: Nature of Inquiry. SUNY Press. p. 109. ISBN   0-7914-3641-1.
  24. Valentine, James (2004). On the Origin of Phyla. University of Chicago Press. pp. 77–80. ISBN   0-226-84548-6.
  25. Conrad, B.; Mitzenmacher, M. (July 2004). "Power laws for monkeys typing randomly: the case of unequal probabilities". IEEE Transactions on Information Theory. 50 (7): 1403–1414. doi:10.1109/TIT.2004.830752. ISSN   1557-9654. S2CID   8913575.
  26. p. 126 of The Principles of Art, as summarized and quoted by Sclafani, Richard J. (1975). "The logical primitiveness of the concept of a work of art". British Journal of Aesthetics. 15 (1): 14. doi:10.1093/bjaesthetics/15.1.14.
  27. John, Eileen; Dominic Lopes, eds. (2004). The Philosophy of Literature: Contemporary and Classic Readings: An Anthology. Blackwell. p. 96. ISBN   1-4051-1208-5.
  28. Genette, Gérard (1997). The Work of Art: Immanence and Transcendence . Cornell UP. ISBN   0-8014-8272-0.
  29. Gracia, Jorge (1996). Texts: Ontological Status, Identity, Author, Audience. SUNY Press. pp. 1–2, 122–125. ISBN   0-7914-2901-6.
  30. Acocella, Joan (9 April 2007). "The typing life: How writers used to write". The New Yorker . – a review of Wershler-Henry, Darren (2007). The Iron Whim: A fragmented history of typewriting. Cornell University Press.
  31. Inglis-Arkell, Esther (June 9, 2011). "The story of the Monkey Shakespeare Simulator Project". io9. gizmodo. Retrieved 24 February 2016.
  32. Marsaglia, George; Zaman, Arif (1993). "Monkey tests for random number generators". Computers & Mathematics with Applications. Elsevier, Oxford. 26 (9): 1–10. doi: 10.1016/0898-1221(93)90001-C . ISSN   0898-1221.
  33. Susan Ratcliffe, ed. (2016), "Robert Wilensky 1951–American academic", Oxford Essential Quotations, Oxford University Press, in Mail on Sunday 16 February 1997 'Quotes of the Week'
  34. Lewis, Bob (1997-06-02). "It's time for some zoning laws in today's version of the Old West: the Web". Enterprise Computing, IS Survival Guide. InfoWorld. Vol. 19, no. 22. InfoWorld Media Group, Inc. p. 84. ISSN   0199-6649. May also be in "Bob Lewis's IS Survival Guide", published March 19, 1999, ISBN 978-0672314377 {{cite news}}: CS1 maint: postscript (link)
  35. Hoffmann, Ute; Hofmann, Jeanette (2001). "Monkeys, Typewriters and Networks" (PDF). Wissenschaftszentrum Berlin für Sozialforschung gGmbH (WZB). Archived from the original (PDF) on 2008-05-13.
  36. Ringle, Ken (28 October 2002). "Hello? This is Bob". The Washington Post . p. C01. Archived from the original on 15 November 2002.
  37. Lorge, Greta (May 2007). "The best thought experiments: Schrödinger's cat, Borel's monkeys". Wired. Vol. 15, no. 6.
  38. "Monkey Typewriter". Microsoft Store Apps. Balanced Software. 2015-11-16. 9NBLGGH69FC8. Retrieved 2022-02-14.