Law of truly large numbers

Last updated

The law of truly large numbers (a statistical adage), attributed to Persi Diaconis and Frederick Mosteller, states that with a large enough number of independent samples, any highly implausible (i.e. unlikely in any single sample, but with constant probability strictly greater than 0 in any sample) result is likely to be observed. [1] Because we never find it notable when likely events occur, we highlight unlikely events and notice them more. The law is often used to falsify different pseudo-scientific claims; as such, it is sometimes criticized by fringe scientists. [2] [3]

Contents

The law can be rephrased as "large numbers also deceive". More concretely, skeptic Penn Jillette has said, "Million-to-one odds happen eight times a day in New York" (population about 8,000,000). [4]

Examples

Graphs of probability P of not observing independent events each of probability 1/n after n Bernoulli trials, and 1 - P vs n. As n increases, the probability of a 1/n-chance event never appearing after n tries rapidly converges to 1/e. Bernoulli trial sequence.svg
Graphs of probability P of not observing independent events each of probability 1/n after n Bernoulli trials, and 1 − P vs n. As n increases, the probability of a 1/n-chance event never appearing after n tries rapidly converges to 1/e.

For a simplified example of the law, assume that a given event happens with a probability for its occurrence of 0.1%, within a single trial. Then, the probability that this so-called unlikely event does not happen (improbability) in a single trial is 99.9% (0.999).

For a sample of only 1000 independent trials, however, the probability that the event does not happen in any of them, even once (improbability), is only [5] 0.9991000 0.3677, or 36.77%. Then, the probability that the event does happen, at least once, in 1000 trials is (1 − 0.9991000 0.6323, or) 63.23%. This means that this "unlikely event" has a probability of 63.23% of happening if 1000 independent trials are conducted. If the number of trials were increased to 10,000, the probability of it happening at least once in 10,000 trials rises to (1 − 0.99910000 0.99995, or) 99.995%. In other words, a highly unlikely event, given enough independent trials with some fixed number of draws per trial, is even more likely to occur.

For an event X that occurs with very low probability of 0.0000001% (in any single sample, see also almost never), considering 1,000,000,000 as a "truly large" number of independent samples gives the probability of occurrence of X equal to 1 0.9999999991000000000 ≈ 0.63 = 63% and a number of independent samples equal to the size of the human population (in 2021) gives probability of event X: 1 0.9999999997900000000 ≈ 0.9996 = 99.96%. [6]

These calculations can be formalized in mathematical language as: "the probability of an unlikely event X happening in N independent trials can become arbitrarily near to 1, no matter how small the probability of the event X in one single trial is, provided that N is truly large." [7]

For example, where the probability of unlikely event X is not a small constant but decreased in function of N, see graph.

In high availability systems even very unlikely events have to be taken into consideration, in series systems even when the probability of failure for single element is very low after connecting them in large numbers probability of whole system failure raises (to make system failures less probable redundancy can be used - in such parallel systems even highly unreliable redundant parts connected in large numbers raise the probability of not breaking to required high level). [8]

In criticism of pseudoscience

The law comes up in criticism of pseudoscience and is sometimes called the Jeane Dixon effect (see also Postdiction). It holds that the more predictions a psychic makes, the better the odds that one of them will "hit". Thus, if one comes true, the psychic expects us to forget the vast majority that did not happen (confirmation bias). [9] Humans can be susceptible to this fallacy.

Another similar manifestation of the law can be found in gambling, where gamblers tend to remember their wins and forget their losses, [10] even if the latter far outnumbers the former (though depending on a particular person, the opposite may also be true when they think they need more analysis of their losses to achieve fine tuning of their playing system [11] ). Mikal Aasved links it with "selective memory bias", allowing gamblers to mentally distance themselves from the consequences of their gambling [11] by holding an inflated view of their real winnings (or losses in the opposite case – "selective memory bias in either direction").

See also

Notes

  1. Everitt 2002
  2. Beitman, Bernard D., (15 Apr 2018), Intrigued by the Low Probability of Synchronicities? Coincidence theorists and statisticians dispute the meaning of rare events. at PsychologyToday
  3. Sharon Hewitt Rawlette, (2019), Coincidence or Psi? The Epistemic Import of Spontaneous Cases of Purported Psi Identified Post-Verification, Journal of Scientific Exploration, Vol. 33, No. 1, pp. 9–42[ unreliable source? ]
  4. Kida, Thomas E. (Thomas Edward) (2006). Don't believe everything you think : the 6 basic mistakes we make in thinking. Amherst, N.Y.: Prometheus Books. p. 97. ISBN   1615920056. OCLC   1019454221.
  5. here other law of "Improbability principle" also acts - the "law of probability lever", which is (according to David Hand) a kind of butterfly effect: we have a value "close" to 1 raised to large number what gives "surprisingly" low value or even close to zero if this number is larger, this shows some philosophical implications, questions the theoretical models but it does not render them useless - evaluation and testing of theoretical hypothesis (even when probability of it correctness is close to 1) can be its falsifiability - feature widely accepted as important for the scientific inquiry which is not meant to lead to dogmatic or absolute knowledge, see: statistical proof.
  6. Graphing calculator at Desmos (graphing)
  7. Proof in: Elemér Elad Rosinger, (2016), "Quanta, Physicists, and Probabilities ... ?" page 28
  8. Reliability of Systems in Concise Reliability for Engineers, Jaroslav Menčík, 2016
  9. 1980, Austin Society to Oppose Pseudoscience (ASTOP) distributed by ICSA (former American Family Foundation) "Pseudoscience Fact Sheets, ASTOP: Psychic Detectives"
  10. Daniel Freeman, Jason Freeman, 2009, London, "Know Your Mind: Everyday Emotional and Psychological Problems and How to Overcome Them" p. 41
  11. 1 2 Mikal Aasved, 2002, Illinois, The Psychodynamics and Psychology of Gambling: The Gambler's Mind vol. I, p. 129

Related Research Articles

The gambler's fallacy, also known as the Monte Carlo fallacy or the fallacy of the maturity of chances, is the belief that, if an event has occurred less frequently than expected, it is more likely to happen again in the future. The fallacy is commonly associated with gambling, where it may be believed, for example, that the next dice roll is more than usually likely to be six because there have recently been fewer than the expected number of sixes.

The law of averages is the commonly held belief that a particular outcome or event will, over certain periods of time, occur at a frequency that is similar to its probability. Depending on context or application it can be considered a valid common-sense observation or a misunderstanding of probability. This notion can lead to the gambler's fallacy when one becomes convinced that a particular outcome must come soon simply because it has not occurred recently.

<span class="mw-page-title-main">Event (probability theory)</span> In statistics and probability theory, set of outcomes to which a probability is assigned

In probability theory, an event is a set of outcomes of an experiment to which a probability is assigned. A single outcome may be an element of many different events, and different events in an experiment are usually not equally likely, since they may include very different groups of outcomes. An event consisting of only a single outcome is called an elementary event or an atomic event; that is, it is a singleton set. An event that has more than one possible outcomes is called compound event. An event is said to occur if contains the outcome of the experiment. The probability that an event occurs is the probability that contains the outcome of an experiment. An event defines a complementary event, namely the complementary set, and together these define a Bernoulli trial: did the event occur or not?

<span class="mw-page-title-main">Shuffling</span> Procedure used to randomize a deck of playing cards

Shuffling is a procedure used to randomize a deck of playing cards to provide an element of chance in card games. Shuffling is often followed by a cut, to help ensure that the shuffler has not manipulated the outcome.

<span class="mw-page-title-main">Probability theory</span> Branch of mathematics concerning probability

Probability theory or probability calculus is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set of axioms. Typically these axioms formalise probability in terms of a probability space, which assigns a measure taking values between 0 and 1, termed the probability measure, to a set of outcomes called the sample space. Any specified subset of the sample space is called an event.

<span class="mw-page-title-main">Infinite monkey theorem</span> Counterintuitive result in probability

The infinite monkey theorem states that a monkey hitting keys at random on a typewriter keyboard for an infinite amount of time will almost surely type any given text, including the complete works of William Shakespeare. In fact, the monkey would almost surely type every possible finite text an infinite number of times. The theorem can be generalized to state that any sequence of events that has a non-zero probability of happening will almost certainly occur an infinite number of times, given an infinite amount of time or a universe that is infinite in size.

Synchronicity is a concept introduced by analytical psychologist Carl Jung "to describe circumstances that appear meaningfully related yet lack a causal connection". Synchronicity experiences refer to one's subjective experience whereby coincidences between events in one's mind and the outside world may be causally unrelated, yet have another unknown connection. Jung held this was a healthy function of the mind, that can become harmful within psychosis.

<span class="mw-page-title-main">Law of large numbers</span> Averages of repeated trials converge to the expected value

In probability theory, the law of large numbers (LLN) is a mathematical theorem that states that the average of the results obtained from a large number of independent random samples converges to the true value, if it exists. More formally, the LLN states that given a sample of independent and identically distributed values, the sample mean converges to the true mean.

In probability theory, odds provide a measure of the probability of a particular outcome. Odds are commonly used in gambling and statistics. For example for an event that is 40% probable, one could say that the odds are "2 in 5","2 to 3 in favor", or "3 to 2 against".

In probability theory, an event is said to happen almost surely if it happens with probability 1. In other words, the set of outcomes on which the event does not occur has probability 0, even though the set might not be empty. The concept is analogous to the concept of "almost everywhere" in measure theory. In probability experiments on a finite sample space with a non-zero probability for each outcome, there is no difference between almost surely and surely ; however, this distinction becomes important when the sample space is an infinite set, because an infinite set can have non-empty subsets of probability 0.

Littlewood's law states that a person can expect to experience events with odds of one in a million at the rate of about one per month. It is named after the British mathematician John Edensor Littlewood.

Specified complexity is a creationist argument introduced by William Dembski, used by advocates to promote the pseudoscience of intelligent design. According to Dembski, the concept can formalize a property that singles out patterns that are both specified and complex, where in Dembski's terminology, a specified pattern is one that admits short descriptions, whereas a complex pattern is one that is unlikely to occur by chance. Proponents of intelligent design use specified complexity as one of their two main arguments, alongside irreducible complexity.

<span class="mw-page-title-main">Coincidence</span> Concurrence of events with no connection

A coincidence is a remarkable concurrence of events or circumstances that have no apparent causal connection with one another. The perception of remarkable coincidences may lead to supernatural, occult, or paranormal claims, or it may lead to belief in fatalism, which is a doctrine that events will happen in the exact manner of a predetermined plan. In general, the perception of coincidence, for lack of more sophisticated explanations, can serve as a link to folk psychology and philosophy.

<i>Innumeracy</i> (book) 1988 book by John Allen Paulos

Innumeracy: Mathematical Illiteracy and its Consequences is a 1988 book by mathematician John Allen Paulos about innumeracy as the mathematical equivalent of illiteracy: incompetence with numbers rather than words. Innumeracy is a problem with many otherwise educated and knowledgeable people. While many people would be ashamed to admit they are illiterate, there is very little shame in admitting innumeracy by saying things like "I'm a people person, not a numbers person", or "I always hated math", but Paulos challenges whether that widespread cultural excusing of innumeracy is truly worthy of acceptability.

The junkyard tornado, sometimes known as Hoyle's fallacy, is an argument against abiogenesis, using a calculation of its probability based on false assumptions, as comparable to "a tornado sweeping through a junk-yard might assemble a Boeing 747 from the materials therein" and to compare the chance of obtaining even a single functioning protein by chance combination of amino acids to a solar system full of blind men solving Rubik's Cubes simultaneously. It was used originally by English astronomer Fred Hoyle (1915–2001) in his book The Intelligent Universe, where he tried to apply statistics to evolution and the origin of life. Similar reasoning were advanced in Darwin's time, and indeed as long ago as Cicero in classical antiquity. While Hoyle himself was an atheist, the argument has since become a mainstay in the rejection of evolution by religious groups.

The mathematics of gambling is a collection of probability applications encountered in games of chance and can get included in game theory. From a mathematical point of view, the games of chance are experiments generating various types of aleatory events, and it is possible to calculate by using the properties of probability on a finite space of possibilities.

<span class="mw-page-title-main">Randomness</span> Apparent lack of pattern or predictability in events

In common usage, randomness is the apparent or actual lack of definite pattern or predictability in information. A random sequence of events, symbols or steps often has no order and does not follow an intelligible pattern or combination. Individual random events are, by definition, unpredictable, but if there is a known probability distribution, the frequency of different outcomes over repeated events is predictable. For example, when throwing two dice, the outcome of any particular roll is unpredictable, but a sum of 7 will tend to occur twice as often as 4. In this view, randomness is not haphazardness; it is a measure of uncertainty of an outcome. Randomness applies to concepts of chance, probability, and information entropy.

<span class="mw-page-title-main">Poisson distribution</span> Discrete probability distribution

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event. It can also be used for the number of events in other types of intervals than time, and in dimension greater than 1.

The look-elsewhere effect is a phenomenon in the statistical analysis of scientific experiments where an apparently statistically significant observation may have actually arisen by chance because of the sheer size of the parameter space to be searched.

References