Simplicity theory

Last updated

Simplicity theory is a cognitive theory that seeks to explain the attractiveness of situations or events to human minds. It is based on work done by scientists like British Behavioural scientist Nick Chater, Dutch computer scientist Paul Vitanyi, artificial intelligence researchers Jean-Louis Dessalles from France and Jürgen Schmidhuber from Germany. It claims that interesting situations appear simpler than expected to the observer.

Contents

Overview

Technically, simplicity corresponds in a drop in Kolmogorov complexity, which means that, for an observer, the shortest description of the situation is shorter than anticipated. For instance, the description of a consecutive lottery draw, such as 22-23-24-25-26-27, is significantly shorter than a typical one, such as 12-22-27-37-38-42. The former requires only one instantiation (choice of a number among all possible numbers in the lottery), whereas the latter requires six instantiations.

Simplicity theory makes several quantitative predictions concerning the way distance, recency, prominence (places, individuals), or atypicality influence interestingness.

Formalization

The basic concept of simplicity theory is unexpectedness, defined as the difference between expected complexity and observed complexity:

In most contexts, corresponds to generation complexity, which is the smallest description of all parameters that must be set in the "world" for the situation to exist. In the lottery example, generation complexity is identical for a consecutive draw and a typical draw (as long as no cheating is imagined) and amounts to six instantiations.

Simplicity theory avoids most criticisms addressed at Kolmogorov complexity by considering only descriptions that are available to a given observer (instead of any imaginable description). This amounts to saying that complexity, and thus unexpectedness, are observer-dependent. For instance, the typical draw 12-22-27-37-38-42 will appear very simple, even simpler than the consecutive one, to the person who played that combination.

Connection with probability

Unexpectedness is linked to subjective probability as

The advantage of this formula is that subjective probability can be assessed without necessarily knowing the alternatives. Classical approaches to probability would consider all situations in the world as having virtually zero probability to have occurred, as each situation is complex and unique. Simplicity theory avoids this trap by considering that subjective improbability is only due to complexity drop.

Related Research Articles

Kolmogorov complexity Measure of algorithmic complexity

In algorithmic information theory, the Kolmogorov complexity of an object, such as a piece of text, is the length of a shortest computer program that produces the object as output. It is a measure of the computational resources needed to specify the object, and is also known as algorithmic complexity, Solomonoff–Kolmogorov–Chaitin complexity, program-size complexity, descriptive complexity, or algorithmic entropy. It is named after Andrey Kolmogorov, who first published on the subject in 1963.

In the computer science subfield of algorithmic information theory, a Chaitin constant or halting probability is a real number that, informally speaking, represents the probability that a randomly constructed program will halt. These numbers are formed from a construction due to Gregory Chaitin.

Complexity characterises the behaviour of a system or model whose components interact in multiple ways and follow local rules, meaning there is no reasonable higher instruction to define the various possible interactions.

Probability measure of the expectation that an event will occur or a statement is true

Probability is a numerical description of how likely an event is to occur or how likely it is that a proposition is true. Probability is a number between 0 and 1, where, roughly speaking, 0 indicates impossibility and 1 indicates certainty. The higher the probability of an event, the more likely it is that the event will occur. A simple example is the tossing of a fair (unbiased) coin. Since the coin is fair, the two outcomes are both equally probable; the probability of "heads" equals the probability of "tails"; and since no other outcomes are possible, the probability of either "heads" or "tails" is 1/2.

Occams razor Philosophical principle of selecting the solution with the fewest assumptions

Occam's razor is the problem-solving principle that states that "Entities should not be multiplied without necessity." The idea is attributed to English Franciscan friar William of Ockham, a scholastic philosopher and theologian who used a preference for simplicity to defend the idea of divine miracles. It is sometimes paraphrased by a statement like "the simplest solution is most likely the right one". Occam's razor says that when presented with competing hypotheses that make the same predictions, one should select the solution with the fewest assumptions, and it is not meant to be a way of choosing between hypotheses that make different predictions.

Minimum message length (MML) is a Bayesian information-theoretic method for statistical model comparison and selection. It provides a formal information theory restatement of Occam's Razor: even when models are equal in their measure of fit-accuracy to the observed data, the one generating the most concise explanation of data is more likely to be correct. MML was invented by Chris Wallace, first appearing in the seminal paper "An information measure for classification". MML is intended not just as a theoretical construct, but as a technique that may be deployed in practice. It differs from the related concept of Kolmogorov complexity in that it does not require use of a Turing-complete language to model data.

The minimum description length (MDL) principle is a formalization of Occam's razor in which the best hypothesis for a given set of data is the one that leads to the best compression of the data. MDL was introduced by Jorma Rissanen in 1978. It is an important concept in information theory and computational learning theory.

Ray Solomonoff was the inventor of algorithmic probability, his General Theory of Inductive Inference, and was a founder of algorithmic information theory. He was an originator of the branch of artificial intelligence based on machine learning, prediction and probability. He circulated the first report on non-semantic machine learning in 1956.

Low-complexity art, first described by Jürgen Schmidhuber in 1997 and now established as a seminal topic within the larger field of computer science, is art that can be described by a short computer program.

Solomonoff's theory of inductive inference is Ray Solomonoff's mathematical formalization of Occam's razor. It explains observations of the world by the smallest computer program that outputs those observations. Solomonoff proved that this explanation is the most likely one, by assuming the world is generated by an unknown computer program. That is to say the probability distribution of all computer programs that output the observations favors the shortest one.

In economics, game theory, and decision theory, the expected utility hypothesis—concerning people's preferences with regard to choices that have uncertain outcomes (gambles)⁠—states that the subjective value associated with an individual's gamble is the statistical expectation of that individual's valuations of the outcomes of that gamble, where these valuations may differ from the dollar value of those outcomes. The introduction of St. Petersburg Paradox by Daniel Bernoulli in 1738 is considered the beginnings of the hypothesis. This hypothesis has proven useful to explain some popular choices that seem to contradict the expected value criterion, such as occur in the contexts of gambling and insurance.

Cognitive complexity describes cognition along a simplicity-complexity axis. It is the subject of academic study in fields including personal construct psychology, organisational theory and human–computer interaction.

Algorithmic information theory (AIT) is a "merger of information theory and computer science" that concerns itself with the relationship between computation and information of computably generated objects, such as strings or any other data structure. In other words, it is shown within algorithmic information theory that computational incompressibility "mimics" the relations or inequalities found in information theory. According to Gregory Chaitin, it is "the result of putting Shannon's information theory and Turing's computability theory into a cocktail shaker and shaking vigorously."

Intuitively, an algorithmically random sequence is an infinite sequence of binary digits that appears random to any algorithm. The notion can be applied analogously to sequences on any finite alphabet. Random sequences are key objects of study in algorithmic information theory.

In 1973 Kolmogorov proposed a non-probabilistic approach to statistics and model selection. Let each datum be a finite binary string and a model be a finite set of binary strings. Consider model classes consisting of models of given maximal Kolmogorov complexity. The Kolmogorov structure function of an individual data string expresses the relation between the complexity level constraint on a model class and the least log-cardinality of a model in the class containing the data. The structure function determines all stochastic properties of the individual data string: for every constrained model class it determines the individual best-fitting model in the class irrespective of whether the true model is in the model class considered or not. In the classical case we talk about a set of data with a probability distribution, and the properties are those of the expectations. In contrast, here we deal with individual data strings and the properties of the individual string focused on. In this setting, a property holds with certainty rather than with high probability as in the classical case. The Kolmogorov structure function precisely quantifies the goodness-of-fit of an individual model with respect to individual data.

Mathematical beauty Notion that some mathematicians may derive aesthetic pleasure from mathematics

Mathematical beauty is the aesthetic pleasure typically derived from the abstractness, purity, simplicity, depth or orderliness of mathematics. Mathematicians often express this pleasure by describing mathematics as beautiful. They might also describe mathematics as an art form or, at a minimum, as a creative activity. Comparisons are often made with music and poetry.

Information distance is the distance between two finite objects expressed as the number of bits in the shortest program which transforms one object into the other one or vice versa on a universal computer. This is an extension of Kolmogorov complexity. The Kolmogorov complexity of a single finite object is the information in that object; the information distance between a pair of finite objects is the minimum information required to go from one object to the other or vice versa. Information distance was first defined and investigated in based on thermodynamic principles, see also. Subsequently, it achieved final form in. It is applied in the normalized compression distance and the normalized Google distance.

Jean-Louis Dessalles is a French computer scientist and researcher in artificial intelligence and cognitive science, professor à Télécom ParisTech (Paris). He is best known for his contributions to the Simplicity theory and for his original theory about a possible political origin of language.

In mathematics, the incompressibility method is a proof method like the probabilistic method, the counting method or the pigeonhole principle. To prove that an object in a certain class satisfies a certain property, select an object of that class which is incompressible. If it does not satisfy the property, it can be compressed by computable coding. Since it can be generally proven that almost all objects in a given class are incompressible, the argument demonstrates that almost all objects in the class have the property involved. To select an incompressible object is ineffective, and cannot be done by a computer program. However, a simple counting argument usually shows that almost all objects of a given class can be compressed by only a few bits.

References