Information fluctuation complexity is an information-theoretic quantity defined as the fluctuation of information about entropy. It is derivable from fluctuations in the predominance of order and chaos in a dynamic system and has been used as a measure of complexity in many diverse fields. It was introduced in a 1993 paper by Bates and Shepard. [1]
The information fluctuation complexity of a discrete dynamic system is a function of the probability distribution of its states when it is subject to random external input data. The purpose of driving the system with a rich information source such as a random number generator or a white noise signal is to probe the internal dynamics of the system in much the same way as a frequency-rich impulse is used in signal processing.
If a system has possible states and the state probabilities are known, then its information entropy is
where is the information content of state .
The information fluctuation complexity of the system is defined as the standard deviation or fluctuation of about its mean :
or
The fluctuation of state information is zero in a maximally disordered system with all ; the system simply mimics its random inputs. is also zero if the system is perfectly ordered with only one fixed state , regardless of the inputs. is non-zero between these two extremes with a mixture of higher-probability states and lower-probability states populating state space.
As a complex dynamic system evolves over time, how it transitions between states depends on external stimuli in an irregular way. At times it may be more sensitive to external stimuli (unstable) and at other times less sensitive (stable). When a given state has multiple possible next-states, external information determines which one will be next and the system gains this information by following a particular trajectory in state space. However, if several different states all lead to the same next-state, then upon entering the next-state the system loses information about which state preceded it. Thus, a complex system exhibits alternating information gain and loss as it evolves over time. This alternation or fluctuation of information is equivalent to remembering and forgetting — temporary information storage or memory — an essential feature of non-trivial computation.
The gain or loss of information associated with transitions between states can be related to state information. The net information gain of a transition from state to state is the information gained when leaving state less the information lost when entering state :
Here is the forward conditional probability that if the present state is then the next state will be and is the reverse conditional probability that if the present state is then the previous state was . The conditional probabilities are related to the transition probability, the probability that a transition from state to state occurs, by:
Eliminating the conditional probabilities:
Therefore, the net information gained by the system as a result of the transition depends only on the increase in state information from the initial to the final state. It can be shown that this is true even for multiple consecutive transitions. [1]
is reminiscent of the relation between force and potential energy. is like potential and is like force in . External information "pushes" a system "uphill" to a state of higher information potential to accomplish information storage, much like pushing a mass uphill to a state of higher gravitational potential stores energy. The amount of energy stored depends only on the final height, not on the path up the hill. Similarly, the amount of information stored does not depend on the transition path between an initial common state and a final rare state. Once a system reaches a rare state with high information potential, it may then "fall" back to a common state, losing previously stored information.
It may be useful to compute the standard deviation of about its mean (which is zero), namely the fluctuation of net information gain, [1] but takes into account multi-transition memory loops in state space and therefore should be more indicative of the computational power of a system. Moreover, is easier to apply because there can be many more transitions than states.
A dynamic system that is sensitive to external information (unstable) exhibits chaotic behavior whereas one that is insensitive to external information (stable) exhibits orderly behavior. A complex system exhibits both behaviors, fluctuating between them in dynamic balance when subject to a rich information source. The degree of fluctuation is quantified by ; it captures the alternation in the predominance of chaos and order in a complex system as it evolves over time.
Source: [2]
The rule 110 variant of the elementary cellular automaton has been proven to be capable of universal computation. The proof is based on the existence and interactions of cohesive and self-perpetuating cell patterns known as gliders, which are examples of emergent phenomena associated with complex systems and which imply the capability of groups of automaton cells to remember that a glider is passing through them. It is therefore to be expected that there will be memory loops in state space resulting from alternations of information gain and loss, instability and stability, chaos and order.
Consider a 3-cell group of adjacent automaton cells that obey rule 110: end-center-end. The next state of the center cell depends on the present state of itself and the end cells as specified by the rule:
3-cell group | 1-1-1 | 1-1-0 | 1-0-1 | 1-0-0 | 0-1-1 | 0-1-0 | 0-0-1 | 0-0-0 |
---|---|---|---|---|---|---|---|---|
next center cell | 0 | 1 | 1 | 0 | 1 | 1 | 1 | 0 |
To compute the information fluctuation complexity of this system, attach a driver cell to each end of the 3-cell group to provide random external stimuli like so, driver→end-center-end←driver, such that the rule can be applied to the two end cells. Next, determine what the next state will be for each possible present state and for each possible combination of driver cell contents, in order to determine the forward conditional probabilities.
The state diagram of this system is depicted below, with circles representing states and arrows representing transitions between states. The eight possible states of this system, 1-1-1 to 0-0-0, are labeled with the octal equivalent of the 3-bit contents of the 3-cell group: 7 to 0. The transition arrows are labeled with forward conditional probabilities. Notice that there is variability in the divergence and convergence of arrows corresponding to variability in gain and loss of information originating from the driver cells.
The forward conditional probabilities are determined by the proportion of possible driver cell contents that drive a particular transition. For example, for the four possible combinations of two driver cell contents, state 7 leads to states 5, 4, 1 and 0 and therefore , , , and are each 1⁄4 or 25%. Similarly, state 0 leads to states 0, 1, 0 and 1 and therefore and are each 1⁄2 or 50%. And so forth.
The state probabilities are related by
These linear algebraic equations can be solved for the state probabilities, with the following results: [2]
p0 | p1 | p2 | p3 | p4 | p5 | p6 | p7 |
2/17 | 2/17 | 1/34 | 5/34 | 2/17 | 2/17 | 2/17 | 4/17 |
The information entropy and the complexity can then be computed from the state probabilities:
Note that the maximum possible entropy for eight states is , which is the case when all . Thus, rule 110 has a relatively high entropy or state utilization of . However, this does not preclude a considerable fluctuation of state information about entropy and thus a considerable value of the complexity. Whereas, maximum entropy would preclude complexity.
An alternative method can be used to obtain the state probabilities when the analytical method used above is unfeasible. Simply drive the system at its inputs (the driver cells) with a random source for many generations and observe the state probabilities empirically. When this is done via computer simulation for 10 million generations the results are as follows: [2]
number of cells | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 |
---|---|---|---|---|---|---|---|---|---|---|---|
(bits) | 2.86 | 3.81 | 4.73 | 5.66 | 6.56 | 7.47 | 8.34 | 9.25 | 10.09 | 10.97 | 11.78 |
(bits) | 0.56 | 0.65 | 0.72 | 0.73 | 0.79 | 0.81 | 0.89 | 0.90 | 1.00 | 1.01 | 1.15 |
0.20 | 0.17 | 0.15 | 0.13 | 0.12 | 0.11 | 0.11 | 0.10 | 0.10 | 0.09 | 0.10 |
Since both and increase with system size, their dimensionless ratio , the relative information fluctuation complexity, is included to compare systems of different sizes. Notice that the empirical and analytical results agree for the 3-cell automaton and that the relative complexity levels off to about by 10 cells.
In the paper by Bates and Shepard, [1] is computed for all elementary cellular automaton rules and it was observed that the ones that exhibit slow-moving gliders and possibly stationary objects, as rule 110 does, are highly correlated with large values of . can therefore be used as a filter to select candidate rules for universal computation, which is challenging to prove.
Although the derivation of the information fluctuation complexity formula is based on information fluctuations in dynamic systems, the formula depends only on state probabilities and therefore is also applicable to any probability distribution, including those derived from static images or text.
Over the years the original paper [1] has been referred to by researchers in many diverse fields: complexity theory, [3] complex systems science, [4] complex networks, [5] chaotic dynamics, [6] many-body localization entanglement, [7] environmental engineering, [8] ecological complexity, [9] ecological time-series analysis, [10] ecosystem sustainability, [11] air [12] and water [13] pollution, hydrological wavelet analysis, [14] soil water flow, [15] soil moisture, [16] headwater runoff, [17] groundwater depth, [18] air traffic control, [19] flow patterns [20] and flood events, [21] topology, [22] economics, [23] market forecasting of metal [24] and electricity [25] prices, health informatics, [26] human cognition, [27] human gait kinematics, [28] neurology, [29] EEG analysis, [30] education, [31] investing, [32] artificial life [33] and aesthetics. [34]
Entropy is a scientific concept that is most commonly associated with a state of disorder, randomness, or uncertainty. The term and the concept are used in diverse fields, from classical thermodynamics, where it was first recognized, to the microscopic description of nature in statistical physics, and to the principles of information theory. It has found far-ranging applications in chemistry and physics, in biological systems and their relation to life, in cosmology, economics, sociology, weather science, climate change, and information systems including the transmission of information in telecommunication.
In information theory, the entropy of a random variable quantifies the average level of uncertainty or information associated with the variable's potential states or possible outcomes. This measures the expected amount of information needed to describe the state of the variable, considering the distribution of probabilities across all potential states. Given a discrete random variable , which takes values in the set and is distributed according to , the entropy is where denotes the sum over the variable's possible values. The choice of base for , the logarithm, varies for different applications. Base 2 gives the unit of bits, while base e gives "natural units" nat, and base 10 gives units of "dits", "bans", or "hartleys". An equivalent definition of entropy is the expected value of the self-information of a variable.
In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable X is log-normally distributed, then Y = ln(X) has a normal distribution. Equivalently, if Y has a normal distribution, then the exponential function of Y, X = exp(Y), has a log-normal distribution. A random variable which is log-normally distributed takes only positive real values. It is a convenient and useful model for measurements in exact and engineering sciences, as well as medicine, economics and other topics (e.g., energies, concentrations, lengths, prices of financial instruments, and other metrics).
In mathematics, a random walk, sometimes known as a drunkard's walk, is a stochastic process that describes a path that consists of a succession of random steps on some mathematical space.
The Ising model, named after the physicists Ernst Ising and Wilhelm Lenz, is a mathematical model of ferromagnetism in statistical mechanics. The model consists of discrete variables that represent magnetic dipole moments of atomic "spins" that can be in one of two states. The spins are arranged in a graph, usually a lattice, allowing each spin to interact with its neighbors. Neighboring spins that agree have a lower energy than those that disagree; the system tends to the lowest energy but heat disturbs this tendency, thus creating the possibility of different structural phases. The model allows the identification of phase transitions as a simplified model of reality. The two-dimensional square-lattice Ising model is one of the simplest statistical models to show a phase transition.
The fluctuation theorem (FT), which originated from statistical mechanics, deals with the relative probability that the entropy of a system which is currently away from thermodynamic equilibrium will increase or decrease over a given amount of time. While the second law of thermodynamics predicts that the entropy of an isolated system should tend to increase until it reaches equilibrium, it became apparent after the discovery of statistical mechanics that the second law is only a statistical one, suggesting that there should always be some nonzero probability that the entropy of an isolated system might spontaneously decrease; the fluctuation theorem precisely quantifies this probability.
In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual dependence between the two variables. More specifically, it quantifies the "amount of information" obtained about one random variable by observing the other random variable. The concept of mutual information is intimately linked to that of entropy of a random variable, a fundamental notion in information theory that quantifies the expected "amount of information" held in a random variable.
In information geometry, the Fisher information metric is a particular Riemannian metric which can be defined on a smooth statistical manifold, i.e., a smooth manifold whose points are probability measures defined on a common probability space. It can be used to calculate the informational difference between measurements.
In quantum physics, a measurement is the testing or manipulation of a physical system to yield a numerical result. A fundamental feature of quantum theory is that the predictions it makes are probabilistic. The procedure for finding a probability involves combining a quantum state, which mathematically describes a quantum system, with a mathematical representation of the measurement to be performed on that system. The formula for this calculation is known as the Born rule. For example, a quantum particle like an electron can be described by a quantum state that associates to each point in space a complex number called a probability amplitude. Applying the Born rule to these amplitudes gives the probabilities that the electron will be found in one region or another when an experiment is performed to locate it. This is the best the theory can do; it cannot say for certain where the electron will be found. The same quantum state can also be used to make a prediction of how the electron will be moving, if an experiment is performed to measure its momentum instead of its position. The uncertainty principle implies that, whatever the quantum state, the range of predictions for the electron's position and the range of predictions for its momentum cannot both be narrow. Some quantum states imply a near-certain prediction of the result of a position measurement, but the result of a momentum measurement will be highly unpredictable, and vice versa. Furthermore, the fact that nature violates the statistical conditions known as Bell inequalities indicates that the unpredictability of quantum measurement results cannot be explained away as due to ignorance about "local hidden variables" within quantum systems.
In mathematical statistics, the Kullback–Leibler (KL) divergence, denoted , is a type of statistical distance: a measure of how one reference probability distribution P is different from a second probability distribution Q. Mathematically, it is defined as
In information theory, the Rényi entropy is a quantity that generalizes various notions of entropy, including Hartley entropy, Shannon entropy, collision entropy, and min-entropy. The Rényi entropy is named after Alfréd Rényi, who looked for the most general way to quantify information while preserving additivity for independent events. In the context of fractal dimension estimation, the Rényi entropy forms the basis of the concept of generalized dimensions.
In Bayesian statistics, the Jeffreys prior is a non-informative prior distribution for a parameter space. Named after Sir Harold Jeffreys, its density function is proportional to the square root of the determinant of the Fisher information matrix:
In quantum field theory and statistical mechanics, the Hohenberg–Mermin–Wagner theorem or Mermin–Wagner theorem states that continuous symmetries cannot be spontaneously broken at finite temperature in systems with sufficiently short-range interactions in dimensions d ≤ 2. Intuitively, this theorem implies that long-range fluctuations can be created with little energy cost, and since they increase the entropy, they are favored.
In mathematics, the topological entropy of a topological dynamical system is a nonnegative extended real number that is a measure of the complexity of the system. Topological entropy was first introduced in 1965 by Adler, Konheim and McAndrew. Their definition was modelled after the definition of the Kolmogorov–Sinai, or metric entropy. Later, Dinaburg and Rufus Bowen gave a different, weaker definition reminiscent of the Hausdorff dimension. The second definition clarified the meaning of the topological entropy: for a system given by an iterated function, the topological entropy represents the exponential growth rate of the number of distinguishable orbits of the iterates. An important variational principle relates the notions of topological and measure-theoretic entropy.
In quantum information theory, quantum relative entropy is a measure of distinguishability between two quantum states. It is the quantum mechanical analog of relative entropy.
The Swendsen–Wang algorithm is the first non-local or cluster algorithm for Monte Carlo simulation for large systems near criticality. It has been introduced by Robert Swendsen and Jian-Sheng Wang in 1987 at Carnegie Mellon.
In quantum information theory, strong subadditivity of quantum entropy (SSA) is the relation among the von Neumann entropies of various quantum subsystems of a larger quantum system consisting of three subsystems. It is a basic theorem in modern quantum information theory. It was conjectured by D. W. Robinson and D. Ruelle in 1966 and O. E. Lanford III and D. W. Robinson in 1968 and proved in 1973 by E.H. Lieb and M.B. Ruskai, building on results obtained by Lieb in his proof of the Wigner-Yanase-Dyson conjecture.
The min-entropy, in information theory, is the smallest of the Rényi family of entropies, corresponding to the most conservative way of measuring the unpredictability of a set of outcomes, as the negative logarithm of the probability of the most likely outcome. The various Rényi entropies are all equal for a uniform distribution, but measure the unpredictability of a nonuniform distribution in different ways. The min-entropy is never greater than the ordinary or Shannon entropy and that in turn is never greater than the Hartley or max-entropy, defined as the logarithm of the number of outcomes with nonzero probability.
Generalized relative entropy is a measure of dissimilarity between two quantum states. It is a "one-shot" analogue of quantum relative entropy and shares many properties of the latter quantity.
In network science, the network entropy is a disorder measure derived from information theory to describe the level of randomness and the amount of information encoded in a graph. It is a relevant metric to quantitatively characterize real complex networks and can also be used to quantify network complexity