Additive Markov chain

Last updated February 07, 2023

In probability theory, an additive Markov chain is a Markov chain with an additive conditional probability function. Here the process is a discrete-time Markov chain of order m and the transition probability to a state at the next time is a sum of functions, each depending on the next state and one of the m previous states.

Definition

An additive Markov chain of order m is a sequence of random variables X₁, X₂, X₃, ..., possessing the following property: the probability that a random variable X_n has a certain value x_n under the condition that the values of all previous variables are fixed depends on the values of m previous variables only (Markov chain of order m), and the influence of previous variables on a generated one is additive,

\Pr(X_{n}=x_{n}\mid X_{n-1}=x_{n-1},X_{n-2}=x_{n-2},\dots ,X_{n-m}=x_{n-m})=\sum _{r=1}^{m}f(x_{n},x_{n-r},r).

Binary case

A binary additive Markov chain is where the state space of the chain consists on two values only, X_n ∈ { x₁, x₂ }. For example, X_n ∈ { 0, 1 }. The conditional probability function of a binary additive Markov chain can be represented as

\Pr(X_{n}=1\mid X_{n-1}=x_{n-1},X_{n-2}=x_{n-2},\dots )={\bar {X}}+\sum _{r=1}^{m}F(r)(x_{n-r}-{\bar {X}}),

\Pr(X_{n}=0\mid X_{n-1}=x_{n-1},X_{n-2}=x_{n-2},\dots )=1-\Pr(X_{n}=1\mid X_{n-1}=x_{n-1},X_{n-2}=x_{n-2},\dots ).

Here ${\bar {X}}$ is the probability to find X_n = 1 in the sequence and F(r) is referred to as the memory function. The value of ${\bar {X}}$ and the function F(r) contain all the information about correlation properties of the Markov chain.

Relation between the memory function and the correlation function

In the binary case, the correlation function between the variables $X_{n}$ and $X_{k}$ of the chain depends on the distance $n-k$ only. It is defined as follows:

K(r)=\langle (X_{n}-{\bar {X}})(X_{n+r}-{\bar {X}})\rangle =\langle X_{n}X_{n+r}\rangle -{\bar {X}}^{2},

where the symbol $\langle \cdots \rangle$ denotes averaging over all n. By definition,

K(-r)=K(r),K(0)={\bar {X}}(1-{\bar {X}}).

There is a relation between the memory function and the correlation function of the binary additive Markov chain:^[1]

K(r)=\sum _{s=1}^{m}K(r-s)F(s),\,\,\,\,r=1,2,\dots \,.

Notes

↑ S.S. Melnyk, O.V. Usatenko, and V.A. Yampol’skii. (2006) "Memory functions of the additive Markov chains: applications to complex dynamic systems", Physica A, 361 (2), 405–415 doi : 10.1016/j.physa.2005.06.083

Related Research Articles

In theoretical physics, a Feynman diagram is a pictorial representation of the mathematical expressions describing the behavior and interaction of subatomic particles. The scheme is named after American physicist Richard Feynman, who introduced the diagrams in 1948. The interaction of subatomic particles can be complex and difficult to understand; Feynman diagrams give a simple visualization of what would otherwise be an arcane and abstract formula. According to David Kaiser, "Since the middle of the 20th century, theoretical physicists have increasingly turned to this tool to help them undertake critical calculations. Feynman diagrams have revolutionized nearly every aspect of theoretical physics." While the diagrams are applied primarily to quantum field theory, they can also be used in other fields, such as solid-state theory. Frank Wilczek wrote that the calculations that won him the 2004 Nobel Prize in Physics "would have been literally unthinkable without Feynman diagrams, as would [Wilczek's] calculations that established a route to production and observation of the Higgs particle."

In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselves are not normally distributed.

In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the lesser values, the covariance is positive. In the opposite case, when the greater values of one variable mainly correspond to the fewer values of the other,, the covariance is negative. The sign of the covariance, therefore, shows the tendency in the linear relationship between the variables. The magnitude of the covariance is the geometric mean of the variances that are in-common for the two random variables. The correlation coefficient normalizes the covariance by dividing by the geometric mean of the total variances for the two random variables.

In mathematics, the moments of a function are certain quantitative measures related to the shape of the function's graph. If the function represents mass density, then the zeroth moment is the total mass, the first moment is the center of mass, and the second moment is the moment of inertia. If the function is a probability distribution, then the first moment is the expected value, the second central moment is the variance, the third standardized moment is the skewness, and the fourth standardized moment is the kurtosis. The mathematical concept is closely related to the concept of moment in physics.

In mathematics, a positive-definite function is, depending on the context, either of two types of function.

In quantum information theory, a quantum circuit is a model for quantum computation, similar to classical circuits, in which a computation is a sequence of quantum gates, measurements, initializations of qubits to known values, and possibly other actions. The minimum set of actions that a circuit needs to be able to perform on the qubits to enable quantum computation is known as DiVincenzo's criteria.

<span class="mw-page-title-main">Automatic differentiation</span> Techniques to evaluate the derivative of a function specified by a computer program

In mathematics and computer algebra, automatic differentiation (AD), also called algorithmic differentiation, computational differentiation, auto-differentiation, or simply autodiff, is a set of techniques to evaluate the derivative of a function specified by a computer program. AD exploits the fact that every computer program, no matter how complicated, executes a sequence of elementary arithmetic operations and elementary functions. By applying the chain rule repeatedly to these operations, derivatives of arbitrary order can be computed automatically, accurately to working precision, and using at most a small constant factor more arithmetic operations than the original program.

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

A factor graph is a bipartite graph representing the factorization of a function. In probability theory and its applications, factor graphs are used to represent factorization of a probability distribution function, enabling efficient computations, such as the computation of marginal distributions through the sum-product algorithm. One of the important success stories of factor graphs and the sum-product algorithm is the decoding of capacity-approaching error-correcting codes, such as LDPC and turbo codes.

<span class="mw-page-title-main">Symmetry in mathematics</span> Symmetry in mathematics

Symmetry occurs not only in geometry, but also in other branches of mathematics. Symmetry is a type of invariance: the property that a mathematical object remains unchanged under a set of operations or transformations.

In probability, a discrete-time Markov chain (DTMC) is a sequence of random variables, known as a stochastic process, in which the value of the next variable depends only on the value of the current variable, and not any variables in the past. For instance, a machine may have two states, A and E. When it is in state A, there is a 40% chance of it moving to state E and a 60% chance of it remaining in state A. When it is in state E, there is a 70% chance of it moving to A and a 30% chance of it staying in E. The sequence of states of the machine is a Markov chain. If we denote the chain by $then is the state which the machine starts in and is the random variable describing its state after 10 transitions. The process continues forever, indexed by the natural numbers.$

Semidefinite programming (SDP) is a subfield of convex optimization concerned with the optimization of a linear objective function over the intersection of the cone of positive semidefinite matrices with an affine space, i.e., a spectrahedron.

In probability theory, the Wick product is a particular way of defining an adjusted product of a set of random variables. In the lowest order product the adjustment corresponds to subtracting off the mean value, to leave a result whose mean is zero. For the higher order products the adjustment involves subtracting off lower order (ordinary) products of the random variables, in a symmetric way, again leaving a result whose mean is zero. The Wick product is a polynomial function of the random variables, their expected values, and expected values of their products.

In probability theory and statistical mechanics, the Gaussian free field (GFF) is a Gaussian random field, a central model of random surfaces. Sheffield (2007) gives a mathematical survey of the Gaussian free field.

The partition function or configuration integral, as used in probability theory, information theory and dynamical systems, is a generalization of the definition of a partition function in statistical mechanics. It is a special case of a normalizing constant in probability theory, for the Boltzmann distribution. The partition function occurs in many problems of probability theory because, in situations where there is a natural symmetry, its associated probability measure, the Gibbs measure, has the Markov property. This means that the partition function occurs not only in physical systems with translation symmetry, but also in such varied settings as neural networks, and applications such as genomics, corpus linguistics and artificial intelligence, which employ Markov networks, and Markov logic networks. The Gibbs measure is also the unique measure that has the property of maximizing the entropy for a fixed expectation value of the energy; this underlies the appearance of the partition function in maximum entropy methods and the algorithms derived therefrom.

In mathematics, the Fortuin–Kasteleyn–Ginibre (FKG) inequality is a correlation inequality, a fundamental tool in statistical mechanics and probabilistic combinatorics, due to Cees M. Fortuin, Pieter W. Kasteleyn, and Jean Ginibre (1971). Informally, it says that in many random systems, increasing events are positively correlated, while an increasing and a decreasing event are negatively correlated. It was obtained by studying the random cluster model.

In probability theory and directional statistics, a wrapped probability distribution is a continuous probability distribution that describes data points that lie on a unit n-sphere. In one dimension, a wrapped distribution consists of points on the unit circle. If $is a random variate in the interval with probability density function (PDF), then is a circular variable distributed according to the wrapped distribution and is an angular variable in the interval distributed according to the wrapped distribution .$

<span class="mw-page-title-main">Wrapped exponential distribution</span> Probability distribution

In probability theory and directional statistics, a wrapped exponential distribution is a wrapped probability distribution that results from the "wrapping" of the exponential distribution around the unit circle.

In probability theory, concentration inequalities provide bounds on how a random variable deviates from some value. The law of large numbers of classical probability theory states that sums of independent random variables are, under very mild conditions, close to their expectation with a large probability. Such sums are the most basic examples of random variables concentrated around their mean. Recent results show that such behavior is shared by other functions of independent random variables.

In statistical mechanics, the mean squared displacement is a measure of the deviation of the position of a particle with respect to a reference position over time. It is the most common measure of the spatial extent of random motion, and can be thought of as measuring the portion of the system "explored" by the random walker. In the realm of biophysics and environmental engineering, the Mean Squared Displacement is measured over time to determine if a particle is spreading slowly due to diffusion, or if an advective force is also contributing. Another relevant concept, the variance-related diameter, is also used in studying the transportation and mixing phenomena in the realm of environmental engineering. It prominently appears in the Debye–Waller factor and in the Langevin equation.

References

A.A. Markov. (1906) "Rasprostranenie zakona bol'shih chisel na velichiny, zavisyaschie drug ot druga". Izvestiya Fiziko-matematicheskogo obschestva pri Kazanskom universitete, 2-ya seriya, tom 15, 135–156
A.A. Markov. (1971) "Extension of the limit theorems of probability theory to a sum of variables connected in a chain". reprinted in Appendix B of: R. Howard. Dynamic Probabilistic Systems, volume 1: Markov Chains. John Wiley and Sons
S. Hod; U. Keshet (2004). "Phase transition in random walks with long-range correlations". Phys. Rev. E. 70 (1 Pt 2): 015104. arXiv: cond-mat/0311483 . Bibcode:2004PhRvE..70a5104H. doi:10.1103/PhysRevE.70.015104. PMID 15324113. S2CID 18169687.
S.L. Narasimhan; J.A. Nathan; K.P.N. Murthy (2005). "Can coarse-graining introduce long-range correlations in a symbolic sequence?". Europhys. Lett. 69 (1): 22. arXiv: cond-mat/0409042 . Bibcode:2005EL.....69...22N. doi:10.1209/epl/i2004-10307-2. S2CID 250845691.
Ramakrishnan, S. (1981) "Finitely Additive Markov Chains", Transactions of the American Mathematical Society, 265 (1), 247–272 JSTOR 1998493

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] S.S. Melnyk, O.V. Usatenko, and V.A. Yampol’skii. (2006) "Memory functions of the additive Markov chains: applications to complex dynamic systems", Physica A, 361 (2), 405–415 doi : 10.1016/j.physa.2005.06.083

[1]