Markov chain mixing time

Last updated

In probability theory, the mixing time of a Markov chain is the time until the Markov chain is "close" to its steady state distribution.

More precisely, a fundamental result about Markov chains is that a finite state irreducible aperiodic chain has a unique stationary distribution π and, regardless of the initial state, the time-t distribution of the chain converges to π as t tends to infinity. Mixing time refers to any of several variant formalizations of the idea: how large must t be until the time-t distribution is approximately π ? One variant, variation distance mixing time, is defined as the smallest t such that the total variation distance of probability measures is small:

for all subsets of states and all initial states. This is the sense in which DaveBayer and Persi Diaconis  ( 1992 ) proved that the number of riffle shuffles needed to mix an ordinary 52 card deck is 7. Mathematical theory focuses on how mixing times change as a function of the size of the structure underlying the chain. For an -card deck, the number of riffle shuffles needed grows as . The most developed theory concerns randomized algorithms for #P-Complete algorithmic counting problems such as the number of graph colorings of a given vertex graph. Such problems can, for sufficiently large number of colors, be answered using the Markov chain Monte Carlo method and showing that the mixing time grows only as ( Jerrum 1995 ). This example and the shuffling example possess the rapid mixing property, that the mixing time grows at most polynomially fast in (number of states of the chain). Tools for proving rapid mixing include arguments based on conductance and the method of coupling. In broader uses of the Markov chain Monte Carlo method, rigorous justification of simulation results would require a theoretical bound on mixing time, and many interesting practical cases have resisted such theoretical analysis.

See also

Related Research Articles

Shuffling procedure used to randomize a deck of playing cards

Shuffling is a procedure used to randomize a deck of playing cards to provide an element of chance in card games. Shuffling is often followed by a cut, to help ensure that the shuffler has not manipulated the outcome.

Markov chain mathematical system

A Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event.

Persi Diaconis American mathematician

Persi Warren Diaconis is an American mathematician of Greek descent and former professional magician. He is the Mary V. Sunseri Professor of Statistics and Mathematics at Stanford University.

Hidden Markov Model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobservable states.

In probability theory, de Finetti's theorem states that exchangeable observations are conditionally independent relative to some latent variable. An epistemic probability distribution could then be assigned to this variable. It is named in honor of Bruno de Finetti.

Random walk mathematical formalization of a path that consists of a succession of random steps

A random walk is a mathematical object, known as a stochastic or random process, that describes a path that consists of a succession of random steps on some mathematical space such as the integers. An elementary example of a random walk is the random walk on the integer number line, , which starts at 0 and at each step moves +1 or −1 with equal probability. Other examples include the path traced by a molecule as it travels in a liquid or a gas, the search path of a foraging animal, the price of a fluctuating stock and the financial status of a gambler: all can be approximated by random walk models, even though they may not be truly random in reality. As illustrated by those examples, random walks have applications to engineering and many scientific fields including ecology, psychology, computer science, physics, chemistry, biology as well as economics. Random walks explain the observed behaviors of many processes in these fields, and thus serve as a fundamental model for the recorded stochastic activity. As a more mathematical application, the value of π can be approximated by the use of random walk in an agent-based modeling environment. The term random walk was first introduced by Karl Pearson in 1905.

In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution. Various algorithms exist for constructing the Markov chain including the Metropolis–Hastings algorithm.

Graph coloring assignment of labels traditionally called "colors" to elements of a graph

In graph theory, graph coloring is a special case of graph labeling; it is an assignment of labels traditionally called "colors" to elements of a graph subject to certain constraints. In its simplest form, it is a way of coloring the vertices of a graph such that no two adjacent vertices are of the same color; this is called a vertex coloring. Similarly, an edge coloring assigns a color to each edge so that no two adjacent edges are of the same color, and a face coloring of a planar graph assigns a color to each face or region so that no two faces that share a boundary have the same color.

A randomized algorithm is an algorithm that employs a degree of randomness as part of its logic. The algorithm typically uses uniformly random bits as an auxiliary input to guide its behavior, in the hope of achieving good performance in the "average case" over all possible choices of random bits. Formally, the algorithm's performance will be a random variable determined by the random bits; thus either the running time, or the output are random variables.

In statistics, the Freedman–Diaconis rule can be used to select the width of the bins to be used in a histogram. It is named after David A. Freedman and Persi Diaconis.

Tutte polynomial

The Tutte polynomial, also called the dichromate or the Tutte–Whitney polynomial, is a graph polynomial. It is a polynomial in two variables which plays an important role in graph theory. It is defined for every undirected graph and contains information about how the graph is connected. It is denoted by .

In computer science, the longest increasing subsequence problem is to find a subsequence of a given sequence in which the subsequence's elements are in sorted order, lowest to highest, and in which the subsequence is as long as possible. This subsequence is not necessarily contiguous, or unique. Longest increasing subsequences are studied in the context of various disciplines related to mathematics, including algorithmics, random matrix theory, representation theory, and physics. The longest increasing subsequence problem is solvable in time O(n log n), where n denotes the length of the input sequence.

Bayesian inference of phylogeny uses a likelihood function to create a quantity called the posterior probability of trees using a model of evolution, based on some prior probabilities, producing the most likely phylogenetic tree for the given data. The Bayesian approach has become popular due to advances in computing speeds and the integration of Markov chain Monte Carlo (MCMC) algorithms. Bayesian inference has a number of applications in molecular phylogenetics and systematics.

Mark Richard Jerrum is a British computer scientist and computational theorist.

In probability theory, uniformization method, is a method to compute transient solutions of finite state continuous-time Markov chains, by approximating the process by a discrete time Markov chain. The original chain is scaled by the fastest transition rate γ, so that transitions occur at the same rate in every state, hence the name uniformisation. The method is simple to program and efficiently calculates an approximation to the transient distribution at a single point in time. The method was first introduced by Winfried Grassmann in 1977.

Edgar Nelson Gilbert was an American mathematician and coding theorist, a longtime researcher at Bell Laboratories whose accomplishments include the Gilbert–Varshamov bound in coding theory, the Gilbert–Elliott model of bursty errors in signal transmission, and the Erdős–Rényi model for random graphs.

In the mathematics of permutations and the study of shuffling playing cards, a riffle shuffle permutation is one of the permutations of a set of n items that can be obtained by a single riffle shuffle, in which a sorted deck of n cards is cut into two packets and then the two packets are interleaved. Beginning with an ordered set, mathematically a riffle shuffle is defined as a permutation on this set containing 1 or 2 rising sequences. The permutations with 1 rising sequence are the identity permutations.

In the mathematics of shuffling playing cards, the Gilbert–Shannon–Reeds model is a probability distribution on riffle shuffle permutations that has been reported to be a good match for experimentally observed outcomes of human shuffling, and that forms the basis for a recommendation that a deck of cards should be riffled seven times in order to thoroughly randomize it. It is named after the work of Edgar Gilbert, Claude Shannon, and J. Reeds, reported in a 1955 technical report by Gilbert and in a 1981 unpublished manuscript of Reeds.

Laurent Saloff-Coste is a French mathematician whose research is in Analysis, Probability theory, and Geometric group theory.

In computational complexity theory, a planted clique or hidden clique in an undirected graph is a clique formed from another graph by selecting a subset of vertices and adding edges between each pair of vertices in the subset. The planted clique problem is the algorithmic problem of distinguishing random graphs from graphs that have a planted clique. This is a variation of the clique problem; it may be solved in quasi-polynomial time but is conjectured not to be solvable in polynomial time for intermediate values of the clique size. The conjecture that no polynomial time solution exists is called the planted clique conjecture; it has been used as a computational hardness assumption.

References