Examples of Markov chains

Last updated

This article contains examples of Markov chains and Markov processes in action.


All examples are in the countable state space. For an overview of Markov chains in general state space, see Markov chains on a measurable state space.


Board games played with dice

A game of snakes and ladders or any other game whose moves are determined entirely by dice is a Markov chain, indeed, an absorbing Markov chain. This is in contrast to card games such as blackjack, where the cards represent a 'memory' of the past moves. To see the difference, consider the probability for a certain event in the game. In the above-mentioned dice games, the only thing that matters is the current state of the board. The next state of the board depends on the current state, and the next roll of the dice. It does not depend on how things got to their current state. In a game such as blackjack, a player can gain an advantage by remembering which cards have already been shown (and hence which cards are no longer in the deck), so the next state (or hand) of the game is not independent of the past states.

Random walk Markov chains

A center-biased random walk

Consider a random walk on the number line where, at each step, the position (call it x) may change by +1 (to the right) or −1 (to the left) with probabilities:

(where c is a constant greater than 0)

For example, if the constant, c, equals 1, the probabilities of a move to the left at positions x = −2,−1,0,1,2 are given by respectively. The random walk has a centering effect that weakens as c increases.

Since the probabilities depend only on the current position (value of x) and not on any prior positions, this biased random walk satisfies the definition of a Markov chain.


Suppose that one starts with $10, and one wagers $1 on an unending, fair, coin toss indefinitely, or until all of the money is lost. If represents the number of dollars one has after n tosses, with , then the sequence is a Markov process. If one knows that one has $12 now, then it would be expected that with even odds, one will either have $11 or $13 after the next toss. This guess is not improved by the added knowledge that one started with $10, then went up to $11, down to $10, up to $11, and then to $12. The fact that the guess is not improved by the knowledge of earlier tosses showcases the Markov property, the memoryless property of a stochastic process. [1]

A model of language

This example came from Markov himself. [2] Markov chose 20,000 letters from Pushkin’s Eugene Onegin , classified them into vowels and consonants, and counted the transition probabilities.The stationary distribution is 43.2 percent vowels and 56.8 percent consonants, which is close to the actual count in the book. [3]

A simple weather model

The probabilities of weather conditions (modeled as either rainy or sunny), given the weather on the preceding day, can be represented by a transition matrix: [4]

The matrix P represents the weather model in which a sunny day is 90% likely to be followed by another sunny day, and a rainy day is 50% likely to be followed by another rainy day. [4] The columns can be labelled "sunny" and "rainy", and the rows can be labelled in the same order.

The above matrix as a graph. Markov Chain weather model matrix as a graph.png
The above matrix as a graph.

(P)i j is the probability that, if a given day is of type i, it will be followed by a day of type j.

Notice that the rows of P sum to 1: this is because P is a stochastic matrix. [4]

Predicting the weather

The weather on day 0 (today) is known to be sunny. This is represented by an initial state vector in which the "sunny" entry is 100%, and the "rainy" entry is 0%:

The weather on day 1 (tomorrow) can be predicted by multiplying the state vector from day 0 by the transition matrix:

Thus, there is a 90% chance that day 1 will also be sunny.

The weather on day 2 (the day after tomorrow) can be predicted in the same way, from the state vector we computed for day 1:


General rules for day n are:

Steady state of the weather

In this example, predictions for the weather on more distant days change less and less on each subsequent day and tend towards a steady state vector. [5] This vector represents the probabilities of sunny and rainy weather on all days, and is independent of the initial weather. [5]

The steady state vector is defined as:

but converges to a strictly positive vector only if P is a regular transition matrix (that is, there is at least one Pn with all non-zero entries).

Since q is independent from initial conditions, it must be unchanged when transformed by P. [5] This makes it an eigenvector (with eigenvalue 1), and means it can be derived from P. [5]

In layman's terms, the steady-state vector is the vector that, when we multiply it by P, we get the exact same vector back. [6] For the weather example, we can use this to set up a matrix equation:

and since they are a probability vector we know that

Solving this pair of simultaneous equations gives the steady state vector:

In conclusion, in the long term about 83.3% of days are sunny. Not all Markov processes have a steady state vector. In particular, the transition matrix must be regular. Otherwise, the state vectors will oscillate over time without converging.

Stock market

Using a directed graph, the probabilities of the possible states a hypothetical stock market can exhibit is represented. The matrix on the left shows how probabilities corresponding to different states can be arranged in matrix form. Finance Markov chain example state space.svg
Using a directed graph, the probabilities of the possible states a hypothetical stock market can exhibit is represented. The matrix on the left shows how probabilities corresponding to different states can be arranged in matrix form.

A state diagram for a simple example is shown in the figure on the right, using a directed graph to picture the state transitions. The states represent whether a hypothetical stock market is exhibiting a bull market, bear market, or stagnant market trend during a given week. According to the figure, a bull week is followed by another bull week 90% of the time, a bear week 7.5% of the time, and a stagnant week the other 2.5% of the time. Labeling the state space {1 = bull, 2 = bear, 3 = stagnant} the transition matrix for this example is

The distribution over states can be written as a stochastic row vector x with the relation x(n + 1) = x(n)P. So if at time n the system is in state x(n), then three time periods later, at time n + 3 the distribution is

In particular, if at time n the system is in state 2 (bear), then at time n + 3 the distribution is

Using the transition matrix it is possible to calculate, for example, the long-term fraction of weeks during which the market is stagnant, or the average number of weeks it will take to go from a stagnant to a bull market. Using the transition probabilities, the steady-state probabilities indicate that 62.5% of weeks will be in a bull market, 31.25% of weeks will be in a bear market and 6.25% of weeks will be stagnant, since:

A thorough development and many examples can be found in the on-line monograph Meyn & Tweedie 2005. [7]

A finite-state machine can be used as a representation of a Markov chain. Assuming a sequence of independent and identically distributed input signals (for example, symbols from a binary alphabet chosen by coin tosses), if the machine is in state y at time n, then the probability that it moves to state x at time n + 1 depends only on the current state.


A birth–death process

If one pops one hundred kernels of popcorn in an oven, each kernel popping at an independent exponentially-distributed time, then this would be a continuous-time Markov process. If denotes the number of kernels which have popped up to time t, the problem can be defined as finding the number of kernels that will pop in some later time. The only thing one needs to know is the number of kernels that have popped prior to the time "t". It is not necessary to know when they popped, so knowing for previous times "t" is not relevant.

The process described here is an approximation of a Poisson point process – Poisson processes are also Markov processes.

See also

Related Research Articles

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables, each of which clusters around a mean value.

<span class="mw-page-title-main">Markov chain</span> Random process independent of past history

A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happens next depends only on the state of affairs now." A countably infinite sequence, in which the chain moves state at discrete time steps, gives a discrete-time Markov chain (DTMC). A continuous-time process is called a continuous-time Markov chain (CTMC). Markov processes are named in honor of the Russian mathematician Andrey Markov.

In linear algebra, an orthogonal matrix, or orthonormal matrix, is a real square matrix whose columns and rows are orthonormal vectors.

In linear algebra, a column vector with elements is an matrix consisting of a single column of entries, for example,

<span class="mw-page-title-main">Covariance matrix</span> Measure of covariance of components of a random vector

In probability theory and statistics, a covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector.

In vector calculus, the Jacobian matrix of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variables as input as the number of vector components of its output, its determinant is referred to as the Jacobian determinant. Both the matrix and the determinant are often referred to simply as the Jacobian in literature.

In the mathematical field of differential geometry, a metric tensor is an additional structure on a manifold M that allows defining distances and angles, just as the inner product on a Euclidean space allows defining distances and angles there. More precisely, a metric tensor at a point p of M is a bilinear form defined on the tangent space at p, and a metric field on M consists of a metric tensor at each point p of M that varies smoothly with p.

In mathematics, a stochastic matrix is a square matrix used to describe the transitions of a Markov chain. Each of its entries is a nonnegative real number representing a probability. It is also called a probability matrix, transition matrix, substitution matrix, or Markov matrix. The stochastic matrix was first developed by Andrey Markov at the beginning of the 20th century, and has found use throughout a wide variety of scientific fields, including probability theory, statistics, mathematical finance and linear algebra, as well as computer science and population genetics. There are several different definitions and types of stochastic matrices:

In control engineering and system identification, a state-space representation is a mathematical model of a physical system specified as a set of input, output, and variables related by first-order differential equations or difference equations. Such variables, called state variables, evolve over time in a way that depends on the values they have at any given instant and on the externally imposed values of input variables. Output variables’ values depend on the state variable values and may also depend on the input variable values.

In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation in Euclidean space. For example, using the convention below, the matrix

A continuous-time Markov chain (CTMC) is a continuous stochastic process in which, for each state, the process will change state according to an exponential random variable and then move to a different state as specified by the probabilities of a stochastic matrix. An equivalent formulation describes the process as changing state according to the least value of a set of exponential random variables, one for each possible state it can move to, with the parameters determined by the current state.

The Leslie matrix is a discrete, age-structured model of population growth that is very popular in population ecology named after Patrick H. Leslie. The Leslie matrix is one of the most well-known ways to describe the growth of populations, in which a population is closed to migration, growing in an unlimited environment, and where only one sex, usually the female, is considered.

In mathematics, matrix calculus is a specialized notation for doing multivariable calculus, especially over spaces of matrices. It collects the various partial derivatives of a single function with respect to many variables, and/or of a multivariate function with respect to a single variable, into vectors and matrices that can be treated as single entities. This greatly simplifies operations such as finding the maximum or minimum of a multivariate function and solving systems of differential equations. The notation used here is commonly used in statistics and engineering, while the tensor index notation is preferred in physics.

In linear algebra, it is often important to know which vectors have their directions unchanged by a given linear transformation. An eigenvector or characteristic vector is such a vector. More precisely, an eigenvector of a linear transformation is scaled by a constant factor when the linear transformation is applied to it: . The corresponding eigenvalue, characteristic value, or characteristic root is the multiplying factor .

A phase-type distribution is a probability distribution constructed by a convolution or mixture of exponential distributions. It results from a system of one or more inter-related Poisson processes occurring in sequence, or phases. The sequence in which each of the phases occurs may itself be a stochastic process. The distribution can be represented by a random variable describing the time until absorption of a Markov process with one absorbing state. Each of the states of the Markov process represents one of the phases.

In geometry, various formalisms exist to express a rotation in three dimensions as a mathematical transformation. In physics, this concept is applied to classical mechanics where rotational kinematics is the science of quantitative description of a purely rotational motion. The orientation of an object at a given instant is described with the same tools, as it is defined as an imaginary rotation from a reference placement in space, rather than an actually observed rotation from a previous placement in space.

In quantum computing, quantum finite automata (QFA) or quantum state machines are a quantum analog of probabilistic automata or a Markov decision process. They provide a mathematical abstraction of real-world quantum computers. Several types of automata may be defined, including measure-once and measure-many automata. Quantum finite automata can also be understood as the quantization of subshifts of finite type, or as a quantization of Markov chains. QFAs are, in turn, special cases of geometric finite automata or topological finite automata.

In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matrix being factorized is a normal or real symmetric matrix, the decomposition is called "spectral decomposition", derived from the spectral theorem.

In the mathematical theory of probability, an absorbing Markov chain is a Markov chain in which every state can reach an absorbing state. An absorbing state is a state that, once entered, cannot be left.

The biochemical systems equation is a compact equation of nonlinear differential equations for describing a kinetic model for any network of coupled biochemical reactions and transport processes.


  1. Øksendal, B. K. (Bernt Karsten), 1945- (2003). Stochastic differential equations : an introduction with applications (6th ed.). Berlin: Springer. ISBN   3540047581. OCLC   52203046.{{cite book}}: CS1 maint: multiple names: authors list (link) CS1 maint: numeric names: authors list (link)
  2. Markov, A. A. "An example of statistical analysis of the text of eugene onegin illustrating the association of trials into a Chain." Bulletin de lAcadamie Imperiale des Sciences de St. Petersburg, ser 6 (1913): 153162.
  3. Grinstead and Snell’s Introduction to Probability , page 465
  4. 1 2 3 Van Kampen, N.G. (2007). Stochastic Processes in Physics and Chemistry . NL: North Holland Elsevier. pp.  73–95. ISBN   978-0-444-52965-7.
  5. 1 2 3 4 Van Kampen, N.G. (2007). Stochastic Processes in Physics and Chemistry . NL: North Holland Elsevier. pp.  73–95. ISBN   978-0-444-52965-7.
  6. "Going steady (state) with Markov processes". Bloomington Tutors.
  7. S. P. Meyn and R.L. Tweedie, 2005. Markov Chains and Stochastic Stability Archived 2013-09-03 at the Wayback Machine