Examples of Markov chains

Last updated

This article contains examples of Markov chains and Markov processes in action.

Contents

All examples are in the countable state space. For an overview of Markov chains in general state space, see Markov chains on a measurable state space.

Discrete-time

Board games played with dice

A game of snakes and ladders or any other game whose moves are determined entirely by dice is a Markov chain, indeed, an absorbing Markov chain. This is in contrast to card games such as blackjack, where the cards represent a 'memory' of the past moves. To see the difference, consider the probability for a certain event in the game. In the above-mentioned dice games, the only thing that matters is the current state of the board. The next state of the board depends on the current state, and the next roll of the dice. It doesn't depend on how things got to their current state. In a game such as blackjack, a player can gain an advantage by remembering which cards have already been shown (and hence which cards are no longer in the deck), so the next state (or hand) of the game is not independent of the past states.

Random walk Markov chains

A center-biased random walk

Consider a random walk on the number line where, at each step, the position (call it x) may change by +1 (to the right) or −1 (to the left) with probabilities:

(where c is a constant greater than 0)

For example, if the constant, c, equals 1, the probabilities of a move to the left at positions x = −2,−1,0,1,2 are given by respectively. The random walk has a centering effect that weakens as c increases.

Since the probabilities depend only on the current position (value of x) and not on any prior positions, this biased random walk satisfies the definition of a Markov chain.

Gambling

Suppose that you start with $10, and you wager $1 on an unending, fair, coin toss indefinitely, or until you lose all of your money. If represents the number of dollars you have after n tosses, with , then the sequence is a Markov process. If I know that you have $12 now, then it would be expected that with even odds, you will either have $11 or $13 after the next toss. This guess is not improved by the added knowledge that you started with $10, then went up to $11, down to $10, up to $11, and then to $12. The fact that the guess is not improved by the knowledge of earlier tosses showcases the Markov property, the memoryless property of a stochastic process. [1] [2]

A simple weather model

The probabilities of weather conditions (modeled as either rainy or sunny), given the weather on the preceding day, can be represented by a transition matrix: [3]

The matrix P represents the weather model in which a sunny day is 90% likely to be followed by another sunny day, and a rainy day is 50% likely to be followed by another rainy day. [3] The columns can be labelled "sunny" and "rainy", and the rows can be labelled in the same order.

The above matrix as a graph. Markov Chain weather model matrix as a graph.png
The above matrix as a graph.

(P)i j is the probability that, if a given day is of type i, it will be followed by a day of type j.

Notice that the rows of P sum to 1: this is because P is a stochastic matrix. [3]

Predicting the weather

The weather on day 0 (today) is known to be sunny. This is represented by an initial state vector in which the "sunny" entry is 100%, and the "rainy" entry is 0%:

The weather on day 1 (tomorrow) can be predicted by multiplying the state vector from day 0 by the transition matrix:

Thus, there is a 90% chance that day 1 will also be sunny.

The weather on day 2 (the day after tomorrow) can be predicted in the same way, from the state vector we computed for day 1:

or

General rules for day n are:

Steady state of the weather

In this example, predictions for the weather on more distant days change less and less on each subsequent day and tend towards a steady state vector. [4] This vector represents the probabilities of sunny and rainy weather on all days, and is independent of the initial weather. [4]

The steady state vector is defined as:

but converges to a strictly positive vector only if P is a regular transition matrix (that is, there is at least one Pn with all non-zero entries).

Since q is independent from initial conditions, it must be unchanged when transformed by P. [4] This makes it an eigenvector (with eigenvalue 1), and means it can be derived from P. [4]

In layman's terms, the steady-state vector is the vector that, when we multiply it by P, we get the exact same vector back. [5] For the weather example, we can use this to set up a matrix equation:

and since they are a probability vector we know that

Solving this pair of simultaneous equations gives the steady state vector:

In conclusion, in the long term about 83.3% of days are sunny. It is important to realize that not all Markov processes have a steady state vector. In particular, the transition matrix must be regular. Otherwise, the state vectors will oscillate over time without converging.

Stock market

Using a directed graph, the probabilities of the possible states a hypothetical stock market can exhibit is represented. The matrix on the left shows how probabilities corresponding to different states can be arranged in matrix form. Finance Markov chain example state space.svg
Using a directed graph, the probabilities of the possible states a hypothetical stock market can exhibit is represented. The matrix on the left shows how probabilities corresponding to different states can be arranged in matrix form.

A state diagram for a simple example is shown in the figure on the right, using a directed graph to picture the state transitions. The states represent whether a hypothetical stock market is exhibiting a bull market, bear market, or stagnant market trend during a given week. According to the figure, a bull week is followed by another bull week 90% of the time, a bear week 7.5% of the time, and a stagnant week the other 2.5% of the time. Labeling the state space {1 = bull, 2 = bear, 3 = stagnant} the transition matrix for this example is

The distribution over states can be written as a stochastic row vector x with the relation x(n + 1) = x(n)P. So if at time n the system is in state x(n), then three time periods later, at time n + 3 the distribution is

Markov Chains prediction on 3 discrete steps based on the transition matrix from the example to the left. Markov Chains prediction on n=3..png
Markov Chains prediction on 3 discrete steps based on the transition matrix from the example to the left.

In particular, if at time n the system is in state 2 (bear), then at time n + 3 the distribution is

Markov chains prediction on 50 discrete steps. Again, the transition matrix from the left is used. Markov Chains prediction on 50 discrete steps..png
Markov chains prediction on 50 discrete steps. Again, the transition matrix from the left is used.

Using the transition matrix it is possible to calculate, for example, the long-term fraction of weeks during which the market is stagnant, or the average number of weeks it will take to go from a stagnant to a bull market. Using the transition probabilities, the steady-state probabilities indicate that 62.5% of weeks will be in a bull market, 31.25% of weeks will be in a bear market and 6.25% of weeks will be stagnant, since:

A thorough development and many examples can be found in the on-line monograph Meyn & Tweedie 2005. [7]

A finite-state machine can be used as a representation of a Markov chain. Assuming a sequence of independent and identically distributed input signals (for example, symbols from a binary alphabet chosen by coin tosses), if the machine is in state y at time n, then the probability that it moves to state x at time n + 1 depends only on the current state.

Continuous-time

A birth–death process

If one pops one hundred kernels of popcorn in an oven, each kernel popping at an independent exponentially-distributed time, then this would be a continuous-time Markov process. If denotes the number of kernels which have popped up to time t, the problem can be defined as finding the number of kernels that will pop in some later time. The only thing one needs to know is the number of kernels that have popped prior to the time "t". It is not necessary to know when they popped, so knowing for previous times "t" is not relevant.

The process described here is an approximation of a Poisson point process – Poisson processes are also Markov processes.

See also

Related Research Articles

Gradient Multi-variable generalization of the derivative of a function

In vector calculus, the gradient of a scalar-valued differentiable function f of several variables is the vector field whose value at a point is the vector whose components are the partial derivatives of at . That is, for , its gradient is defined at the point in n-dimensional space as the vector:

Multivariate normal distribution Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

Markov chain Random process independent of past history

A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. A countably infinite sequence, in which the chain moves state at discrete time steps, gives a discrete-time Markov chain (DTMC). A continuous-time process is called a continuous-time Markov chain (CTMC). It is named after the Russian mathematician Andrey Markov.

In linear algebra, a column vector is a column of entries, for example,

In vector calculus, the Jacobian matrix of a vector-valued function of several variables is the matrix of all its first-order partial derivatives. When this matrix is square, that is, when the function takes the same number of variables as input as the number of vector components of its output, its determinant is referred to as the Jacobian determinant. Both the matrix and the determinant are often referred to simply as the Jacobian in literature.

In the mathematical field of differential geometry, one definition of a metric tensor is a type of function which takes as input a pair of tangent vectors v and w at a point of a surface and produces a real number scalar g(v, w) in a way that generalizes many of the familiar properties of the dot product of vectors in Euclidean space. In the same way as a dot product, metric tensors are used to define the length of and angle between tangent vectors. Through integration, the metric tensor allows one to define and compute the length of curves on the manifold.

In mathematics, a stochastic matrix is a square matrix used to describe the transitions of a Markov chain. Each of its entries is a nonnegative real number representing a probability. It is also called a probability matrix, transition matrix, substitution matrix, or Markov matrix. The stochastic matrix was first developed by Andrey Markov at the beginning of the 20th century, and has found use throughout a wide variety of scientific fields, including probability theory, statistics, mathematical finance and linear algebra, as well as computer science and population genetics. There are several different definitions and types of stochastic matrices:

In mathematics, particularly in matrix theory, a permutation matrix is a square binary matrix that has exactly one entry of 1 in each row and each column and 0s elsewhere. Each such matrix, say P, represents a permutation of m elements and, when used to multiply another matrix, say A, results in permuting the rows or columns of the matrix A.

In mathematics, the Hessian matrix or Hessian is a square matrix of second-order partial derivatives of a scalar-valued function, or scalar field. It describes the local curvature of a function of many variables. The Hessian matrix was developed in the 19th century by the German mathematician Ludwig Otto Hesse and later named after him. Hesse originally used the term "functional determinants".

In control engineering, a state-space representation is a mathematical model of a physical system as a set of input, output and state variables related by first-order differential equations or difference equations. State variables are variables whose values evolve over time in a way that depends on the values they have at any given time and on the externally imposed values of input variables. Output variables’ values depend on the values of the state variables.

In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation in Euclidean space. For example, using the convention below, the matrix

A continuous-time Markov chain (CTMC) is a continuous stochastic process in which, for each state, the process will change state according to an exponential random variable and then move to a different state as specified by the probabilities of a stochastic matrix. An equivalent formulation describes the process as changing state according to the least value of a set of exponential random variables, one for each possible state it can move to, with the parameters determined by the current state.

In applied mathematics, the Leslie matrix is a discrete, age-structured model of population growth that is very popular in population ecology. It was invented by and named after Patrick H. Leslie. The Leslie matrix is one of the most well known ways to describe the growth of populations, in which a population is closed to migration, growing in an unlimited environment, and where only one sex, usually the female, is considered.

In mathematics, matrix calculus is a specialized notation for doing multivariable calculus, especially over spaces of matrices. It collects the various partial derivatives of a single function with respect to many variables, and/or of a multivariate function with respect to a single variable, into vectors and matrices that can be treated as single entities. This greatly simplifies operations such as finding the maximum or minimum of a multivariate function and solving systems of differential equations. The notation used here is commonly used in statistics and engineering, while the tensor index notation is preferred in physics.

A phase-type distribution is a probability distribution constructed by a convolution or mixture of exponential distributions. It results from a system of one or more inter-related Poisson processes occurring in sequence, or phases. The sequence in which each of the phases occur may itself be a stochastic process. The distribution can be represented by a random variable describing the time until absorption of a Markov process with one absorbing state. Each of the states of the Markov process represents one of the phases.

In quantum computing, quantum finite automata (QFA) or quantum state machines are a quantum analog of probabilistic automata or a Markov decision process. They provide a mathematical abstraction of real-world quantum computers. Several types of automata may be defined, including measure-once and measure-many automata. Quantum finite automata can also be understood as the quantization of subshifts of finite type, or as a quantization of Markov chains. QFAs are, in turn, special cases of geometric finite automata or topological finite automata.

The discrete phase-type distribution is a probability distribution that results from a system of one or more inter-related geometric distributions occurring in sequence, or phases. The sequence in which each of the phases occur may itself be a stochastic process. The distribution can be represented by a random variable describing the time until absorption of an absorbing Markov chain with one absorbing state. Each of the states of the Markov chain represents one of the phases.

In linear algebra, eigendecomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. When the matrix being factorized is normal or real symmetric matrix, the decomposition is called "spectral decomposition", derived from the Spectral theorem.

In the mathematical theory of probability, an absorbing Markov chain is a Markov chain in which every state can reach an absorbing state. An absorbing state is a state that, once entered, cannot be left.

In probability theory, the matrix analytic method is a technique to compute the stationary probability distribution of a Markov chain which has a repeating structure and a state space which grows unboundedly in no more than one dimension. Such models are often described as M/G/1 type Markov chains because they can describe transitions in an M/G/1 queue. The method is a more complicated version of the matrix geometric method and is the classical solution method for M/G/1 chains.

References

  1. Øksendal, B. K. (Bernt Karsten), 1945- (2003). Stochastic differential equations : an introduction with applications (6th ed.). Berlin: Springer. ISBN   3540047581. OCLC   52203046.CS1 maint: multiple names: authors list (link)
  2. Gagniuc, Paul A. (2017). Markov Chains: From Theory to Implementation and Experimentation. USA, NJ: John Wiley & Sons. pp. 1–235. ISBN   978-1-119-38755-8.
  3. 1 2 3 Van Kampen, N.G. (2007). Stochastic Processes in Physics and Chemistry . NL: North Holland Elsevier. pp.  73–95. ISBN   978-0-444-52965-7.
  4. 1 2 3 4 Van Kampen, N.G. (2007). Stochastic Processes in Physics and Chemistry . NL: North Holland Elsevier. pp.  73–95. ISBN   978-0-444-52965-7.
  5. "Going steady (state) with Markov processes". Bloomington Tutors.
  6. 1 2 Gagniuc, Paul A. (2017). Markov chains: from theory to implementation and experimentation. Hoboken, NJ: John Wiley & Sons. pp. 131–163. ISBN   9781119387572. OCLC   982373850.
  7. S. P. Meyn and R.L. Tweedie, 2005. Markov Chains and Stochastic Stability Archived 2013-09-03 at the Wayback Machine