Absorbing Markov chain

Last updated
A (finite) drunkard's walk is an example of an absorbing Markov chain. Drunkard's walk.svg
A (finite) drunkard's walk is an example of an absorbing Markov chain.

In the mathematical theory of probability, an absorbing Markov chain is a Markov chain in which every state can reach an absorbing state. An absorbing state is a state that, once entered, cannot be left.

Contents

Like general Markov chains, there can be continuous-time absorbing Markov chains with an infinite state space. However, this article concentrates on the discrete-time discrete-state-space case.

Formal definition

A Markov chain is an absorbing chain if [1] [2]

  1. there is at least one absorbing state and
  2. it is possible to go from any state to at least one absorbing state in a finite number of steps.

In an absorbing Markov chain, a state that is not absorbing is called transient.

Canonical form

Let an absorbing Markov chain with transition matrix P have t transient states and r absorbing states. Unlike a typical transition matrix, the rows of P represent sources, while columns represent destinations. Then

where Q is a t-by-t matrix, R is a nonzero t-by-r matrix, 0 is an r-by-t zero matrix, and Ir is the r-by-r identity matrix. Thus, Q describes the probability of transitioning from some transient state to another while R describes the probability of transitioning from some transient state to some absorbing state.

The probability of transitioning from i to j in exactly k steps is the (i,j)-entry of Pk, further computed below. When considering only transient states, the probability found in the upper left of Pk, the (i,j)-entry of Qk.

Fundamental matrix

Expected number of visits to a transient state

A basic property about an absorbing Markov chain is the expected number of visits to a transient state j starting from a transient state i (before being absorbed). This can be established to be given by the (i, j) entry of so-called fundamental matrix N, obtained by summing Qk for all k (from 0 to ∞). It can be proven that

where It is the t-by-t identity matrix. The computation of this formula is the matrix equivalent of the geometric series of scalars, .

With the matrix N in hand, also other properties of the Markov chain are easy to obtain. [2]

Expected number of steps before being absorbed

The expected number of steps before being absorbed in any absorbing state, when starting in transient state i can be computed via a sum over transient states. The value is given by the ith entry of the vector

where 1 is a length-t column vector whose entries are all 1.

Absorbing probabilities

By induction,

The probability of eventually being absorbed in the absorbing state j when starting from transient state i is given by the (i,j)-entry of the matrix

.

The number of columns of this matrix equals the number of absorbing states r.

An approximation of those probabilities can also be obtained directly from the (i,j)-entry of for a large enough value of k, when i is the index of a transient, and j the index of an absorbing state. This is because

.

Transient visiting probabilities

The probability of visiting transient state j when starting at a transient state i is the (i,j)-entry of the matrix

where Ndg is the diagonal matrix with the same diagonal as N.

Variance on number of transient visits

The variance on the number of visits to a transient state j with starting at a transient state i (before being absorbed) is the (i,j)-entry of the matrix

where Nsq is the Hadamard product of N with itself (i.e. each entry of N is squared).

Variance on number of steps

The variance on the number of steps before being absorbed when starting in transient state i is the ith entry of the vector

where tsq is the Hadamard product of t with itself (i.e., as with Nsq, each entry of t is squared).

Examples

String generation

Consider the process of repeatedly flipping a fair coin until the sequence (heads, tails, heads) appears. This process is modeled by an absorbing Markov chain with transition matrix

A Markov Chain with 4 states for the String Generation problem. Markov Chain for String Generation Example.png
A Markov Chain with 4 states for the String Generation problem.

The first state represents the empty string, the second state the string "H", the third state the string "HT", and the fourth state the string "HTH". Although in reality, the coin flips cease after the string "HTH" is generated, the perspective of the absorbing Markov chain is that the process has transitioned into the absorbing state representing the string "HTH" and, therefore, cannot leave.

For this absorbing Markov chain, the fundamental matrix is

The expected number of steps starting from each of the transient states is

Therefore, the expected number of coin flips before observing the sequence (heads, tails, heads) is 10, the entry for the state representing the empty string.

The cumulative probability of finishing a game of Snakes and Ladders by turn N Probability of winning Snakes and Ladders by turns.svg
The cumulative probability of finishing a game of Snakes and Ladders by turn N

Games of chance

Games based entirely on chance can be modeled by an absorbing Markov chain. A classic example of this is the ancient Indian board game Snakes and Ladders. The graph on the left [3] plots the probability mass in the lone absorbing state that represents the final square as the transition matrix is raised to larger and larger powers. To determine the expected number of turns to complete the game, compute the vector t as described above and examine tstart, which is approximately 39.2.

Infectious disease testing

Infectious disease testing, either of blood products or in medical clinics, is often taught as an example of an absorbing Markov chain. [4] The public U.S. Centers for Disease Control and Prevention (CDC) model for HIV and for hepatitis B, for example, [5] illustrates the property that absorbing Markov chains can lead to the detection of disease, versus the loss of detection through other means.

In the standard CDC model, the Markov chain has five states, a state in which the individual is uninfected, then a state with infected but undetectable virus, a state with detectable virus, and absorbing states of having quit/been lost from the clinic, or of having been detected (the goal). The typical rates of transition between the Markov states are the probability p per unit time of being infected with the virus, w for the rate of window period removal (time until virus is detectable), q for quit/loss rate from the system, and d for detection, assuming a typical rate at which the health system administers tests of the blood product or patients in question.

Classical example of HIV or hepatitis virus screening model Hivmodelmarkov.png
Classical example of HIV or hepatitis virus screening model

It follows that we can "walk along" the Markov model to identify the overall probability of detection for a person starting as undetected, by multiplying the probabilities of transition to each next state of the model as:

.

The subsequent total absolute number of false negative tests—the primary CDC concern—would then be the rate of tests, multiplied by the probability of reaching the infected but undetectable state, times the duration of staying in the infected undetectable state:

.

See also

Related Research Articles

In mathematics, a symmetric matrix with real entries is positive-definite if the real number is positive for every nonzero real column vector where is the transpose of . More generally, a Hermitian matrix is positive-definite if the real number is positive for every nonzero complex column vector where denotes the conjugate transpose of

<span class="mw-page-title-main">Markov chain</span> Random process independent of past history

A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happens next depends only on the state of affairs now." A countably infinite sequence, in which the chain moves state at discrete time steps, gives a discrete-time Markov chain (DTMC). A continuous-time process is called a continuous-time Markov chain (CTMC). It is named after the Russian mathematician Andrey Markov.

<span class="mw-page-title-main">Matrix multiplication</span> Mathematical operation in linear algebra

In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first and the number of columns of the second matrix. The product of matrices A and B is denoted as AB.

<span class="mw-page-title-main">Kalman filter</span> Algorithm that estimates unknowns from a series of measurements over time

For statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe. The filter is named after Rudolf E. Kálmán, who was one of the primary developers of its theory.

<span class="mw-page-title-main">Covariance matrix</span> Measure of covariance of components of a random vector

In probability theory and statistics, a covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector.

This article contains examples of Markov chains and Markov processes in action.

In mathematics, a stochastic matrix is a square matrix used to describe the transitions of a Markov chain. Each of its entries is a nonnegative real number representing a probability. It is also called a probability matrix, transition matrix, substitution matrix, or Markov matrix. The stochastic matrix was first developed by Andrey Markov at the beginning of the 20th century, and has found use throughout a wide variety of scientific fields, including probability theory, statistics, mathematical finance and linear algebra, as well as computer science and population genetics. There are several different definitions and types of stochastic matrices:

In linear algebra, a QR decomposition, also known as a QR factorization or QU factorization, is a decomposition of a matrix A into a product A = QR of an orthonormal matrix Q and an upper triangular matrix R. QR decomposition is often used to solve the linear least squares (LLS) problem and is the basis for a particular eigenvalue algorithm, the QR algorithm.

In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation in Euclidean space. For example, using the convention below, the matrix

A continuous-time Markov chain (CTMC) is a continuous stochastic process in which, for each state, the process will change state according to an exponential random variable and then move to a different state as specified by the probabilities of a stochastic matrix. An equivalent formulation describes the process as changing state according to the least value of a set of exponential random variables, one for each possible state it can move to, with the parameters determined by the current state.

In linear algebra, it is often important to know which vectors have their directions unchanged by a given linear transformation. An eigenvector or characteristic vector is such a vector. Thus an eigenvector of a linear transformation is scaled by a constant factor when the linear transformation is applied to it: . The corresponding eigenvalue, characteristic value, or characteristic root is the multiplying factor .

In queueing theory, a discipline within the mathematical theory of probability, a Jackson network is a class of queueing network where the equilibrium distribution is particularly simple to compute as the network has a product-form solution. It was the first significant development in the theory of networks of queues, and generalising and applying the ideas of the theorem to search for similar product-form solutions in other networks has been the subject of much research, including ideas used in the development of the Internet. The networks were first identified by James R. Jackson and his paper was re-printed in the journal Management Science’s ‘Ten Most Influential Titles of Management Sciences First Fifty Years.’

<span class="mw-page-title-main">Discrete-time Markov chain</span> Probability concept

In probability, a discrete-time Markov chain (DTMC) is a sequence of random variables, known as a stochastic process, in which the value of the next variable depends only on the value of the current variable, and not any variables in the past. For instance, a machine may have two states, A and E. When it is in state A, there is a 40% chance of it moving to state E and a 60% chance of it remaining in state A. When it is in state E, there is a 70% chance of it moving to A and a 30% chance of it staying in E. The sequence of states of the machine is a Markov chain. If we denote the chain by then is the state which the machine starts in and is the random variable describing its state after 10 transitions. The process continues forever, indexed by the natural numbers.

A phase-type distribution is a probability distribution constructed by a convolution or mixture of exponential distributions. It results from a system of one or more inter-related Poisson processes occurring in sequence, or phases. The sequence in which each of the phases occurs may itself be a stochastic process. The distribution can be represented by a random variable describing the time until absorption of a Markov process with one absorbing state. Each of the states of the Markov process represents one of the phases.

In statistical mechanics, the Zimm–Bragg model is a helix-coil transition model that describes helix-coil transitions of macromolecules, usually polymer chains. Most models provide a reasonable approximation of the fractional helicity of a given polypeptide; the Zimm–Bragg model differs by incorporating the ease of propagation (self-replication) with respect to nucleation. It is named for co-discoverers Bruno H. Zimm and J. K. Bragg.

A number of different Markov models of DNA sequence evolution have been proposed. These substitution models differ in terms of the parameters used to describe the rates at which one nucleotide replaces another during evolution. These models are frequently used in molecular phylogenetic analyses. In particular, they are used during the calculation of likelihood of a tree and they are used to estimate the evolutionary distance between sequences from the observed differences between the sequences.

In quantum computing, quantum finite automata (QFA) or quantum state machines are a quantum analog of probabilistic automata or a Markov decision process. They provide a mathematical abstraction of real-world quantum computers. Several types of automata may be defined, including measure-once and measure-many automata. Quantum finite automata can also be understood as the quantization of subshifts of finite type, or as a quantization of Markov chains. QFAs are, in turn, special cases of geometric finite automata or topological finite automata.

The discrete phase-type distribution is a probability distribution that results from a system of one or more inter-related geometric distributions occurring in sequence, or phases. The sequence in which each of the phases occur may itself be a stochastic process. The distribution can be represented by a random variable describing the time until absorption of an absorbing Markov chain with one absorbing state. Each of the states of the Markov chain represents one of the phases.

A dependability state diagram is a method for modelling a system as a Markov chain. It is used in reliability engineering for availability and reliability analysis.

In probability theory, uniformization method, is a method to compute transient solutions of finite state continuous-time Markov chains, by approximating the process by a discrete-time Markov chain. The original chain is scaled by the fastest transition rate γ, so that transitions occur at the same rate in every state, hence the name. The method is simple to program and efficiently calculates an approximation to the transient distribution at a single point in time. The method was first introduced by Winfried Grassmann in 1977.

References

  1. 1 2 Grinstead, Charles M.; Snell, J. Laurie (July 1997). "Ch. 11: Markov Chains" (PDF). Introduction to Probability. American Mathematical Society. ISBN   978-0-8218-0749-1.
  2. 1 2 Kemeny, John G.; Snell, J. Laurie (July 1976) [1960]. "Ch. 3: Absorbing Markov Chains". In Gehring, F. W.; Halmos, P. R. (eds.). Finite Markov Chains (Second ed.). New York Berlin Heidelberg Tokyo: Springer-Verlag. pp.  224. ISBN   978-0-387-90192-3.
  3. Based on the definition found in S. C. Althoen; L. King; K. Schilling (March 1993). "How Long Is a Game of Snakes and Ladders?". The Mathematical Gazette. 78 (478). The Mathematical Gazette, Vol. 77, No. 478: 71–76. doi:10.2307/3619261. JSTOR   3619261.
  4. results, search (1998-07-28). Markov Chains. Cambridge: Cambridge University Press. ISBN   9780521633963.
  5. Sanders, Gillian D.; Anaya, Henry D.; Asch, Steven; Hoang, Tuyen; Golden, Joya F.; Bayoumi, Ahmed M.; Owens, Douglas K. (June 2010). "Cost-Effectiveness of Strategies to Improve HIV Testing and Receipt of Results: Economic Analysis of a Randomized Controlled Trial". Journal of General Internal Medicine. 25 (6): 556–563. doi:10.1007/s11606-010-1265-5. ISSN   0884-8734. PMC   2869414 . PMID   20204538.