Continuous-time Markov chain

Last updated April 26, 2024

A continuous-time Markov chain (CTMC) is a continuous stochastic process in which, for each state, the process will change state according to an exponential random variable and then move to a different state as specified by the probabilities of a stochastic matrix. An equivalent formulation describes the process as changing state according to the least value of a set of exponential random variables, one for each possible state it can move to, with the parameters determined by the current state.

An example of a CTMC with three states $\{0,1,2\}$ is as follows: the process makes a transition after the amount of time specified by the holding time—an exponential random variable $E_{i}$ , where i is its current state. Each random variable is independent and such that $E_{0}\sim {\text{Exp}}(6)$ , $E_{1}\sim {\text{Exp}}(12)$ and $E_{2}\sim {\text{Exp}}(18)$ . When a transition is to be made, the process moves according to the jump chain, a discrete-time Markov chain with stochastic matrix:

{\begin{bmatrix}0&{\frac {1}{2}}&{\frac {1}{2}}\\{\frac {1}{3}}&0&{\frac {2}{3}}\\{\frac {5}{6}}&{\frac {1}{6}}&0\end{bmatrix}}.

Equivalently, by the property of competing exponentials, this CTMC changes state from state i according to the minimum of two random variables, which are independent and such that $E_{i,j}\sim {\text{Exp}}(q_{i,j})$ for $i\neq j$ where the parameters are given by the Q-matrix $Q=(q_{i,j})$

{\begin{bmatrix}-6&3&3\\4&-12&8\\15&3&-18\end{bmatrix}}.

Each non-diagonal entry $q_{i,j}$ can be computed as the probability that the jump chain moves from state i to state j, divided by the expected holding time of state i. The diagonal entries are chosen so that each row sums to 0.

A CTMC satisfies the Markov property, that its behavior depends only on its current state and not on its past behavior, due to the memorylessness of the exponential distribution and of discrete-time Markov chains.

Definition

Let $(\Omega ,{\cal {A}},\Pr )$ be a probability space, let $S$ be a countable nonempty set, and let $T=\mathbb {R} _{\geq 0}$ ( $T$ for "time"). Equip $S$ with the discrete metric, so that we can make sense of right continuity of functions $\mathbb {R} _{\geq 0}\to S$ . A continuous-time Markov chain is defined by:^[1]

A probability vector $\lambda$ on $S$ (which below we will interpret as the initial distribution of the Markov chain), and
A rate matrix $Q$ on $S$ , that is, a function $Q:S^{2}\to \mathbb {R}$ such that

for all distinct $i,j\in S,0\leq q_{i,j}$ ,
for all $i\in S,$ $\sum _{j\in I:j\neq i}q_{i,j}=-q_{i,i}.$ (Even if $I$ is infinite, this sum is a priori well defined (possibly equalling $+\infty$ ) because each term appearing in the sum is nonnegative. A posteriori, we know the sum must also be finite (not equal to $+\infty$ ), since we're assuming it's equal to $-q_{i,i}$ and we've assumed $Q$ is real valued. Some authors instead use a definition that's word-for-word the same except for a modified stipulation $Q:S^{2}\to \mathbb {R} \cup \{-\infty \}$ , and say $Q$ is stable or totally stable to mean $\operatorname {range} Q\subseteq \mathbb {R}$ , i.e., every entry is real valued.)^[2]^[3]^[4]

Note that the row sums of $Q$ are 0: $\forall i\in S~\sum _{j\in I}q_{i,j}=0,$ or more succinctly, $Q\cdot 1=0$ . This situation contrasts with the situation for discrete-time Markov chains, where all row sums of the transition matrix equal unity.

Now, let $X:T\to S^{\Omega }$ such that $\forall t\in T~X(t)$ is $({\cal {A}},{\cal {P}}(S))$ -measurable. There are three equivalent ways to define $X$ being Markov with initial distribution $\lambda$ and rate matrix $Q$ : via transition probabilities or via the jump chain and holding times.^[5]

As a prelude to a transition-probability definition, we first motivate the definition of a regular rate matrix. We will use the transition-rate matrix $Q$ to specify the dynamics of the Markov chain by means of generating a collection of transition matrices $P(t)$ on $S$ ( $t\in \mathbb {R} _{\geq 0}$ ), via the following theorem.

Theorem: Existence of solution to Kolmogorov backward equations.^[6] — There exists $P\in ([0,1]^{S\times S})^{T}$ such that for all $i,j\in S$ the entry $(P(t)_{i,j})_{t\in T}$ is differentiable and $P$ satisfies the Kolmogorov backward equations:

P(0)=([i=j])_{i,j\in S},~\forall t\in T~\forall i,j\in S~~(P(t)_{i,j})'=\sum _{k\in S}q_{i,k}P(t)_{k,j}.

(0)

We say $Q$ is regular to mean that we do have uniqueness for the above system, i.e., that there exists exactly one solution.^[7]^[8] We say $Q$ is irregular to mean $Q$ is not regular. If $S$ is finite, then there is exactly one solution, namely $P=(e^{tQ})_{t\in T},$ and hence $Q$ is regular. Otherwise, $S$ is infinite, and there exist irregular transition-rate matrices on $S$ .^{[lower-alpha 1]} If $Q$ is regular, then for the unique solution $P$ , for each $t\in T$ , $P(t)$ will be a stochastic matrix.^[6] We will assume $Q$ is regular from the beginning of the following subsection up through the end of this section, even though it is conventional^[10]^[11]^[12] to not include this assumption. (Note for the expert: thus we are not defining continuous-time Markov chains in general but only non-explosive continuous-time Markov chains.)

Transition-probability definition

Let $P$ be the (unique) solution of the system ( 0 ). (Uniqueness guaranteed by our assumption that $Q$ is regular.) We say $X$ is Markov with initial distribution $\lambda$ and rate matrix $Q$ to mean: for any nonnegative integer $n\geq 0$ , for all $t_{0},\dots ,t_{n+1}\in T$ such that $t_{0}<\dots <t_{n+1},$ for all $i_{0},\dots ,i_{n+1}\in I,$

\Pr(X_{0}=i_{0},\dots ,X_{t_{n+1}}=i_{n+1})=\lambda _{i_{0}}\prod _{k\in \mathbb {Z} :0\leq k\leq n}P(t_{k+1}-t_{k})_{i_{k},i_{k+1}}

.^[10]

(1)

Using induction and the fact that $\forall A,B\in {\cal {A}}~~\Pr(B)\neq 0\rightarrow \Pr(A\cap B)=\Pr(A\mid B)\Pr(B),$ we can show the equivalence of the above statement containing ( 1 ) and the following statement: for all $i\in I,~\Pr(X_{0}=i)=\lambda _{i}$ and for any nonnegative integer $n\geq 0$ , for all $t_{0},\dots ,t_{n+1}\in T$ such that $t_{0}<\dots <t_{n+1},$ for all $i_{0},\dots ,i_{n+1}\in I$ such that $0<\Pr(X_{0}=i_{0},\dots ,X_{t_{n}}=i_{n})$ (it follows that $0<\Pr(X_{t_{n}}=i_{n})$ ),

\Pr(X_{t_{n+1}}=i_{n+1}\mid X_{t_{n}}=i_{n},\dots ,X_{t_{0}}=i_{0})=P(t_{n+1}-t_{n})_{i_{n},i_{n+1}}.

(2)

It follows from continuity of the functions $(P(t)_{i,j})_{t\in T}$ ( $i,j\in S$ ) that the trajectory $(X_{t}(\omega ))_{t\in T}$ is almost surely right continuous (with respect to the discrete metric on $S$ ): there exists a $\Pr$ -null set $N$ such that ${\displaystyle \{\omega \in \Omega$ .^[13]

Jump-chain/holding-time definition

Sequences associated to a right-continuous function

Let $f:T\to S$ be right continuous (when we equip $S$ with the discrete metric). Define

h=h(f)=(\inf\{u\in (0,+\infty ):f(t+u)\neq f(t)\})_{t\in T})\cup \{+\infty ,0\},

let

H=H(f)=(h^{\circ n}0)_{n\in \mathbb {Z} _{\geq 0}}

be the holding-time sequence associated to $f$ , choose $s\in S,$ and let

y=y(f)=\left({\begin{cases}f(\sum _{k\in n}H_{k})&{\text{ if }}\sum _{k\in n}H_{k}<+\infty ,\\s&{\text{ else}}\end{cases}}\right)_{n\in \omega }

be "the state sequence" associated to $f$ .

Definition of the jump matrix Π

The jump matrix $\Pi$ , alternatively written $\Pi (Q)$ if we wish to emphasize the dependence on $Q$ , is the matrix

\Pi =([i=j])_{i\in Z,j\in S}\cup \bigcup _{i\in S\setminus Z}(\{((i,j),(-Q_{i,i})^{-1}Q_{i,j}):j\in S\setminus \{i\}\}\cup \{((i,i),0)\}),

where $Z=Z(Q)=\{k\in S:q_{k,k}=0\}$ is the zero set of the function $(q_{k,k})_{k\in S}.$ ^[14]

Jump-chain/holding-time property

We say $X$ is Markov with initial distribution $\lambda$ and rate matrix $Q$ to mean: the trajectories of $X$ are almost surely right continuous, let $f$ be a modification of $X$ to have (everywhere) right-continuous trajectories, $\sum _{n\in \mathbb {Z} _{\geq 0}}H(f(\omega ))_{n}=+\infty$ almost surely (note to experts: this condition says $X$ is non-explosive), the state sequence $y(f(\omega ))$ is a discrete-time Markov chain with initial distribution $\lambda$ (jump-chain property) and transition matrix $\Pi (Q),$ and $\forall n\in \mathbb {Z} _{\geq 0}~\forall B\in {\cal {B}}(\mathbb {R} _{\geq 0})~\Pr(H_{n}(f)\in B)=\operatorname {Exp} (-q_{Y_{n},Y_{n}})(B)$ (holding-time property).

Infinitesimal definition

We say $X$ is Markov with initial distribution $\lambda$ and rate matrix $Q$ to mean: for all $i\in S,$ $\Pr(X(0)=i)=\lambda _{i}$ and for all $i,j$ , for all $t$ and for small strictly positive values of $h$ , the following holds for all $t\in T$ such that $0<\Pr(X(t)=i)$ :

\Pr(X(t+h)=j\mid X(t)=i)=[i=j]+q_{i,j}h+o(h)

,

where the term $[i=j]$ is $1$ if $i=j$ and otherwise $0$ , and the little-o term $o(h)$ depends in a certain way on $i,j,h$ .^[15]^[16]

The above equation shows that $q_{i,j}$ can be seen as measuring how quickly the transition from $i$ to $j$ happens for $i\neq j$ , and how quickly the transition away from $i$ happens for $i=j$ .

Properties

Communicating classes

Communicating classes, transience, recurrence and positive and null recurrence are defined identically as for discrete-time Markov chains.

Transient behaviour

Write P(t) for the matrix with entries p_ij = P(X_t = j | X₀ = i). Then the matrix P(t) satisfies the forward equation, a first-order differential equation

P'(t)=P(t)Q

,

where the prime denotes differentiation with respect to t. The solution to this equation is given by a matrix exponential

P(t)=e^{tQ}

.

In a simple case such as a CTMC on the state space {1,2}. The general Q matrix for such a process is the following 2 × 2 matrix with α,β > 0

Q={\begin{pmatrix}-\alpha &\alpha \\\beta &-\beta \end{pmatrix}}.

The above relation for forward matrix can be solved explicitly in this case to give

P(t)={\begin{pmatrix}{\frac {\beta }{\alpha +\beta }}+{\frac {\alpha }{\alpha +\beta }}e^{-(\alpha +\beta )t}&{\frac {\alpha }{\alpha +\beta }}-{\frac {\alpha }{\alpha +\beta }}e^{-(\alpha +\beta )t}\\{\frac {\beta }{\alpha +\beta }}-{\frac {\beta }{\alpha +\beta }}e^{-(\alpha +\beta )t}&{\frac {\alpha }{\alpha +\beta }}+{\frac {\beta }{\alpha +\beta }}e^{-(\alpha +\beta )t}\end{pmatrix}}

.

Computing direct solutions is complicated in larger matrices. The fact that Q is the generator for a semigroup of matrices

P(t+s)=e^{(t+s)Q}=e^{tQ}e^{sQ}=P(t)P(s)

is used.

Stationary distribution

The stationary distribution for an irreducible recurrent CTMC is the probability distribution to which the process converges for large values of t. Observe that for the two-state process considered earlier with P(t) given by

P(t)={\begin{pmatrix}{\frac {\beta }{\alpha +\beta }}+{\frac {\alpha }{\alpha +\beta }}e^{-(\alpha +\beta )t}&{\frac {\alpha }{\alpha +\beta }}-{\frac {\alpha }{\alpha +\beta }}e^{-(\alpha +\beta )t}\\{\frac {\beta }{\alpha +\beta }}-{\frac {\beta }{\alpha +\beta }}e^{-(\alpha +\beta )t}&{\frac {\alpha }{\alpha +\beta }}+{\frac {\beta }{\alpha +\beta }}e^{-(\alpha +\beta )t}\end{pmatrix}}

,

as t → ∞ the distribution tends to

P_{\pi }={\begin{pmatrix}{\frac {\beta }{\alpha +\beta }}&{\frac {\alpha }{\alpha +\beta }}\\{\frac {\beta }{\alpha +\beta }}&{\frac {\alpha }{\alpha +\beta }}\end{pmatrix}}

.

Observe that each row has the same distribution as this does not depend on starting state. The row vector $π$ may be found by solving

\pi Q=0

with the constraint

\sum _{i\in S}\pi _{i}=1

.

Example 1

The image to the right describes a continuous-time Markov chain with state-space {Bull market, Bear market, Stagnant market} and transition-rate matrix

Q={\begin{pmatrix}-0.025&0.02&0.005\\0.3&-0.5&0.2\\0.02&0.4&-0.42\end{pmatrix}}.

The stationary distribution of this chain can be found by solving $\pi Q=0$ , subject to the constraint that elements must sum to 1 to obtain

\pi ={\begin{pmatrix}0.885&0.071&0.044\end{pmatrix}}.

Example 2

The image to the right describes a discrete-time Markov chain modeling Pac-Man with state-space {1,2,3,4,5,6,7,8,9}. The player controls Pac-Man through a maze, eating pac-dots. Meanwhile, he is being hunted by ghosts. For convenience, the maze shall be a small 3x3-grid and the ghosts move randomly in horizontal and vertical directions. A secret passageway between states 2 and 8 can be used in both directions. Entries with probability zero are removed in the following transition-rate matrix:

$Q={\begin{pmatrix}-1&{\frac {1}{2}}&&{\frac {1}{2}}\\{\frac {1}{4}}&-1&{\frac {1}{4}}&&{\frac {1}{4}}&&&{\frac {1}{4}}\\&{\frac {1}{2}}&-1&&&{\frac {1}{2}}\\{\frac {1}{3}}&&&-1&{\frac {1}{3}}&&{\frac {1}{3}}\\&{\frac {1}{4}}&&{\frac {1}{4}}&-1&{\frac {1}{4}}&&{\frac {1}{4}}\\&&{\frac {1}{3}}&&{\frac {1}{3}}&-1&&&{\frac {1}{3}}\\&&&{\frac {1}{2}}&&&-1&{\frac {1}{2}}\\&{\frac {1}{4}}&&&{\frac {1}{4}}&&{\frac {1}{4}}&-1&{\frac {1}{4}}\\&&&&&{\frac {1}{2}}&&{\frac {1}{2}}&-1\end{pmatrix}}$

This Markov chain is irreducible, because the ghosts can fly from every state to every state in a finite amount of time. Due to the secret passageway, the Markov chain is also aperiodic, because the ghosts can move from any state to any state both in an even and in an uneven number of state transitions. Therefore, a unique stationary distribution exists and can be found by solving $\pi Q=0$ , subject to the constraint that elements must sum to 1. The solution of this linear equation subject to the constraint is $\pi =(7.7,15.4,7.7,11.5,15.4,11.5,7.7,15.4,7.7)\%.$ The central state and the border states 2 and 8 of the adjacent secret passageway are visited most and the corner states are visited least.

Time reversal

For a CTMC X_t, the time-reversed process is defined to be ${\hat {X}}_{t}=X_{T-t}$ . By Kelly's lemma this process has the same stationary distribution as the forward process.

A chain is said to be reversible if the reversed process is the same as the forward process. Kolmogorov's criterion states that the necessary and sufficient condition for a process to be reversible is that the product of transition rates around a closed loop must be the same in both directions.

Embedded Markov chain

One method of finding the stationary probability distribution, $π$ , of an ergodic continuous-time Markov chain, Q, is by first finding its embedded Markov chain (EMC). Strictly speaking, the EMC is a regular discrete-time Markov chain. Each element of the one-step transition probability matrix of the EMC, S, is denoted by s_ij, and represents the conditional probability of transitioning from state i into state j. These conditional probabilities may be found by

s_{ij}={\begin{cases}{\frac {q_{ij}}{\sum _{k\neq i}q_{ik}}}&{\text{if }}i\neq j,\\0&{\text{otherwise}}.\end{cases}}

From this, S may be written as

S=I-\left(\operatorname {diag} (Q)\right)^{-1}Q

where I is the identity matrix and diag(Q) is the diagonal matrix formed by selecting the main diagonal from the matrix Q and setting all other elements to zero.

To find the stationary probability distribution vector, we must next find $\varphi$ such that

\varphi S=\varphi ,

with $\varphi$ being a row vector, such that all elements in $\varphi$ are greater than 0 and $\|\varphi \|_{1}$ = 1. From this, $π$ may be found as

\pi ={-\varphi (\operatorname {diag} (Q))^{-1} \over \left\|\varphi (\operatorname {diag} (Q))^{-1}\right\|_{1}}.

(S may be periodic, even if Q is not. Once $π$ is found, it must be normalized to a unit vector.)

Another discrete-time process that may be derived from a continuous-time Markov chain is a δ-skeleton—the (discrete-time) Markov chain formed by observing X(t) at intervals of δ units of time. The random variables X(0), X(δ), X(2δ), ... give the sequence of states visited by the δ-skeleton.

Notes

↑ Ross, S.M. (2010). Introduction to Probability Models (10 ed.). Elsevier. ISBN 978-0-12-375686-2.
↑ Anderson 1991, See definition on page 64.
↑ Chen & Mao 2021, Definition 2.2.
↑ Chen 2004, Definition 0.1(4).
↑ Norris 1997, Theorem 2.8.4 and Theorem 2.8.2(b).
1 2 Anderson 1991, Theorem 2.2.2(1), page 70.
↑ Anderson 1991, Definition on page 81.
↑ Chen 2004, page 2.
↑ Anderson 1991, page 20.
1 2 Suhov & Kelbert 2008, Definition 2.6.3.
↑ Chen & Mao 2021, Definition 2.1.
↑ Chen 2004, Definition 0.1.
↑ Chen & Mao 2021, page 56, just below Definition 2.2.
↑ Norris 1997, page 87.
↑ Suhov & Kelbert 2008, Theorem 2.6.6.
↑ Norris 1997, Theorem 2.8.2(c).

Related Research Articles

In mathematical physics and mathematics, the Pauli matrices are a set of three $2 \times 2$ complex matrices that are Hermitian, involutory and unitary. Usually indicated by the Greek letter sigma, they are occasionally denoted by tau when used in connection with isospin symmetries.

A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happens next depends only on the state of affairs now." A countably infinite sequence, in which the chain moves state at discrete time steps, gives a discrete-time Markov chain (DTMC). A continuous-time process is called a continuous-time Markov chain (CTMC). It is named after the Russian mathematician Andrey Markov.

In mechanics and geometry, the 3D rotation group, often denoted SO(3), is the group of all rotations about the origin of three-dimensional Euclidean space $under the operation of composition.$

In mathematics, the special unitary group of degree $n$ , denoted $SU(n)$ , is the Lie group of $n \times n$ unitary matrices with determinant 1.

<span class="mw-page-title-main">Gumbel distribution</span> Particular case of the generalized extreme value distribution

In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.

In physics, the S-matrix or scattering matrix relates the initial state and the final state of a physical system undergoing a scattering process. It is used in quantum mechanics, scattering theory and quantum field theory (QFT).

In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution.

In particle physics, neutral particle oscillation is the transmutation of a particle with zero electric charge into another neutral particle due to a change of a non-zero internal quantum number, via an interaction that does not conserve that quantum number. Neutral particle oscillations were first investigated in 1954 by Murray Gell-mann and Abraham Pais.

The principle of detailed balance can be used in kinetic systems which are decomposed into elementary processes. It states that at equilibrium, each elementary process is in equilibrium with its reverse process.

In probability, a discrete-time Markov chain (DTMC) is a sequence of random variables, known as a stochastic process, in which the value of the next variable depends only on the value of the current variable, and not any variables in the past. For instance, a machine may have two states, A and E. When it is in state A, there is a 40% chance of it moving to state E and a 60% chance of it remaining in state A. When it is in state E, there is a 70% chance of it moving to A and a 30% chance of it staying in E. The sequence of states of the machine is a Markov chain. If we denote the chain by $then is the state which the machine starts in and is the random variable describing its state after 10 transitions. The process continues forever, indexed by the natural numbers.$

The Hückel method or Hückel molecular orbital theory, proposed by Erich Hückel in 1930, is a simple method for calculating molecular orbitals as linear combinations of atomic orbitals. The theory predicts the molecular orbitals for π-electrons in π-delocalized molecules, such as ethylene, benzene, butadiene, and pyridine. It provides the theoretical basis for Hückel's rule that cyclic, planar molecules or ions with $π-electrons are aromatic. It was later extended to conjugated molecules such as pyridine, pyrrole and furan that contain atoms other than carbon and hydrogen (heteroatoms). A more dramatic extension of the method to include σ-electrons, known as the extended Hückel method (EHM), was developed by Roald Hoffmann. The extended Hückel method gives some degree of quantitative accuracy for organic molecules in general and was used to provide computational justification for the Woodward-Hoffmann rules. To distinguish the original approach from Hoffmann's extension, the Hückel method is also known as the simple Hückel method (SHM).$

In computational complexity theory, PostBQP is a complexity class consisting of all of the computational problems solvable in polynomial time on a quantum Turing machine with postselection and bounded error.

A number of different Markov models of DNA sequence evolution have been proposed. These substitution models differ in terms of the parameters used to describe the rates at which one nucleotide replaces another during evolution. These models are frequently used in molecular phylogenetic analyses. In particular, they are used during the calculation of likelihood of a tree and they are used to estimate the evolutionary distance between sequences from the observed differences between the sequences.

In quantum computing, quantum finite automata (QFA) or quantum state machines are a quantum analog of probabilistic automata or a Markov decision process. They provide a mathematical abstraction of real-world quantum computers. Several types of automata may be defined, including measure-once and measure-many automata. Quantum finite automata can also be understood as the quantization of subshifts of finite type, or as a quantization of Markov chains. QFAs are, in turn, special cases of geometric finite automata or topological finite automata.

In probability theory and statistics, the Dirichlet-multinomial distribution is a family of discrete multivariate probability distributions on a finite support of non-negative integers. It is also called the Dirichlet compound multinomial distribution (DCM) or multivariate Pólya distribution. It is a compound probability distribution, where a probability vector p is drawn from a Dirichlet distribution with parameter vector $, and an observation drawn from a multinomial distribution with probability vector p and number of trials n . The Dirichlet parameter vector captures the prior belief about the situation and can be seen as a pseudocount: observations of each outcome that occur before the actual data is collected. The compounding corresponds to a Pólya urn scheme. It is frequently encountered in Bayesian statistics, machine learning, empirical Bayes methods and classical statistics as an overdispersed multinomial distribution.$

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

In mathematics and mathematical physics, raising and lowering indices are operations on tensors which change their type. Raising and lowering indices are a form of index manipulation in tensor expressions.

A product distribution is a probability distribution constructed as the distribution of the product of random variables having two other known distributions. Given two statistically independent random variables X and Y, the distribution of the random variable Z that is formed as the product $is a product distribution .$

In queueing theory, a discipline within the mathematical theory of probability, a fluid queue is a mathematical model used to describe the fluid level in a reservoir subject to randomly determined periods of filling and emptying. The term dam theory was used in earlier literature for these models. The model has been used to approximate discrete models, model the spread of wildfires, in ruin theory and to model high speed data networks. The model applies the leaky bucket algorithm to a stochastic source.

In machine learning, the kernel embedding of distributions comprises a class of nonparametric methods in which a probability distribution is represented as an element of a reproducing kernel Hilbert space (RKHS). A generalization of the individual data-point feature mapping done in classical kernel methods, the embedding of distributions into infinite-dimensional feature spaces can preserve all of the statistical features of arbitrary distributions, while allowing one to compare and manipulate distributions using Hilbert space operations such as inner products, distances, projections, linear transformations, and spectral analysis. This learning framework is very general and can be applied to distributions over any space $on which a sensible kernel function may be defined. For example, various kernels have been proposed for learning from data which are: vectors in, discrete classes/categories, strings, graphs/networks, images, time series, manifolds, dynamical systems, and other structured objects. The theory behind kernel embeddings of distributions has been primarily developed by Alex Smola, Le Song, Arthur Gretton, and Bernhard Schölkopf. A review of recent works on kernel embedding of distributions can be found in.$

References

Anderson, William J. (1991). Continuous-time Markov chains: an applications-oriented approach. Springer.
Leo Breiman (1992) [1968] Probability. Original edition published by Addison-Wesley; reprinted by Society for Industrial and Applied Mathematics ISBN 0-89871-296-3. (See Chapter 7)
Chen, Mu-Fa (2004). From Markov chains to non-equilibrium particle systems (Second ed.). World Scientific.
Chen, Mu-Fa; Mao, Yong-Hua (2021). Introduction to stochastic processes. World Scientific.
J. L. Doob (1953) Stochastic Processes. New York: John Wiley and Sons ISBN 0-471-52369-0.
A. A. Markov (1971). "Extension of the limit theorems of probability theory to a sum of variables connected in a chain". reprinted in Appendix B of: R. Howard. Dynamic Probabilistic Systems, volume 1: Markov Chains. John Wiley and Sons.
Markov, A. A. (2006). "An Example of Statistical Investigation of the Text Eugene Onegin Concerning the Connection of Samples in Chains". Science in Context. 19 (4). Translated by Link, David: 591–600. doi:10.1017/s0269889706001074. S2CID 144854176.
S. P. Meyn and R. L. Tweedie (1993) Markov Chains and Stochastic Stability. London: Springer-Verlag ISBN 0-387-19832-6. online: MCSS . Second edition to appear, Cambridge University Press, 2009.
Kemeny, John G.; Hazleton Mirkil; J. Laurie Snell; Gerald L. Thompson (1959). Finite Mathematical Structures (1st ed.). Englewood Cliffs, NJ: Prentice-Hall, Inc. Library of Congress Card Catalog Number 59-12841. Classical text. cf Chapter 6 Finite Markov Chains pp. 384ff.
John G. Kemeny & J. Laurie Snell (1960) Finite Markov Chains, D. van Nostrand Company ISBN 0-442-04328-7
E. Nummelin. General irreducible Markov chains and non-negative operators. Cambridge University Press, 1984, 2004. ISBN 0-521-60494-X
Norris, J. R. (1997). Markov Chains. doi:10.1017/CBO9780511810633.005. ISBN 9780511810633.
Seneta, E. Non-negative matrices and Markov chains. 2nd rev. ed., 1981, XVI, 288 p., Softcover Springer Series in Statistics. (Originally published by Allen & Unwin Ltd., London, 1973) ISBN 978-0-387-29765-1
Suhov, Yuri; Kelbert, Mark (2008). Markov chains: a primer in random processes and their applications. Cambridge University Press.

↑ For instance, consider the example $S=\mathbb {Z} _{\geq 0}$ and $Q$ being the (unique) transition-rate matrix on $S$ such that $\forall i\in \mathbb {Z} _{\geq 0}~~Q_{i,i+1}=i^{2},~Q_{i,i}=-i^{2}$ . (Then the remaining entries of $Q$ will all be zero. Cf. birth process.) Then $Q$ is irregular. Then, for general infinite $S$ , indexing $S$ by the nonnegative integers $\mathbb {Z} _{\geq 0}$ yields that a suitably modified version of the above matrix $Q$ will be irregular.^[9]

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[ross-1] Ross, S.M. (2010). Introduction to Probability Models (10 ed.). Elsevier. ISBN 978-0-12-375686-2.

[FOOTNOTEAnderson1991See_definition_on_page_64-2] Anderson 1991, See definition on page 64.

[FOOTNOTEChenMao2021Definition_2.2-3] Chen & Mao 2021, Definition 2.2.

[FOOTNOTEChen2004Definition_0.1(4)-4] Chen 2004, Definition 0.1(4).

[FOOTNOTENorris1997Theorem_2.8.4_and_Theorem_2.8.2(b)-5] Norris 1997, Theorem 2.8.4 and Theorem 2.8.2(b).

[FOOTNOTEAnderson1991Theorem_2.2.2(1),_page_70-6] 1 2 Anderson 1991, Theorem 2.2.2(1), page 70.

[FOOTNOTEAnderson1991Definition_on_page_81-7] Anderson 1991, Definition on page 81.

[FOOTNOTEChen2004page_2-8] Chen 2004, page 2.

[FOOTNOTEAnderson1991page_20-9] Anderson 1991, page 20.

[FOOTNOTESuhovKelbert2008Definition_2.6.3-11] 1 2 Suhov & Kelbert 2008, Definition 2.6.3.

[FOOTNOTEChenMao2021Definition_2.1-12] Chen & Mao 2021, Definition 2.1.

[FOOTNOTEChen2004Definition_0.1-13] Chen 2004, Definition 0.1.

[FOOTNOTEChenMao2021page_56,_just_below_Definition_2.2-14] Chen & Mao 2021, page 56, just below Definition 2.2.

[FOOTNOTENorris1997page_87-15] Norris 1997, page 87.

[FOOTNOTESuhovKelbert2008Theorem_2.6.6-16] Suhov & Kelbert 2008, Theorem 2.6.6.

[FOOTNOTENorris1997Theorem_2.8.2(c)-17] Norris 1997, Theorem 2.8.2(c).

[10] For instance, consider the example $S=\mathbb {Z} _{\geq 0}$ and $Q$ being the (unique) transition-rate matrix on $S$ such that $\forall i\in \mathbb {Z} _{\geq 0}~~Q_{i,i+1}=i^{2},~Q_{i,i}=-i^{2}$ . (Then the remaining entries of $Q$ will all be zero. Cf. birth process.) Then $Q$ is irregular. Then, for general infinite $S$ , indexing $S$ by the nonnegative integers $\mathbb {Z} _{\geq 0}$ yields that a suitably modified version of the above matrix $Q$ will be irregular.^[9]

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[lower-alpha 1]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[9]

v t e Queueing theory
Single queueing nodes	D/M/1 queue M/D/1 queue M/D/c queue M/M/1 queue Burke's theorem M/M/c queue M/M/∞ queue M/G/1 queue Pollaczek–Khinchine formula Matrix analytic method M/G/k queue G/M/1 queue G/G/1 queue Kingman's formula Lindley equation Fork–join queue Bulk queue
Arrival processes	Poisson point process Markovian arrival process Rational arrival process
Queueing networks	Jackson network Traffic equations Gordon–Newell theorem Mean value analysis Buzen's algorithm Kelly network G-network BCMP network
Service policies	FIFO LIFO Processor sharing Round-robin Shortest job next Shortest remaining time
Key concepts	Continuous-time Markov chain Kendall's notation Little's law Product-form solution Balance equation Quasireversibility Flow-equivalent server method Arrival theorem Decomposition method Beneš method
Limit theorems	Fluid limit Mean-field theory Heavy traffic approximation Reflected Brownian motion
Extensions	Fluid queue Layered queueing network Polling system Adversarial queueing network Loss network Retrial queue
Information systems	Data buffer Erlang (unit) Erlang distribution Flow control (data) Message queue Network congestion Network scheduler Pipeline (software) Quality of service Scheduling (computing) Teletraffic engineering
Category