Markov Chains and Mixing Times is a book on Markov chain mixing times. The second edition was written by David A. Levin, and Yuval Peres. Elizabeth Wilmer was a co-author on the first edition and is credited as a contributor to the second edition. The first edition was published in 2009 by the American Mathematical Society, [1] [2] with an expanded second edition in 2017. [3] [4] [5] [6]
A Markov chain is a stochastic process defined by a set of states and, for each state, a probability distribution on the states. Starting from an initial state, it follows a sequence of states where each state in the sequence is chosen randomly from the distribution associated with the previous state. In that sense, it is "memoryless": each random choice depends only on the current state, and not on the past history of states. Under mild restrictions, a Markov chain with a finite set of states will have a stationary distribution that it converges to, meaning that, after a sufficiently large number of steps, the probability of being in each state will close to that of the stationary distribution, regardless of the initial state or of the exact number of steps. The mixing time of a Markov chain is the number of steps needed for this convergence to happen, to a suitable degree of accuracy. A family of Markov chains is said to be rapidly mixing if the mixing time is a polynomial function of some size parameter of the Markov chain, and slowly mixing otherwise. This book is about finite Markov chains, their stationary distributions and mixing times, and methods for determining whether Markov chains are rapidly or slowly mixing. [1] [4]
A classical and familiar example of this phenomenon involves shuffling decks of cards: starting from a non-random initial deck of cards, how many shuffles does it take to reach a nearly-random permutation? This can be modeled as a Markov chain whose states are orderings of the card deck and whose state-to-state transition probabilities are given by some mathematical model of random shuffling such as the Gilbert–Shannon–Reeds model. In this situation, rapid mixing of the Markov chain means that one does not have to perform a huge number of shuffles to reach a sufficiently randomized state. Beyond card games, similar considerations arise in the behavior of the physical systems studied in statistical mechanics, and of certain randomized algorithms. [1] [4]
The book is divided into two parts, the first more introductory and the second more advanced. [2] [6] After three chapters of introductory material on Markov chains, chapter four defines the ways of measuring the distance of a Markov chain to its stationary distribution and the time it takes to reach that distance. Chapter five describes coupling, one of the standard techniques of bounding mixing times. In this technique, one sets up two Markov chains, one starting from the given initial state and the other from the stationary distribution, with transitions that have the correct probabilities within each chain but are not independent from chain-to-chain, in such a way that the two chains become likely to move to the same states as each other. In this way, the mixing time can be bounded by the time for the two coupled chains to synchronize. Chapter six discusses a technique called "strong stationary times" with which, for some Markov chains, one can prove that choosing a stopping time randomly from a certain distribution will result in a state drawn from the stationary distribution. [6]
After a chapter on lower bounds on mixing time based on the "bottleneck ratio" and isoperimetric number, [5] the next two chapters of the first part cover two important examples: card shuffling and random walks on graphs. Chapters 10 and 11 consider two more parameters closely related to the mixing time, the hitting time at which the Markov chain first reaches a specified state, and the cover time at which it has first reached all states. [6] They also discuss time-reversible Markov chains and their connection to electrical networks. [5] The final chapter of this part discusses the connection between the spectral gap of a Markov chain and its mixing time. [6]
The second part of the book includes many more examples in which this theory has been applied, including the Glauber dynamics on the Ising model, Markov models of chromosomal rearrangement, the asymmetric simple exclusion process in which particles randomly jump to unoccupied adjacent spaces, and random walks in the lamplighter group. [6] Topics covered in the second part of the book include more on spectral graphs and expander graphs, [5] path coupling (in which a sequence of more than two Markov chains is coupled in pairs), [6] connections between coupling and the earth mover's distance, martingales, [5] critical temperatures, [2] the "cutoff effect" in which the probability distribution of the chain transitions rapidly between unmixed and mixed, [1] [2] [6] the evolving set process (a derived Markov chain on sets of states of the given chain), [2] Markov chains with infinitely many states, and Markov chains that operate in continuous time rather than by a discrete sequence of steps. A guest chapter by Jim Propp and David B. Wilson describes coupling from the past, a method for obtaining samples drawn exactly from the stationary distribution rather than (as one obtains from Markov chain Monte Carlo methods) approximations to this distribution. [1] [2] The final chapter collects open problems in this area. [5]
This book can be used either as a reference by researchers in areas using these methods, or as the basis for a graduate-level course, [1] particularly one limited to the more introductory material in the first part of the book [6] where only an undergraduate-level knowledge of probability theory and linear algebra is required. [1] [2] However, reviewer Rick Durrett suggests that the book's contents would be too advanced for undergraduate courses, even at research-level universities, [6] and reviewer Takis Konstantopoulos suggests that the book's contents would be better appreciated by a reader who has already had some exposure to the material that it covers. [5]
Reviewer Olle Häggström calls the book "authoritative and highly readable". [1] Reviewer H. M. Mai writes that its explanations are careful and "well motivated", and that the writing is "lucid and clear". [2] Reviewer László Lakatos calls it "a brilliant guide to the modern theory of Markov chains". And reviewer David Aldous predicts that it "will long remain the definitive required reading" in this area.
In probability theory and related fields, a stochastic or random process is a mathematical object usually defined as a sequence of random variables, where the index of the sequence has the interpretation of time. Stochastic processes are widely used as mathematical models of systems and phenomena that appear to vary in a random manner. Examples include the growth of a bacterial population, an electrical current fluctuating due to thermal noise, or the movement of a gas molecule. Stochastic processes have applications in many disciplines such as biology, chemistry, ecology, neuroscience, physics, image processing, signal processing, control theory, information theory, computer science, and telecommunications. Furthermore, seemingly random changes in financial markets have motivated the extensive use of stochastic processes in finance.
A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happens next depends only on the state of affairs now." A countably infinite sequence, in which the chain moves state at discrete time steps, gives a discrete-time Markov chain (DTMC). A continuous-time process is called a continuous-time Markov chain (CTMC). It is named after the Russian mathematician Andrey Markov.
In mathematics, a stochastic matrix is a square matrix used to describe the transitions of a Markov chain. Each of its entries is a nonnegative real number representing a probability. It is also called a probability matrix, transition matrix, substitution matrix, or Markov matrix. The stochastic matrix was first developed by Andrey Markov at the beginning of the 20th century, and has found use throughout a wide variety of scientific fields, including probability theory, statistics, mathematical finance and linear algebra, as well as computer science and population genetics. There are several different definitions and types of stochastic matrices:
In mathematics, a random walk, sometimes known as a drunkard's walk, is a random process that describes a path that consists of a succession of random steps on some mathematical space.
In statistics, Markov chain Monte Carlo (MCMC) methods comprise a class of algorithms for sampling from a probability distribution. By constructing a Markov chain that has the desired distribution as its equilibrium distribution, one can obtain a sample of the desired distribution by recording states from the chain. The more steps that are included, the more closely the distribution of the sample matches the actual desired distribution. Various algorithms exist for constructing chains, including the Metropolis–Hastings algorithm.
In probability, a discrete-time Markov chain (DTMC) is a sequence of random variables, known as a stochastic process, in which the value of the next variable depends only on the value of the current variable, and not any variables in the past. For instance, a machine may have two states, A and E. When it is in state A, there is a 40% chance of it moving to state E and a 60% chance of it remaining in state A. When it is in state E, there is a 70% chance of it moving to A and a 30% chance of it staying in E. The sequence of states of the machine is a Markov chain. If we denote the chain by then is the state which the machine starts in and is the random variable describing its state after 10 transitions. The process continues forever, indexed by the natural numbers.
In probability theory, the mixing time of a Markov chain is the time until the Markov chain is "close" to its steady state distribution.
In graph theory the conductance of a graph G = (V, E) measures how "well-knit" the graph is: it controls how fast a random walk on G converges to its stationary distribution. The conductance of a graph is often called the Cheeger constant of a graph as the analog of its counterpart in spectral geometry. Since electrical networks are intimately related to random walks with a long history in the usage of the term "conductance", this alternative name helps avoid possible confusion.
Among Markov chain Monte Carlo (MCMC) algorithms, coupling from the past is a method for sampling from the stationary distribution of a Markov chain. Contrary to many MCMC algorithms, coupling from the past gives in principle a perfect sample from the stationary distribution. It was invented by James Propp and David Wilson in 1996.
James Gary Propp is a professor of mathematics at the University of Massachusetts Lowell.
Harry Kesten was a Jewish American mathematician best known for his work in probability, most notably on random walks on groups and graphs, random matrices, branching processes, and percolation theory.
In mathematics, the Fortuin–Kasteleyn–Ginibre (FKG) inequality is a correlation inequality, a fundamental tool in statistical mechanics and probabilistic combinatorics, due to Cees M. Fortuin, Pieter W. Kasteleyn, and Jean Ginibre (1971). Informally, it says that in many random systems, increasing events are positively correlated, while an increasing and a decreasing event are negatively correlated. It was obtained by studying the random cluster model.
This page lists articles related to probability theory. In particular, it lists many articles corresponding to specific probability distributions. Such articles are marked here by a code of the form (X:Y), which refers to number of random variables involved and the type of the distribution. For example (2:DC) indicates a distribution with two random variables, discrete or continuous. Other codes are just abbreviations for topics. The list of codes can be found in the table of contents.
In queueing theory, a discipline within the mathematical theory of probability, an M/G/1 queue is a queue model where arrivals are Markovian, service times have a General distribution and there is a single server. The model name is written in Kendall's notation, and is an extension of the M/M/1 queue, where service times must be exponentially distributed. The classic application of the M/G/1 queue is to model performance of a fixed head hard disk.
Yuval Peres is a mathematician known for his research in probability theory, ergodic theory, mathematical analysis, theoretical computer science, and in particular for topics such as fractals and Hausdorff measure, random walks, Brownian motion, percolation and Markov chain mixing times. He was born in Israel and obtained his Ph.D. at the Hebrew University of Jerusalem in 1990 under the supervision of Hillel Furstenberg. He was a faculty member at the Hebrew University and the University of California at Berkeley, and a Principal Researcher at Microsoft Research in Redmond, Washington. Peres has been accused of sexual harassment by several female scientists.
Elizabeth Lee Wilmer is an American mathematician known for her work on Markov chain mixing times. She is a professor, and former department head, of mathematics at Oberlin College.
Point Processes is a book on the mathematics of point processes, randomly located sets of points on the real line or in other geometric spaces. It was written by David Cox and Valerie Isham, and published in 1980 by Chapman & Hall in their Monographs on Applied Probability and Statistics book series. The Basic Library List Committee of the Mathematical Association of America has suggested its inclusion in undergraduate mathematics libraries.
In the mathematical theory of Markov chains, the Markov chain tree theorem is an expression for the stationary distribution of a Markov chain with finitely many states. It sums up terms for the rooted spanning trees of the Markov chain, with a positive combination for each tree. The Markov chain tree theorem is closely related to Kirchhoff's theorem on counting the spanning trees of a graph, from which it can be derived. It was first stated by Hill (1966), for certain Markov chains arising in thermodynamics, and proved in full generality by Leighton & Rivest (1986), motivated by an application in limited-memory estimation of the probability of a biased coin.