Branching process

Last updated

In probability theory, a branching process is a type of mathematical object known as a stochastic process, which consists of collections of random variables indexed by some set, usually natural or non-negative real numbers. The original purpose of branching processes was to serve as a mathematical model of a population in which each individual in generation  produces some random number of individuals in generation , according, in the simplest case, to a fixed probability distribution that does not vary from individual to individual. [1] Branching processes are used to model reproduction; for example, the individuals might correspond to bacteria, each of which generates 0, 1, or 2 offspring with some probability in a single time unit. Branching processes can also be used to model other systems with similar dynamics, e.g., the spread of surnames in genealogy or the propagation of neutrons in a nuclear reactor.

Contents

A central question in the theory of branching processes is the probability of ultimate extinction, where no individuals exist after some finite number of generations. Using Wald's equation, it can be shown that starting with one individual in generation zero, the expected size of generation n equals μn where μ is the expected number of children of each individual. If μ < 1, then the expected number of individuals goes rapidly to zero, which implies ultimate extinction with probability 1 by Markov's inequality. Alternatively, if μ > 1, then the probability of ultimate extinction is less than 1 (but not necessarily zero; consider a process where each individual either has 0 or 100 children with equal probability. In that case, μ = 50, but probability of ultimate extinction is greater than 0.5, since that's the probability that the first individual has 0 children). If μ = 1, then ultimate extinction occurs with probability 1 unless each individual always has exactly one child.

In theoretical ecology, the parameter μ of a branching process is called the basic reproductive rate.

Mathematical formulation

The most common formulation of a branching process is that of the Galton–Watson process. Let Zn denote the state in period n (often interpreted as the size of generation n), and let Xn,i be a random variable denoting the number of direct successors of member i in period n, where Xn,i are independent and identically distributed random variables over all n ∈{ 0, 1, 2, ...} and i  {1, ..., Zn}. Then the recurrence equation is

with Z0 = 1.

Alternatively, the branching process can be formulated as a random walk. Let Si denote the state in period i, and let Xi be a random variable that is iid over all i. Then the recurrence equation is

with S0 = 1. To gain some intuition for this formulation, imagine a walk where the goal is to visit every node, but every time a previously unvisited node is visited, additional nodes are revealed that must also be visited. Let Si represent the number of revealed but unvisited nodes in period i, and let Xi represent the number of new nodes that are revealed when node i is visited. Then in each period, the number of revealed but unvisited nodes equals the number of such nodes in the previous period, plus the new nodes that are revealed when visiting a node, minus the node that is visited. The process ends once all revealed nodes have been visited.

Continuous-time branching processes

For discrete-time branching processes, the "branching time" is fixed to be 1 for all individuals. For continuous-time branching processes, each individual waits for a random time (which is a continuous random variable), and then divides according to the given distribution. The waiting time for different individuals are independent, and are independent with the number of children. In general, the waiting time is an exponential variable with parameter λ for all individuals, so that the process is Markovian.

Extinction problem for a Galton–Watson process

The ultimate extinction probability is given by

For any nontrivial cases (trivial cases are ones in which the probability of having no offspring is zero for every member of the population - in such cases the probability of ultimate extinction is 0), the probability of ultimate extinction equals one if μ  1 and strictly less than one if μ > 1.

The process can be analyzed using the method of probability generating function. Let p0, p1, p2, ... be the probabilities of producing 0, 1, 2, ... offspring by each individual in each generation. Let dm be the extinction probability by the mth generation. Obviously, d0 = 0. Since the probabilities for all paths that lead to 0 by the mth generation must be added up, the extinction probability is nondecreasing in generations. That is,

Therefore, dm converges to a limit d, and d is the ultimate extinction probability. If there are j offspring in the first generation, then to die out by the mth generation, each of these lines must die out in m  1 generations. Since they proceed independently, the probability is (dm−1) j. Thus,

The right-hand side of the equation is a probability generating function. Let h(z) be the ordinary generating function for pi:

Using the generating function, the previous equation becomes

Since dmd, d can be found by solving

This is also equivalent to finding the intersection point(s) of lines y = z and y = h(z) for z  0. y = z is a straight line. y = h(z) is an increasing (since ) and convex (since ) function. There are at most two intersection points. Since (1,1) is always an intersect point for the two functions, there only exist three cases:

Three cases of y = h(z) intersect with y = z. Hgraph.png
Three cases of y = h(z) intersect with y = z.

Case 1 has another intersect point at z < 1 (see the red curve in the graph).

Case 2 has only one intersect point at z = 1.(See the green curve in the graph)

Case 3 has another intersect point at z > 1.(See the black curve in the graph)

In case 1, the ultimate extinction probability is strictly less than one. For case 2 and 3, the ultimate extinction probability equals to one.

By observing that h′(1) = p1 + 2p2 + 3p3 + ... = μ is exactly the expected number of offspring a parent could produce, it can be concluded that for a branching process with generating function h(z) for the number of offspring of a given parent, if the mean number of offspring produced by a single parent is less than or equal to one, then the ultimate extinction probability is one. If the mean number of offspring produced by a single parent is greater than one, then the ultimate extinction probability is strictly less than one.

Size dependent branching processes

Along with discussion of a more general model of branching processes known as age-dependent branching processes by Grimmett, [2] in which individuals live for more than one generation, Krishna Athreya has identified three distinctions between size-dependent branching processes which have general application. Athreya identifies the three classes of size-dependent branching processes as sub-critical, stable, and super-critical branching measures. For Athreya, the central parameters are crucial to control if sub-critical and super-critical unstable branching is to be avoided. [3] Size dependent branching processes are also discussed under the topic of resource-dependent branching process [4]

Example of extinction problem

Consider a parent can produce at most two offspring. The extinction probability in each generation is:

with d0 = 0. For the ultimate extinction probability, we need to find d which satisfies d = p0 + p1d + p2d2.

Taking as example probabilities for the numbers of offspring produced p0 = 0.1, p1 = 0.6, and p2 = 0.3, the extinction probability for the first 20 generations is as follows:

Generation # (1–10)Extinction probabilityGeneration # (11–20)Extinction probability
10.1110.3156
20.163120.3192
30.2058130.3221
40.2362140.3244
50.2584150.3262
60.2751160.3276
70.2878170.3288
80.2975180.3297
90.3051190.3304
100.3109200.331

In this example, we can solve algebraically that d = 1/3, and this is the value to which the extinction probability converges with increasing generations.

Simulating branching processes

Branching processes can be simulated for a range of problems. One specific use of simulated branching process is in the field of evolutionary biology. [5] [6] Phylogenetic trees, for example, can be simulated under several models, [7] helping to develop and validate estimation methods as well as supporting hypothesis testing.

Multitype branching processes

In multitype branching processes, individuals are not identical, but can be classified into n types. After each time step, an individual of type i will produce individuals of different types, and , a random vector representing the numbers of children in different types, satisfies a probability distribution on .

For example, consider the population of cancer stem cells (CSCs) and non-stem cancer cells (NSCCs). After each time interval, each CSC has probability to produce two CSCs (symmetric division), probability to produce one CSC and one NSCC (asymmetric division), probability to produce one CSC (stagnation), and probability to produce nothing (death); each NSCC has probability to produce two NSCCs (symmetric division), probability to produce one NSCC (stagnation), and probability to produce nothing (death). [8]

Law of large numbers for multitype branching processes

For multitype branching processes that the populations of different types grow exponentially, the proportions of different types converge almost surely to a constant vector under some mild conditions. This is the strong law of large numbers for multitype branching processes.

For continuous-time cases, proportions of the population expectation satisfy an ODE system, which has a unique attracting fixed point. This fixed point is just the vector that the proportions converge to in the law of large numbers.

The monograph by Athreya and Ney [9] summarizes a common set of conditions under which this law of large numbers is valid. Later there are some improvements through discarding different conditions. [10] [11]

Other branching processes

There are many other branching processes, for example, branching processes in random environments, in which the reproduction law is chosen randomly at each generation, or branching processes, where the growth of the population is controlled by external influences or interacting processes. Branching processes where particles have to work (contribute resources to the environment) in order to be able to reproduce, and live in a changing society structure controlling the distribution of resources, are so-called resource-dependent branching processes.

The scaling limit of near-critical branching processes can be used to obtain superprocesses.

See also

Related Research Articles

<span class="mw-page-title-main">Normal distribution</span> Probability distribution

In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

In probability theory, the central limit theorem (CLT) states that, under appropriate conditions, the distribution of a normalized version of the sample mean converges to a standard normal distribution. This holds even if the original variables themselves are not normally distributed. There are several versions of the CLT, each applying in the context of different conditions.

<span class="mw-page-title-main">Multivariate random variable</span> Random variable with multiple component dimensions

In probability, and statistics, a multivariate random variable or random vector is a list or vector of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value. The individual variables in a random vector are grouped together because they are all part of a single mathematical system — often they represent different properties of an individual statistical unit. For example, while a given person has a specific age, height and weight, the representation of these features of an unspecified person from within a group would be a random vector. Normally each element of a random vector is a real number.

<span class="mw-page-title-main">Queueing theory</span> Mathematical study of waiting lines, or queues

Queueing theory is the mathematical study of waiting lines, or queues. A queueing model is constructed so that queue lengths and waiting time can be predicted. Queueing theory is generally considered a branch of operations research because the results are often used when making business decisions about the resources needed to provide a service.

<span class="mw-page-title-main">Random walk</span> Mathematical formalization of a path that consists of a succession of random steps

In mathematics, a random walk, sometimes known as a drunkard's walk, is a random process that describes a path that consists of a succession of random steps on some mathematical space.

In probability theory, the Brownian tree, or Aldous tree, or Continuum Random Tree (CRT) is a random real tree that can be defined from a Brownian excursion. The Brownian tree was defined and studied by David Aldous in three articles published in 1991 and 1993. This tree has since then been generalized.

<span class="mw-page-title-main">Galton–Watson process</span> Probability model, originally to model the extinction of family names

The Galton–Watson process is a branching stochastic process arising from Francis Galton's statistical investigation of the extinction of family names. The process models family names as patrilineal, while offspring are randomly either male or female, and names become extinct if the family name line dies out. This is an accurate description of Y chromosome transmission in genetics, and the model is thus useful for understanding human Y-chromosome DNA haplogroups. Likewise, since mitochondria are inherited only on the maternal line, the same mathematical formulation describes transmission of mitochondria. The formula is of limited usefulness in understanding actual family name distributions, since in practice family names change for many other reasons, and dying out of name line is only one factor.

In statistics, Gibbs sampling or a Gibbs sampler is a Markov chain Monte Carlo (MCMC) algorithm for sampling from a specified multivariate probability distribution when direct sampling from the joint distribution is difficult, but sampling from the conditional distribution is more practical. This sequence can be used to approximate the joint distribution ; to approximate the marginal distribution of one of the variables, or some subset of the variables ; or to compute an integral. Typically, some of the variables correspond to observations whose values are known, and hence do not need to be sampled.

Belief propagation, also known as sum–product message passing, is a message-passing algorithm for performing inference on graphical models, such as Bayesian networks and Markov random fields. It calculates the marginal distribution for each unobserved node, conditional on any observed nodes. Belief propagation is commonly used in artificial intelligence and information theory, and has demonstrated empirical success in numerous applications, including low-density parity-check codes, turbo codes, free energy approximation, and satisfiability.

In probability theory, a Lévy process, named after the French mathematician Paul Lévy, is a stochastic process with independent, stationary increments: it represents the motion of a point whose successive displacements are random, in which displacements in pairwise disjoint time intervals are independent, and displacements in different time intervals of the same length have identical probability distributions. A Lévy process may thus be viewed as the continuous-time analog of a random walk.

In queueing theory, a discipline within the mathematical theory of probability, a Jackson network is a class of queueing network where the equilibrium distribution is particularly simple to compute as the network has a product-form solution. It was the first significant development in the theory of networks of queues, and generalising and applying the ideas of the theorem to search for similar product-form solutions in other networks has been the subject of much research, including ideas used in the development of the Internet. The networks were first identified by James R. Jackson and his paper was re-printed in the journal Management Science’s ‘Ten Most Influential Titles of Management Sciences First Fifty Years.’

<span class="mw-page-title-main">Dirichlet process</span> Family of stochastic processes

In probability theory, Dirichlet processes are a family of stochastic processes whose realizations are probability distributions. In other words, a Dirichlet process is a probability distribution whose range is itself a set of probability distributions. It is often used in Bayesian inference to describe the prior knowledge about the distribution of random variables—how likely it is that the random variables are distributed according to one or another particular distribution.

In probability theory and statistics, the Dirichlet-multinomial distribution is a family of discrete multivariate probability distributions on a finite support of non-negative integers. It is also called the Dirichlet compound multinomial distribution (DCM) or multivariate Pólya distribution. It is a compound probability distribution, where a probability vector p is drawn from a Dirichlet distribution with parameter vector , and an observation drawn from a multinomial distribution with probability vector p and number of trials n. The Dirichlet parameter vector captures the prior belief about the situation and can be seen as a pseudocount: observations of each outcome that occur before the actual data is collected. The compounding corresponds to a Pólya urn scheme. It is frequently encountered in Bayesian statistics, machine learning, empirical Bayes methods and classical statistics as an overdispersed multinomial distribution.

In population genetics, fixation is the change in a gene pool from a situation where there exists at least two variants of a particular gene (allele) in a given population to a situation where only one of the alleles remains. That is, the allele becomes fixed. In the absence of mutation or heterozygote advantage, any allele must eventually either be lost completely from the population, or fixed, i.e. permanently established at 100% frequency in the population. Whether a gene will ultimately be lost or fixed is dependent on selection coefficients and chance fluctuations in allelic proportions. Fixation can refer to a gene in general or particular nucleotide position in the DNA chain (locus).

An -superprocess, , within mathematics probability theory is a stochastic process on that is usually constructed as a special limit of near-critical branching diffusions.

<span class="mw-page-title-main">Random geometric graph</span> In graph theory, the mathematically simplest spatial network

In graph theory, a random geometric graph (RGG) is the mathematically simplest spatial network, namely an undirected graph constructed by randomly placing N nodes in some metric space and connecting two nodes by a link if and only if their distance is in a given range, e.g. smaller than a certain neighborhood radius, r.

<span class="mw-page-title-main">Network science</span> Academic field

Network science is an academic field which studies complex networks such as telecommunication networks, computer networks, biological networks, cognitive and semantic networks, and social networks, considering distinct elements or actors represented by nodes and the connections between the elements or actors as links. The field draws on theories and methods including graph theory from mathematics, statistical mechanics from physics, data mining and information visualization from computer science, inferential modeling from statistics, and social structure from sociology. The United States National Research Council defines network science as "the study of network representations of physical, biological, and social phenomena leading to predictive models of these phenomena."

<span class="mw-page-title-main">Poisson distribution</span> Discrete probability distribution

In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event. It can also be used for the number of events in other types of intervals than time, and in dimension greater than 1.

The Borel distribution is a discrete probability distribution, arising in contexts including branching processes and queueing theory. It is named after the French mathematician Émile Borel.

Biogeography-based optimization (BBO) is an evolutionary algorithm (EA) that optimizes a function by stochastically and iteratively improving candidate solutions with regard to a given measure of quality, or fitness function. BBO belongs to the class of metaheuristics since it includes many variations, and since it does not make any assumptions about the problem and can therefore be applied to a wide class of problems.

References

  1. Athreya, K. B. (2006). "Branching Process". Encyclopedia of Environmetrics. doi:10.1002/9780470057339.vab032. ISBN   978-0471899976.
  2. G. R. Grimmett and D. R. Stirzaker, Probability and Random Processes, 2nd ed., Clarendon Press, Oxford, 1992.
  3. Krishna Athreya and Peter Jagers. Branching Processes. Springer. 1973.
  4. F. Thomas Bruss and M. Duerinckx (2015) "Resource dependent branching processes and the envelope of societies", Annals of Applied Probability. 25: 324–372.
  5. Hagen, O.; Hartmann, K.; Steel, M.; Stadler, T. (2015-05-01). "Age-Dependent Speciation Can Explain the Shape of Empirical Phylogenies". Systematic Biology. 64 (3): 432–440. doi:10.1093/sysbio/syv001. ISSN   1063-5157. PMC   4395845 . PMID   25575504.
  6. Hagen, Oskar; Andermann, Tobias; Quental, Tiago B.; Antonelli, Alexandre; Silvestro, Daniele (May 2018). "Estimating Age-Dependent Extinction: Contrasting Evidence from Fossils and Phylogenies". Systematic Biology. 67 (3): 458–474. doi: 10.1093/sysbio/syx082 . PMC   5920349 . PMID   29069434.
  7. Hagen, Oskar; Stadler, Tanja (2018). "TreeSimGM: Simulating phylogenetic trees under general Bellman–Harris models with lineage-specific shifts of speciation and extinction in R". Methods in Ecology and Evolution. 9 (3): 754–760. doi:10.1111/2041-210X.12917. ISSN   2041-210X. PMC   5993341 . PMID   29938014.
  8. Chen, Xiufang; Wang, Yue; Feng, Tianquan; Yi, Ming; Zhang, Xingan; Zhou, Da (2016). "The overshoot and phenotypic equilibrium in characterizing cancer dynamics of reversible phenotypic plasticity". Journal of Theoretical Biology. 390: 40–49. arXiv: 1503.04558 . doi:10.1016/j.jtbi.2015.11.008. PMID   26626088. S2CID   15335040.
  9. Athreya, Krishna B.; Ney, Peter E. (1972). Branching Processes. Berlin: Springer-Verlag. pp. 199–206. ISBN   978-3-642-65371-1.
  10. Janson, Svante (2003). "Functional limit theorems for multitype branching processes and generalized Pólya urns". Stochastic Processes and Their Applications. 110 (2): 177–245. doi: 10.1016/j.spa.2003.12.002 .
  11. Jiang, Da-Quan; Wang, Yue; Zhou, Da (2017). "Phenotypic equilibrium as probabilistic convergence in multi-phenotype cell population dynamics". PLOS ONE. 12 (2): e0170916. Bibcode:2017PLoSO..1270916J. doi: 10.1371/journal.pone.0170916 . PMC   5300154 . PMID   28182672.