Maximal entropy random walk

Last updated November 29, 2023

Maximal entropy random walk (MERW) is a popular type of biased random walk on a graph, in which transition probabilities are chosen accordingly to the principle of maximum entropy, which says that the probability distribution which best represents the current state of knowledge is the one with largest entropy. While standard random walk chooses for every vertex uniform probability distribution among its outgoing edges, locally maximizing entropy rate, MERW maximizes it globally (average entropy production) by assuming uniform probability distribution among all paths in a given graph.

MERW is used in various fields of science. A direct application is choosing probabilities to maximize transmission rate through a constrained channel, analogously to Fibonacci coding. Its properties also made it useful for example in analysis of complex networks,^[1] like link prediction,^[2] community detection,^[3] robust transport over networks^[4] and centrality measures.^[5] Also in image analysis, for example for detecting visual saliency regions,^[6] object localization,^[7] tampering detection^[8] or tractography problem.^[9]

Additionally, it recreates some properties of quantum mechanics, suggesting a way to repair the discrepancy between diffusion models and quantum predictions, like Anderson localization.^[10]

Basic model

Consider a graph with $n$ vertices, defined by an adjacency matrix $A\in \left\{0,1\right\}^{n\times n}$ : $A_{ij}=1$ if there is an edge from vertex $i$ to $j$ , 0 otherwise. For simplicity assume it is an undirected graph, which corresponds to a symmetric $A$ ; however, MERW can also be generalized for directed and weighted graphs (for example Boltzmann distribution among paths instead of uniform).

We would like to choose a random walk as a Markov process on this graph: for every vertex $i$ and its outgoing edge to $j$ , choose probability $S_{ij}$ of the walker randomly using this edge after visiting $i$ . Formally, find a stochastic matrix $S$ (containing the transition probabilities of a Markov chain) such that

$0\leq S_{ij}\leq A_{ij}$ for all $i,j$ and
$\sum _{j=1}^{n}S_{ij}=1$ for all $i$ .

Assuming this graph is connected and not periodic, ergodic theory says that evolution of this stochastic process leads to some stationary probability distribution $\rho$ such that $\rho S=\rho$ .

Using Shannon entropy for every vertex and averaging over probability of visiting this vertex (to be able to use its entropy), we get the following formula for average entropy production (entropy rate) of the stochastic process:

H(S)=\sum _{i=1}^{n}\rho _{i}\sum _{j=1}^{n}S_{ij}\log(1/S_{ij})

This definition turns out to be equivalent to the asymptotic average entropy (per length) of the probability distribution in the space of paths for this stochastic process.

In the standard random walk, referred to here as generic random walk (GRW), we naturally choose that each outgoing edge is equally probable:

S_{ij}={\frac {A_{ij}}{\sum \limits _{k=1}^{n}A_{ik}}}

.

For a symmetric $A$ it leads to a stationary probability distribution $\rho$ with

\rho _{i}={\frac {\sum \limits _{j=1}^{n}A_{ij}}{\sum \limits _{i=1}^{n}\sum \limits _{j=1}^{n}A_{ij}}}

.

It locally maximizes entropy production (uncertainty) for every vertex, but usually leads to a suboptimal averaged global entropy rate $H(S)$ .

MERW chooses the stochastic matrix which maximizes $H(S)$ , or equivalently assumes uniform probability distribution among all paths in a given graph. Its formula is obtained by first calculating the dominant eigenvalue $\lambda$ and corresponding eigenvector $\psi$ of the adjacency matrix, i.e. the largest $\lambda \in \mathbb {R}$ with corresponding $\psi \in \mathbb {R} ^{n}$ such that $\psi A=\lambda \psi$ . Then stochastic matrix and stationary probability distribution are given by

S_{ij}={\frac {A_{ij}}{\lambda }}{\frac {\psi _{j}}{\psi _{i}}}

for which every possible path of length $l$ from the $i$ -th to $j$ -th vertex has probability

{\frac {1}{\lambda ^{l}}}{\frac {\psi _{j}}{\psi _{i}}}

.

Its entropy rate is $\log(\lambda )$ and the stationary probability distribution $\rho$ is

\rho _{i}={\frac {\psi _{i}^{2}}{\|\psi \|_{2}^{2}}}

.

In contrast to GRW, the MERW transition probabilities generally depend on the structure of the entire graph (are nonlocal). Hence, they should not be imagined as directly applied by the walker – if random-looking decisions are made based on the local situation, like a person would make, the GRW approach is more appropriate. MERW is based on the principle of maximum entropy, making it the safest assumption when we don't have any additional knowledge about the system. For example, it would be appropriate for modelling our knowledge about an object performing some complex dynamics – not necessarily random, like a particle.

Sketch of derivation

Assume for simplicity that the considered graph is indirected, connected and aperiodic, allowing to conclude from the Perron–Frobenius theorem that the dominant eigenvector is unique. Hence $A^{l}$ can be asymptotically ( $l\rightarrow \infty$ ) approximated by $\lambda ^{l}\psi \psi ^{T}$ (or $\lambda ^{l}|\psi \rangle \langle \psi |$ in bra–ket notation).

MERW requires uniform distribution along paths. The number $m_{il}$ of paths with length $2l$ and vertex $i$ in the center is

m_{il}=\sum _{j=1}^{n}\sum _{k=1}^{n}\left(A^{l}\right)_{ji}\left(A^{l}\right)_{ik}\approx \sum _{j=1}^{n}\sum _{k=1}^{n}\left(\lambda ^{l}\psi \psi ^{\top }\right)_{ji}\left(\lambda ^{l}\psi \psi ^{\top }\right)_{ik}=\sum _{j=1}^{n}\sum _{k=1}^{n}\lambda ^{2l}\psi _{j}\psi _{i}\psi _{i}\psi _{k}=\lambda ^{2l}\psi _{i}^{2}\underbrace {\sum _{j=1}^{n}\psi _{j}\sum _{k=1}^{n}\psi _{k}} _{=:b}

,

hence for all $i$ ,

\rho _{i}=\lim _{l\rightarrow \infty }{\frac {m_{il}}{\sum \limits _{k=1}^{n}m_{kl}}}=\lim _{l\rightarrow \infty }{\frac {\lambda ^{2l}\psi _{i}^{2}b}{\sum \limits _{k=1}^{n}\lambda ^{2l}\psi _{k}^{2}b}}=\lim _{l\rightarrow \infty }{\frac {\psi _{i}^{2}}{\sum \limits _{k=1}^{n}\psi _{k}^{2}}}={\frac {\psi _{i}^{2}}{\sum \limits _{k=1}^{n}\psi _{k}^{2}}}={\frac {\psi _{i}^{2}}{\|\psi \|_{2}^{2}}}

.

Analogously calculating probability distribution for two succeeding vertices, one obtains that the probability of being at the $i$ -th vertex and next at the $j$ -th vertex is

{\frac {\psi _{i}A_{ij}\psi _{j}}{\sum \limits _{i'=1}^{n}\sum \limits _{j'=1}^{n}\psi _{i'}A_{i'j'}\psi _{j'}}}={\frac {\psi _{i}A_{ij}\psi _{j}}{\psi A\psi ^{\top }}}={\frac {\psi _{i}A_{ij}\psi _{j}}{\lambda \|\psi \|_{2}^{2}}}

.

Dividing by the probability of being at the $i$ -th vertex, i.e. $\rho _{i}$ , gives for the conditional probability $S_{ij}$ of the $j$ -th vertex being next after the $i$ -th vertex

S_{ij}={\frac {A_{ij}}{\lambda }}{\frac {\psi _{j}}{\psi _{i}}}

.

Weighted MERW: Boltzmann path ensemble

We have assumed that $A_{ij}\in \{0,1\}$ for MERW corresponding to uniform ensemble among paths. However, the above derivation works for real nonnegative $A$ . Parametrizing $A_{ij}=\exp(-E_{ij})$ and asking for probability of length $l$ path $(\gamma _{0},\ldots ,\gamma _{l})$ , we get:

{\textrm {Pr}}(\gamma _{0},\ldots ,\gamma _{l})=\rho _{\gamma _{0}}S_{\gamma _{0}\gamma _{1}}\ldots S_{\gamma _{l-1}\gamma _{l}}=\psi _{\gamma _{0}}{\frac {A_{\gamma _{0}\gamma _{1}}\ldots A_{\gamma _{l-1}\gamma _{l}}}{\lambda ^{l}}}\psi _{\gamma _{l}}=\psi _{\gamma _{0}}{\frac {\exp(-(E_{\gamma _{0}\gamma _{1}}+\ldots +E_{\gamma _{l-1}\gamma _{l}}))}{\lambda ^{l}}}\psi _{\gamma _{l}}

As in Boltzmann distribution of paths for energy defined as sum of $E_{ij}$ over given path. For example, it allows to calculate probability distribution of patterns in Ising model.

Examples

Let us first look at a simple nontrivial situation: Fibonacci coding, where we want to transmit a message as a sequence of 0s and 1s, but not using two successive 1s: after a 1 there has to be a 0. To maximize the amount of information transmitted in such sequence, we should assume uniform probability distribution in the space of all possible sequences fulfilling this constraint. To practically use such long sequences, after 1 we have to use 0, but there remains a freedom of choosing the probability of 0 after 0. Let us denote this probability by $q$ , then entropy coding would allow encoding a message using this chosen probability distribution. The stationary probability distribution of symbols for a given $q$ turns out to be $\rho =(1/(2-q),1-1/(2-q))$ . Hence, entropy production is $H(S)=\rho _{0}\left(q\log(1/q)+(1-q)\log(1/(1-q))\right)$ , which is maximized for $q=({\sqrt {5}}-1)/2\approx 0.618$ , known as the golden ratio. In contrast, standard random walk would choose suboptimal $q=0.5$ . While choosing larger $q$ reduces the amount of information produced after 0, it also reduces frequency of 1, after which we cannot write any information.

A more complex example is the defected one-dimensional cyclic lattice: let say 1000 nodes connected in a ring, for which all nodes but the defects have a self-loop (edge to itself). In standard random walk (GRW) the stationary probability distribution would have defect probability being 2/3 of probability of the non-defect vertices – there is nearly no localization, also analogously for standard diffusion, which is infinitesimal limit of GRW. For MERW we have to first find the dominant eigenvector of the adjacency matrix – maximizing $\lambda$ in:

$(\lambda \psi )_{x}=(A\psi )_{x}=\psi _{x-1}+(1-V_{x})\psi _{x}+\psi _{x+1}$

for all positions $x$ , where $V_{x}=1$ for defects, 0 otherwise. Substituting $3\psi _{x}$ and multiplying the equation by −1 we get:

$E\psi _{x}=-(\psi _{x-1}-2\psi _{x}+\psi _{x+1})+V_{x}\psi _{x}$

where $E=3-\lambda$ is minimized now, becoming the analog of energy. The formula inside the bracket is discrete Laplace operator, making this equation a discrete analogue of stationary Schrodinger equation. As in quantum mechanics, MERW predicts that the probability distribution should lead exactly to the one of quantum ground state: $\rho _{x}\propto \psi _{x}^{2}$ with its strongly localized density (in contrast to standard diffusion). Taking the infinitesimal limit, we can get standard continuous stationary (time-independent) Schrodinger equation ( $E\psi =-C\psi _{xx}+V\psi$ for $C=\hbar ^{2}/2m$ ) here.^[11]

Related Research Articles

In particle physics, the Dirac equation is a relativistic wave equation derived by British physicist Paul Dirac in 1928. In its free form, or including electromagnetic interactions, it describes all spin-1⁄2 massive particles, called "Dirac particles", such as electrons and quarks for which parity is a symmetry. It is consistent with both the principles of quantum mechanics and the theory of special relativity, and was the first theory to account fully for special relativity in the context of quantum mechanics. It was validated by accounting for the fine structure of the hydrogen spectrum in a completely rigorous way.

<span class="mw-page-title-main">Negative binomial distribution</span> Probability distribution

In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes occurs. For example, we can define rolling a 6 on a die as a success, and rolling any other number as a failure, and ask how many failure rolls will occur before we see the third success. In such a case, the probability distribution of the number of failures that appear will be a negative binomial distribution.

In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the time between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.

In quantum mechanics, a density matrix is a matrix that describes the quantum state of a physical system. It allows for the calculation of the probabilities of the outcomes of any measurement performed upon this system, using the Born rule. It is a generalization of the more usual state vectors or wavefunctions: while those can only represent pure states, density matrices can also represent mixed states. Mixed states arise in quantum mechanics in two different situations:

when the preparation of the system is not fully known, and thus one must deal with a statistical ensemble of possible preparations, and
when one wants to describe a physical system which is entangled with another, without describing their combined state.

In statistics, the Wishart distribution is a generalization to multiple dimensions of the gamma distribution. It is named in honor of John Wishart, who first formulated the distribution in 1928. Other names include Wishart ensemble, or Wishart–Laguerre ensemble, or LOE, LUE, LSE.

In statistics and information theory, a maximum entropy probability distribution has entropy that is at least as great as that of all other members of a specified class of probability distributions. According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class, then the distribution with the largest entropy should be chosen as the least-informative default. The motivation is twofold: first, maximizing entropy minimizes the amount of prior information built into the distribution; second, many physical systems tend to move towards maximal entropy configurations over time.

In mathematics, the von Mangoldt function is an arithmetic function named after German mathematician Hans von Mangoldt. It is an example of an important arithmetic function that is neither multiplicative nor additive.

In quantum mechanics, the Hellmann–Feynman theorem relates the derivative of the total energy with respect to a parameter to the expectation value of the derivative of the Hamiltonian with respect to that same parameter. According to the theorem, once the spatial distribution of the electrons has been determined by solving the Schrödinger equation, all the forces in the system can be calculated using classical electrostatics.

The principle of detailed balance can be used in kinetic systems which are decomposed into elementary processes. It states that at equilibrium, each elementary process is in equilibrium with its reverse process.

In queueing theory, a discipline within the mathematical theory of probability, a Jackson network is a class of queueing network where the equilibrium distribution is particularly simple to compute as the network has a product-form solution. It was the first significant development in the theory of networks of queues, and generalising and applying the ideas of the theorem to search for similar product-form solutions in other networks has been the subject of much research, including ideas used in the development of the Internet. The networks were first identified by James R. Jackson and his paper was re-printed in the journal Management Science’s ‘Ten Most Influential Titles of Management Sciences First Fifty Years.’

In quantum mechanics, notably in quantum information theory, fidelity is a measure of the "closeness" of two quantum states. It expresses the probability that one state will pass a test to identify as the other. The fidelity is not a metric on the space of density matrices, but it can be used to define the Bures metric on this space.

The time-evolving block decimation (TEBD) algorithm is a numerical scheme used to simulate one-dimensional quantum many-body systems, characterized by at most nearest-neighbour interactions. It is dubbed Time-evolving Block Decimation because it dynamically identifies the relevant low-dimensional Hilbert subspaces of an exponentially larger original Hilbert space. The algorithm, based on the Matrix Product States formalism, is highly efficient when the amount of entanglement in the system is limited, a requirement fulfilled by a large class of quantum many-body systems in one dimension.

In quantum information theory, quantum relative entropy is a measure of distinguishability between two quantum states. It is the quantum mechanical analog of relative entropy.

The Newman–Penrose (NP) formalism is a set of notation developed by Ezra T. Newman and Roger Penrose for general relativity (GR). Their notation is an effort to treat general relativity in terms of spinor notation, which introduces complex forms of the usual variables used in GR. The NP formalism is itself a special case of the tetrad formalism, where the tensors of the theory are projected onto a complete vector basis at each point in spacetime. Usually this vector basis is chosen to reflect some symmetry of the spacetime, leading to simplified expressions for physical observables. In the case of the NP formalism, the vector basis chosen is a null tetrad: a set of four null vectors—two real, and a complex-conjugate pair. The two real members often asymptotically point radially inward and radially outward, and the formalism is well adapted to treatment of the propagation of radiation in curved spacetime. The Weyl scalars, derived from the Weyl tensor, are often used. In particular, it can be shown that one of these scalars— $in the appropriate frame—encodes the outgoing gravitational radiation of an asymptotically flat system.$

In mathematical physics, spacetime algebra (STA) is a name for the Clifford algebra Cl_1,3(R), or equivalently the geometric algebra G(M⁴). According to David Hestenes, spacetime algebra can be particularly closely associated with the geometry of special relativity and relativistic spacetime.

In mathematics, the spectral theory of ordinary differential equations is the part of spectral theory concerned with the determination of the spectrum and eigenfunction expansion associated with a linear ordinary differential equation. In his dissertation, Hermann Weyl generalized the classical Sturm–Liouville theory on a finite closed interval to second order differential operators with singularities at the endpoints of the interval, possibly semi-infinite or infinite. Unlike the classical case, the spectrum may no longer consist of just a countable set of eigenvalues, but may also contain a continuous part. In this case the eigenfunction expansion involves an integral over the continuous part with respect to a spectral measure, given by the Titchmarsh–Kodaira formula. The theory was put in its final simplified form for singular differential equations of even degree by Kodaira and others, using von Neumann's spectral theorem. It has had important applications in quantum mechanics, operator theory and harmonic analysis on semisimple Lie groups.

In general relativity, a point mass deflects a light ray with impact parameter $by an angle approximately equal to$

In quantum mechanics, and especially quantum information theory, the purity of a normalized quantum state is a scalar defined as

In mathematical physics, the Belinfante–Rosenfeld tensor is a modification of the energy–momentum tensor that is constructed from the canonical energy–momentum tensor and the spin current so as to be symmetric yet still conserved.

Tau functions are an important ingredient in the modern mathematical theory of integrable systems, and have numerous applications in a variety of other domains. They were originally introduced by Ryogo Hirota in his direct method approach to soliton equations, based on expressing them in an equivalent bilinear form.

References

↑ Sinatra, Roberta; Gómez-Gardeñes, Jesús; Lambiotte, Renaud; Nicosia, Vincenzo; Latora, Vito (2011). "Maximal-entropy random walks in complex networks with limited information" (PDF). Physical Review E. 83 (3): 030103. arXiv: 1007.4936 . Bibcode:2011PhRvE..83c0103S. doi:10.1103/PhysRevE.83.030103. ISSN 1539-3755. PMID 21517435. S2CID 6984660.
↑ Li, Rong-Hua; Yu, Jeffrey Xu; Liu, Jianquan (2011). Link prediction: the power of maximal entropy random walk (PDF). Association for Computing Machinery Conference on Information and Knowledge Management. p. 1147. doi:10.1145/2063576.2063741. S2CID 15309519. Archived from the original (PDF) on 12 February 2017.
↑ Ochab, J.K.; Burda, Z. (2013). "Maximal entropy random walk in community detection". The European Physical Journal Special Topics. 216 (1): 73–81. arXiv: 1208.3688 . Bibcode:2013EPJST.216...73O. doi:10.1140/epjst/e2013-01730-6. ISSN 1951-6355. S2CID 56409069.
↑ Chen, Y.; Georgiou, T.T.; Pavon, M.; Tannenbaum, A. (2016). "Robust transport over networks". IEEE Transactions on Automatic Control. 62 (9): 4675–4682. arXiv: 1603.08129 . Bibcode:2016arXiv160308129C. doi:10.1109/TAC.2016.2626796. PMC 5600536 . PMID 28924302.
↑ Delvenne, Jean-Charles; Libert, Anne-Sophie (2011). "Centrality measures and thermodynamic formalism for complex networks". Physical Review E. 83 (4): 046117. arXiv: 0710.3972 . Bibcode:2011PhRvE..83d6117D. doi:10.1103/PhysRevE.83.046117. ISSN 1539-3755. PMID 21599250. S2CID 25816198.
↑ Jin-Gang Yu; Ji Zhao; Jinwen Tian; Yihua Tan (2014). "Maximal Entropy Random Walk for Region-Based Visual Saliency". IEEE Transactions on Cybernetics. Institute of Electrical and Electronics Engineers (IEEE). 44 (9): 1661–1672. doi:10.1109/tcyb.2013.2292054. ISSN 2168-2267. PMID 25137693. S2CID 20962642.
↑ L. Wang, J. Zhao, X. Hu, J. Lu, Weakly supervised object localization via maximal entropy random walk, ICIP, 2014.
↑ Korus, Pawel; Huang, Jiwu (2016). "Improved Tampering Localization in Digital Image Forensics Based on Maximal Entropy Random Walk". IEEE Signal Processing Letters. Institute of Electrical and Electronics Engineers (IEEE). 23 (1): 169–173. Bibcode:2016ISPL...23..169K. doi:10.1109/lsp.2015.2507598. ISSN 1070-9908. S2CID 16305991.
↑ Galinsky, Vitaly L.; Frank, Lawrence R. (2015). "Simultaneous Multi-Scale Diffusion Estimation and Tractography Guided by Entropy Spectrum Pathways". IEEE Transactions on Medical Imaging. Institute of Electrical and Electronics Engineers (IEEE). 34 (5): 1177–1193. doi:10.1109/tmi.2014.2380812. ISSN 0278-0062. PMC 4417445 . PMID 25532167.
↑ Burda, Z.; Duda, J.; Luck, J. M.; Waclaw, B. (23 April 2009). "Localization of the Maximal Entropy Random Walk". Physical Review Letters. 102 (16): 160602. arXiv: 0810.4113 . Bibcode:2009PhRvL.102p0602B. doi:10.1103/physrevlett.102.160602. ISSN 0031-9007. PMID 19518691. S2CID 32134048.
↑ J. Duda, Extended Maximal Entropy Random Walk, PhD Thesis, 2012.

External links

Gábor Simonyi, Y. Lin, Z. Zhang, "Mean first-passage time for maximal-entropy random walks in complex networks". Scientific Reports, 2014.
Electron Conductance Models Using Maximal Entropy Random Walks Wolfram Demonstration Project

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[SinatraGómez-Gardeñes2011-1] Sinatra, Roberta; Gómez-Gardeñes, Jesús; Lambiotte, Renaud; Nicosia, Vincenzo; Latora, Vito (2011). "Maximal-entropy random walks in complex networks with limited information" (PDF). Physical Review E. 83 (3): 030103. arXiv: 1007.4936 . Bibcode:2011PhRvE..83c0103S. doi:10.1103/PhysRevE.83.030103. ISSN 1539-3755. PMID 21517435. S2CID 6984660.

[LiYu2011-2] Li, Rong-Hua; Yu, Jeffrey Xu; Liu, Jianquan (2011). Link prediction: the power of maximal entropy random walk (PDF). Association for Computing Machinery Conference on Information and Knowledge Management. p. 1147. doi:10.1145/2063576.2063741. S2CID 15309519. Archived from the original (PDF) on 12 February 2017.

[OchabBurda2013-3] Ochab, J.K.; Burda, Z. (2013). "Maximal entropy random walk in community detection". The European Physical Journal Special Topics. 216 (1): 73–81. arXiv: 1208.3688 . Bibcode:2013EPJST.216...73O. doi:10.1140/epjst/e2013-01730-6. ISSN 1951-6355. S2CID 56409069.

[CGPT2016-4] Chen, Y.; Georgiou, T.T.; Pavon, M.; Tannenbaum, A. (2016). "Robust transport over networks". IEEE Transactions on Automatic Control. 62 (9): 4675–4682. arXiv: 1603.08129 . Bibcode:2016arXiv160308129C. doi:10.1109/TAC.2016.2626796. PMC 5600536 . PMID 28924302.

[DelvenneLibert2011-5] Delvenne, Jean-Charles; Libert, Anne-Sophie (2011). "Centrality measures and thermodynamic formalism for complex networks". Physical Review E. 83 (4): 046117. arXiv: 0710.3972 . Bibcode:2011PhRvE..83d6117D. doi:10.1103/PhysRevE.83.046117. ISSN 1539-3755. PMID 21599250. S2CID 25816198.

[saliency-6] Jin-Gang Yu; Ji Zhao; Jinwen Tian; Yihua Tan (2014). "Maximal Entropy Random Walk for Region-Based Visual Saliency". IEEE Transactions on Cybernetics. Institute of Electrical and Electronics Engineers (IEEE). 44 (9): 1661–1672. doi:10.1109/tcyb.2013.2292054. ISSN 2168-2267. PMID 25137693. S2CID 20962642.

[local-7] L. Wang, J. Zhao, X. Hu, J. Lu, Weakly supervised object localization via maximal entropy random walk, ICIP, 2014.

[tamp-8] Korus, Pawel; Huang, Jiwu (2016). "Improved Tampering Localization in Digital Image Forensics Based on Maximal Entropy Random Walk". IEEE Signal Processing Letters. Institute of Electrical and Electronics Engineers (IEEE). 23 (1): 169–173. Bibcode:2016ISPL...23..169K. doi:10.1109/lsp.2015.2507598. ISSN 1070-9908. S2CID 16305991.

[trac-9] Galinsky, Vitaly L.; Frank, Lawrence R. (2015). "Simultaneous Multi-Scale Diffusion Estimation and Tractography Guided by Entropy Spectrum Pathways". IEEE Transactions on Medical Imaging. Institute of Electrical and Electronics Engineers (IEEE). 34 (5): 1177–1193. doi:10.1109/tmi.2014.2380812. ISSN 0278-0062. PMC 4417445 . PMID 25532167.

[prl-10] Burda, Z.; Duda, J.; Luck, J. M.; Waclaw, B. (23 April 2009). "Localization of the Maximal Entropy Random Walk". Physical Review Letters. 102 (16): 160602. arXiv: 0810.4113 . Bibcode:2009PhRvL.102p0602B. doi:10.1103/physrevlett.102.160602. ISSN 0031-9007. PMID 19518691. S2CID 32134048.

[ext-11] J. Duda, Extended Maximal Entropy Random Walk, PhD Thesis, 2012.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

v t e Stochastic processes
Discrete time	Bernoulli process Branching process Chinese restaurant process Galton–Watson process Independent and identically distributed random variables Markov chain Moran process Random walk Loop-erased Self-avoiding Biased Maximal entropy
Continuous time	Additive process Bessel process Birth–death process pure birth Brownian motion Bridge Excursion Fractional Geometric Meander Cauchy process Contact process Continuous-time random walk Cox process Diffusion process Dyson Brownian motion Empirical process Feller process Fleming–Viot process Gamma process Geometric process Hawkes process Hunt process Interacting particle systems Itô diffusion Itô process Jump diffusion Jump process Lévy process Local time Markov additive process McKean–Vlasov process Ornstein–Uhlenbeck process Poisson process Compound Non-homogeneous Schramm–Loewner evolution Semimartingale Sigma-martingale Stable process Superprocess Telegraph process Variance gamma process Wiener process Wiener sausage
Both	Branching process Galves–Löcherbach model Gaussian process Hidden Markov model (HMM) Markov process Martingale Differences Local Sub- Super- Random dynamical system Regenerative process Renewal process Stochastic chains with memory of variable length White noise
Fields and other	Dirichlet process Gaussian random field Gibbs measure Hopfield model Ising model Potts model Boolean network Markov random field Percolation Pitman–Yor process Point process Cox Poisson Random field Random graph
Time series models	Autoregressive conditional heteroskedasticity (ARCH) model Autoregressive integrated moving average (ARIMA) model Autoregressive (AR) model Autoregressive–moving-average (ARMA) model Generalized autoregressive conditional heteroskedasticity (GARCH) model Moving-average (MA) model
Financial models	Binomial options pricing model Black–Derman–Toy Black–Karasinski Black–Scholes Chan–Karolyi–Longstaff–Sanders (CKLS) Chen Constant elasticity of variance (CEV) Cox–Ingersoll–Ross (CIR) Garman–Kohlhagen Heath–Jarrow–Morton (HJM) Heston Ho–Lee Hull–White LIBOR market Rendleman–Bartter SABR volatility Vašíček Wilkie
Actuarial models	Bühlmann Cramér–Lundberg Risk process Sparre–Anderson
Queueing models	Bulk Fluid Generalized queueing network M/G/1 M/M/1 M/M/c
Properties	Càdlàg paths Continuous Continuous paths Ergodic Exchangeable Feller-continuous Gauss–Markov Markov Mixing Piecewise-deterministic Predictable Progressively measurable Self-similar Stationary Time-reversible
Limit theorems	Central limit theorem Donsker's theorem Doob's martingale convergence theorems Ergodic theorem Fisher–Tippett–Gnedenko theorem Large deviation principle Law of large numbers (weak/strong) Law of the iterated logarithm Maximal ergodic theorem Sanov's theorem Zero–one laws (Blumenthal, Borel–Cantelli, Engelbert–Schmidt, Hewitt–Savage, Kolmogorov, Lévy)
Inequalities	Burkholder–Davis–Gundy Doob's martingale Doob's upcrossing Kunita–Watanabe Marcinkiewicz–Zygmund
Tools	Cameron–Martin formula Convergence of random variables Doléans-Dade exponential Doob decomposition theorem Doob–Meyer decomposition theorem Doob's optional stopping theorem Dynkin's formula Feynman–Kac formula Filtration Girsanov theorem Infinitesimal generator Itô integral Itô's lemma Karhunen–Loève theorem Kolmogorov continuity theorem Kolmogorov extension theorem Lévy–Prokhorov metric Malliavin calculus Martingale representation theorem Optional stopping theorem Prokhorov's theorem Quadratic variation Reflection principle Skorokhod integral Skorokhod's representation theorem Skorokhod space Snell envelope Stochastic differential equation Tanaka Stopping time Stratonovich integral Uniform integrability Usual hypotheses Wiener space Classical Abstract
Disciplines	Actuarial mathematics Control theory Econometrics Ergodic theory Extreme value theory (EVT) Large deviations theory Mathematical finance Mathematical statistics Probability theory Queueing theory Renewal theory Ruin theory Signal processing Statistics Stochastic analysis Time series analysis Machine learning
List of topics Category