Kolmogorov backward equations (diffusion) Last updated August 21, 2025 Overview The Kolmogorov forward equation is used to evolve the state of a system forward in time. Given an initial probability distribution p t ( x ) {\displaystyle p_{t}(x)} for a system being in state x {\displaystyle x} at time t , {\displaystyle t,} the forward PDE is integrated to obtain p s ( x ) {\displaystyle p_{s}(x)} at later times s > t . {\displaystyle s>t.} A common case takes the initial value p t ( x ) {\displaystyle p_{t}(x)} to be a Dirac delta function centered on the known initial state x . {\displaystyle x.}
The Kolmogorov backward equation is used to estimate the probability of the current system evolving so that it's future state at time s > t {\displaystyle s>t} is given by some fixed probability function p s ( x ) . {\displaystyle p_{s}(x).} That is, the probability distribution in the future is given as a boundary condition, and the backwards PDE is integrated backwards in time.
A common boundary condition is to ask that the future state is contained in some subset of states B , {\displaystyle B,} the target set . Writing the set membership function as 1 B , {\displaystyle 1_{B},} so that 1 B ( x ) = 1 {\displaystyle 1_{B}(x)=1} if x ∈ B {\displaystyle x\in B} and zero otherwise, the backward equation expresses the hit probability p t ( x ) {\displaystyle p_{t}(x)} that in the future, the set membership will be sharp, given by p s ( x ) = 1 B ( x ) / ‖ B ‖ . {\displaystyle p_{s}(x)=1_{B}(x)/\Vert B\Vert .} Here, ‖ B ‖ {\displaystyle \Vert B\Vert } is just the size of the set B , {\displaystyle B,} a normalization so that the total probability at time s {\displaystyle s} integrates to one.
Kolmogorov backward equation Let { X t } 0 ≤ t ≤ T {\displaystyle \{X_{t}\}_{0\leq t\leq T}} be the solution of the stochastic differential equation
d X t = μ ( t , X t ) d t + σ ( t , X t ) d W t , 0 ≤ t ≤ T , {\displaystyle dX_{t}\;=\;\mu {\bigl (}t,X_{t}{\bigr )}\,dt\;+\;\sigma {\bigl (}t,X_{t}{\bigr )}\,dW_{t},\quad 0\;\leq \;t\;\leq \;T,} where W t {\displaystyle W_{t}} is a (possibly multi-dimensional) Wiener process (Brownian motion ), μ {\displaystyle \mu } is the drift coefficient, and σ {\displaystyle \sigma } is related to the diffusion coefficient D {\displaystyle D} as D = σ 2 / 2. {\displaystyle D=\sigma ^{2}/2.} Define the transition density (or fundamental solution ) p ( t , x ; T , y ) {\displaystyle p(t,x;\,T,y)} by
p ( t , x ; T , y ) = P [ X T ∈ d y ∣ X t = x ] d y , t < T . {\displaystyle p(t,x;\,T,y)\;=\;{\frac {\mathbb {P} [\,X_{T}\in dy\,\mid \,X_{t}=x\,]}{dy}},\quad t<T.} Then the usual Kolmogorov backward equation for p {\displaystyle p} is
∂ p ∂ t ( t , x ; T , y ) + A p ( t , x ; T , y ) = 0 , lim t → T p ( t , x ; T , y ) = δ y ( x ) , {\displaystyle {\frac {\partial p}{\partial t}}(t,x;\,T,y)\;+\;A\,p(t,x;\,T,y)\;=\;0,\quad \lim _{t\to T}\,p(t,x;\,T,y)\;=\;\delta _{y}(x),} where δ y ( x ) {\displaystyle \delta _{y}(x)} is the Dirac delta in x {\displaystyle x} centered at y {\displaystyle y} , and A {\displaystyle A} is the infinitesimal generator of the diffusion:
A f ( x ) = ∑ i μ i ( x ) ∂ f ∂ x i ( x ) + 1 2 ∑ i , j [ σ ( x ) σ ( x ) T ] i j ∂ 2 f ∂ x i ∂ x j ( x ) . {\displaystyle A\,f(x)\;=\;\sum _{i}\,\mu _{i}(x)\,{\frac {\partial f}{\partial x_{i}}}(x)\;+\;{\frac {1}{2}}\,\sum _{i,j}\,{\bigl [}\sigma (x)\,\sigma (x)^{\mathsf {T}}{\bigr ]}_{ij}\,{\frac {\partial ^{2}f}{\partial x_{i}\,\partial x_{j}}}(x).} The backward Kolmogorov equation can be used to derive the Feynman–Kac formula . Given a function F {\displaystyle F} that satisfies the boundary value problem
∂ F ∂ t ( t , x ) + μ ( t , x ) ∂ F ∂ x ( t , x ) + 1 2 σ 2 ( t , x ) ∂ 2 F ∂ x 2 ( t , x ) = 0 , 0 ≤ t ≤ T , F ( T , x ) = Φ ( x ) {\displaystyle {\frac {\partial F}{\partial t}}(t,x)\;+\;\mu (t,x)\,{\frac {\partial F}{\partial x}}(t,x)\;+\;{\frac {1}{2}}\,\sigma ^{2}(t,x)\,{\frac {\partial ^{2}F}{\partial x^{2}}}(t,x)\;=\;0,\quad 0\leq t\leq T,\quad F(T,x)\;=\;\Phi (x)} and given { X t } 0 ≤ t ≤ T , {\displaystyle \{X_{t}\}_{0\leq t\leq T},} that, just as before, is a solution of
d X t = μ ( t , X t ) d t + σ ( t , X t ) d W t , 0 ≤ t ≤ T , {\displaystyle dX_{t}\;=\;\mu (t,X_{t})\,dt\;+\;\sigma (t,X_{t})\,dW_{t},\quad 0\leq t\leq T,} then if the expectation value is finite
∫ 0 T E [ ( σ ( t , X t ) ∂ F ∂ x ( t , X t ) ) 2 ] d t < ∞ , {\displaystyle \int _{0}^{T}\,\mathbb {E} \!{\Bigl [}{\bigl (}\sigma (t,X_{t})\,{\frac {\partial F}{\partial x}}(t,X_{t}){\bigr )}^{2}{\Bigr ]}\,dt\;<\;\infty ,} then the Feynman–Kac formula is obtained:
F ( t , x ) = E [ Φ ( X T ) | X t = x ] . {\displaystyle F(t,x)\;=\;\mathbb {E} \!{\bigl [}\;\Phi (X_{T})\,{\big |}\;X_{t}=x{\bigr ]}.} Proof. Apply Itô’s formula to F ( s , X s ) {\displaystyle F(s,X_{s})} for t ≤ s ≤ T {\displaystyle t\leq s\leq T} :
F ( T , X T ) = F ( t , X t ) + ∫ t T { ∂ F ∂ s ( s , X s ) + μ ( s , X s ) ∂ F ∂ x ( s , X s ) + 1 2 σ 2 ( s , X s ) ∂ 2 F ∂ x 2 ( s , X s ) } d s + ∫ t T σ ( s , X s ) ∂ F ∂ x ( s , X s ) d W s . {\displaystyle F(T,X_{T})\;=\;F(t,X_{t})\;+\;\int _{t}^{T}\!{\Bigl \{}{\frac {\partial F}{\partial s}}(s,X_{s})\;+\;\mu (s,X_{s})\,{\frac {\partial F}{\partial x}}(s,X_{s})\;+\;{\tfrac {1}{2}}\,\sigma ^{2}(s,X_{s})\,{\frac {\partial ^{2}F}{\partial x^{2}}}(s,X_{s}){\Bigr \}}\,ds\;+\;\int _{t}^{T}\!\sigma (s,X_{s})\,{\frac {\partial F}{\partial x}}(s,X_{s})\,dW_{s}.} Because F {\displaystyle F} solves the PDE, the first integral is zero. Taking conditional expectation and using the martingale property of the Itô integral gives
E [ F ( T , X T ) | X t = x ] = F ( t , x ) . {\displaystyle \mathbb {E} \!{\bigl [}F(T,X_{T})\,{\big |}\;X_{t}=x{\bigr ]}\;=\;F(t,x).} Substitute F ( T , X T ) = Φ ( X T ) {\displaystyle F(T,X_{T})=\Phi (X_{T})} to conclude
F ( t , x ) = E [ Φ ( X T ) | X t = x ] . {\displaystyle F(t,x)\;=\;\mathbb {E} \!{\bigl [}\;\Phi (X_{T})\,{\big |}\;X_{t}=x{\bigr ]}.} Derivation of the backward Kolmogorov equation The Feynman–Kac representation can be used to find the PDE solved by the transition densities of solutions to SDEs. Suppose
d X t = μ ( t , X t ) d t + σ ( t , X t ) d W t . {\displaystyle dX_{t}\;=\;\mu (t,X_{t})\,dt\;+\;\sigma (t,X_{t})\,dW_{t}.} For any set B {\displaystyle B} , define
p B ( t , x ; T ) ≜ P [ X T ∈ B ∣ X t = x ] = E [ 1 B ( X T ) | X t = x ] . {\displaystyle p_{B}(t,x;\,T)\;\triangleq \;\mathbb {P} \!{\bigl [}X_{T}\in B\,\mid \,X_{t}=x{\bigr ]}\;=\;\mathbb {E} \!{\bigl [}\mathbf {1} _{B}(X_{T})\,{\big |}\;X_{t}=x{\bigr ]}.} By Feynman–Kac (under integrability conditions), taking Φ = 1 B {\displaystyle \Phi =\mathbf {1} _{B}} , then
∂ p B ∂ t ( t , x ; T ) + A p B ( t , x ; T ) = 0 , p B ( T , x ; T ) = 1 B ( x ) , {\displaystyle {\frac {\partial p_{B}}{\partial t}}(t,x;\,T)\;+\;A\,p_{B}(t,x;\,T)\;=\;0,\quad p_{B}(T,x;\,T)\;=\;\mathbf {1} _{B}(x),} where
A f ( t , x ) = μ ( t , x ) ∂ f ∂ x ( t , x ) + 1 2 σ 2 ( t , x ) ∂ 2 f ∂ x 2 ( t , x ) . {\displaystyle A\,f(t,x)\;=\;\mu (t,x)\,{\frac {\partial f}{\partial x}}(t,x)\;+\;{\tfrac {1}{2}}\,\sigma ^{2}(t,x)\,{\frac {\partial ^{2}f}{\partial x^{2}}}(t,x).} Assuming Lebesgue measure as the reference, write | B | {\displaystyle |B|} for its measure. The transition density p ( t , x ; T , y ) {\displaystyle p(t,x;\,T,y)} is
p ( t , x ; T , y ) ≜ lim B → y 1 | B | P [ X T ∈ B ∣ X t = x ] . {\displaystyle p(t,x;\,T,y)\;\triangleq \;\lim _{B\to y}\,{\frac {1}{|B|}}\,\mathbb {P} \!{\bigl [}X_{T}\in B\,\mid \,X_{t}=x{\bigr ]}.} Then
∂ p ∂ t ( t , x ; T , y ) + A p ( t , x ; T , y ) = 0 , p ( t , x ; T , y ) → δ y ( x ) as t → T . {\displaystyle {\frac {\partial p}{\partial t}}(t,x;\,T,y)\;+\;A\,p(t,x;\,T,y)\;=\;0,\quad p(t,x;\,T,y)\;\to \;\delta _{y}(x)\quad {\text{as }}t\;\to \;T.} Derivation of the forward Kolmogorov equation The Kolmogorov forward equation is
∂ ∂ T p ( t , x ; T , y ) = A ∗ [ p ( t , x ; T , y ) ] , lim T → t p ( t , x ; T , y ) = δ y ( x ) . {\displaystyle {\frac {\partial }{\partial T}}\,p{\bigl (}t,x;\,T,y{\bigr )}\;=\;A^{*}\!{\bigl [}p{\bigl (}t,x;\,T,y{\bigr )}{\bigr ]},\quad \lim _{T\to t}\,p(t,x;\,T,y)\;=\;\delta _{y}(x).} For T > r > t {\displaystyle T>r>t} , the Markov property implies
p ( t , x ; T , y ) = ∫ − ∞ ∞ p ( t , x ; r , z ) p ( r , z ; T , y ) d z . {\displaystyle p(t,x;\,T,y)\;=\;\int _{-\infty }^{\infty }p{\bigl (}t,x;\,r,z{\bigr )}\,p{\bigl (}r,z;\,T,y{\bigr )}\,dz.} Differentiate both sides w.r.t. r {\displaystyle r} :
0 = ∫ − ∞ ∞ [ ∂ ∂ r p ( t , x ; r , z ) ⋅ p ( r , z ; T , y ) + p ( t , x ; r , z ) ⋅ ∂ ∂ r p ( r , z ; T , y ) ] d z . {\displaystyle 0\;=\;\int _{-\infty }^{\infty }{\Bigl [}{\frac {\partial }{\partial r}}\,p{\bigl (}t,x;\,r,z{\bigr )}\,\cdot \,p{\bigl (}r,z;\,T,y{\bigr )}\;+\;p{\bigl (}t,x;\,r,z{\bigr )}\,\cdot \,{\frac {\partial }{\partial r}}\,p{\bigl (}r,z;\,T,y{\bigr )}{\Bigr ]}\,dz.} From the backward Kolmogorov equation:
∂ ∂ r p ( r , z ; T , y ) = − A p ( r , z ; T , y ) . {\displaystyle {\frac {\partial }{\partial r}}\,p{\bigl (}r,z;\,T,y{\bigr )}\;=\;-\,A\,p{\bigl (}r,z;\,T,y{\bigr )}.} Substitute into the integral:
0 = ∫ − ∞ ∞ [ ∂ ∂ r p ( t , x ; r , z ) ⋅ p ( r , z ; T , y ) − p ( t , x ; r , z ) ⋅ A p ( r , z ; T , y ) ] d z . {\displaystyle 0\;=\;\int _{-\infty }^{\infty }{\Bigl [}{\frac {\partial }{\partial r}}\,p{\bigl (}t,x;\,r,z{\bigr )}\,\cdot \,p{\bigl (}r,z;\,T,y{\bigr )}\;-\;p{\bigl (}t,x;\,r,z{\bigr )}\,\cdot \,A\,p{\bigl (}r,z;\,T,y{\bigr )}{\Bigr ]}\,dz.} By definition of the adjoint operator A ∗ {\displaystyle A^{*}} :
∫ − ∞ ∞ [ ∂ ∂ r p ( t , x ; r , z ) − A ∗ p ( t , x ; r , z ) ] p ( r , z ; T , y ) d z = 0. {\displaystyle \int _{-\infty }^{\infty }{\bigl [}{\frac {\partial }{\partial r}}\,p{\bigl (}t,x;\,r,z{\bigr )}\;-\;A^{*}\,p{\bigl (}t,x;\,r,z{\bigr )}{\bigr ]}\,p{\bigl (}r,z;\,T,y{\bigr )}\,dz\;=\;0.} Since p ( r , z ; T , y ) {\displaystyle p(r,z;\,T,y)} can be arbitrary, the bracket must vanish:
∂ ∂ r p ( t , x ; r , z ) = A ∗ [ p ( t , x ; r , z ) ] . {\displaystyle {\frac {\partial }{\partial r}}\,p{\bigl (}t,x;\,r,z{\bigr )}\;=\;A^{*}{\bigl [}p{\bigl (}t,x;\,r,z{\bigr )}{\bigr ]}.} Relabel r → T {\displaystyle r\to T} and z → y {\displaystyle z\to y} , yielding the forward Kolmogorov equation:
∂ ∂ T p ( t , x ; T , y ) = A ∗ [ p ( t , x ; T , y ) ] , lim T → t p ( t , x ; T , y ) = δ y ( x ) . {\displaystyle {\frac {\partial }{\partial T}}\,p{\bigl (}t,x;\,T,y{\bigr )}\;=\;A^{*}\!{\bigl [}p{\bigl (}t,x;\,T,y{\bigr )}{\bigr ]},\quad \lim _{T\to t}\,p(t,x;\,T,y)\;=\;\delta _{y}(x).} Finally,
A ∗ g ( x ) = − ∑ i ∂ ∂ x i [ μ i ( x ) g ( x ) ] + 1 2 ∑ i , j ∂ 2 ∂ x i ∂ x j [ ( σ ( x ) σ ( x ) T ) i j g ( x ) ] . {\displaystyle A^{*}\,g(x)\;=\;-\sum _{i}\,{\frac {\partial }{\partial x_{i}}}{\bigl [}\mu _{i}(x)\,g(x){\bigr ]}\;+\;{\frac {1}{2}}\,\sum _{i,j}\,{\frac {\partial ^{2}}{\partial x_{i}\,\partial x_{j}}}{\Bigl [}{\bigl (}\sigma (x)\,\sigma (x)^{\mathsf {T}}{\bigr )}_{ij}\,g(x){\Bigr ]}.} References Etheridge, A. (2002). A Course in Financial Calculus . Cambridge University Press. ↑ Andrei Kolmogorov, "Über die analytischen Methoden in der Wahrscheinlichkeitsrechnung" (On Analytical Methods in the Theory of Probability), 1931, This page is based on this
Wikipedia article Text is available under the
CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.