Filtering problem (stochastic processes)

Last updated

In the theory of stochastic processes, filtering describes the problem of determining the state of a system from an incomplete and potentially noisy set of observations. While originally motivated by problems in engineering, filtering found applications in many fields from signal processing to finance.

Contents

The problem of optimal non-linear filtering (even for the non-stationary case) was solved by Ruslan L. Stratonovich (1959, [1] 1960 [2] ), see also Harold J. Kushner's work [3] and Moshe Zakai's, who introduced a simplified dynamics for the unnormalized conditional law of the filter [4] known as Zakai equation. The solution, however, is infinite-dimensional in the general case. [5] Certain approximations and special cases are well understood: for example, the linear filters are optimal for Gaussian random variables, and are known as the Wiener filter and the Kalman-Bucy filter. More generally, as the solution is infinite dimensional, it requires finite dimensional approximations to be implemented in a computer with finite memory. A finite dimensional approximated nonlinear filter may be more based on heuristics, such as the extended Kalman filter or the assumed density filters, [6] or more methodologically oriented such as for example the projection filters, [7] some sub-families of which are shown to coincide with the Assumed Density Filters. [8] Particle filters [9] are another option to attack the infinite dimensional filtering problem and are based on sequential Monte Carlo methods.

In general, if the separation principle applies, then filtering also arises as part of the solution of an optimal control problem. For example, the Kalman filter is the estimation part of the optimal control solution to the linear-quadratic-Gaussian control problem.

The mathematical formalism

Consider a probability space (Ω, Σ, P) and suppose that the (random) state Yt in n-dimensional Euclidean space Rn of a system of interest at time t is a random variable Yt : Ω  Rn given by the solution to an Itō stochastic differential equation of the form

where B denotes standard p-dimensional Brownian motion, b : [0, +∞) × Rn  Rn is the drift field, and σ : [0, +∞) × Rn  Rn×p is the diffusion field. It is assumed that observations Ht in Rm (note that m and n may, in general, be unequal) are taken for each time t according to

Adopting the Itō interpretation of the stochastic differential and setting

this gives the following stochastic integral representation for the observations Zt:

where W denotes standard r-dimensional Brownian motion, independent of B and the initial condition Y0, and c : [0, +∞) × Rn  Rn and γ : [0, +∞) × Rn  Rn×r satisfy

for all t and x and some constant C.

The filtering problem is the following: given observations Zs for 0  s  t, what is the best estimate Ŷt of the true state Yt of the system based on those observations?

By "based on those observations" it is meant that Ŷt is measurable with respect to the σ-algebra Gt generated by the observations Zs, 0  s  t. Denote by K = K(Z, t) the collection of all Rn-valued random variables Y that are square-integrable and Gt-measurable:

By "best estimate", it is meant that Ŷt minimizes the mean-square distance between Yt and all candidates in K:

Basic result: orthogonal projection

The space K(Z, t) of candidates is a Hilbert space, and the general theory of Hilbert spaces implies that the solution Ŷt of the minimization problem (M) is given by

where PK(Z,t) denotes the orthogonal projection of L2(Ω, Σ, P; Rn) onto the linear subspace K(Z, t) = L2(Ω, Gt, P; Rn). Furthermore, it is a general fact about conditional expectations that if F is any sub-σ-algebra of Σ then the orthogonal projection

is exactly the conditional expectation operator E[·|F], i.e.,

Hence,

This elementary result is the basis for the general Fujisaki-Kallianpur-Kunita equation of filtering theory.

More advanced result: nonlinear filtering SPDE

The complete knowledge of the filter at a time t would be given by the probability law of the signal Yt conditional on the sigma-field Gt generated by observations Z up to time t. If this probability law admits a density, informally

then under some regularity assumptions the density satisfies a non-linear stochastic partial differential equation (SPDE) driven by and called Kushner-Stratonovich equation, [10] or a unnormalized version of the density satisfies a linear SPDE called Zakai equation. [10] These equations can be formulated for the above system, but to simplify the exposition one can assume that the unobserved signal Y and the partially observed noisy signal Z satisfy the equations

In other terms, the system is simplified by assuming that the observation noise W is not state dependent.

One might keep a deterministic time dependent in front of but we assume this has been taken out by re-scaling.

For this particular system, the Kushner-Stratonovich SPDE for the density reads

where T denotes transposition, denotes the expectation with respect to the density p, and the forward diffusion operator is

where . If we choose the unnormalized density , the Zakai SPDE for the same system reads

These SPDEs for p and q are written in Ito calculus form. It is possible to write them in Stratonovich calculus form, which turns out to be helpful when deriving filtering approximations based on differential geometry, as in the projection filters. For example, the Kushner-Stratonovich equation written in Stratonovich calculus reads

From any of the densities p and q one can calculate all statistics of the signal Yt conditional on the sigma-field generated by observations Z up to time t, so that the densities give complete knowledge of the filter. Under the particular linear-constant assumptions with respect to Y, where the systems coefficients b and c are linear functions of Y and where and do not depend on Y, with the initial condition for the signal Y being Gaussian or deterministic, the density is Gaussian and it can be characterized by its mean and variance-covariance matrix, whose evolution is described by the Kalman-Bucy filter, which is finite dimensional. [10] More generally, the evolution of the filter density occurs in an infinite-dimensional function space, [5] and it has to be approximated via a finite dimensional approximation, as hinted above.

See also

Related Research Articles

<span class="mw-page-title-main">Lorentz force</span> Force acting on charged particles in electric and magnetic fields

In physics, the Lorentz force is the combination of electric and magnetic force on a point charge due to electromagnetic fields. A particle of charge q moving with a velocity v in an electric field E and a magnetic field B experiences a force of

<span class="mw-page-title-main">Navier–Stokes equations</span> Equations describing the motion of viscous fluid substances

The Navier–Stokes equations are partial differential equations which describe the motion of viscous fluid substances, named after French engineer and physicist Claude-Louis Navier and Anglo-Irish physicist and mathematician George Gabriel Stokes. They were developed over several decades of progressively building the theories, from 1822 (Navier) to 1842-1850 (Stokes).

In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the lesser values, the covariance is positive. In the opposite case, when the greater values of one variable mainly correspond to the lesser values of the other,, the covariance is negative. The sign of the covariance, therefore, shows the tendency in the linear relationship between the variables. The magnitude of the covariance is the geometric mean of the variances that are in-common for the two random variables. The correlation coefficient normalizes the covariance by dividing by the geometric mean of the total variances for the two random variables.

<span class="mw-page-title-main">Fokker–Planck equation</span> Partial differential equation

In statistical mechanics and information theory, the Fokker–Planck equation is a partial differential equation that describes the time evolution of the probability density function of the velocity of a particle under the influence of drag forces and random forces, as in Brownian motion. The equation can be generalized to other observables as well. The Fokker-Planck equation has multiple applications in information theory, graph theory, data science, finance, economics etc.

<span class="mw-page-title-main">Kalman filter</span> Algorithm that estimates unknowns from a series of measurements over time

For statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe. The filter is named after Rudolf E. Kálmán, who was one of the primary developers of its theory.

<span class="mw-page-title-main">Covariance matrix</span> Measure of covariance of components of a random vector

In probability theory and statistics, a covariance matrix is a square matrix giving the covariance between each pair of elements of a given random vector. Any covariance matrix is symmetric and positive semi-definite and its main diagonal contains variances.

In mathematics, Itô's lemma or Itô's formula is an identity used in Itô calculus to find the differential of a time-dependent function of a stochastic process. It serves as the stochastic calculus counterpart of the chain rule. It can be heuristically derived by forming the Taylor series expansion of the function up to its second derivatives and retaining terms up to first order in the time increment and second order in the Wiener process increment. The lemma is widely employed in mathematical finance, and its best known application is in the derivation of the Black–Scholes equation for option values.

A stochastic differential equation (SDE) is a differential equation in which one or more of the terms is a stochastic process, resulting in a solution which is also a stochastic process. SDEs have many applications throughout pure mathematics and are used to model various behaviours of stochastic models such as stock prices, random growth models or physical systems that are subjected to thermal fluctuations.

In stochastic processes, the Stratonovich integral or Fisk–Stratonovich integral is a stochastic integral, the most common alternative to the Itô integral. Although the Itô integral is the usual choice in applied mathematics, the Stratonovich integral is frequently used in physics.

In control theory, the linear–quadratic–Gaussian (LQG) control problem is one of the most fundamental optimal control problems, and it can also be operated repeatedly for model predictive control. It concerns linear systems driven by additive white Gaussian noise. The problem is to determine an output feedback law that is optimal in the sense of minimizing the expected value of a quadratic cost criterion. Output measurements are assumed to be corrupted by Gaussian noise and the initial state, likewise, is assumed to be a Gaussian random vector.

<span class="mw-page-title-main">Mathematical descriptions of the electromagnetic field</span> Formulations of electromagnetism

There are various mathematical descriptions of the electromagnetic field that are used in the study of electromagnetism, one of the four fundamental interactions of nature. In this article, several approaches are discussed, although the equations are in terms of electric and magnetic fields, potentials, and charges with currents, generally speaking.

In mathematics – specifically, in stochastic analysis – an Itô diffusion is a solution to a specific type of stochastic differential equation. That equation is similar to the Langevin equation used in physics to describe the Brownian motion of a particle subjected to a potential in a viscous fluid. Itô diffusions are named after the Japanese mathematician Kiyosi Itô.

The Cauchy momentum equation is a vector partial differential equation put forth by Cauchy that describes the non-relativistic momentum transport in any continuum.

In estimation theory, the extended Kalman filter (EKF) is the nonlinear version of the Kalman filter which linearizes about an estimate of the current mean and covariance. In the case of well defined transition models, the EKF has been considered the de facto standard in the theory of nonlinear state estimation, navigation systems and GPS.

In filtering theory the Kushner equation is an equation for the conditional probability density of the state of a stochastic non-linear dynamical system, given noisy measurements of the state. It therefore provides the solution of the nonlinear filtering problem in estimation theory. The equation is sometimes referred to as the Stratonovich–Kushnerequation. However, the correct equation in terms of Itō calculus was first derived by Kushner although a more heuristic Stratonovich version of it appeared already in Stratonovich's works in late fifties. However, the derivation in terms of Itō calculus is due to Richard Bucy.

<span class="mw-page-title-main">Stokes' theorem</span> Theorem in vector calculus

Stokes' theorem, also known as the Kelvin–Stokes theorem after Lord Kelvin and George Stokes, the fundamental theorem for curls or simply the curl theorem, is a theorem in vector calculus on . Given a vector field, the theorem relates the integral of the curl of the vector field over some surface, to the line integral of the vector field around the boundary of the surface. The classical theorem of Stokes can be stated in one sentence: The line integral of a vector field over a loop is equal to the flux of its curl through the enclosed surface. It is illustrated in the figure, where the direction of positive circulation of the bounding contour ∂Σ, and the direction n of positive flux through the surface Σ, are related by a right-hand-rule. For the right hand the fingers circulate along ∂Σ and the thumb is directed along n.

In stochastic analysis, a rough path is a generalization of the notion of smooth path allowing to construct a robust solution theory for controlled differential equations driven by classically irregular signals, for example a Wiener process. The theory was developed in the 1990s by Terry Lyons. Several accounts of the theory are available.

Quantum stochastic calculus is a generalization of stochastic calculus to noncommuting variables. The tools provided by quantum stochastic calculus are of great use for modeling the random evolution of systems undergoing measurement, as in quantum trajectories. Just as the Lindblad master equation provides a quantum generalization to the Fokker–Planck equation, quantum stochastic calculus allows for the derivation of quantum stochastic differential equations (QSDE) that are analogous to classical Langevin equations.

In statistics, the Innovation Method provides an estimator for the parameters of stochastic differential equations given a time series of observations of the state variables. In the framework of continuous-discrete state space models, the innovation estimator is obtained by maximizing the log-likelihood of the corresponding discrete-time innovation process with respect to the parameters. The innovation estimator can be classified as a M-estimator, a quasi-maximum likelihood estimator or a prediction error estimator depending on the inferential considerations that want to be emphasized. The innovation method is a system identification technique for developing mathematical models of dynamical systems from measured data and for the optimal design of experiments.

Projection filters are a set of algorithms based on stochastic analysis and information geometry, or the differential geometric approach to statistics, used to find approximate solutions for filtering problems for nonlinear state-space systems. The filtering problem consists of estimating the unobserved signal of a random dynamical system from partial noisy observations of the signal. The objective is computing the probability distribution of the signal conditional on the history of the noise-perturbed observations. This distribution allows for calculations of all statistics of the signal given the history of observations. If this distribution has a density, the density satisfies specific stochastic partial differential equations (SPDEs) called Kushner-Stratonovich equation, or Zakai equation. It is known that the nonlinear filter density evolves in an infinite dimensional function space.

References

  1. Stratonovich, R. L. (1959). Optimum nonlinear systems which bring about a separation of a signal with constant parameters from noise. Radiofizika, 2:6, pp. 892-901.
  2. Stratonovich, R.L. (1960). Application of the Markov processes theory to optimal filtering. Radio Engineering and Electronic Physics, 5:11, pp.1-19.
  3. Kushner, Harold. (1967). Nonlinear filtering: The exact dynamical equations satisfied by the conditional mode. Automatic Control, IEEE Transactions on Volume 12, Issue 3, Jun 1967 Page(s): 262 - 267
  4. Zakai, Moshe (1969), On the optimal filtering of diffusion processes. Zeit. Wahrsch. 11 230–243. MR 242552, Zbl   0164.19201, doi : 10.1007/BF00536382
  5. 1 2 Mireille Chaleyat-Maurel and Dominique Michel. Des resultats de non existence de filtre de dimension finie. Stochastics, 13(1+2):83-102, 1984.
  6. Maybeck, Peter S., Stochastic models, estimation, and control, Volume 141, Series Mathematics in Science and Engineering, 1979, Academic Press
  7. Damiano Brigo, Bernard Hanzon and François LeGland, A Differential Geometric approach to nonlinear filtering: the Projection Filter, I.E.E.E. Transactions on Automatic Control Vol. 43, 2 (1998), pp 247--252.
  8. Damiano Brigo, Bernard Hanzon and François Le Gland, Approximate Nonlinear Filtering by Projection on Exponential Manifolds of Densities, Bernoulli, Vol. 5, N. 3 (1999), pp. 495--534
  9. Del Moral, Pierre (1998). "Measure Valued Processes and Interacting Particle Systems. Application to Non Linear Filtering Problems". Annals of Applied Probability (Publications du Laboratoire de Statistique et Probabilités, 96-15 (1996) ed.). 8 (2): 438–495. doi: 10.1214/aoap/1028903535 .
  10. 1 2 3 Bain, A., and Crisan, D. (2009). Fundamentals of Stochastic Filtering. Springer-Verlag, New York, https://doi.org/10.1007/978-0-387-76896-0

Further reading