Filtering problem (stochastic processes)

Last updated

In the theory of stochastic processes, filtering describes the problem of determining the state of a system from an incomplete and potentially noisy set of observations. While originally motivated by problems in engineering, filtering found applications in many fields from signal processing to finance.

Contents

The problem of optimal non-linear filtering (even for the non-stationary case) was solved by Ruslan L. Stratonovich (1959, [1] 1960 [2] ), see also Harold J. Kushner's work [3] and Moshe Zakai's, who introduced a simplified dynamics for the unnormalized conditional law of the filter [4] known as the Zakai equation. The solution, however, is infinite-dimensional in the general case. [5] Certain approximations and special cases are well understood: for example, the linear filters are optimal for Gaussian random variables, and are known as the Wiener filter and the Kalman-Bucy filter. More generally, as the solution is infinite dimensional, it requires finite dimensional approximations to be implemented in a computer with finite memory. A finite dimensional approximated nonlinear filter may be more based on heuristics, such as the extended Kalman filter or the assumed density filters, [6] or more methodologically oriented such as for example the projection filters, [7] some sub-families of which are shown to coincide with the Assumed Density Filters. [8] Particle filters [9] are another option to attack the infinite dimensional filtering problem and are based on sequential Monte Carlo methods.

In general, if the separation principle applies, then filtering also arises as part of the solution of an optimal control problem. For example, the Kalman filter is the estimation part of the optimal control solution to the linear-quadratic-Gaussian control problem.

The mathematical formalism

Consider a probability space (Ω, Σ, P) and suppose that the (random) state Yt in n-dimensional Euclidean space Rn of a system of interest at time t is a random variable Yt : Ω  Rn given by the solution to an Itō stochastic differential equation of the form

where B denotes standard p-dimensional Brownian motion, b : [0, +∞) × Rn  Rn is the drift field, and σ : [0, +∞) × Rn  Rn×p is the diffusion field. It is assumed that observations Ht in Rm (note that m and n may, in general, be unequal) are taken for each time t according to

Adopting the Itō interpretation of the stochastic differential and setting

this gives the following stochastic integral representation for the observations Zt:

where W denotes standard r-dimensional Brownian motion, independent of B and the initial condition Y0, and c : [0, +∞) × Rn  Rn and γ : [0, +∞) × Rn  Rn×r satisfy

for all t and x and some constant C.

The filtering problem is the following: given observations Zs for 0  s  t, what is the best estimate Ŷt of the true state Yt of the system based on those observations?

By "based on those observations" it is meant that Ŷt is measurable with respect to the σ-algebra Gt generated by the observations Zs, 0  s  t. Denote by K = K(Z, t) the collection of all Rn-valued random variables Y that are square-integrable and Gt-measurable:

By "best estimate", it is meant that Ŷt minimizes the mean-square distance between Yt and all candidates in K:

Basic result: orthogonal projection

The space K(Z, t) of candidates is a Hilbert space, and the general theory of Hilbert spaces implies that the solution Ŷt of the minimization problem (M) is given by

where PK(Z,t) denotes the orthogonal projection of L2(Ω, Σ, P; Rn) onto the linear subspace K(Z, t) = L2(Ω, Gt, P; Rn). Furthermore, it is a general fact about conditional expectations that if F is any sub-σ-algebra of Σ then the orthogonal projection

is exactly the conditional expectation operator E[·|F], i.e.,

Hence,

This elementary result is the basis for the general Fujisaki-Kallianpur-Kunita equation of filtering theory.

More advanced result: nonlinear filtering SPDE

The complete knowledge of the filter at a time t would be given by the probability law of the signal Yt conditional on the sigma-field Gt generated by observations Z up to time t. If this probability law admits a density, informally

then under some regularity assumptions the density satisfies a non-linear stochastic partial differential equation (SPDE) driven by and called Kushner-Stratonovich equation, [10] or a unnormalized version of the density satisfies a linear SPDE called Zakai equation. [10] These equations can be formulated for the above system, but to simplify the exposition one can assume that the unobserved signal Y and the partially observed noisy signal Z satisfy the equations

In other terms, the system is simplified by assuming that the observation noise W is not state dependent.

One might keep a deterministic time dependent in front of but we assume this has been taken out by re-scaling.

For this particular system, the Kushner-Stratonovich SPDE for the density reads

where T denotes transposition, denotes the expectation with respect to the density p, and the forward diffusion operator is

where . If we choose the unnormalized density , the Zakai SPDE for the same system reads

These SPDEs for p and q are written in Ito calculus form. It is possible to write them in Stratonovich calculus form, which turns out to be helpful when deriving filtering approximations based on differential geometry, as in the projection filters. For example, the Kushner-Stratonovich equation written in Stratonovich calculus reads

From any of the densities p and q one can calculate all statistics of the signal Yt conditional on the sigma-field generated by observations Z up to time t, so that the densities give complete knowledge of the filter. Under the particular linear-constant assumptions with respect to Y, where the systems coefficients b and c are linear functions of Y and where and do not depend on Y, with the initial condition for the signal Y being Gaussian or deterministic, the density is Gaussian and it can be characterized by its mean and variance-covariance matrix, whose evolution is described by the Kalman-Bucy filter, which is finite dimensional. [10] More generally, the evolution of the filter density occurs in an infinite-dimensional function space, [5] and it has to be approximated via a finite dimensional approximation, as hinted above.

See also

Related Research Articles

<span class="mw-page-title-main">Lorentz force</span> Force acting on charged particles in electric and magnetic fields

In physics, specifically in electromagnetism, the Lorentz force is the combination of electric and magnetic force on a point charge due to electromagnetic fields. A particle of charge q moving with a velocity v in an electric field E and a magnetic field B experiences a force of It says that the electromagnetic force on a charge q is a combination of (1) a force in the direction of the electric field E, and (2) a force at right angles to both the magnetic field B and the velocity v of the charge.

<span class="mw-page-title-main">Navier–Stokes equations</span> Equations describing the motion of viscous fluid substances

The Navier–Stokes equations are partial differential equations which describe the motion of viscous fluid substances. They were named after French engineer and physicist Claude-Louis Navier and the Irish physicist and mathematician George Gabriel Stokes. They were developed over several decades of progressively building the theories, from 1822 (Navier) to 1842–1850 (Stokes).

<span class="mw-page-title-main">Fokker–Planck equation</span> Partial differential equation

In statistical mechanics and information theory, the Fokker–Planck equation is a partial differential equation that describes the time evolution of the probability density function of the velocity of a particle under the influence of drag forces and random forces, as in Brownian motion. The equation can be generalized to other observables as well. The Fokker-Planck equation has multiple applications in information theory, graph theory, data science, finance, economics etc.

<span class="mw-page-title-main">Kalman filter</span> Algorithm that estimates unknowns from a series of measurements over time

For statistics and control theory, Kalman filtering, also known as linear quadratic estimation, is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe. The filter is constructed as a mean squared error minimiser, but an alternative derivation of the filter is also provided showing how the filter relates to maximum likelihoood statistics. The filter is named after Rudolf E. Kálmán, who was one of the primary developers of its theory.

<span class="mw-page-title-main">Stream function</span> Function for incompressible divergence-free flows in two dimensions

In fluid dynamics, two types of stream function are defined:

A stochastic differential equation (SDE) is a differential equation in which one or more of the terms is a stochastic process, resulting in a solution which is also a stochastic process. SDEs have many applications throughout pure mathematics and are used to model various behaviours of stochastic models such as stock prices, random growth models or physical systems that are subjected to thermal fluctuations.

In signal processing, a nonlinearfilter is a filter whose output is not a linear function of its input. That is, if the filter outputs signals R and S for two input signals r and s separately, but does not always output αR + βS when the input is a linear combination αr + βs.

In stochastic processes, the Stratonovich integral or Fisk–Stratonovich integral is a stochastic integral, the most common alternative to the Itô integral. Although the Itô integral is the usual choice in applied mathematics, the Stratonovich integral is frequently used in physics.

<span class="mw-page-title-main">Three-dimensional space</span> Geometric model of the physical space

In geometry, a three-dimensional space is a mathematical space in which three values (coordinates) are required to determine the position of a point. Most commonly, it is the three-dimensional Euclidean space, that is, the Euclidean space of dimension three, which models physical space. More general three-dimensional spaces are called 3-manifolds. The term may also refer colloquially to a subset of space, a three-dimensional region, a solid figure.

<span class="mw-page-title-main">Charge density</span> Electric charge per unit length, area or volume

In electromagnetism, charge density is the amount of electric charge per unit length, surface area, or volume. Volume charge density is the quantity of charge per unit volume, measured in the SI system in coulombs per cubic meter (C⋅m−3), at any point in a volume. Surface charge density (σ) is the quantity of charge per unit area, measured in coulombs per square meter (C⋅m−2), at any point on a surface charge distribution on a two dimensional surface. Linear charge density (λ) is the quantity of charge per unit length, measured in coulombs per meter (C⋅m−1), at any point on a line charge distribution. Charge density can be either positive or negative, since electric charge can be either positive or negative.

In control theory, the linear–quadratic–Gaussian (LQG) control problem is one of the most fundamental optimal control problems, and it can also be operated repeatedly for model predictive control. It concerns linear systems driven by additive white Gaussian noise. The problem is to determine an output feedback law that is optimal in the sense of minimizing the expected value of a quadratic cost criterion. Output measurements are assumed to be corrupted by Gaussian noise and the initial state, likewise, is assumed to be a Gaussian random vector.

<span class="mw-page-title-main">Mathematical descriptions of the electromagnetic field</span> Formulations of electromagnetism

There are various mathematical descriptions of the electromagnetic field that are used in the study of electromagnetism, one of the four fundamental interactions of nature. In this article, several approaches are discussed, although the equations are in terms of electric and magnetic fields, potentials, and charges with currents, generally speaking.

The ensemble Kalman filter (EnKF) is a recursive filter suitable for problems with a large number of variables, such as discretizations of partial differential equations in geophysical models. The EnKF originated as a version of the Kalman filter for large problems, and it is now an important data assimilation component of ensemble forecasting. EnKF is related to the particle filter but the EnKF makes the assumption that all probability distributions involved are Gaussian; when it is applicable, it is much more efficient than the particle filter.

In mathematics – specifically, in stochastic analysis – an Itô diffusion is a solution to a specific type of stochastic differential equation. That equation is similar to the Langevin equation used in physics to describe the Brownian motion of a particle subjected to a potential in a viscous fluid. Itô diffusions are named after the Japanese mathematician Kiyosi Itô.

In filtering theory the Kushner equation is an equation for the conditional probability density of the state of a stochastic non-linear dynamical system, given noisy measurements of the state. It therefore provides the solution of the nonlinear filtering problem in estimation theory. The equation is sometimes referred to as the Stratonovich–Kushnerequation. However, the correct equation in terms of Itō calculus was first derived by Kushner although a more heuristic Stratonovich version of it appeared already in Stratonovich's works in late fifties. However, the derivation in terms of Itō calculus is due to Richard Bucy.

<span class="mw-page-title-main">Weyl equation</span> Relativistic wave equation describing massless fermions

In physics, particularly in quantum field theory, the Weyl equation is a relativistic wave equation for describing massless spin-1/2 particles called Weyl fermions. The equation is named after Hermann Weyl. The Weyl fermions are one of the three possible types of elementary fermions, the other two being the Dirac and the Majorana fermions.

In stochastic analysis, a rough path is a generalization of the notion of smooth path allowing to construct a robust solution theory for controlled differential equations driven by classically irregular signals, for example a Wiener process. The theory was developed in the 1990s by Terry Lyons. Several accounts of the theory are available.

Quantum stochastic calculus is a generalization of stochastic calculus to noncommuting variables. The tools provided by quantum stochastic calculus are of great use for modeling the random evolution of systems undergoing measurement, as in quantum trajectories. Just as the Lindblad master equation provides a quantum generalization to the Fokker–Planck equation, quantum stochastic calculus allows for the derivation of quantum stochastic differential equations (QSDE) that are analogous to classical Langevin equations.

In statistics, the Innovation Method provides an estimator for the parameters of stochastic differential equations given a time series of observations of the state variables. In the framework of continuous-discrete state space models, the innovation estimator is obtained by maximizing the log-likelihood of the corresponding discrete-time innovation process with respect to the parameters. The innovation estimator can be classified as a M-estimator, a quasi-maximum likelihood estimator or a prediction error estimator depending on the inferential considerations that want to be emphasized. The innovation method is a system identification technique for developing mathematical models of dynamical systems from measured data and for the optimal design of experiments.

Projection filters are a set of algorithms based on stochastic analysis and information geometry, or the differential geometric approach to statistics, used to find approximate solutions for filtering problems for nonlinear state-space systems. The filtering problem consists of estimating the unobserved signal of a random dynamical system from partial noisy observations of the signal. The objective is computing the probability distribution of the signal conditional on the history of the noise-perturbed observations. This distribution allows for calculations of all statistics of the signal given the history of observations. If this distribution has a density, the density satisfies specific stochastic partial differential equations (SPDEs) called Kushner-Stratonovich equation, or Zakai equation. It is known that the nonlinear filter density evolves in an infinite dimensional function space.

References

  1. Stratonovich, R. L. (1959). Optimum nonlinear systems which bring about a separation of a signal with constant parameters from noise. Radiofizika, 2:6, pp. 892-901.
  2. Stratonovich, R.L. (1960). Application of the Markov processes theory to optimal filtering. Radio Engineering and Electronic Physics, 5:11, pp.1-19.
  3. Kushner, Harold. (1967). Nonlinear filtering: The exact dynamical equations satisfied by the conditional mode. Automatic Control, IEEE Transactions on Volume 12, Issue 3, Jun 1967 Page(s): 262 - 267
  4. Zakai, Moshe (1969), On the optimal filtering of diffusion processes. Zeit. Wahrsch. 11 230–243. MR 242552, Zbl   0164.19201, doi : 10.1007/BF00536382
  5. 1 2 Mireille Chaleyat-Maurel and Dominique Michel. Des resultats de non existence de filtre de dimension finie. Stochastics, 13(1+2):83-102, 1984.
  6. Maybeck, Peter S., Stochastic models, estimation, and control, Volume 141, Series Mathematics in Science and Engineering, 1979, Academic Press
  7. Damiano Brigo, Bernard Hanzon and François LeGland, A Differential Geometric approach to nonlinear filtering: the Projection Filter, I.E.E.E. Transactions on Automatic Control Vol. 43, 2 (1998), pp 247--252.
  8. Damiano Brigo, Bernard Hanzon and François Le Gland, Approximate Nonlinear Filtering by Projection on Exponential Manifolds of Densities, Bernoulli, Vol. 5, N. 3 (1999), pp. 495--534
  9. Del Moral, Pierre (1998). "Measure Valued Processes and Interacting Particle Systems. Application to Non Linear Filtering Problems". Annals of Applied Probability. 8 (2) (Publications du Laboratoire de Statistique et Probabilités, 96-15 (1996) ed.): 438–495. doi: 10.1214/aoap/1028903535 .
  10. 1 2 3 Bain, A., and Crisan, D. (2009). Fundamentals of Stochastic Filtering. Springer-Verlag, New York, https://doi.org/10.1007/978-0-387-76896-0

Further reading