Smoothing problem (stochastic processes)

Last updated

The smoothing problem (not to be confused with smoothing in statistics, image processing and other contexts) is the problem of estimating an unknown probability density function recursively over time using incremental incoming measurements. It is one of the main problems defined by Norbert Wiener. [1] [2] A smoother is an algorithm that implements a solution to this problem, typically based on recursive Bayesian estimation. The smoothing problem is closely related to the filtering problem, both of which are studied in Bayesian smoothing theory.

Contents

A smoother is often a two-pass process, composed of forward and backward passes. Consider doing estimation (prediction/retrodiction) about an ongoing process (e.g. tracking a missile) based on incoming observations. When new observations arrive, estimations about past needs to be updated to have a smoother (more accurate) estimation of the whole estimated path until now (taking into account the newer observations). Without a backward pass (for retrodiction), the sequence of predictions in an online filtering algorithm does not look smooth. In other words, retrospectively, it is as if we are using future observations for improving estimation of a point in past, when those observations about future points become available. Note that time of estimation (which determines which observations are available) can be different to the time of the point that the prediction is about (that is subject to prediction/retrodiction). The observations about later times can be used to update and improved the estimations about earlier times. Doing so leads to smoother-looking estimations (retrodiction) about the whole path.

Examples of smoothers

Some variants include: [3]

The confusion in terms and the relation between Filtering and Smoothing problems

The terms Smoothing and Filtering are used for four concepts that may initially be confusing: Smoothing (in two senses: estimation and convolution), and Filtering (again in two senses: estimation and convolution).

Smoothing (estimation) and smoothing (convolution) despite being labelled with the same name in English language, can mean totally different mathematical procedures. The requirements of problems they solve are different. These concepts are distinguished by the context (signal processing versus estimation of stochastic processes).

The historical reason for this confusion is that initially, the Wiener's suggested a "smoothing" filter that was just a convolution. Later on his proposed solutions for obtaining a smoother estimation separate developments as two distinct concepts. One was about attaining a smoother estimation by taking into account past observations, and the other one was smoothing using filter design (design of a convolution filter).

Both the smoothing problem (in sense of estimation) and the filtering problem (in sense of estimation) are often confused with smoothing and filtering in other contexts (especially non-stochastic signal processing, often a name of various types of convolution). These names are used in the context of World War 2 with problems framed by people like Norbert Wiener. [1] [2] One source of confusion is the Wiener Filter is in form of a simple convolution. However, in Wiener's filter, two time-series are given. When the filter is defined, a straightforward convolution is the answer. However, in later developments such as Kalman filtering, the nature of filtering is different to convolution and it deserves a different name.

The distinction is described in the following two senses:

1. Convolution: The smoothing in the sense of convolution is simpler. For example, moving average, low-pass filtering, convolution with a kernel, or blurring using Laplace filters in image processing. It is often a filter design problem. Especially non-stochastic and non-Bayesian signal processing, without any hidden variables.

2. Estimation: The smoothing problem (or Smoothing in the sense of estimation) uses Bayesian and state-space models to estimate the hidden state variables. This is used in the context of World War 2 defined by people like Norbert Wiener, in (stochastic) control theory, radar, signal detection, tracking, etc. The most common use is the Kalman Smoother used with Kalman Filter, which is actually developed by Rauch. The procedure is called Kalman-Rauch recursion. It is one of the main problems solved by Norbert Wiener. [1] [2] Most importantly, in the Filtering problem (sense 2) the information from observation up to the time of the current sample is used. In smoothing (also sense 2) all observation samples (from future) are used. Filtering is causal but smoothing is batch processing of the same problem, namely, estimation of a time-series process based on serial incremental observations.

But the usual and more common smoothing and filtering (in the sense of 1.) do not have such distinction because there is no distinction between hidden and observable.

The distinction between Smoothing (estimation) and Filtering (estimation): In smoothing all observation samples are used (from future). Filtering is causal, whereas smoothing is batch processing of the given data. Filtering is the estimation of a (hidden) time-series process based on serial incremental observations.

See also

Related Research Articles

Signal processing Analysing, modifying and creating signals

Signal processing is an electrical engineering subfield that focuses on analysing, modifying, and synthesizing signals such as sound, images, and scientific measurements. Signal processing techniques can be used to improve transmission, storage efficiency and subjective quality and to also emphasize or detect components of interest in a measured signal.

A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process — call it — with unobservable ("hidden") states. As part of the definition, HMM requires that there be an observable process whose outcomes are "influenced" by the outcomes of in a known way. Since cannot be observed directly, the goal is to learn about by observing HMM has an additional requirement that the outcome of at time must be "influenced" exclusively by the outcome of at and that the outcomes of and at must not affect the outcome of at

Kalman filter Algorithm that estimates unknowns from a series of measurements over time

For statistics and control theory, Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe. The filter is named after Rudolf E. Kálmán, who was one of the primary developers of its theory.

Prediction Statement about a future event

A prediction, or forecast, is a statement about a future event or data. They are often, but not always, based upon experience or knowledge. There is no universal agreement about the exact difference from "estimation"; different authors and disciplines ascribe different connotations.

Deconvolution

In mathematics, deconvolution is the operation inverse to convolution. Both operations are used in signal processing and image processing. For example, it may be possible to recover the original signal after a filter (convolution) by using a deconvolution method with a certain degree of accuracy. Due to the measurement error of the recorded signal or image, it can be demonstrated that the worse is the SNR, the worse the reversing of a filter will be; hence, inverting a filter is not always a good solution as the error amplifies. Deconvolution offers a solution to this problem.

Time series Sequence of data points over time

In mathematics, a time series is a series of data points indexed in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average.

In signal processing, the Wiener filter is a filter used to produce an estimate of a desired or target random process by linear time-invariant (LTI) filtering of an observed noisy process, assuming known stationary signal and noise spectra, and additive noise. The Wiener filter minimizes the mean square error between the estimated random process and the desired process.

Data assimilation is a mathematical discipline that seeks to optimally combine theory with observations. There may be a number of different goals sought – for example, to determine the optimal state estimate of a system, to determine initial conditions for a numerical forecast model, to interpolate sparse observation data using knowledge of the system being observed, to set numerical parameters based on training a model from observed data. Depending on the goal, different solution methods may be used. Data assimilation is distinguished from other forms of machine learning, image analysis, and statistical methods in that it utilizes a dynamical model of the system being analyzed.

In probability theory, statistics, and machine learning, recursive Bayesian estimation, also known as a Bayes filter, is a general probabilistic approach for estimating an unknown probability density function (PDF) recursively over time using incoming measurements and a mathematical process model. The process relies heavily upon mathematical concepts and models that are theorized within a study of prior and posterior probabilities known as Bayesian statistics.

In applied mathematics, the Wiener–Khinchin theorem, also known as the Wiener–Khintchine theorem and sometimes as the Wiener–Khinchin–Einstein theorem or the Khinchin–Kolmogorov theorem, states that the autocorrelation function of a wide-sense-stationary random process has a spectral decomposition given by the power spectrum of that process.

In the theory of stochastic processes, the filtering problem is a mathematical model for a number of state estimation problems in signal processing and related fields. The general idea is to establish a "best estimate" for the true value of some system from an incomplete, potentially noisy set of observations on that system. The problem of optimal non-linear filtering was solved by Ruslan L. Stratonovich, see also Harold J. Kushner's work and Moshe Zakai's, who introduced a simplified dynamics for the unnormalized conditional law of the filter known as Zakai equation. The solution, however, is infinite-dimensional in the general case. Certain approximations and special cases are well understood: for example, the linear filters are optimal for Gaussian random variables, and are known as the Wiener filter and the Kalman-Bucy filter. More generally, as the solution is infinite dimensional, it requires finite dimensional approximations to be implemented in a computer with finite memory. A finite dimensional approximated nonlinear filter may be more based on heuristics, such as the Extended Kalman Filter or the Assumed Density Filters, or more methodologically oriented such as for example the Projection Filters, some sub-families of which are shown to coincide with the Assumed Density Filters.

In estimation theory, the extended Kalman filter (EKF) is the nonlinear version of the Kalman filter which linearizes about an estimate of the current mean and covariance. In the case of well defined transition models, the EKF has been considered the de facto standard in the theory of nonlinear state estimation, navigation systems and GPS.

In time series analysis — as conducted in statistics, signal processing, and many other fields — the innovation is the difference between the observed value of a variable at time t and the optimal forecast of that value based on information available prior to time t. If the forecasting method is working correctly, successive innovations are uncorrelated with each other, i.e., constitute a white noise time series. Thus it can be said that the innovation time series is obtained from the measurement time series by a process of 'whitening', or removing the predictable component. The use of the term innovation in the sense described here is due to Hendrik Bode and Claude Shannon (1950) in their discussion of the Wiener filter problem, although the notion was already implicit in the work of Kolmogorov.

Kolmogorov–Zurbenko filter

Within statistics, the Kolmogorov–Zurbenko (KZ) filter was first proposed by A. N. Kolmogorov and formally defined by Zurbenko. It is a series of iterations of a moving average filter of length m, where m is a positive, odd integer. The KZ filter belongs to the class of low-pass filters. The KZ filter has two parameters, the length m of the moving average window and the number of iterations k of the moving average itself. It also can be considered as a special window function designed to eliminate spectral leakage.

Moving horizon estimation (MHE) is an optimization approach that uses a series of measurements observed over time, containing noise and other inaccuracies, and produces estimates of unknown variables or parameters. Unlike deterministic approaches, MHE requires an iterative approach that relies on linear programming or nonlinear programming solvers to find a solution.

Gopinath Kallianpur (1925–2015) was an Indian American mathematician and statistician who became the first director of the Indian Statistical Institute (1976–79) under its new Memorandum of Association. During his tenure as the director the new centre of ISI at Bangalore, Karnataka was founded.

Generalized filtering is a generic Bayesian filtering scheme for nonlinear state-space models. It is based on a variational principle of least action, formulated in generalized coordinates of motion. Note that "generalized coordinates of motion" are related to—but distinct from—generalized coordinates as used in (multibody) dynamical systems analysis. Generalized filtering furnishes posterior densities over hidden states generating observed data using a generalized gradient descent on variational free energy, under the Laplace assumption. Unlike classical filtering, generalized filtering eschews Markovian assumptions about random fluctuations. Furthermore, it operates online, assimilating data to approximate the posterior density over unknown quantities, without the need for a backward pass. Special cases include variational filtering, dynamic expectation maximization and generalized predictive coding.

Outline of machine learning Overview of and topical guide to machine learning

The following outline is provided as an overview of and topical guide to machine learning. Machine learning is a subfield of soft computing within computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. In 1959, Arthur Samuel defined machine learning as a "field of study that gives computers the ability to learn without being explicitly programmed". Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from an example training set of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.

Probabilistic numerics is a scientific field at the intersection of statistics, machine learning and applied mathematics, where tasks in numerical analysis including finding numerical solutions for integration, linear algebra, optimisation and differential equations are seen as problems of statistical, probabilistic, or Bayesian inference.

References

  1. 1 2 3 1942, Extrapolation, Interpolation and Smoothing of Stationary Time Series. A war-time classified report nicknamed "the yellow peril" because of the color of the cover and the difficulty of the subject. Published postwar 1949 MIT Press. http://www.isss.org/lumwiener.htm
  2. 1 2 3 Wiener, Norbert (1949). Extrapolation, Interpolation, and Smoothing of Stationary Time Series. New York: Wiley. ISBN   0-262-73005-7.
  3. Simo Särkkä. Bayesian Filtering and Smoothing. Publisher: Cambridge University Press (5 Sept. 2013) Language: English ISBN   1107619289 ISBN   978-1107619289