Auxiliary particle filter

Last updated

The auxiliary particle filter is a particle filtering algorithm introduced by Pitt and Shephard in 1999 to improve some deficiencies of the sequential importance resampling (SIR) algorithm when dealing with tailed observation densities.

Contents

Motivation

Particle filters approximate continuous random variable by particles with discrete probability mass , say for uniform distribution. The random sampled particles can be used to approximate the probability density function of the continuous random variable if the value .

The empirical prediction density is produced as the weighted summation of these particles: [1]

, and we can view it as the "prior" density. Note that the particles are assumed to have the same weight .

Combining the prior density and the likelihood , the empirical filtering density can be produced as:

, where .

On the other hand, the true filtering density which we want to estimate is

.

The prior density can be used to approximate the true filtering density :

The weakness of the particle filters includes:

Therefore, the auxiliary particle filter is proposed to solve this problem.

Auxiliary particle filter

Auxiliary variable

Comparing with the empirical filtering density which has ,

we now define , where .

Being aware that is formed by the summation of particles, the auxiliary variable represents one specific particle. With the aid of , we can form a set of samples which has the distribution . Then, we draw from these sample set instead of directly from . In other words, the samples are drawn from with different probability. The samples are ultimately utilized to approximate .

Take the SIR method for example:

The original particle filters draw samples from the prior density, while the auxiliary filters draw from the joint distribution of the prior density and the likelihood. In other words, the auxiliary particle filters avoid the circumstance which the particles are generated in the regions with low likelihood. As a result, the samples can approximate more precisely.

Selection of the auxiliary variable

The selection of the auxiliary variable affects and controls the distribution of the samples. A possible selection of can be:
, where and is the mean.

We sample from to approximate by the following procedure:

Following the procedure, we draw the samples from . Since is closely related to the mean , it has high conditional likelihood. As a result, the sampling procedure is more efficient and the value can be reduced.

Other point of view

Assume that the filtered posterior is described by the following M weighted samples:

Then, each step in the algorithm consists of first drawing a sample of the particle index which will be propagated from into the new step . These indexes are auxiliary variables only used as an intermediary step, hence the name of the algorithm. The indexes are drawn according to the likelihood of some reference point which in some way is related to the transition model (for example, the mean, a sample, etc.):

This is repeated for , and using these indexes we can now draw the conditional samples:

Finally, the weights are updated to account for the mismatch between the likelihood at the actual sample and the predicted point :

Related Research Articles

In physics, the cross section is a measure of the probability that a specific process will take place when some kind of radiant excitation intersects a localized phenomenon. For example, the Rutherford cross-section is a measure of probability that an alpha particle will be deflected by a given angle during an interaction with an atomic nucleus. Cross section is typically denoted σ (sigma) and is expressed in units of area, more specifically in barns. In a way, it can be thought of as the size of the object that the excitation must hit in order for the process to occur, but more exactly, it is a parameter of a stochastic process.

The propagation constant of a sinusoidal electromagnetic wave is a measure of the change undergone by the amplitude and phase of the wave as it propagates in a given direction. The quantity being measured can be the voltage, the current in a circuit, or a field vector such as electric field strength or flux density. The propagation constant itself measures the change per unit length, but it is otherwise dimensionless. In the context of two-port networks and their cascades, propagation constant measures the change undergone by the source quantity as it propagates from one port to the next.

<span class="mw-page-title-main">Quantum harmonic oscillator</span> Important, well-understood quantum mechanical model

The quantum harmonic oscillator is the quantum-mechanical analog of the classical harmonic oscillator. Because an arbitrary smooth potential can usually be approximated as a harmonic potential at the vicinity of a stable equilibrium point, it is one of the most important model systems in quantum mechanics. Furthermore, it is one of the few quantum-mechanical systems for which an exact, analytical solution is known.

<span class="mw-page-title-main">Fermi gas</span> Physical model of gases composed of many non-interacting identical fermions

An ideal Fermi gas is a state of matter which is an ensemble of many non-interacting fermions. Fermions are particles that obey Fermi–Dirac statistics, like electrons, protons, and neutrons, and, in general, particles with half-integer spin. These statistics determine the energy distribution of fermions in a Fermi gas in thermal equilibrium, and is characterized by their number density, temperature, and the set of available energy states. The model is named after the Italian physicist Enrico Fermi.

In mathematics, a Lie algebroid is a vector bundle together with a Lie bracket on its space of sections and a vector bundle morphism , satisfying a Leibniz rule. A Lie algebroid can thus be thought of as a "many-object generalisation" of a Lie algebra.

<span class="mw-page-title-main">Radon transform</span> Integral transform

In mathematics, the Radon transform is the integral transform which takes a function f defined on the plane to a function Rf defined on the (two-dimensional) space of lines in the plane, whose value at a particular line is equal to the line integral of the function over that line. The transform was introduced in 1917 by Johann Radon, who also provided a formula for the inverse transform. Radon further included formulas for the transform in three dimensions, in which the integral is taken over planes. It was later generalized to higher-dimensional Euclidean spaces, and more broadly in the context of integral geometry. The complex analogue of the Radon transform is known as the Penrose transform. The Radon transform is widely applicable to tomography, the creation of an image from the projection data associated with cross-sectional scans of an object.

In mathematics, the discrete-time Fourier transform (DTFT) is a form of Fourier analysis that is applicable to a sequence of values.

In signal processing, linear phase is a property of a filter where the phase response of the filter is a linear function of frequency. The result is that all frequency components of the input signal are shifted in time by the same constant amount, which is referred to as the group delay. Consequently, there is no phase distortion due to the time delay of frequencies relative to one another.

<span class="mw-page-title-main">Theoretical motivation for general relativity</span>

A theoretical motivation for general relativity, including the motivation for the geodesic equation and the Einstein field equation, can be obtained from special relativity by examining the dynamics of particles in circular orbits about the earth. A key advantage in examining circular orbits is that it is possible to know the solution of the Einstein Field Equation a priori. This provides a means to inform and verify the formalism.

A quasiprobability distribution is a mathematical object similar to a probability distribution but which relaxes some of Kolmogorov's axioms of probability theory. Quasiprobabilities share several of general features with ordinary probabilities, such as, crucially, the ability to yield expectation values with respect to the weights of the distribution. However, they can violate the σ-additivity axiom: integrating over them does not necessarily yield probabilities of mutually exclusive states. Indeed, quasiprobability distributions also have regions of negative probability density, counterintuitively, contradicting the first axiom. Quasiprobability distributions arise naturally in the study of quantum mechanics when treated in phase space formulation, commonly used in quantum optics, time-frequency analysis, and elsewhere.

<span class="mw-page-title-main">Inverse Gaussian distribution</span> Family of continuous probability distributions

In probability theory, the inverse Gaussian distribution is a two-parameter family of continuous probability distributions with support on (0,∞).

In many-body theory, the term Green's function is sometimes used interchangeably with correlation function, but refers specifically to correlators of field operators or creation and annihilation operators.

A hydrogen-like atom (or hydrogenic atom) is any atom or ion with a single valence electron. These atoms are isoelectronic with hydrogen. Examples of hydrogen-like atoms include, but are not limited to, hydrogen itself, all alkali metals such as Rb and Cs, singly ionized alkaline earth metals such as Ca+ and Sr+ and other ions such as He+, Li2+, and Be3+ and isotopes of any of the above. A hydrogen-like atom includes a positively charged core consisting of the atomic nucleus and any core electrons as well as a single valence electron. Because helium is common in the universe, the spectroscopy of singly ionized helium is important in EUV astronomy, for example, of DO white dwarf stars.

<span class="mw-page-title-main">Skew normal distribution</span>

In probability theory and statistics, the skew normal distribution is a continuous probability distribution that generalises the normal distribution to allow for non-zero skewness.

Surface-extended X-ray absorption fine structure (SEXAFS) is the surface-sensitive equivalent of the EXAFS technique. This technique involves the illumination of the sample by high-intensity X-ray beams from a synchrotron and monitoring their photoabsorption by detecting in the intensity of Auger electrons as a function of the incident photon energy. Surface sensitivity is achieved by the interpretation of data depending on the intensity of the Auger electrons instead of looking at the relative absorption of the X-rays as in the parent method, EXAFS.

Static force fields are fields, such as a simple electric, magnetic or gravitational fields, that exist without excitations. The most common approximation method that physicists use for scattering calculations can be interpreted as static forces arising from the interactions between two bodies mediated by virtual particles, particles that exist for only a short time determined by the uncertainty principle. The virtual particles, also known as force carriers, are bosons, with different bosons associated with each force.

Generalized filtering is a generic Bayesian filtering scheme for nonlinear state-space models. It is based on a variational principle of least action, formulated in generalized coordinates of motion. Note that "generalized coordinates of motion" are related to—but distinct from—generalized coordinates as used in (multibody) dynamical systems analysis. Generalized filtering furnishes posterior densities over hidden states generating observed data using a generalized gradient descent on variational free energy, under the Laplace assumption. Unlike classical filtering, generalized filtering eschews Markovian assumptions about random fluctuations. Furthermore, it operates online, assimilating data to approximate the posterior density over unknown quantities, without the need for a backward pass. Special cases include variational filtering, dynamic expectation maximization and generalized predictive coding.

In machine learning, the kernel embedding of distributions comprises a class of nonparametric methods in which a probability distribution is represented as an element of a reproducing kernel Hilbert space (RKHS). A generalization of the individual data-point feature mapping done in classical kernel methods, the embedding of distributions into infinite-dimensional feature spaces can preserve all of the statistical features of arbitrary distributions, while allowing one to compare and manipulate distributions using Hilbert space operations such as inner products, distances, projections, linear transformations, and spectral analysis. This learning framework is very general and can be applied to distributions over any space on which a sensible kernel function may be defined. For example, various kernels have been proposed for learning from data which are: vectors in , discrete classes/categories, strings, graphs/networks, images, time series, manifolds, dynamical systems, and other structured objects. The theory behind kernel embeddings of distributions has been primarily developed by Alex Smola, Le Song , Arthur Gretton, and Bernhard Schölkopf. A review of recent works on kernel embedding of distributions can be found in.

Bayesian hierarchical modelling is a statistical model written in multiple levels that estimates the parameters of the posterior distribution using the Bayesian method. The sub-models combine to form the hierarchical model, and Bayes' theorem is used to integrate them with the observed data and account for all the uncertainty that is present. The result of this integration is the posterior distribution, also known as the updated probability estimate, as additional evidence on the prior distribution is acquired.

In set theory and logic, Buchholz's ID hierarchy is a hierarchy of subsystems of first-order arithmetic. The systems/theories are referred to as "the formal theories of ν-times iterated inductive definitions". IDν extends PA by ν iterated least fixed points of monotone operators.

References

  1. Pitt, Michael K.; Shephard, Neil. "Filtering Via Simulation: Auxiliary Particle Filters" (PDF). Journal of the American Statistical Association.

Sources