Doubly stochastic model

Last updated

In statistics, a doubly stochastic model is a type of model that can arise in many contexts, but in particular in modelling time-series and stochastic processes.

Contents

The basic idea for a doubly stochastic model is that an observed random variable is modelled in two stages. In one stage, the distribution of the observed outcome is represented in a fairly standard way using one or more parameters. At a second stage, some of these parameters (often only one) are treated as being themselves random variables. In a univariate context this is essentially the same as the well-known concept of compounded distributions. For the more general case of doubly stochastic models, there is the idea that many values in a time-series or stochastic model are simultaneously affected by the underlying parameters, either by using a single parameter affecting many outcome variates, or by treating the underlying parameter as a time-series or stochastic process in its own right.

The basic idea here is essentially similar to that broadly used in latent variable models except that here the quantities playing the role of latent variables usually have an underlying dependence structure related to the time-series or spatial context.

An example of a doubly stochastic model is the following. [1] The observed values in a point process might be modelled as a Poisson process in which the rate (the relevant underlying parameter) is treated as being the exponential of a Gaussian process.

See also

Related Research Articles

In probability theory and statistics, a probability distribution is a mathematical function that provides the probabilities of occurrence of different possible outcomes in an experiment. In more technical terms, the probability distribution is a description of a random phenomenon in terms of the probabilities of events. For instance, if the random variable X is used to denote the outcome of a coin toss, then the probability distribution of X would take the value 0.5 for X = heads, and 0.5 for X = tails. Examples of random phenomena can include the results of an experiment or survey.

A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of sample data. A statistical model represents, often in considerably idealized form, the data-generating process.

White noise random signal having equal intensity at different frequencies, giving it a constant power spectral density

In signal processing, white noise is a random signal having equal intensity at different frequencies, giving it a constant power spectral density. The term is used, with this or similar meanings, in many scientific and technical disciplines, including physics, acoustical engineering, telecommunications, and statistical forecasting. White noise refers to a statistical model for signals and signal sources, rather than to any specific signal. White noise draws its name from white light, although light that appears white generally does not have a flat power spectral density over the visible band.

Hidden Markov Model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process – call it  – with unobservable ("hidden") states. HMM assumes that there is another process whose behavior "depends" on . The goal is to learn about by observing . HMM stipulates that, for each time instance , the conditional probability distribution of given the history must NOT depend on .

Logistic regression Statistical model for a binary dependent variable

In statistics, the logistic model is used to model the probability of a certain class or event existing such as pass/fail, win/lose, alive/dead or healthy/sick. This can be extended to model several classes of events such as determining whether an image contains a cat, dog, lion, etc. Each object being detected in the image would be assigned a probability between 0 and 1 and the sum adding to one.

In mathematics and statistics, a stationary process is a stochastic process whose unconditional joint probability distribution does not change when shifted in time. Consequently, parameters such as mean and variance also do not change over time.

Time series Sequence of data points over time

A time series is a series of data points indexed in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Examples of time series are heights of ocean tides, counts of sunspots, and the daily closing value of the Dow Jones Industrial Average.

In statistics, Gibbs sampling or a Gibbs sampler is a Markov chain Monte Carlo (MCMC) algorithm for obtaining a sequence of observations which are approximately from a specified multivariate probability distribution, when direct sampling is difficult. This sequence can be used to approximate the joint distribution ; to approximate the marginal distribution of one of the variables, or some subset of the variables ; or to compute an integral. Typically, some of the variables correspond to observations whose values are known, and hence do not need to be sampled.

Mathematical statistics branch of statistics, mathematical methods are used here

Mathematical statistics is the application of probability theory, a branch of mathematics, to statistics, as opposed to techniques for collecting statistical data. Specific mathematical techniques which are used for this include mathematical analysis, linear algebra, stochastic analysis, differential equations, and measure theory.

In mathematical finance, a Monte Carlo option model uses Monte Carlo methods to calculate the value of an option with multiple sources of uncertainty or with complicated features. The first application to option pricing was by Phelim Boyle in 1977. In 1996, M. Broadie and P. Glasserman showed how to price Asian options by Monte Carlo. An important development was the introduction in 1996 by Carriere of Monte Carlo methods for options early exercise features.

"Stochastic" means being or having a random variable. A stochastic model is a tool for estimating probability distributions of potential outcomes by allowing for random variation in one or more inputs over time. The random variation is usually based on fluctuations observed in historical data for a selected period using standard time-series techniques. Distributions of potential outcomes are derived from a large number of simulations which reflect the random variation in the input(s).

Multinomial logistic regression regression for more than two discrete outcomes

In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables.

Binomial regression

In statistics, binomial regression is a regression analysis technique in which the response has a binomial distribution: it is the number of successes in a series of independent Bernoulli trials, where each trial has probability of success . In binomial regression, the probability of a success is related to explanatory variables: the corresponding concept in ordinary regression is to relate the mean value of the unobserved response to explanatory variables.

In probability theory, a Cox process, also known as a doubly stochastic Poisson process is a point process which is a generalization of a Poisson process where the intensity that varies across the underlying mathematical space is itself a stochastic process. The process is named after the statistician David Cox, who first published the model in 1955.

Dirichlet process

In probability theory, Dirichlet processes are a family of stochastic processes whose realizations are probability distributions. In other words, a Dirichlet process is a probability distribution whose range is itself a set of probability distributions. It is often used in Bayesian inference to describe the prior knowledge about the distribution of random variables—how likely it is that the random variables are distributed according to one or another particular distribution.

A standard normal deviate is a normally distributed deviate. It is a realization of a standard normal random variable, defined as a random variable with expected value 0 and variance 1. Where collections of such random variables are used, there is often an associated assumption that members of such collections are statistically independent.

In the theory of stochastic processes in probability theory and statistics, a nuisance variable is a random variable that is fundamental to the probabilistic model, but that is of no particular interest in itself or is no longer of interest: one such usage arises for the Chapman–Kolmogorov equation. For example, a model for a stochastic process may be defined conceptually using intermediate variables that are not observed in practice. If the problem is to derive the theoretical properties, such as the mean, variance and covariances of quantities that would be observed, then the intermediate variables are nuisance variables.

In estimation theory in statistics, stochastic equicontinuity is a property of estimators that is useful in dealing with their asymptotic behaviour as the amount of data increases. It is a version of equicontinuity used in the context of functions of random variables: that is, random functions. The property relates to the rate of convergence of sequences of random variables and requires that this rate is essentially the same within a region of the parameter space being considered.

In probability, statistics and related fields, a Poisson point process is a type of random mathematical object that consists of points randomly located on a mathematical space. The Poisson point process is often called simply the Poisson process, but it is also called a Poisson random measure, Poisson random point field or Poisson point field. This point process has convenient mathematical properties, which has led to it being frequently defined in Euclidean space and used as a mathematical model for seemingly random processes in numerous disciplines such as astronomy, biology, ecology, geology, seismology, physics, economics, image processing, and telecommunications.

References

  1. Cox, D.R.; Isham, V. (1980). Point processes. Chapman and Hall. p. 10. ISBN   978-0-412-21910-8.

Further reading