Innovation Method

Last updated

In statistics, the Innovation Method provides an estimator for the parameters of stochastic differential equations given a time series of (potentially noisy) observations of the state variables. In the framework of continuous-discrete state space models, the innovation estimator is obtained by maximizing the log-likelihood of the corresponding discrete-time innovation process with respect to the parameters. The innovation estimator can be classified as a M-estimator, a quasi-maximum likelihood estimator or a prediction error estimator depending on the inferential considerations that want to be emphasized. The innovation method is a system identification technique for developing mathematical models of dynamical systems from measured data and for the optimal design of experiments.

Contents

Background

Stochastic differential equations (SDEs) have become an important mathematical tool for describing the time evolution of several random phenomenon in natural, social and applied sciences. Statistical inference for SDEs is thus of great importance in applications for model building, model selection, model identification and forecasting. To carry out statistical inference for SDEs, measurements of the state variables of these random phenomena are indispensable. Usually, in practice, only a few state variables are measured by physical devices that introduce random measurement errors (observational errors).

Mathematical model for inference

The innovation estimator. [1] for SDEs is defined in the framework of continuous-discrete state space models. [2] These models arise as natural mathematical representation of the temporal evolution of continuous random phenomena and their measurements in a succession of time instants. In the simplest formulation, these continuous-discrete models [2] are expressed in term of a SDE of the form

describing the time evolution of state variables of the phenomenon for all time instant , and an observation equation

describing the time series of measurements of at least one of the variables of the random phenomenon on time instants . In the model (1)-(2), and are differentiable functions, is an -dimensional standard Wiener process, is a vector of parameters, is a sequence of -dimensional i.i.d. Gaussian random vectors independent of , an positive definite matrix, and an matrix.

Statistical problem to solve

Once the dynamics of a phenomenon is described by a state equation as (1) and the way of measurement the state variables specified by an observation equation as (2), the inference problem to solve is the following: [1] [3] given partial and noisy observations of the stochastic process on the observation times , estimate the unobserved state variable of and the unknown parameters in (1) that better fit to the given observations.

Discrete-time innovation process

Let be the sequence of observation times of the states of (1), and the time series of partial and noisy measurements of described by the observation equation (2).

Further, let and be the conditional mean and variance of with , where denotes the expected value of random vectors.

The random sequence with

defines the discrete-time innovation process, [4] [1] [5] where is proved to be an independent normally distributed random vector with zero mean and variance

for small enough , with . In practice, [6] this distribution for the discrete-time innovation is valid when, with a suitable selection of both, the number of observations and the time distance between consecutive observations, the time series of observations of the SDE contains the main information about the continuous-time process . That is, when the sampling of the continuous-time process has low distortion (aliasing) and when there is a suitable signal-noise ratio.

Innovation estimator

The innovation estimator for the parameters of the SDE (1) is the one that maximizes the likelihood function of the discrete-time innovation process with respect to the parameters. [1] More precisely, given measurements of the state space model (1)-(2) with on the innovation estimator for the parameters of (1) is defined by

where

being the discrete-time innovation (3) and the innovation variance (4) of the model (1)-(2) at , for all In the above expression for the conditional mean and variance are computed by the continuous-discrete filtering algorithm for the evolution of the moments (Section 6.4 in [2] ), for all

Differences with the maximum likelihood estimator

The maximum likelihood estimator of the parameters in the model (1)-(2) involves the evaluation of the - usually unknown - transition density function between the states and of the diffusion process for all the observation times and . [7] Instead of this, the innovation estimator (5) is obtained by maximizing the likelihood of the discrete-time innovation process taking into account that are Gaussian and independent random vectors. Remarkably, whereas the transition density function changes when the SDE for does, the transition density function for the innovation process remains Gaussian independently of the SDEs for . Only in the case that the diffusion is described by a linear SDE with additive noise, the density function is Gaussian and equal to and so the maximum likelihood and the innovation estimator coincide. [5] Otherwise, [5] the innovation estimator is an approximation to the maximum likelihood estimator and, in this sense, the innovation estimator is a Quasi-Maximum Likelihood estimator. In addition, the innovation method is a particular instance of the Prediction Error method according to the definition given in. [8] Therefore, the asymptotic results obtained in for that general class of estimators are valid for the innovation estimators. [1] [9] [10] Intuitively, by following the typical control engineering viewpoint, it is expected that the innovation process - viewed as a measure of the prediction errors of the fitted model - be approximately a white noise process when the models fit the data, [11] [3] which can be used as a practical tool for designing of models and for optimal experimental design. [6]

Properties

The innovation estimator (5) has a number of important attributes:

where is the t-student distribution with significance level, and degrees of freedom . Here, denotes the variance of the innovation estimator , where

is the Fisher Information matrix the innovation estimator of and

is the entry of the matrix with and , for .

can be transformed to the simpler one (2), and the innovation estimator (5) can be applied. [5]

Approximate Innovation estimators

In practice, close form expressions for computing and in (5) are only available for a few models (1)-(2). Therefore, approximate filtering algorithms as the following are used in applications.

Given measurements and the initial filter estimates , , the approximate Linear Minimum Variance (LMV) filter for the model (1)-(2) is iteratively defined at each observation time by the prediction estimates [2] [13]

and

with initial conditions and , and the filter estimates

and

with filter gain

for all , where is an approximation to the solution of (1) on the observation times .

Given measurements of the state space model (1)-(2) with on , the approximate innovation estimator for the parameters of (1) is defined by [1] [12]

where

being

and

approximations to the discrete-time innovation (3) and innovation variance (4), respectively, resulting from the filtering algorithm (7)-(8).

For models with complete observations free of noise (i.e., with and in (2)), the approximate innovation estimator (9) reduces to the known Quasi-Maximum Likelihood estimators for SDEs. [12]

Main conventional-type estimators

Conventional-type innovation estimators are those (9) derived from conventional-type continuous-discrete or discrete-discrete approximate filtering algorithms. With approximate continuous-discrete filters there are the innovation estimators based on Local Linearization (LL) filters, [1] [14] [5] on the extended Kalman filter, [15] [16] and on the second order filters. [3] [16] Approximate innovation estimators based on discrete-discrete filters result from the discretization of the SDE (1) by means of a numerical scheme. [17] [18] Typically, the effectiveness of these innovation estimators is directly related to the stability of the involved filtering algorithms.

A shared drawback of these conventional-type filters is that, once the observations are given, the error between the approximate and the exact innovation process is fixed and completely settled by the time distance between observations. [12] This might set a large bias of the approximate innovation estimators in some applications, bias that cannot be corrected by increasing the number of observations. However, the conventional-type innovation estimators are useful in many practical situations for which only medium or low accuracy for the parameter estimation is required. [12]

Order-β innovation estimators

Let us consider the finer time discretization of the time interval satisfying the condition . Further, let be the approximate value of obtained from a discretization of the equation (1) for all , and

for all

a continuous-time approximation to .

A order-LMV filter. [13] is an approximate LMV filter for which is an order-weak approximation to satisfying (10) and the weak convergence condition

for all and any times continuously differentiable functions for which and all its partial derivatives up to order have polynomial growth, being a positive constant. This order- LMV filter converges with rate to the exact LMV filter as goes to zero, [13] where is the maximum stepsize of the time discretization on which the approximation to is defined.

A order-innovation estimator is an approximate innovation estimator (9) for which the approximations to the discrete-time innovation (3) and innovation variance (4), respectively, resulting from an order- LMV filter. [12]

Approximations of any kind converging to in a weak sense (as, e.g., those in [19] [13] ) can be used to design an order- LMV filter and, consequently, an order- innovation estimator. These order- innovation estimators are intended for the recurrent practical situation in which a diffusion process should be identified from a reduced number of observations distant in time or when high accuracy for the estimated parameters is required.

Properties

An order-innovation estimator has a number of important properties: [12] [6]

  • For each given data of observations, converges to the exact innovation estimator as the maximum stepsize of the time discretization goes to zero.
  • For finite samples of observations, the expected value of converges to the expected value of the exact innovation estimator as goes to zero.
  • For an increasing number of observations, is asymptotically normal distributed and its bias decreases when goes to zero.
  • Likewise to the convergence of the order- LMV filter to the exact LMV filter, for the convergence and asymptotic properties of there are no constraints on the time distance between two consecutive observations and , nor on the time discretization
  • Approximations for the Akaike or Bayesian information criterion and confidence limits are directly obtained by replacing the exact estimator by its approximation . These approximations converge to the corresponding exact one when the maximum stepsize of the time discretization goes to zero.
  • The distribution of the approximate fitting-innovation process measures the goodness of fit of the model to the data, which is also used as a practical tool for designing of models and for optimal experimental design.
  • For smooth enough function , nonlinear observation equations of the form (6) can be transformed to the simpler one (2), and the order- innovation estimator can be applied.
Fig. 1 Histograms of the differences
(
a
^
M
-
a
^
h
,
M
D
,
s
^
M
-
s
^
h
,
M
D
)
{\displaystyle ({\widehat {\alpha }}_{M}-{\widehat {\alpha }}_{h,M}^{D},{\widehat {\sigma }}_{M}-{\widehat {\sigma }}_{h,M}^{D})}
and
(
a
^
M
-
a
^
h
,
M
,
s
^
M
-
s
^
h
,
M
)
{\displaystyle ({\widehat {\alpha }}_{M}-{\widehat {\alpha }}_{h,M},{\widehat {\sigma }}_{M}-{\widehat {\sigma }}_{h,M})}
between the exact innovation estimator
(
a
^
M
,
s
^
M
)
{\displaystyle ({\widehat {\alpha }}_{M},{\widehat {\sigma }}_{M})}
with the conventional
(
a
^
h
,
M
D
,
s
^
h
,
M
D
)
{\displaystyle ({\widehat {\alpha }}_{h,M}^{D},{\widehat {\sigma }}_{h,M}^{D})}
and order-
1
{\displaystyle 1}
(
a
^
h
,
M
,
s
^
h
,
M
)
{\displaystyle ({\widehat {\alpha }}_{h,M},{\widehat {\sigma }}_{h,M})}
innovation estimators for the parameters
(
a
,
s
)
{\displaystyle (\alpha ,\sigma )}
of the model (11)-(12) given
100
{\displaystyle 100}
time series of
M
=
10
{\displaystyle M=10}
noisy observations on the time interval
[
0.5
,
0.5
+
M
-
1
]
{\displaystyle [0.5,0.5+M-1]}
with sampling period
D
=
1
{\displaystyle \Delta =1}
. WikiStochInnNewFigure.jpg
Fig. 1 Histograms of the differences and between the exact innovation estimator with the conventional and order- innovation estimators for the parameters of the model (11)-(12) given time series of noisy observations on the time interval with sampling period .

Figure 1 presents the histograms of the differences and between the exact innovation estimator with the conventional and order- innovation estimators for the parameters and of the equation [12]

obtained from 100 time series of noisy observations

of on the observation times , , with and . The classical and the order- Local Linearization filters of the innovation estimators and are defined as in, [12] respectively, on the uniform time discretizations and , with . The number of stochastic simulations of the order- Local Linearization filter is estimated via an adaptive sampling algorithm with moderate tolerance. The Figure 1 illustrates the convergence of the order- innovation estimator to the exact innovation estimators as decreases, which substantially improves the estimation provided by the conventional innovation estimator .

Deterministic approximations

The order- innovation estimators overcome the drawback of the conventional-type innovation estimators concerning the impossibility of reducing bias. [12] However, the viable bias reduction of an order- innovation estimators might eventually require that the associated order- LMV filter performs a large number of stochastic simulations. [13] In situations where only low or medium precision approximate estimators are needed, an alternative deterministic filter algorithm - called deterministic order- LMV filter [13] - can be obtained by tracking the first two conditional moments and of the order- weak approximation at all the time instants in between two consecutive observation times and . That is, the value of the predictions and in the filtering algorithm are computed from the recursive formulas

and with

and with . The approximate innovation estimators defined with these deterministic order- LMV filters not longer converge to the exact innovation estimator, but allow a significant bias reduction in the estimated parameters for a given finite sample with a lower computational cost.

Fig. 2 Histograms and confidence limits for the innovation estimators
(
a
^
h
,
M
,
s
^
h
,
M
)
{\displaystyle ({\widehat {\alpha }}_{h,M},{\widehat {\sigma }}_{h,M})}
and
(
a
^
[?]
,
M
,
s
^
[?]
,
M
)
{\displaystyle ({\widehat {\alpha }}_{\cdot ,M},{\widehat {\sigma }}_{\cdot ,M})}
of
(
a
,
s
)
{\displaystyle (\alpha ,\sigma )}
computed with the deterministic order-1 LL filter on uniform
(
t
)
h
,
M
{\displaystyle \left(\tau \right)_{h,M}}
and adaptive
(
t
)
[?]
,
M
{\displaystyle \left(\tau \right)_{\cdot ,M}}
time discretizations, respectively, from
100
{\displaystyle 100}
noisy realizations of the Van der Pol model (13)-(15) with sampling period
D
=
1
{\displaystyle \Delta =1}
on the time interval
[
0
,
M
-
1
]
{\displaystyle [0,M-1]}
and
M
=
30
{\displaystyle M=30}
. Observe the bias reduction of the estimated parameter as
h
{\displaystyle h}
decreases. WikiInnFigure.jpg
Fig. 2 Histograms and confidence limits for the innovation estimators and of computed with the deterministic order-1 LL filter on uniform and adaptive time discretizations, respectively, from noisy realizations of the Van der Pol model (13)-(15) with sampling period on the time interval and . Observe the bias reduction of the estimated parameter as decreases.

Figure 2 presents the histograms and the confidence limits of the approximate innovation estimators and for the parameters and of the Van der Pol oscillator with random frequency [12]

obtained from 100 time series of partial and noisy observations

of on the observation times , , with and . The deterministic order- Local Linearization filter of the innovation estimators and is defined, [12] for each estimator, on uniform time discretizations , with and on an adaptive time-stepping discretization with moderate relative and absolute tolerances, respectively. Observe the bias reduction of the estimated parameter as decreases.

Software

A Matlab implementation of various approximate innovation estimators is provided by the SdeEstimation toolbox. [20] This toolbox has Local Linearization filters, including deterministic and stochastic options with fixed step sizes and sample numbers. It also offers adaptive time stepping and sampling algorithms, along with local and global optimization algorithms for innovation estimation. For models with complete observations free of noise, various approximations to the Quasi-Maximum Likelihood estimator are implemented in R. [21]

Related Research Articles

<span class="mw-page-title-main">Normal distribution</span> Probability distribution

In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

<span class="mw-page-title-main">Navier–Stokes equations</span> Equations describing the motion of viscous fluid substances

The Navier–Stokes equations are partial differential equations which describe the motion of viscous fluid substances, named after French engineer and physicist Claude-Louis Navier and Irish physicist and mathematician George Gabriel Stokes. They were developed over several decades of progressively building the theories, from 1822 (Navier) to 1842-1850 (Stokes).

In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statistical model, the observed data is most probable. The point in the parameter space that maximizes the likelihood function is called the maximum likelihood estimate. The logic of maximum likelihood is both intuitive and flexible, and as such the method has become a dominant means of statistical inference.

Linear elasticity is a mathematical model of how solid objects deform and become internally stressed due to prescribed loading conditions. It is a simplification of the more general nonlinear theory of elasticity and a branch of continuum mechanics.

<span class="mw-page-title-main">Expectation–maximization algorithm</span> Iterative method for finding maximum likelihood estimates in statistical models

In statistics, an expectation–maximization (EM) algorithm is an iterative method to find (local) maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variables. The EM iteration alternates between performing an expectation (E) step, which creates a function for the expectation of the log-likelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected log-likelihood found on the E step. These parameter-estimates are then used to determine the distribution of the latent variables in the next E step.

<span class="mw-page-title-main">Hamilton–Jacobi equation</span> A reformulation of Newtons laws of motion using the calculus of variations

In physics, the Hamilton–Jacobi equation, named after William Rowan Hamilton and Carl Gustav Jacob Jacobi, is an alternative formulation of classical mechanics, equivalent to other formulations such as Newton's laws of motion, Lagrangian mechanics and Hamiltonian mechanics.

In mathematics, a Killing vector field, named after Wilhelm Killing, is a vector field on a Riemannian manifold that preserves the metric. Killing fields are the infinitesimal generators of isometries; that is, flows generated by Killing fields are continuous isometries of the manifold. More simply, the flow generates a symmetry, in the sense that moving each point of an object the same distance in the direction of the Killing vector will not distort distances on the object.

<span class="mw-page-title-main">Mohr's circle</span> Geometric civil engineering calculation technique

Mohr's circle is a two-dimensional graphical representation of the transformation law for the Cauchy stress tensor.

<span class="mw-page-title-main">Chiral model</span> Model of mesons in the massless quark limit

In nuclear physics, the chiral model, introduced by Feza Gürsey in 1960, is a phenomenological model describing effective interactions of mesons in the chiral limit (where the masses of the quarks go to zero), but without necessarily mentioning quarks at all. It is a nonlinear sigma model with the principal homogeneous space of a Lie group as its target manifold. When the model was originally introduced, this Lie group was the SU(N), where N is the number of quark flavors. The Riemannian metric of the target manifold is given by a positive constant multiplied by the Killing form acting upon the Maurer–Cartan form of SU(N).

In probability and statistics, a circular distribution or polar distribution is a probability distribution of a random variable whose values are angles, usually taken to be in the range [0, 2π). A circular distribution is often a continuous probability distribution, and hence has a probability density, but such distributions can also be discrete, in which case they are called circular lattice distributions. Circular distributions can be used even when the variables concerned are not explicitly angles: the main consideration is that there is not usually any real distinction between events occurring at the lower or upper end of the range, and the division of the range could notionally be made at any point.

Estimation theory is a branch of statistics that deals with estimating the values of parameters based on measured empirical data that has a random component. The parameters describe an underlying physical setting in such a way that their value affects the distribution of the measured data. An estimator attempts to approximate the unknown parameters using the measurements. In estimation theory, two approaches are generally considered:

<span class="mw-page-title-main">Ornstein–Uhlenbeck process</span> Stochastic process modeling random walk with friction

In mathematics, the Ornstein–Uhlenbeck process is a stochastic process with applications in financial mathematics and the physical sciences. Its original application in physics was as a model for the velocity of a massive Brownian particle under the influence of friction. It is named after Leonard Ornstein and George Eugene Uhlenbeck.

The James–Stein estimator is a biased estimator of the mean, , of (possibly) correlated Gaussian distributed random variables with unknown means .

<span class="mw-page-title-main">Toroidal coordinates</span>

Toroidal coordinates are a three-dimensional orthogonal coordinate system that results from rotating the two-dimensional bipolar coordinate system about the axis that separates its two foci. Thus, the two foci and in bipolar coordinates become a ring of radius in the plane of the toroidal coordinate system; the -axis is the axis of rotation. The focal ring is also known as the reference circle.

<span class="mw-page-title-main">Oblate spheroidal coordinates</span> Three-dimensional orthogonal coordinate system

Oblate spheroidal coordinates are a three-dimensional orthogonal coordinate system that results from rotating the two-dimensional elliptic coordinate system about the non-focal axis of the ellipse, i.e., the symmetry axis that separates the foci. Thus, the two foci are transformed into a ring of radius in the x-y plane. Oblate spheroidal coordinates can also be considered as a limiting case of ellipsoidal coordinates in which the two largest semi-axes are equal in length.

In estimation theory and decision theory, a Bayes estimator or a Bayes action is an estimator or decision rule that minimizes the posterior expected value of a loss function. Equivalently, it maximizes the posterior expectation of a utility function. An alternative way of formulating an estimator within Bayesian statistics is maximum a posteriori estimation.

In statistical signal processing, the goal of spectral density estimation (SDE) or simply spectral estimation is to estimate the spectral density of a signal from a sequence of time samples of the signal. Intuitively speaking, the spectral density characterizes the frequency content of the signal. One purpose of estimating the spectral density is to detect any periodicities in the data, by observing peaks at the frequencies corresponding to these periodicities.

<span class="mw-page-title-main">Wrapped normal distribution</span>

In probability theory and directional statistics, a wrapped normal distribution is a wrapped probability distribution that results from the "wrapping" of the normal distribution around the unit circle. It finds application in the theory of Brownian motion and is a solution to the heat equation for periodic boundary conditions. It is closely approximated by the von Mises distribution, which, due to its mathematical simplicity and tractability, is the most commonly used distribution in directional statistics.

<span class="mw-page-title-main">Symmetry in quantum mechanics</span> Properties underlying modern physics

Symmetries in quantum mechanics describe features of spacetime and particles which are unchanged under some transformation, in the context of quantum mechanics, relativistic quantum mechanics and quantum field theory, and with applications in the mathematical formulation of the standard model and condensed matter physics. In general, symmetry in physics, invariance, and conservation laws, are fundamentally important constraints for formulating physical theories and models. In practice, they are powerful methods for solving problems and predicting what can happen. While conservation laws do not always give the answer to the problem directly, they form the correct constraints and the first steps to solving a multitude of problems.

In numerical analysis, the local linearization (LL) method is a general strategy for designing numerical integrators for differential equations based on a local (piecewise) linearization of the given equation on consecutive time intervals. The numerical integrators are then iteratively defined as the solution of the resulting piecewise linear equation at the end of each consecutive interval. The LL method has been developed for a variety of equations such as the ordinary, delayed, random and stochastic differential equations. The LL integrators are key component in the implementation of inference methods for the estimation of unknown parameters and unobserved variables of differential equations given time series of observations. The LL schemes are ideals to deal with complex models in a variety of fields as neuroscience, finance, forestry management, control engineering, mathematical statistics, etc.

References

  1. 1 2 3 4 5 6 7 8 9 Ozaki, Tohru (1994), Bozdogan, H.; Sclove, S. L.; Gupta, A. K.; Haughton, D. (eds.), "The Local Linearization Filter with Application to Nonlinear System Identifications", Proceedings of the First US/Japan Conference on the Frontiers of Statistical Modeling: An Informational Approach: Volume 3 Engineering and Scientific Applications, Dordrecht: Springer Netherlands, pp. 217–240, doi:10.1007/978-94-011-0854-6_10, ISBN   978-94-011-0854-6 , retrieved 2023-07-06
  2. 1 2 3 4 Jazwinski A.H., Stochastic Processes and Filtering Theory, Academic Press, New York, 1970.
  3. 1 2 3 4 Nielsen, Jan Nygaard; Vestergaard, Martin (2000). "Estimation in continuous-time stochastic volatility models using nonlinear filters". International Journal of Theoretical and Applied Finance. 03 (2): 279–308. doi:10.1142/S0219024900000139. ISSN   0219-0249.
  4. Kailath T., Lectures on Wiener and Kalman Filtering. New York: Springer-Verlag, 1981.
  5. 1 2 3 4 5 Jimenez, J. C.; Ozaki, T. (2006). "An Approximate Innovation Method For The Estimation Of Diffusion Processes From Discrete Data". Journal of Time Series Analysis. 27 (1): 77–97. doi:10.1111/j.1467-9892.2005.00454.x. ISSN   0143-9782. S2CID   18072651.
  6. 1 2 3 4 5 Jimenez, J. C.; Yoshimoto, A.; Miwakeichi, F. (2021-08-24). "State and parameter estimation of stochastic physical systems from uncertain and indirect measurements". The European Physical Journal Plus. 136 (8): 136, 869. Bibcode:2021EPJP..136..869J. doi:10.1140/epjp/s13360-021-01859-1. ISSN   2190-5444. S2CID   238846267.
  7. Schweppe, F. (1965). "Evaluation of likelihood functions for Gaussian signals". IEEE Transactions on Information Theory. 11 (1): 61–70. doi:10.1109/TIT.1965.1053737. ISSN   1557-9654.
  8. Ljung L., System Identification, Theory for the User (2nd edn). Englewood Cliffs: Prentice Hall, 1999.
  9. Lennart, Ljung; Caines, Peter E. (1980). "Asymptotic normality of prediction error estimators for approximate system models". Stochastics. 3 (1–4): 29–46. doi:10.1080/17442507908833135. ISSN   0090-9491. S2CID   43397253.
  10. 1 2 Nolsoe K., Nielsen, J.N., Madsen H. (2000) "Prediction-based estimating function for diffusion processes with measurement noise", Technical Reports 2000, No. 10, Informatics and Mathematical Modelling, Technical University of Denmark.
  11. 1 2 3 Ozaki, T.; Jimenez, J. C.; Haggan-Ozaki, V. (2000). "The Role of the Likelihood Function in the Estimation of Chaos Models". Journal of Time Series Analysis. 21 (4): 363–387. doi:10.1111/1467-9892.00189. ISSN   0143-9782. S2CID   122681657.
  12. 1 2 3 4 5 6 7 8 9 10 11 12 Jimenez, J.C. (2020). "Bias reduction in the estimation of diffusion processes from discrete observations". IMA Journal of Mathematical Control and Information. 37 (4): 1468–1505. doi:10.1093/imamci/dnaa021 . Retrieved 2023-07-06.
  13. 1 2 3 4 5 6 Jimenez, J.C. (2019). "Approximate linear minimum variance filters for continuous-discrete state space models: convergence and practical adaptive algorithms". IMA Journal of Mathematical Control and Information. 36 (2): 341–378. doi: 10.1093/imamci/dnx047 . Retrieved 2023-07-06.
  14. Shoji, Isao (1998). "A comparative study of maximum likelihood estimators for nonlinear dynamical system models". International Journal of Control. 71 (3): 391–404. doi:10.1080/002071798221731. ISSN   0020-7179.
  15. Nielsen, Jan Nygaard; Madsen, Henrik (2001-01-01). "Applying the EKF to stochastic differential equations with level effects". Automatica. 37 (1): 107–112. doi:10.1016/S0005-1098(00)00128-X. ISSN   0005-1098.
  16. 1 2 Singer, Hermann (2002). "Parameter Estimation of Nonlinear Stochastic Differential Equations: Simulated Maximum Likelihood versus Extended Kalman Filter and Itô-Taylor Expansion". Journal of Computational and Graphical Statistics. 11 (4): 972–995. doi:10.1198/106186002808. ISSN   1061-8600. S2CID   120719418.
  17. Ozaki, Tohru; Iino, Mitsunori (2001). "An innovation approach to non-Gaussian time series analysis". Journal of Applied Probability. 38 (A): 78–92. doi:10.1239/jap/1085496593. ISSN   0021-9002. S2CID   119422248.
  18. Peng, H.; Ozaki, T.; Jimenez, J.C. (2002). "Modeling and control for foreign exchange based on a continuous time stochastic microstructure model". Proceedings of the 41st IEEE Conference on Decision and Control, 2002. Vol. 4. pp. 4440–4445 vol.4. doi:10.1109/CDC.2002.1185071. ISBN   0-7803-7516-5. S2CID   8239063.
  19. Kloeden P.E., Platen E., Numerical Solution of Stochastic Differential Equations, 3rd edn. Berlin: Springer, 1999.
  20. "GitHub - locallinearization/SdeEstimation". GitHub. Retrieved 2023-07-06.
  21. Iacus S.M., Simulation and inference for stochastic differential equations: with R examples, New York: Springer, 2008.