In numerical analysis, the local linearization (LL) method is a general strategy for designing numerical integrators for differential equations based on a local (piecewise) linearization of the given equation on consecutive time intervals. The numerical integrators are then iteratively defined as the solution of the resulting piecewise linear equation at the end of each consecutive interval. The LL method has been developed for a variety of equations such as the ordinary, delayed, random and stochastic differential equations. The LL integrators are key component in the implementation of inference methods for the estimation of unknown parameters and unobserved variables of differential equations given time series of (potentially noisy) observations. The LL schemes are ideals to deal with complex models in a variety of fields as neuroscience, finance, forestry management, control engineering, mathematical statistics, etc.
Differential equations have become an important mathematical tool for describing the time evolution of several phenomenon, e.g., rotation of the planets around the sun, the dynamic of assets prices in the market, the fire of neurons, the propagation of epidemics, etc. However, since the exact solutions of these equations are usually unknown, numerical approximations to them obtained by numerical integrators are necessary. Currently, many applications in engineering and applied sciences focused in dynamical studies demand the developing of efficient numerical integrators that preserve, as much as possible, the dynamics of these equations. With this main motivation, the Local Linearization integrators have been developed.
High-order local linearization (HOLL) method is a generalization of the Local Linearization method oriented to obtain high-order integrators for differential equations that preserve the stability and dynamics of the linear equations. The integrators are obtained by splitting, on consecutive time intervals, the solution x of the original equation in two parts: the solution z of the locally linearized equation plus a high-order approximation of the residual .
A Local Linearization (LL) scheme is the final recursive algorithm that allows the numerical implementation of a discretization derived from the LL or HOLL method for a class of differential equations.
Consider the d-dimensional Ordinary Differential Equation (ODE)
with initial condition , where is a differentiable function.
Let be a time discretization of the time interval with maximum stepsize h such that and . After the local linearization of the equation (4.1) at the time step the variation of constants formula yields
where
results from the linear approximation, and
is the residual of the linear approximation. Here, and denote the partial derivatives of f with respect to the variables x and t, respectively, and
For a time discretization , the Local Linear discretization of the ODE (4.1) at each point is defined by the recursive expression [1] [2]
The Local Linear discretization (4.3) converges with order 2 to the solution of nonlinear ODEs, but it match the solution of the linear ODEs. The recursion (4.3) is also known as Exponential Euler discretization. [3]
For a time discretization a high-order local linear (HOLL) discretization of the ODE (4.1) at each point is defined by the recursive expression [1] [4] [5] [6]
where is an order (> 2) approximation to the residual r The HOLL discretization (4.4) converges with order to the solution of nonlinear ODEs, but it match the solution of the linear ODEs.
HOLL discretizations can be derived in two ways: [1] [4] [5] [6] 1) (quadrature-based) by approximating the integral representation (4.2) of r; and 2) (integrator-based) by using a numerical integrator for the differential representation of r defined by
for all , where
HOLL discretizations are, for instance, the followings:
which is obtained by solving (4.5) via a s-stage explicit Runge–Kutta (RK) scheme with coefficients .
which results from the approximation of in (4.2) by its order-p truncated Taylor expansion.
which results from the interpolation of in (4.2) by a polynomial of degree p on , where denotes the j-th backward difference of .
which results from the interpolation of in (4.2) by a polynomial of degree p on ,
which results from the interpolation of in (4.2) by a Hermite polynomial of degree p on .
All numerical implementation of the LL (or of a HOLL) discretization involves approximations to integrals of the form
where A is a d × d matrix. Every numerical implementation of the LL (or of a HOLL) of any order is generically called Local Linearization scheme. [1] [9]
Among a number of algorithms to compute the integrals , those based on rational Padé and Krylov subspaces approximations for exponential matrix are preferred. For this, a central role is playing by the expression [10] [5] [11]
where are d-dimensional vectors,
, , being the d-dimensional identity matrix.
If denotes the (p; q)-Padé approximation of and k is the smallest natural number such that [12] [9]
If denotes the (m; p; q; k) Krylov-Padé approximation of , then [12]
where is the dimension of the Krylov subspace.
where the matrices , L and r are defined as
and with . For large systems of ODEs [3]
where for autonomous ODEs the matrices and are defined as
. Here, denotes the second derivative of f with respect to x, and p + q > 2. For large systems of ODEs
where
and
with and p + q > 3. For large systems of ODEs, the vector in the above scheme is replaced by with
where s = 7 is the number of stages,
with , and are the Runge–Kutta coefficients of Dormand and Prince and p + q > 4. The vector in the above scheme is computed by a Padé or Krylor–Padé approximation for small or large systems of ODE, respectively.
By construction, the LL and HOLL discretizations inherit the stability and dynamics of the linear ODEs, but it is not the case of the LL schemes in general. With , the LL schemes (4.6)-(4.9) are A-stable. [4] With q = p + 1 or q = p + 2, the LL schemes (4.6)–(4.9) are also L-stable. [4] For linear ODEs, the LL schemes (4.6)-(4.9) converge with order p + q. [4] [9] In addition, with p = q = 6 and = d, all the above described LL schemes yield to the ″exact computation″ (up to the precision of the floating-point arithmetic) of linear ODEs on the current personal computers. [4] [9] This includes stiff and highly oscillatory linear equations. Moreover, the LL schemes (4.6)-(4.9) are regular for linear ODEs and inherit the symplectic structure of Hamiltonian harmonic oscillators. [5] [13] These LL schemes are also linearization preserving, and display a better reproduction of the stable and unstable manifolds around hyperbolic equilibrium points and periodic orbits that other numerical schemes with the same stepsize. [5] [13] For instance, Figure 1 shows the phase portrait of the ODEs
with , and , and its approximation by various schemes. This system has two stable stationary points and one unstable stationary point in the region .
Consider the d-dimensional Delay Differential Equation (DDE)
with m constant delays and initial condition for all where f is a differentiable function, is the segment function defined as
for all is a given function, and
For a time discretization , the Local Linear discretization of the DDE (5.1) at each point is defined by the recursive expression [11]
where
is the segment function defined as
and is a suitable approximation to for all such that Here,
are constant matrices and
are constant vectors. denote, respectively, the partial derivatives of f with respect to the variables t and x, and . The Local Linear discretization (5.2) converges to the solution of (5.1) with order if approximates with order for all .
Depending on the approximations and on the algorithm to compute different Local Linearizations schemes can be defined. Every numerical implementation of a Local Linear discretization is generically called local linearization scheme.
where the matrices and are defined as
and , and . Here, the matrices , , and are defined as in (5.2), but replacing by and where
with , is the Local Linear Approximation to the solution of (5.1) defined through the LL scheme (5.3) for all and by for . For large systems of DDEs
with and . Fig. 2 Illustrates the stability of the LL scheme (5.3) and of that of an explicit scheme of similar order in the integration of a stiff system of DDEs.
Consider the d-dimensional Random Differential Equation (RDE)
with initial condition where is a k-dimensional separable finite continuous stochastic process, and f is a differentiable function. Suppose that a realization (path) of is given.
For a time discretization , the Local Linear discretization of the RDE (6.1) at each point is defined by the recursive expression [16]
where
and is an approximation to the process for all Here, and denote the partial derivatives of with respect to and , respectively.
Depending on the approximations to the process and of the algorithm to compute , different Local Linearizations schemes can be defined. Every numerical implementation of the local linear discretization is generically called local linearization scheme.
where the matrices are defined as
, , and p+q>1. For large systems of RDEs, [17]
The convergence rate of both schemes is , where is the exponent of the Holder condition of .
Figure 3 presents the phase portrait of the RDE
and its approximation by two numerical schemes, where denotes a fractional Brownian process with Hurst exponent H=0.45.
Consider the d-dimensional Stochastic Differential Equation (SDE)
with initial condition , where the drift coefficient and the diffusion coefficient are differentiable functions, and is an m-dimensional standard Wiener process.
For a time discretization , the order- (=1,1.5) Strong Local Linear discretization of the solution of the SDE (7.1) is defined by the recursive relation [18] [19]
where
and
Here,
denote the partial derivatives of with respect to the variables and t, respectively, and the Hessian matrix of with respect to . The strong Local Linear discretization converges with order (= 1, 1.5) to the solution of (7.1).
After the local linearization of the drift term of (7.1) at , the equation for the residual is given by
for all , where
A high-order local linear discretization of the SDE (7.1) at each point is then defined by the recursive expression [20]
where is a strong approximation to the residual of order higher than 1.5. The strong HOLL discretization converges with order to the solution of (7.1).
Depending on the way of computing , and different numerical schemes can be obtained. Every numerical implementation of a strong Local Linear discretization of any order is generically called Strong Local Linearization (SLL) scheme.
where the matrices , and are defined as in (4.6), is an i.i.d. zero mean Gaussian random variable with variance , and p + q > 1. For large systems of SDEs, [21] in the above scheme is replaced by .
where the matrices , and are defined as
, is a i.i.d. zero mean Gaussian random variable with variance and covariance and p+q>1 [12] . For large systems of SDEs, [12] in the above scheme is replaced by .
where , , and are defined as in the order-1 SLL schemes, and is order 2 approximation to the multiple Stratonovish integral . [20]
For SDEs with a single Wiener noise (m=1) [20]
where
with .
Here, for low dimensional SDEs, and for large systems of SDEs, where , , , and are defined as in the order-2 SLL-Taylor schemes, p+q>1 and .
By construction, the strong LL and HOLL discretizations inherit the stability and dynamics of the linear SDEs, but it is not the case of the strong LL schemes in general. LL schemes (7.2)-(7.5) with are A-stable, including stiff and highly oscillatory linear equations. [12] Moreover, for linear SDEs with random attractors, these schemes also have a random attractor that converges in probability to the exact one as the stepsize decreases and preserve the ergodicity of these equations for any stepsize. [20] [12] These schemes also reproduce essential dynamical properties of simple and coupled harmonic oscillators such as the linear growth of energy along the paths, the oscillatory behavior around 0, the symplectic structure of Hamiltonian oscillators, and the mean of the paths. [20] [22] For nonlinear SDEs with small noise (i.e., (7.1) with ), the paths of these SLL schemes are basically the nonrandom paths of the LL scheme (4.6) for ODEs plus a small disturbance related to the small noise. In this situation, the dynamical properties of that deterministic scheme, such as the linearization preserving and the preservation of the exact solution dynamics around hyperbolic equilibrium points and periodic orbits, become relevant for the paths of the SLL scheme. [20] For instance, Fig 4 shows the evolution of domains in the phase plane and the energy of the stochastic oscillator
and their approximations by two numerical schemes.
Consider the d-dimensional stochastic differential equation
with initial condition , where the drift coefficient and the diffusion coefficient are differentiable functions, and is an m-dimensional standard Wiener process.
For a time discretization , the order-Weak Local Linear discretization of the solution of the SDE (8.1) is defined by the recursive relation [23]
where
with
and is a zero mean stochastic process with variance matrix
Here, , denote the partial derivatives of with respect to the variables and t, respectively, the Hessian matrix of with respect to , and . The weak Local Linear discretization converges with order (=1,2) to the solution of (8.1).
Depending on the way of computing and different numerical schemes can be obtained. Every numerical implementation of the Weak Local Linear discretization is generically called Weak Local Linearization (WLL) scheme.
where, for SDEs with autonomous diffusion coefficients, , and are the submatrices defined by the partitioned matrix , with
and is a sequence of d-dimensional independent two-points distributed random vectors satisfying .
where , and are the submatrices defined by the partitioned matrix with
and
By construction, the weak LL discretizations inherit the stability and dynamics of the linear SDEs, but it is not the case of the weak LL schemes in general. WLL schemes, with preserve the first two moments of the linear SDEs, and inherits the mean-square stability or instability that such solution may have. [24] This includes, for instance, the equations of coupled harmonic oscillators driven by random force, and large systems of stiff linear SDEs that result from the method of lines for linear stochastic partial differential equations. Moreover, these WLL schemes preserve the ergodicity of the linear equations, and are geometrically ergodic for some classes of nonlinear SDEs. [26] For nonlinear SDEs with small noise (i.e., (8.1) with ), the solutions of these WLL schemes are basically the nonrandom paths of the LL scheme (4.6) for ODEs plus a small disturbance related to the small noise. In this situation, the dynamical properties of that deterministic scheme, such as the linearization preserving and the preservation of the exact solution dynamics around hyperbolic equilibrium points and periodic orbits, become relevant for the mean of the WLL scheme. [24] For instance, Fig. 5 shows the approximate mean of the SDE
computed by various schemes.
Below is a time line of the main developments of the Local Linearization (LL) method.
In mathematics, the associative property is a property of some binary operations that means that rearranging the parentheses in an expression will not change the result. In propositional logic, associativity is a valid rule of replacement for expressions in logical proofs.
The Navier–Stokes equations are partial differential equations which describe the motion of viscous fluid substances. They were named after French engineer and physicist Claude-Louis Navier and the Irish physicist and mathematician George Gabriel Stokes. They were developed over several decades of progressively building the theories, from 1822 (Navier) to 1842–1850 (Stokes).
In mechanics and geometry, the 3D rotation group, often denoted SO(3), is the group of all rotations about the origin of three-dimensional Euclidean space under the operation of composition.
In statistics and control theory, Kalman filtering is an algorithm that uses a series of measurements observed over time, including statistical noise and other inaccuracies, to produce estimates of unknown variables that tend to be more accurate than those based on a single measurement, by estimating a joint probability distribution over the variables for each time-step. The filter is constructed as a mean squared error minimiser, but an alternative derivation of the filter is also provided showing how the filter relates to maximum likelihood statistics. The filter is named after Rudolf E. Kálmán.
In mathematics, the Hodge star operator or Hodge star is a linear map defined on the exterior algebra of a finite-dimensional oriented vector space endowed with a nondegenerate symmetric bilinear form. Applying the operator to an element of the algebra produces the Hodge dual of the element. This map was introduced by W. V. D. Hodge.
In the theory of stochastic processes, the Karhunen–Loève theorem, also known as the Kosambi–Karhunen–Loève theorem states that a stochastic process can be represented as an infinite linear combination of orthogonal functions, analogous to a Fourier series representation of a function on a bounded interval. The transformation is also known as Hotelling transform and eigenvector transform, and is closely related to principal component analysis (PCA) technique widely used in image processing and in data analysis in many fields.
In physics, the Hamilton–Jacobi equation, named after William Rowan Hamilton and Carl Gustav Jacob Jacobi, is an alternative formulation of classical mechanics, equivalent to other formulations such as Newton's laws of motion, Lagrangian mechanics and Hamiltonian mechanics.
In mathematics, the Riesz–Thorin theorem, often referred to as the Riesz–Thorin interpolation theorem or the Riesz–Thorin convexity theorem, is a result about interpolation of operators. It is named after Marcel Riesz and his student G. Olof Thorin.
In geometry, Plücker coordinates, introduced by Julius Plücker in the 19th century, are a way to assign six homogeneous coordinates to each line in projective 3-space, . Because they satisfy a quadratic constraint, they establish a one-to-one correspondence between the 4-dimensional space of lines in and points on a quadric in . A predecessor and special case of Grassmann coordinates, Plücker coordinates arise naturally in geometric algebra. They have proved useful for computer graphics, and also can be extended to coordinates for the screws and wrenches in the theory of kinematics used for robot control.
In mathematics, a local system on a topological space X is a tool from algebraic topology which interpolates between cohomology with coefficients in a fixed abelian group A, and general sheaf cohomology in which coefficients vary from point to point. Local coefficient systems were introduced by Norman Steenrod in 1943.
Limited-memory BFGS is an optimization algorithm in the family of quasi-Newton methods that approximates the Broyden–Fletcher–Goldfarb–Shanno algorithm (BFGS) using a limited amount of computer memory. It is a popular algorithm for parameter estimation in machine learning. The algorithm's target problem is to minimize over unconstrained values of the real-vector where is a differentiable scalar function.
In mathematics, the G-function was introduced by Cornelis Simon Meijer as a very general function intended to include most of the known special functions as particular cases. This was not the only attempt of its kind: the generalized hypergeometric function and the MacRobert E-function had the same aim, but Meijer's G-function was able to include those as particular cases as well. The first definition was made by Meijer using a series; nowadays the accepted and more general definition is via a line integral in the complex plane, introduced in its full generality by Arthur Erdélyi in 1953.
In mathematics, the classical groups are defined as the special linear groups over the reals , the complex numbers and the quaternions together with special automorphism groups of symmetric or skew-symmetric bilinear forms and Hermitian or skew-Hermitian sesquilinear forms defined on real, complex and quaternionic finite-dimensional vector spaces. Of these, the complex classical Lie groups are four infinite families of Lie groups that together with the exceptional groups exhaust the classification of simple Lie groups. The compact classical groups are compact real forms of the complex classical groups. The finite analogues of the classical groups are the classical groups of Lie type. The term "classical group" was coined by Hermann Weyl, it being the title of his 1939 monograph The Classical Groups.
In the field of mathematical analysis, an interpolation space is a space which lies "in between" two other Banach spaces. The main applications are in Sobolev spaces, where spaces of functions that have a noninteger number of derivatives are interpolated from the spaces of functions with integer number of derivatives.
In mathematics, the oscillator representation is a projective unitary representation of the symplectic group, first investigated by Irving Segal, David Shale, and André Weil. A natural extension of the representation leads to a semigroup of contraction operators, introduced as the oscillator semigroup by Roger Howe in 1988. The semigroup had previously been studied by other mathematicians and physicists, most notably Felix Berezin in the 1960s. The simplest example in one dimension is given by SU(1,1). It acts as Möbius transformations on the extended complex plane, leaving the unit circle invariant. In that case the oscillator representation is a unitary representation of a double cover of SU(1,1) and the oscillator semigroup corresponds to a representation by contraction operators of the semigroup in SL(2,C) corresponding to Möbius transformations that take the unit disk into itself.
In machine learning, the kernel embedding of distributions comprises a class of nonparametric methods in which a probability distribution is represented as an element of a reproducing kernel Hilbert space (RKHS). A generalization of the individual data-point feature mapping done in classical kernel methods, the embedding of distributions into infinite-dimensional feature spaces can preserve all of the statistical features of arbitrary distributions, while allowing one to compare and manipulate distributions using Hilbert space operations such as inner products, distances, projections, linear transformations, and spectral analysis. This learning framework is very general and can be applied to distributions over any space on which a sensible kernel function may be defined. For example, various kernels have been proposed for learning from data which are: vectors in , discrete classes/categories, strings, graphs/networks, images, time series, manifolds, dynamical systems, and other structured objects. The theory behind kernel embeddings of distributions has been primarily developed by Alex Smola, Le Song , Arthur Gretton, and Bernhard Schölkopf. A review of recent works on kernel embedding of distributions can be found in.
In statistics, the complex Wishart distribution is a complex version of the Wishart distribution. It is the distribution of times the sample Hermitian covariance matrix of zero-mean independent Gaussian random variables. It has support for Hermitian positive definite matrices.
In mathematics, calculus on Euclidean space is a generalization of calculus of functions in one or several variables to calculus of functions on Euclidean space as well as a finite-dimensional real vector space. This calculus is also known as advanced calculus, especially in the United States. It is similar to multivariable calculus but is somewhat more sophisticated in that it uses linear algebra more extensively and covers some concepts from differential geometry such as differential forms and Stokes' formula in terms of differential forms. This extensive use of linear algebra also allows a natural generalization of multivariable calculus to calculus on Banach spaces or topological vector spaces.
In statistics, the Innovation Method provides an estimator for the parameters of stochastic differential equations given a time series of observations of the state variables. In the framework of continuous-discrete state space models, the innovation estimator is obtained by maximizing the log-likelihood of the corresponding discrete-time innovation process with respect to the parameters. The innovation estimator can be classified as a M-estimator, a quasi-maximum likelihood estimator or a prediction error estimator depending on the inferential considerations that want to be emphasized. The innovation method is a system identification technique for developing mathematical models of dynamical systems from measured data and for the optimal design of experiments.