Empirical dynamic modeling

Last updated

Empirical dynamic modeling (EDM) is a framework for analysis and prediction of nonlinear dynamical systems. Applications include population dynamics, [1] [2] [3] [4] [5] [6] ecosystem service, [7] medicine, [8] neuroscience, [9] [10] [11] dynamical systems, [12] [13] [14] geophysics, [15] [16] [17] and human-computer interaction. [18] EDM was originally developed by Robert May and George Sugihara. It can be considered a methodology for data modeling, predictive analytics, dynamical system analysis, machine learning and time series analysis.

Contents

Description

Mathematical models have tremendous power to describe observations of real-world systems. They are routinely used to test hypothesis, explain mechanisms and predict future outcomes. However, real-world systems are often nonlinear and multidimensional, in some instances rendering explicit equation-based modeling problematic. Empirical models, which infer patterns and associations from the data instead of using hypothesized equations, represent a natural and flexible framework for modeling complex dynamics.

Donald DeAngelis and Simeon Yurek illustrated that canonical statistical models are ill-posed when applied to nonlinear dynamical systems. [19] A hallmark of nonlinear dynamics is state-dependence: system states are related to previous states governing transition from one state to another. EDM operates in this space, the multidimensional state-space of system dynamics rather than on one-dimensional observational time series. EDM does not presume relationships among states, for example, a functional dependence, but projects future states from localised, neighboring states. EDM is thus a state-space, nearest-neighbors paradigm where system dynamics are inferred from states derived from observational time series. This provides a model-free representation of the system naturally encompassing nonlinear dynamics.

A cornerstone of EDM is recognition that time series observed from a dynamical system can be transformed into higher-dimensional state-spaces by time-delay embedding with Takens's theorem. The state-space models are evaluated based on in-sample fidelity to observations, conventionally with Pearson correlation between predictions and observations.

Methods

EDM is continuing to evolve. As of 2022, the main algorithms are Simplex projection, [20] Sequential locally weighted global linear maps (S-Map) projection, [21] Multivariate embedding in Simplex or S-Map, [1] Convergent cross mapping (CCM), [22] and Multiview Embeding, [23] described below.

Nomenclature
ParameterDescription
embedding dimension
number of nearest neighbors
prediction interval
observed time series
vector of lagged observations
S-Map localization
lagged embedding vectors
norm of v
list of nearest neighbors

Nearest neighbors are found according to:

Simplex

Simplex projection [20] [24] [25] [26] is a nearest neighbor projection. It locates the nearest neighbors to the location in the state-space from which a prediction is desired. To minimize the number of free parameters is typically set to defining an dimensional simplex in the state-space. The prediction is computed as the average of the weighted phase-space simplex projected points ahead. Each neighbor is weighted proportional to their distance to the projection origin vector in the state-space.

  1. Find nearest neighbor:
  2. Define the distance scale:
  3. Compute weights: For{} :
  4. Average of state-space simplex:

S-Map

S-Map [21] extends the state-space prediction in Simplex from an average of the nearest neighbors to a linear regression fit to all neighbors, but localised with an exponential decay kernel. The exponential localisation function is , where is the neighbor distance and the mean distance. In this way, depending on the value of , neighbors close to the prediction origin point have a higher weight than those further from it, such that a local linear approximation to the nonlinear system is reasonable. This localisation ability allows one to identify an optimal local scale, in-effect quantifying the degree of state dependence, and hence nonlinearity of the system.

Another feature of S-Map is that for a properly fit model, the regression coefficients between variables have been shown to approximate the gradient (directional derivative) of variables along the manifold. [27] These Jacobians represent the time-varying interaction strengths between system variables.

  1. Find nearest neighbor:
  2. Sum of distances:
  3. Compute weights: For{} :
  4. Reweighting matrix:
  5. Design matrix:
  6. Weighted design matrix:
  7. Response vector at :
  8. Weighted response vector:
  9. Least squares solution (SVD):
  10. Local linear model is prediction:

Multivariate Embedding

Multivariate Embedding [1] [12] [28] recognizes that time-delay embeddings are not the only valid state-space construction. In Simplex and S-Map one can generate a state-space from observational vectors, or time-delay embeddings of a single observational time series, or both.

Convergent Cross Mapping

Convergent cross mapping (CCM) [22] leverages a corollary to the Generalized Takens Theorem [12] that it should be possible to cross predict or cross map between variables observed from the same system. Suppose that in some dynamical system involving variables and , causes . Since and belong to the same dynamical system, their reconstructions (via embeddings) , and , also map to the same system.

The causal variable leaves a signature on the affected variable , and consequently, the reconstructed states based on can be used to cross predict values of . CCM leverages this property to infer causality by predicting using the library of points (or vice versa for the other direction of causality), while assessing improvements in cross map predictability as larger and larger random samplings of are used. If the prediction skill of increases and saturates as the entire is used, this provides evidence that is casually influencing .

Multiview Embedding

Multiview Embedding [23] is a Dimensionality reduction technique where a large number of state-space time series vectors are combitorially assessed towards maximal model predictability.

Extensions

Extensions to EDM techniques include:

See also

Related Research Articles

<span class="mw-page-title-main">Chaos theory</span> Field of mathematics and science based on non-linear systems and initial conditions

Chaos theory is an interdisciplinary area of scientific study and branch of mathematics. It focuses on underlying patterns and deterministic laws of dynamical systems that are highly sensitive to initial conditions. These were once thought to have completely random states of disorder and irregularities. Chaos theory states that within the apparent randomness of chaotic complex systems, there are underlying patterns, interconnection, constant feedback loops, repetition, self-similarity, fractals and self-organization. The butterfly effect, an underlying principle of chaos, describes how a small change in one state of a deterministic nonlinear system can result in large differences in a later state. A metaphor for this behavior is that a butterfly flapping its wings in Brazil can cause a tornado in Texas.

<span class="mw-page-title-main">Dynamical system</span> Mathematical model of the time dependence of a point in space

In mathematics, a dynamical system is a system in which a function describes the time dependence of a point in an ambient space, such as in a parametric curve. Examples include the mathematical models that describe the swinging of a clock pendulum, the flow of water in a pipe, the random motion of particles in the air, and the number of fish each springtime in a lake. The most general definition unifies several concepts in mathematics such as ordinary differential equations and ergodic theory by allowing different choices of the space and how time is measured. Time can be measured by integers, by real or complex numbers or can be a more general algebraic object, losing the memory of its physical origin, and the space may be a manifold or simply a set, without the need of a smooth space-time structure defined on it.

In mathematics and science, a nonlinear system is a system in which the change of the output is not proportional to the change of the input. Nonlinear problems are of interest to engineers, biologists, physicists, mathematicians, and many other scientists since most systems are inherently nonlinear in nature. Nonlinear dynamical systems, describing changes in variables over time, may appear chaotic, unpredictable, or counterintuitive, contrasting with much simpler linear systems.

<span class="mw-page-title-main">Lyapunov exponent</span> The rate of separation of infinitesimally close trajectories

In mathematics, the Lyapunov exponent or Lyapunov characteristic exponent of a dynamical system is a quantity that characterizes the rate of separation of infinitesimally close trajectories. Quantitatively, two trajectories in phase space with initial separation vector diverge at a rate given by

<span class="mw-page-title-main">Nonlinear dimensionality reduction</span> Summary of algorithms for nonlinear dimensionality reduction

Nonlinear dimensionality reduction, also known as manifold learning, is any of various related techniques that aim to project high-dimensional data onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low-dimensional space, or learning the mapping itself. The techniques described below can be understood as generalizations of linear decomposition methods used for dimensionality reduction, such as singular value decomposition and principal component analysis.

Various types of stability may be discussed for the solutions of differential equations or difference equations describing dynamical systems. The most important type is that concerning the stability of solutions near to a point of equilibrium. This may be discussed by the theory of Aleksandr Lyapunov. In simple terms, if the solutions that start out near an equilibrium point stay near forever, then is Lyapunov stable. More strongly, if is Lyapunov stable and all solutions that start out near converge to , then is said to be asymptotically stable. The notion of exponential stability guarantees a minimal rate of decay, i.e., an estimate of how quickly the solutions converge. The idea of Lyapunov stability can be extended to infinite-dimensional manifolds, where it is known as structural stability, which concerns the behavior of different but "nearby" solutions to differential equations. Input-to-state stability (ISS) applies Lyapunov notions to systems with inputs.

In descriptive statistics and chaos theory, a recurrence plot (RP) is a plot showing, for each moment in time, the times at which the state of a dynamical system returns to the previous state at , i.e., when the phase space trajectory visits roughly the same area in the phase space as at time . In other words, it is a plot of

<span class="mw-page-title-main">Takens's theorem</span> Conditions under which a chaotic system can be reconstructed by observation

In the study of dynamical systems, a delay embedding theorem gives the conditions under which a chaotic dynamical system can be reconstructed from a sequence of observations of the state of that system. The reconstruction preserves the properties of the dynamical system that do not change under smooth coordinate changes, but it does not preserve the geometric shape of structures in phase space.

The competitive Lotka–Volterra equations are a simple model of the population dynamics of species competing for some common resource. They can be further generalised to the generalized Lotka–Volterra equation to include trophic interactions.

<span class="mw-page-title-main">Bifurcation theory</span> Study of sudden qualitative behavior changes caused by small parameter changes

Bifurcation theory is the mathematical study of changes in the qualitative or topological structure of a given family of curves, such as the integral curves of a family of vector fields, and the solutions of a family of differential equations. Most commonly applied to the mathematical study of dynamical systems, a bifurcation occurs when a small smooth change made to the parameter values of a system causes a sudden 'qualitative' or topological change in its behavior. Bifurcations occur in both continuous systems and discrete systems.

<span class="mw-page-title-main">Lorenz system</span> System of ordinary differential equations with chaotic solutions

The Lorenz system is a system of ordinary differential equations first studied by mathematician and meteorologist Edward Lorenz. It is notable for having chaotic solutions for certain parameter values and initial conditions. In particular, the Lorenz attractor is a set of chaotic solutions of the Lorenz system. The term "butterfly effect" in popular media may stem from the real-world implications of the Lorenz attractor, namely that several different initial chaotic conditions evolve in phase space in a way that never repeats, so all chaos is unpredictable. This underscores that chaotic systems can be completely deterministic and yet still be inherently unpredictable over long periods of time. Because chaos continually increases in systems, it is impossible to predict the future of systems well. For instance, even the small flap of a butterfly's wings could set the world on a vastly different trajectory, such as by causing a hurricane. The shape of the Lorenz attractor itself, when plotted in phase space, may also be seen to resemble a butterfly.

<span class="mw-page-title-main">Singular spectrum analysis</span> Nonparametric spectral estimation method

In time series analysis, singular spectrum analysis (SSA) is a nonparametric spectral estimation method. It combines elements of classical time series analysis, multivariate statistics, multivariate geometry, dynamical systems and signal processing. Its roots lie in the classical Karhunen (1946)–Loève spectral decomposition of time series and random fields and in the Mañé (1981)–Takens (1981) embedding theorem. SSA can be an aid in the decomposition of time series into a sum of components, each having a meaningful interpretation. The name "singular spectrum analysis" relates to the spectrum of eigenvalues in a singular value decomposition of a covariance matrix, and not directly to a frequency domain decomposition.

A coupled map lattice (CML) is a dynamical system that models the behavior of nonlinear systems. They are predominantly used to qualitatively study the chaotic dynamics of spatially extended systems. This includes the dynamics of spatiotemporal chaos where the number of effective degrees of freedom diverges as the size of the system increases.

<span class="mw-page-title-main">Stellar pulsation</span>

Stellar pulsations are caused by expansions and contractions in the outer layers as a star seeks to maintain equilibrium. These fluctuations in stellar radius cause corresponding changes in the luminosity of the star. Astronomers are able to deduce this mechanism by measuring the spectrum and observing the Doppler effect. Many intrinsic variable stars that pulsate with large amplitudes, such as the classical Cepheids, RR Lyrae stars and large-amplitude Delta Scuti stars show regular light curves.

The dynamical systems approach to neuroscience is a branch of mathematical biology that utilizes nonlinear dynamics to understand and model the nervous system and its functions. In a dynamical system, all possible states are expressed by a phase space. Such systems can experience bifurcation as a function of its bifurcation parameters and often exhibit chaos. Dynamical neuroscience describes the non-linear dynamics at many levels of the brain from single neural cells to cognitive processes, sleep states and the behavior of neurons in large-scale neuronal simulation.

Convergent cross mapping (CCM) is a statistical test for a cause-and-effect relationship between two variables that, like the Granger causality test, seeks to resolve the problem that correlation does not imply causation. While Granger causality is best suited for purely stochastic systems where the influences of the causal variables are separable, CCM is based on the theory of dynamical systems and can be applied to systems where causal variables have synergistic effects. As such, CCM is specifically aimed to identify linkage between variables that can appear uncorrelated with each other.

t-distributed stochastic neighbor embedding Technique for dimensionality reduction

t-distributed stochastic neighbor embedding (t-SNE) is a statistical method for visualizing high-dimensional data by giving each datapoint a location in a two or three-dimensional map. It is based on Stochastic Neighbor Embedding originally developed by Geoffrey Hinton and Sam Roweis, where Laurens van der Maaten proposed the t-distributed variant. It is a nonlinear dimensionality reduction technique for embedding high-dimensional data for visualization in a low-dimensional space of two or three dimensions. Specifically, it models each high-dimensional object by a two- or three-dimensional point in such a way that similar objects are modeled by nearby points and dissimilar objects are modeled by distant points with high probability.

Mean-field particle methods are a broad class of interacting type Monte Carlo algorithms for simulating from a sequence of probability distributions satisfying a nonlinear evolution equation. These flows of probability measures can always be interpreted as the distributions of the random states of a Markov process whose transition probabilities depends on the distributions of the current random states. A natural way to simulate these sophisticated nonlinear Markov processes is to sample a large number of copies of the process, replacing in the evolution equation the unknown distributions of the random states by the sampled empirical measures. In contrast with traditional Monte Carlo and Markov chain Monte Carlo methods these mean-field particle techniques rely on sequential interacting samples. The terminology mean-field reflects the fact that each of the samples interacts with the empirical measures of the process. When the size of the system tends to infinity, these random empirical measures converge to the deterministic distribution of the random states of the nonlinear Markov chain, so that the statistical interaction between particles vanishes. In other words, starting with a chaotic configuration based on independent copies of initial state of the nonlinear Markov chain model, the chaos propagates at any time horizon as the size the system tends to infinity; that is, finite blocks of particles reduces to independent copies of the nonlinear Markov process. This result is called the propagation of chaos property. The terminology "propagation of chaos" originated with the work of Mark Kac in 1976 on a colliding mean-field kinetic gas model.

<span class="mw-page-title-main">Spectral submanifold</span>

In dynamical systems, a spectral submanifold (SSM) is the unique smoothest invariant manifold serving as the nonlinear extension of a spectral subspace of a linear dynamical system under the addition of nonlinearities. SSM theory provides conditions for when invariant properties of eigenspaces of a linear dynamical system can be extended to a nonlinear system, and therefore motivates the use of SSMs in nonlinear dimensionality reduction.

<span class="mw-page-title-main">Heteroclinic channels</span> Robotic control method

Heteroclinic channels are ensembles of trajectories that can connect saddle equilibrium points in phase space. Dynamical systems and their associated phase spaces can be used to describe natural phenomena in mathematical terms; heteroclinic channels, and the cycles that they produce, are features in phase space that can be designed to occupy specific locations in that space. Heteroclinic channels move trajectories from one equilibrium point to another. More formally, a heteroclinic channel is a region in phase space in which nearby trajectories are drawn closer and closer to one unique limiting trajectory, the heteroclinic orbit. Equilibria connected by heteroclinic trajectories form heteroclinic cycles and cycles can be connected to form heteroclinic networks. Heteroclinic cycles and networks naturally appear in a number of applications, such as fluid dynamics, population dynamics, and neural dynamics. In addition, dynamical systems are often used as methods for robotic control. In particular, for robotic control, the equilibrium points can correspond to robotic states, and the heteroclinic channels can provide smooth methods for switching from state to state.

References

  1. 1 2 3 Dixon, P. A., et al. 1999. Episodic fluctuations in larval supply. Science 283:1528–1530
  2. Hao Ye, Richard J. Beamish, Sarah M. Glaser, et al. 2015. Equation-free mechanistic ecosystem forecasting using empirical dynamic modeling. Proceedings of the National Academy of Sciences Mar 2015, 112 (13) E1569-E1576; DOI: 10.1073/pnas.1417063112
  3. Ethan R. Deyle, Michael Fogarty, Chih-hao Hsieh, et al. 2013. Proceedings of the National Academy of Sciences Apr 2013, 110 (16) 6430-6435; DOI: 10.1073/pnas.1215506110
  4. 1 2 Ushio, M., Hsieh, Ch., Masuda, R. et al., 2018. Fluctuating interaction network and time-varying stability of a natural fish community. Nature 554, 360–363
  5. Deyle E.R., et al. 2016. Tracking and forecasting ecosystem interactions in real time. Proc. R. Soc. B 283: 20152258
  6. Tanya L. Rogers, Stephan B. Munch, Simon D. Stewart, Eric P. Palkovacs, Alfredo Giron-Nava, Shin-ichiro S. Matsuzaki, Celia C. Symons. Ecology Letters, 23 (8) August 2020, 1287-1297
  7. Park J., et al. 2021. Dynamics of Florida milk production and total phosphate in Lake Okeechobee. PLoS ONE 16(8): e0248910. doi:10.1371/journal.pone.0248910
  8. George Sugihara, Walter Allan, Daniel Sobel, and Kenneth D. Allan, 1996. Nonlinear control of heart rate variability in human infants. Proc. Natl. Acad. Sci. USA. Vol. 93, pp. 2608-2613, March 1996. Medical Sciences
  9. McBride, J. C., et al. Sugihara causality analysis of scalp EEG for detection of early Alzheimer's disease. Neuroimage-Clinical 7:258–265 (2015)
  10. Tajima S, Yanagawa T, Fujii N, Toyoizumi T (2015) Untangling Brain-Wide Dynamics in Consciousness by Cross-Embedding. PLoS Comput Biol 11(11): e1004537. https://doi.org/10.1371/journal.pcbi.1004537
  11. W. Watanakeesuntorn et al., "Massively Parallel Causal Inference of Whole Brain Dynamics at Single Neuron Resolution," 2020 IEEE 26th International Conference on Parallel and Distributed Systems (ICPADS), 2020, pp. 196-205, doi: 10.1109/ICPADS51040.2020.00035
  12. 1 2 3 4 Deyle ER, Sugihara G (2011) Generalized Theorems for Nonlinear State Space Reconstruction. PLoS ONE 6(3): e18295. doi:10.1371/journal.pone.0018295
  13. 1 2 Ye, H., Deyle, E., Gilarranz, L. et al., 2015. Distinguishing time-delayed causal interactions using convergent cross mapping. Sci Rep 5, 14750 (2015). doi:10.1038/srep14750
  14. Cenci, S., Saavedra, S. Non-parametric estimation of the structural stability of non-equilibrium community dynamics. Nat Ecol Evol 3, 912–918 (2019). https://doi.org/10.1038/s41559-019-0879-1
  15. Tsonis A. A., et al. Dynamical evidence for causality between galactic cosmic rays and interannual variation in global temperature. Proc Natl Acad Sci 112(11):3253–3256 (2015).
  16. Nes EH Van, et al. Causal feedbacks in climate change. Nat Clim Chang 5(5):445–448 (2015)
  17. 1 2 Park, J., et al. Empirical mode modeling. Nonlinear Dyn (2022). https://doi.org/10.1007/s11071-022-07311-y
  18. van Berkel, Niels; Dennis, Simon; Zyphur, Michael; Li, Jinjing; Heathcote, Andrew; Kostakos, Vassilis (2021-07-04). "Modeling interaction as a complex system". Human–Computer Interaction. 36 (4): 279–305. doi:10.1080/07370024.2020.1715221. hdl: 11343/247884 . ISSN   0737-0024. S2CID   211267275.
  19. Donald L. DeAngelis, Simeon Yurek, 2015, Equation-free modeling unravels the behavior of complex ecological systems. Proceedings of the National Academy of Sciences Mar 2015, 112 (13) 3856-3857; DOI: 10.1073/pnas.1503154112
  20. 1 2 Sugihara G. and May R., 1990. Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series. Nature, 344:734–741
  21. 1 2 Sugihara G., 1994. Nonlinear forecasting for the classification of natural time series. Philosophical Transactions: Physical Sciences and Engineering, 348 (1688) : 477–495
  22. 1 2 Sugihara G., May R., Ye H., et al. 2012. Detecting Causality in Complex Ecosystems. Science 338:496-500
  23. 1 2 Ye H., and G. Sugihara, 2016. Information leverage in interconnected ecosystems: Overcoming the curse of dimensionality. Science 353:922–925
  24. Takens, F. (1981). Detecting strange attractors in turbulence. In D. A. Rand & L. S. Young (Eds.), Dynamical Systems and Turbulence (pp. 366–381). Springer.
  25. Casdagli, M. (1989). Nonlinear prediction of chaotic time series. Physica D: Nonlinear Phenomena, 35(3), 335–356.
  26. Judd, K., & Mees, A. (1998). Embedding as a modeling problem. Physica D: Nonlinear Phenomena, 120(3), 273–286.
  27. Deyle ER. et al. 2016. Tracking and forecasting ecosystem interactions in real time. Proc. R. Soc. B 283: 20152258
  28. Sauer, T., Yorke, J. A., & Casdagli, M. (1991). Embedology. Journal of Statistical Physics, 65(3), 579–616
  29. Cenci S, Sugihara G, Saavedra S, 2019. Regularized S-map for inference and forecasting with noisy ecological time series, METHODS IN ECOLOGY AND EVOLUTION, 10 (5), 650-660
  30. Hiroaki Natsukawa, et al. 2021. A Visual Analytics Approach for Ecosystem Dynamics based on Empirical Dynamic Modeling. IEEE Transactions on Visualization and Computer Graphics. Feb. 2021, 506-516, vol. 27 DOI: 10.1109/TVCG.2020.3028956
  31. Breston, L., Leonardis, E.J., Quinn, L.K. et al. 2021. Convergent cross sorting for estimating dynamic coupling. Sci Rep 11, 20374 (2021). doi:10.1038/s41598-021-98864-2
  32. Deyle E. R. et al. A hybrid empirical and parametric approach for managing ecosystem complexity: Water quality in Lake Geneva under nonstationary futures. PNAS Vol. 119, No. 26 (2022).
  33. Ge, X., Lin, A. Dynamic causality analysis using overlapped sliding windows based on the extended convergent cross-mapping. Nonlinear Dyn 104, 1753–1765 (2021). https://doi.org/10.1007/s11071-021-06362-x
  34. Bethany Johnson, Stephan B. Munch. 2022. An empirical dynamic modeling framework for missing or irregular samples. Ecological Modelling, Volume 468, June 2022, 109948.
  35. Chang, C.-W., Miki, T., Ushio, M., et al. (2021) Reconstructing large interaction networks from empirical time series data. Ecology Letters, 24, 2763– 2774. https://doi.org/10.1111/ele.13897

Further reading

Animations
Online books or lecture notes
Research groups