Surrogate data testing

Last updated

Surrogate data testing [1] (or the method of surrogate data) is a statistical proof by contradiction technique and similar to permutation tests [2] and as a resampling technique related (but different) to parametric bootstrapping. It is used to detect non-linearity in a time series. [3] The technique basically involves specifying a null hypothesis describing a linear process and then generating several surrogate data sets according to using Monte Carlo methods. A discriminating statistic is then calculated for the original time series and all the surrogate set. If the value of the statistic is significantly different for the original series than for the surrogate set, the null hypothesis is rejected and non-linearity assumed. [3]

Contents

The particular surrogate data testing method to be used is directly related to the null hypothesis. Usually this is similar to the following: The data is a realization of a stationary linear system, whose output has been possibly measured by a monotonically increasing possibly nonlinear (but static) function. [1] Here linear means that each value is linearly dependent on past values or on present and past values of some independent identically distributed (i.i.d.) process, usually also Gaussian. This is equivalent to saying that the process is ARMA type. In case of fluxes (continuous mappings), linearity of system means that it can be expressed by a linear differential equation. In this hypothesis, the static measurement function is one which depends only on the present value of its argument, not on past ones.

Methods

Many algorithms to generate surrogate data have been proposed. They are usually classified in two groups: [4]

The last surrogate data methods do not depend on a particular model, nor on any parameters, thus they are non-parametric methods. These surrogate data methods are usually based on preserving the linear structure of the original series (for instance, by preserving the autocorrelation function, or equivalently the periodogram, an estimate of the sample spectrum). [5] Among constrained realizations methods, the most widely used (and thus could be called the classical methods) are:

  1. Algorithm 0, or RS (for Random Shuffle): [1] [6] New data are created simply by random permutations of the original series. This concept is also used in permutation tests. The permutations guarantee the same amplitude distribution as the original series, but destroy any temporal correlation that may have been in the original data. This method is associated to the null hypothesis of the data being uncorrelated i.i.d noise (possibly Gaussian and measured by a static nonlinear function).
  2. Algorithm 1, or RP (for Random Phases; also known as FT, for Fourier Transform): [1] [7] In order to preserve the linear correlation (the periodogram) of the series, surrogate data are created by the inverse Fourier Transform of the modules of Fourier Transform of the original data with new (uniformly random) phases. If the surrogates must be real, the Fourier phases must be antisymmetric with respect to the central value of data.
  3. Algorithm 2, or AAFT (for Amplitude Adjusted Fourier Transform): [1] [4] This method has approximately the advantages of the two previous ones: it tries to preserve both the linear structure and the amplitude distribution. This method consists of these steps:
    • Scaling the data to a Gaussian distribution (Gaussianization).
    • Performing a RP transformation of the new data.
    • Finally doing a transformation inverse of the first one (de-Gaussianization).
    The drawback of this method is precisely that the last step changes somewhat the linear structure.
  4. Iterative algorithm 2, or IAAFT (for Iterative Amplitude Adjusted Fourier Transform): [8] This algorithm is an iterative version of AAFT. The steps are repeated until the autocorrelation function is sufficiently similar to the original, or until there is no change in the amplitudes.

Many other surrogate data methods have been proposed, some based on optimizations to achieve an autocorrelation close to the original one, [9] [10] [11] some based on wavelet transform [12] [13] [14] and some capable of dealing with some types of non-stationary data. [15] [16] [17]

The above mentioned techniques are called linear surrogate methods, because they are based on a linear process and address a linear null hypothesis. [9] Broadly speaking, these methods are useful for data showing irregular fluctuations (short-term variabilities) and data with such a behaviour abound in the real world. However, we often observe data with obvious periodicity, for example, annual sunspot numbers, electrocardiogram (ECG) and so on. Time series exhibiting strong periodicities are clearly not consistent with the linear null hypotheses. To tackle this case, some algorithms and null hypotheses have been proposed. [18] [19] [20]

See also

Related Research Articles

<span class="mw-page-title-main">Edge of chaos</span> Transition space between order and disorder

The edge of chaos is a transition space between order and disorder that is hypothesized to exist within a wide variety of systems. This transition zone is a region of bounded instability that engenders a constant dynamic interplay between order and disorder.

<span class="mw-page-title-main">Self-organized criticality</span> Concept in physics

Self-organized criticality (SOC) is a property of dynamical systems that have a critical point as an attractor. Their macroscopic behavior thus displays the spatial or temporal scale-invariance characteristic of the critical point of a phase transition, but without the need to tune control parameters to a precise value, because the system, effectively, tunes itself as it evolves towards criticality.

Numerical relativity is one of the branches of general relativity that uses numerical methods and algorithms to solve and analyze problems. To this end, supercomputers are often employed to study black holes, gravitational waves, neutron stars and many other phenomena governed by Einstein's theory of general relativity. A currently active field of research in numerical relativity is the simulation of relativistic binaries and their associated gravitational waves.

Bremermann's limit, named after Hans-Joachim Bremermann, is a limit on the maximum rate of computation that can be achieved in a self-contained system in the material universe. It is derived from Einstein's mass-energy equivalency and the Heisenberg uncertainty principle, and is c2/h ≈ 1.36 × 1050 bits per second per kilogram.

<span class="mw-page-title-main">Zaslavskii map</span> Dynamical system that exhibits chaotic behavior

The Zaslavskii map is a discrete-time dynamical system introduced by George M. Zaslavsky. It is an example of a dynamical system that exhibits chaotic behavior. The Zaslavskii map takes a point in the plane and maps it to a new point:

In chaos theory, the correlation dimension is a measure of the dimensionality of the space occupied by a set of random points, often referred to as a type of fractal dimension.

<span class="mw-page-title-main">J. Doyne Farmer</span> American physicist and entrepreneur (b.1952)

J. Doyne Farmer is an American complex systems scientist and entrepreneur with interests in chaos theory, complexity and econophysics. He is Baillie Gifford Professor of Mathematics at Oxford University, where he is also Director of the Complexity Economics at the Institute for New Economic Thinking at the Oxford Martin School. Additionally he is an external professor at the Santa Fe Institute. His current research is on complexity economics, focusing on systemic risk in financial markets and technological progress. During his career he has made important contributions to complex systems, chaos, artificial life, theoretical biology, time series forecasting and econophysics. He co-founded Prediction Company, one of the first companies to do fully automated quantitative trading. While a graduate student he led a group that called itself Eudaemonic Enterprises and built the first wearable digital computer, which was used to beat the game of roulette.

PVLAS aims to carry out a test of quantum electrodynamics and possibly detect dark matter at the Department of Physics and National Institute of Nuclear Physics in Ferrara, Italy. It searches for vacuum polarization causing nonlinear optical behavior in magnetic fields. Experiments began in 2001 at the INFN Laboratory in Legnaro and continue today with new equipment.

Dissipative solitons (DSs) are stable solitary localized structures that arise in nonlinear spatially extended dissipative systems due to mechanisms of self-organization. They can be considered as an extension of the classical soliton concept in conservative systems. An alternative terminology includes autosolitons, spots and pulses.

Peter Grassberger is a professor well known for his work in statistical and particle physics. He is most famous for his contributions to chaos theory, where he introduced the idea of correlation dimension, a means of measuring a type of fractal dimension of the strange attractor.

Differential dynamic microscopy (DDM) is an optical technique that allows performing light scattering experiments by means of a simple optical microscope. DDM is suitable for typical soft materials such as for instance liquids or gels made of colloids, polymers and liquid crystals but also for biological materials like bacteria and cells.

Surrogate data, sometimes known as analogous data, usually refers to time series data that is produced using well-defined (linear) models like ARMA processes that reproduce various statistical properties like the autocorrelation structure of a measured data set. The resulting surrogate data can then for example be used for testing for non-linear structure in the empirical data, see surrogate data testing.

<span class="mw-page-title-main">Quantum scar</span>

Quantum scarring refers to a phenomenon where the eigenstates of a classically chaotic quantum system have enhanced probability density around the paths of unstable classical periodic orbits. The instability of the periodic orbit is a decisive point that differentiates quantum scars from the more trivial observation that the probability density is enhanced in the neighborhood of stable periodic orbits. The latter can be understood as a purely classical phenomenon, a manifestation of the Bohr correspondence principle, whereas in the former, quantum interference is essential. As such, scarring is both a visual example of quantum-classical correspondence, and simultaneously an example of a (local) quantum suppression of chaos.

<span class="mw-page-title-main">Periodic travelling wave</span>

In mathematics, a periodic travelling wave is a periodic function of one-dimensional space that moves with constant speed. Consequently, it is a special type of spatiotemporal oscillation that is a periodic function of both space and time.

<span class="mw-page-title-main">Chung-Kang Peng</span>

Chung-Kang Peng is the Director of the Center for Dynamical Biomarkers at Beth Israel Deaconess Medical Center / Harvard Medical School (BIDMC/HMS). Under his direction the Center for Dynamical Biomarkers researches fundamental theories and novel computational algorithms for characterizing physiological states in terms of their dynamical properties. He is also currently the K.-T. Li Visiting Chair Professor at National Central University (NCU), Visiting Chair Professor at National Chiao Tung University (NCTU) in Taiwan, and Visiting Professor at China Academy of Chinese Medical Sciences in China. During 2012–2014, he served as the founding Dean of the College of Health Sciences and Technology at NCU in Taiwan.

<span class="mw-page-title-main">Strange nonchaotic attractor</span>

In mathematics, a strange nonchaotic attractor (SNA) is a form of attractor which, while converging to a limit, is strange, because it is not piecewise differentiable, and also non-chaotic, in that its Lyapunov exponents are non-positive. SNAs were introduced as a topic of study by Grebogi et al. in 1984. SNAs can be distinguished from periodic, quasiperiodic and chaotic attractors using the 0-1 test for chaos.

Dwight Barkley is a professor of mathematics at the University of Warwick.

The Kundu equation is a general form of integrable system that is gauge-equivalent to the mixed nonlinear Schrödinger equation. It was proposed by Anjan Kundu as

Bernstein–Greene–Kruskal modes are nonlinear electrostatic waves that propagate in an unmagnetized, collisionless plasma. They are nonlinear solutions to the Vlasov–Poisson equation in plasma physics, and are named after physicists Ira B. Bernstein, John M. Greene, and Martin D. Kruskal, who solved and published the exact solution for the one-dimensional case in 1957.

In fractal geometry, the Higuchi dimension is an approximate value for the box-counting dimension of the graph of a real-valued function or time series. This value is obtained via an algorithmic approximation so one also talks about the Higuchi method. It has many applications in science and engineering and has been applied to subjects like characterizing primary waves in seismograms, clinical neurophysiology and analyzing changes in the electroencephalogram in Alzheimer’s disease.

References

  1. 1 2 3 4 5 J. Theiler; S. Eubank; A. Longtin; B. Galdrikian; J. Doyne Farmer (1992). "Testing for nonlinearity in time series: the method of surrogate data" (PDF). Physica D. 58 (1–4): 77–94. Bibcode:1992PhyD...58...77T. doi:10.1016/0167-2789(92)90102-S.
  2. Moore, Jason H. "Bootstrapping, permutation testing and the method of surrogate data." Physics in Medicine & Biology 44.6 (1999): L11
  3. 1 2 Andreas Galka (2000). Topics in Nonlinear Time Series Analysis: with Implications for EEG Analysis. River Edge, N.J.: World Scientific. pp. 222–223. ISBN   9789810241483.
  4. 1 2 J. Theiler; D. Prichard (1996). "Constrained-realization Monte-Carlo method for hypothesis testing". Physica D. 94 (4): 221–235. arXiv: comp-gas/9603001 . Bibcode:1996PhyD...94..221T. doi:10.1016/0167-2789(96)00050-4. S2CID   12568769.
  5. A. Galka; T. Ozaki (2001). "Testing for nonlinearity in high-dimensional time series from continuous dynamics". Physica D. 158 (1–4): 32–44. Bibcode:2001PhyD..158...32G. CiteSeerX   10.1.1.379.7641 . doi:10.1016/s0167-2789(01)00318-9.
  6. J.A. Scheinkman; B. LeBaron (1989). "Nonlinear Dynamics and Stock Returns". The Journal of Business. 62 (3): 311. doi:10.1086/296465.
  7. A.R. Osborne; A.D. Kirwan Jr.; A. Provenzale; L. Bergamasco (1986). "A search for chaotic behavior in large and mesoscale motions in the Pacific Ocean". Physica D. 23 (1–3): 75–83. Bibcode:1986PhyD...23...75O. doi:10.1016/0167-2789(86)90113-2.
  8. T. Schreiber; A. Schmitz (1996). "Improved Surrogate Data for Nonlinearity Tests". Phys. Rev. Lett. 77 (4): 635–638. arXiv: chao-dyn/9909041 . Bibcode:1996PhRvL..77..635S. doi:10.1103/PhysRevLett.77.635. PMID   10062864. S2CID   13193081.
  9. 1 2 T. Schreiber; A. Schmitz (2000). "Surrogate time series". Physica D. 142 (3–4): 346–382. arXiv: chao-dyn/9909037 . Bibcode:2000PhyD..142..346S. doi:10.1016/S0167-2789(00)00043-9. S2CID   13889229.
  10. T. Schreiber (1998). "Constrained Randomization of Time Series Data". Phys. Rev. Lett. 80 (4): 2105–2108. arXiv: chao-dyn/9909042 . Bibcode:1998PhRvL..80.2105S. doi:10.1103/PhysRevLett.80.2105. S2CID   42976448.
  11. R. Engbert (2002). "Testing for nonlinearity: the role of surrogate data". Chaos, Solitons & Fractals . 13 (1): 79–84. Bibcode:2002CSF....13...79E. doi:10.1016/S0960-0779(00)00236-8.
  12. M. Breakspear; M. Brammer; P.A. Robinson (2003). "Construction of multivariate surrogate sets from nonlinear data using the wavelet transform". Physica D. 182 (1): 1–22. Bibcode:2003PhyD..182....1B. doi:10.1016/S0167-2789(03)00136-2.
  13. C.J. Keylock (2006). "Constrained surrogate time series with preservation of the mean and variance structure". Phys. Rev. E. 73 (3): 036707. Bibcode:2006PhRvE..73c6707K. doi:10.1103/PhysRevE.73.036707. PMID   16605698.
  14. C.J. Keylock (2007). "A wavelet-based method for surrogate data generation". Physica D. 225 (2): 219–228. Bibcode:2007PhyD..225..219K. doi:10.1016/j.physd.2006.10.012.
  15. T. Nakamura; M. Small (2005). "Small-shuffle surrogate data: Testing for dynamics in fluctuating data with trends". Phys. Rev. E. 72 (5): 056216. Bibcode:2005PhRvE..72e6216N. doi:10.1103/PhysRevE.72.056216. hdl: 10397/4826 . PMID   16383736.
  16. T. Nakamura; M. Small; Y. Hirata (2006). "Testing for nonlinearity in irregular fluctuations with long-term trends". Phys. Rev. E. 74 (2): 026205. Bibcode:2006PhRvE..74b6205N. doi:10.1103/PhysRevE.74.026205. hdl: 10397/7633 . PMID   17025523.
  17. J.H. Lucio; R. Valdés; L.R. Rodríguez (2012). "Improvements to surrogate data methods for nonstationary time series". Phys. Rev. E. 85 (5): 056202. Bibcode:2012PhRvE..85e6202L. doi:10.1103/PhysRevE.85.056202. PMID   23004838.
  18. J. Theiler (1995). "On the evidence for low-dimensional chaos in an epileptic electroencephalogram". Physics Letters A. 196 (5–6): 335–341. Bibcode:1995PhLA..196..335T. doi:10.1016/0375-9601(94)00856-K.
  19. M. Small; D. Yu; R. G. Harrison (2001). "Surrogate test for pseudoperiodic time series data". Phys. Rev. Lett. 87 (18): 188101. Bibcode:2001PhRvL..87r8101S. doi:10.1103/PhysRevLett.87.188101. hdl: 10397/4856 .
  20. X. Luo; T. Nakamura; M. Small (2005). "Surrogate test to distinguish between chaotic and pseudoperiodic time series". Phys. Rev. E. 71 (2): 026230. arXiv: nlin/0404054 . Bibcode:2005PhRvE..71b6230L. doi:10.1103/PhysRevE.71.026230. hdl:10397/4828. PMID   15783410. S2CID   35512941.