Detrended fluctuation analysis

Last updated

In stochastic processes, chaos theory and time series analysis, detrended fluctuation analysis (DFA) is a method for determining the statistical self-affinity of a signal. It is useful for analysing time series that appear to be long-memory processes (diverging correlation time, e.g. power-law decaying autocorrelation function) or 1/f noise.

Contents

The obtained exponent is similar to the Hurst exponent, except that DFA may also be applied to signals whose underlying statistics (such as mean and variance) or dynamics are non-stationary (changing with time). It is related to measures based upon spectral techniques such as autocorrelation and Fourier transform.

Peng et al. introduced DFA in 1994 in a paper that has been cited over 3,000 times as of 2022 [1] and represents an extension of the (ordinary) fluctuation analysis (FA), which is affected by non-stationarities.

Definition

DFA on a Brownian motion process, with increasing values of
n
{\displaystyle n}
. Detrended fluctuation analysis, illustrated with Brownian motion.png
DFA on a Brownian motion process, with increasing values of .

Algorithm

Given: a time series .

Compute its average value .

Sum it into a process . This is the cumulative sum, or profile, of the original time series. For example, the profile of an i.i.d. white noise is a standard random walk.

Select a set of integers, such that , the smallest , the largest , and the sequence is roughly distributed evenly in log-scale: . In other words, it is approximately a geometric progression. [2]

For each , divide the sequence into consecutive segments of length . Within each segment, compute the least squares straight-line fit (the local trend). Let be the resulting piecewise-linear fit.

Compute the root-mean-square deviation from the local trend (localfluctuation):

And their root-mean-square is the total fluctuation:

(If is not divisible by , then one can either discard the remainder of the sequence, or repeat the procedure on the reversed sequence, then take their root-mean-square. [3] )

Make the log-log plot . [4] [5]

Interpretation

A straight line of slope on the log-log plot indicates a statistical self-affinity of form . Since monotonically increases with , we always have .

The scaling exponent is a generalization of the Hurst exponent, with the precise value giving information about the series self-correlations:

Because the expected displacement in an uncorrelated random walk of length N grows like , an exponent of would correspond to uncorrelated white noise. When the exponent is between 0 and 1, the result is fractional Gaussian noise.

Pitfalls in interpretation

Though the DFA algorithm always produces a positive number for any time series, it does not necessarily imply that the time series is self-similar. Self-similarity requires the log-log graph to be sufficiently linear over a wide range of . Furthermore, a combination of techniques including MLE, rather than least-squares has been shown to better approximate the scaling, or power-law, exponent. [6]

Also, there are many scaling exponent-like quantities that can be measured for a self-similar time series, including the divider dimension and Hurst exponent. Therefore, the DFA scaling exponent is not a fractal dimension, and does not have certain desirable properties that the Hausdorff dimension has, though in certain special cases it is related to the box-counting dimension for the graph of a time series.

Generalizations

The standard DFA algorithm given above removes a linear trend in each segment. If we remove a degree-n polynomial trend in each segment, it is called DFAn, or higher order DFA. [7]

Since is a cumulative sum of , a linear trend in is a constant trend in , which is a constant trend in (visible as short sections of "flat plateaus"). In this regard, DFA1 removes the mean from segments of the time series before quantifying the fluctuation.

Similarly, a degree n trend in is a degree (n-1) trend in . For example, DFA1 removes linear trends from segments the time series before quantifying the fluctuation, DFA1 removes parabolic trends from , and so on.

The Hurst R/S analysis removes constant trends in the original sequence and thus, in its detrending it is equivalent to DFA1.

Generalization to different moments (multifractal DFA)

DFA can be generalized by computing

then making the log-log plot of , If there is a strong linearity in the plot of , then that slope is . [8] DFA is the special case where .

Multifractal systems scale as a function . Essentially, the scaling exponents need not be independent of the scale of the system. In particular, DFA measures the scaling-behavior of the second moment-fluctuations.

Kantelhardt et al. intended this scaling exponent as a generalization of the classical Hurst exponent. The classical Hurst exponent corresponds to for stationary cases, and for nonstationary cases. [8] [9] [10]

Applications

The DFA method has been applied to many systems, e.g. DNA sequences, [11] [12] neuronal oscillations, [10] speech pathology detection, [13] heartbeat fluctuation in different sleep stages, [14] and animal behavior pattern analysis. [15]

The effect of trends on DFA has been studied. [16]

Relations to other methods, for specific types of signal

For signals with power-law-decaying autocorrelation

In the case of power-law decaying auto-correlations, the correlation function decays with an exponent : . In addition the power spectrum decays as . The three exponents are related by: [11]

The relations can be derived using the Wiener–Khinchin theorem. The relation of DFA to the power spectrum method has been well studied. [17]

Thus, is tied to the slope of the power spectrum and is used to describe the color of noise by this relationship: .

For fractional Gaussian noise

For fractional Gaussian noise (FGN), we have , and thus , and , where is the Hurst exponent. for FGN is equal to . [18]

For fractional Brownian motion

For fractional Brownian motion (FBM), we have , and thus , and , where is the Hurst exponent. for FBM is equal to . [9] In this context, FBM is the cumulative sum or the integral of FGN, thus, the exponents of their power spectra differ by 2.

See also

Related Research Articles

<span class="mw-page-title-main">Power law</span> Functional relationship between two quantities

In statistics, a power law is a functional relationship between two quantities, where a relative change in one quantity results in a relative change in the other quantity proportional to a power of the change, independent of the initial size of those quantities: one quantity varies as a power of another. For instance, considering the area of a square in terms of the length of its side, if the length is doubled, the area is multiplied by a factor of four.

<span class="mw-page-title-main">Gamma distribution</span> Probability distribution

In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:

  1. With a shape parameter and a scale parameter .
  2. With a shape parameter and an inverse scale parameter , called a rate parameter.
<span class="mw-page-title-main">Gumbel distribution</span> Particular case of the generalized extreme value distribution

In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.

In mathematics, a transcendental function is an analytic function that does not satisfy a polynomial equation, in contrast to an algebraic function. In other words, a transcendental function "transcends" algebra in that it cannot be expressed algebraically.

<i>F</i>-distribution Continuous probability distribution

In probability theory and statistics, the F-distribution or F-ratio, also known as Snedecor's F distribution or the Fisher–Snedecor distribution, is a continuous probability distribution that arises frequently as the null distribution of a test statistic, most notably in the analysis of variance (ANOVA) and other F-tests.

<span class="mw-page-title-main">Cobb–Douglas production function</span> Macroeconomic formula that describes productivity

In economics and econometrics, the Cobb–Douglas production function is a particular functional form of the production function, widely used to represent the technological relationship between the amounts of two or more inputs and the amount of output that can be produced by those inputs. The Cobb–Douglas form is developed and tested against statistical evidence by Charles Cobb and Paul Douglas between 1927 and 1947; according to Douglas, the functional form itself was developed earlier by Philip Wicksteed.

The Lotka–Volterra equations, also known as the Lotka–Volterra predator–prey model, are a pair of first-order nonlinear differential equations, frequently used to describe the dynamics of biological systems in which two species interact, one as a predator and the other as prey. The populations change through time according to the pair of equations:

<span class="mw-page-title-main">Logistic distribution</span> Continuous probability distribution

In probability theory and statistics, the logistic distribution is a continuous probability distribution. Its cumulative distribution function is the logistic function, which appears in logistic regression and feedforward neural networks. It resembles the normal distribution in shape but has heavier tails. The logistic distribution is a special case of the Tukey lambda distribution.

<span class="mw-page-title-main">Stable distribution</span> Distribution of variables which satisfies a stability property under linear combinations

In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.

<span class="mw-page-title-main">Inverse-gamma distribution</span> Two-parameter family of continuous probability distributions

In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution.

<span class="mw-page-title-main">Beta prime distribution</span> Probability distribution

In probability theory and statistics, the beta prime distribution is an absolutely continuous probability distribution. If has a beta distribution, then the odds has a beta prime distribution.

In mathematics, a real or complex-valued function f on d-dimensional Euclidean space satisfies a Hölder condition, or is Hölder continuous, when there are real constants C ≥ 0, α > 0, such that

<span class="mw-page-title-main">Multifractal system</span> System with multiple fractal dimensions

A multifractal system is a generalization of a fractal system in which a single exponent is not enough to describe its dynamics; instead, a continuous spectrum of exponents is needed.

The Hurst exponent is used as a measure of long-term memory of time series. It relates to the autocorrelations of the time series, and the rate at which these decrease as the lag between pairs of values increases. Studies involving the Hurst exponent were originally developed in hydrology for the practical matter of determining optimum dam sizing for the Nile river's volatile rain and drought conditions that had been observed over a long period of time. The name "Hurst exponent", or "Hurst coefficient", derives from Harold Edwin Hurst (1880–1978), who was the lead researcher in these studies; the use of the standard notation H for the coefficient also relates to his name.

A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.

In probability and statistics, the Tweedie distributions are a family of probability distributions which include the purely continuous normal, gamma and inverse Gaussian distributions, the purely discrete scaled Poisson distribution, and the class of compound Poisson–gamma distributions which have positive mass at zero, but are otherwise continuous. Tweedie distributions are a special case of exponential dispersion models and are often used as distributions for generalized linear models.

<span class="mw-page-title-main">Log-logistic distribution</span>

In probability and statistics, the log-logistic distribution is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for events whose rate increases initially and decreases later, as, for example, mortality rate from cancer following diagnosis or treatment. It has also been used in hydrology to model stream flow and precipitation, in economics as a simple model of the distribution of wealth or income, and in networking to model the transmission times of data considering both the network and the software.

<span class="mw-page-title-main">Shifted log-logistic distribution</span>

The shifted log-logistic distribution is a probability distribution also known as the generalized log-logistic or the three-parameter log-logistic distribution. It has also been called the generalized logistic distribution, but this conflicts with other uses of the term: see generalized logistic distribution.

In mathematics, Racah polynomials are orthogonal polynomials named after Giulio Racah, as their orthogonality relations are equivalent to his orthogonality relations for Racah coefficients.

A geometric stable distribution or geo-stable distribution is a type of leptokurtic probability distribution. Geometric stable distributions were introduced in Klebanov, L. B., Maniya, G. M., and Melamed, I. A. (1985). A problem of Zolotarev and analogs of infinitely divisible and stable distributions in a scheme for summing a random number of random variables. These distributions are analogues for stable distributions for the case when the number of summands is random, independent of the distribution of summand, and having geometric distribution. The geometric stable distribution may be symmetric or asymmetric. A symmetric geometric stable distribution is also referred to as a Linnik distribution. The Laplace distribution and asymmetric Laplace distribution are special cases of the geometric stable distribution. The Mittag-Leffler distribution is also a special case of a geometric stable distribution.

References

  1. Peng, C.K.; et al. (1994). "Mosaic organization of DNA nucleotides". Phys. Rev. E. 49 (2): 1685–1689. Bibcode:1994PhRvE..49.1685P. doi: 10.1103/physreve.49.1685 . PMID   9961383. S2CID   3498343.
  2. Hardstone, Richard; Poil, Simon-Shlomo; Schiavone, Giuseppina; Jansen, Rick; Nikulin, Vadim; Mansvelder, Huibert; Linkenkaer-Hansen, Klaus (2012). "Detrended Fluctuation Analysis: A Scale-Free View on Neuronal Oscillations". Frontiers in Physiology. 3: 450. doi: 10.3389/fphys.2012.00450 . ISSN   1664-042X. PMC   3510427 . PMID   23226132.
  3. Zhou, Yu; Leung, Yee (2010-06-21). "Multifractal temporally weighted detrended fluctuation analysis and its application in the analysis of scaling behavior in temperature series". Journal of Statistical Mechanics: Theory and Experiment. 2010 (6): P06021. doi:10.1088/1742-5468/2010/06/P06021. ISSN   1742-5468. S2CID   119901219.
  4. Peng, C.K.; et al. (1994). "Quantification of scaling exponents and crossover phenomena in nonstationary heartbeat time series". Chaos. 49 (1): 82–87. Bibcode:1995Chaos...5...82P. doi:10.1063/1.166141. PMID   11538314. S2CID   722880.
  5. Bryce, R.M.; Sprague, K.B. (2012). "Revisiting detrended fluctuation analysis". Sci. Rep. 2: 315. Bibcode:2012NatSR...2E.315B. doi:10.1038/srep00315. PMC   3303145 . PMID   22419991.
  6. Clauset, Aaron; Rohilla Shalizi, Cosma; Newman, M. E. J. (2009). "Power-Law Distributions in Empirical Data". SIAM Review. 51 (4): 661–703. arXiv: 0706.1062 . Bibcode:2009SIAMR..51..661C. doi:10.1137/070710111. S2CID   9155618.
  7. Kantelhardt J.W.; et al. (2001). "Detecting long-range correlations with detrended fluctuation analysis". Physica A. 295 (3–4): 441–454. arXiv: cond-mat/0102214 . Bibcode:2001PhyA..295..441K. doi:10.1016/s0378-4371(01)00144-3. S2CID   55151698.
  8. 1 2 H.E. Stanley, J.W. Kantelhardt; S.A. Zschiegner; E. Koscielny-Bunde; S. Havlin; A. Bunde (2002). "Multifractal detrended fluctuation analysis of nonstationary time series". Physica A. 316 (1–4): 87–114. arXiv: physics/0202070 . Bibcode:2002PhyA..316...87K. doi:10.1016/s0378-4371(02)01383-3. S2CID   18417413.
  9. 1 2 Movahed, M. Sadegh; et al. (2006). "Multifractal detrended fluctuation analysis of sunspot time series". Journal of Statistical Mechanics: Theory and Experiment. 02.
  10. 1 2 Hardstone, Richard; Poil, Simon-Shlomo; Schiavone, Giuseppina; Jansen, Rick; Nikulin, Vadim V.; Mansvelder, Huibert D.; Linkenkaer-Hansen, Klaus (1 January 2012). "Detrended Fluctuation Analysis: A Scale-Free View on Neuronal Oscillations". Frontiers in Physiology. 3: 450. doi: 10.3389/fphys.2012.00450 . PMC   3510427 . PMID   23226132.
  11. 1 2 Buldyrev; et al. (1995). "Long-Range Correlation-Properties of Coding And Noncoding Dna-Sequences- Genbank Analysis". Phys. Rev. E. 51 (5): 5084–5091. Bibcode:1995PhRvE..51.5084B. doi:10.1103/physreve.51.5084. PMID   9963221.
  12. Bunde A, Havlin S (1996). "Fractals and Disordered Systems, Springer, Berlin, Heidelberg, New York".{{cite journal}}: Cite journal requires |journal= (help)
  13. Little, M.; McSharry, P.; Moroz, I.; Roberts, S. (2006). "Nonlinear, Biophysically-Informed Speech Pathology Detection" (PDF). 2006 IEEE International Conference on Acoustics Speed and Signal Processing Proceedings. Vol. 2. pp. II-1080–II-1083. doi:10.1109/ICASSP.2006.1660534. ISBN   1-4244-0469-X. S2CID   11068261.
  14. Bunde A.; et al. (2000). "Correlated and uncorrelated regions in heart-rate fluctuations during sleep". Phys. Rev. E. 85 (17): 3736–3739. Bibcode:2000PhRvL..85.3736B. doi:10.1103/physrevlett.85.3736. PMID   11030994. S2CID   21568275.
  15. Bogachev, Mikhail I.; Lyanova, Asya I.; Sinitca, Aleksandr M.; Pyko, Svetlana A.; Pyko, Nikita S.; Kuzmenko, Alexander V.; Romanov, Sergey A.; Brikova, Olga I.; Tsygankova, Margarita; Ivkin, Dmitry Y.; Okovityi, Sergey V.; Prikhodko, Veronika A.; Kaplun, Dmitrii I.; Sysoev, Yuri I.; Kayumov, Airat R. (March 2023). "Understanding the complex interplay of persistent and antipersistent regimes in animal movement trajectories as a prominent characteristic of their behavioral pattern profiles: Towards an automated and robust model based quantification of anxiety test data". Biomedical Signal Processing and Control. 81: 104409. doi:10.1016/j.bspc.2022.104409. S2CID   254206934.
  16. Hu, K.; et al. (2001). "Effect of trends on detrended fluctuation analysis". Phys. Rev. E. 64 (1): 011114. arXiv: physics/0103018 . Bibcode:2001PhRvE..64a1114H. doi:10.1103/physreve.64.011114. PMID   11461232. S2CID   2524064.
  17. Heneghan; et al. (2000). "Establishing the relation between detrended fluctuation analysis and power spectral density analysis for stochastic processes". Phys. Rev. E. 62 (5): 6103–6110. Bibcode:2000PhRvE..62.6103H. doi:10.1103/physreve.62.6103. PMID   11101940. S2CID   10791480.
  18. Taqqu, Murad S.; et al. (1995). "Estimators for long-range dependence: an empirical study". Fractals. 3 (4): 785–798. doi:10.1142/S0218348X95000692.