In stochastic processes, chaos theory and time series analysis, detrended fluctuation analysis (DFA) is a method for determining the statistical self-affinity of a signal. It is useful for analysing time series that appear to be long-memory processes (diverging correlation time, e.g. power-law decaying autocorrelation function) or 1/f noise.
The obtained exponent is similar to the Hurst exponent, except that DFA may also be applied to signals whose underlying statistics (such as mean and variance) or dynamics are non-stationary (changing with time). It is related to measures based upon spectral techniques such as autocorrelation and Fourier transform.
Peng et al. introduced DFA in 1994 in a paper that has been cited over 3,000 times as of 2022 [1] and represents an extension of the (ordinary) fluctuation analysis (FA), which is affected by non-stationarities.
Given: a time series .
Compute its average value .
Sum it into a process . This is the cumulative sum, or profile, of the original time series. For example, the profile of an i.i.d. white noise is a standard random walk.
Select a set of integers, such that , the smallest , the largest , and the sequence is roughly distributed evenly in log-scale: . In other words, it is approximately a geometric progression. [2]
For each , divide the sequence into consecutive segments of length . Within each segment, compute the least squares straight-line fit (the local trend). Let be the resulting piecewise-linear fit.
Compute the root-mean-square deviation from the local trend (localfluctuation):And their root-mean-square is the total fluctuation:
(If is not divisible by , then one can either discard the remainder of the sequence, or repeat the procedure on the reversed sequence, then take their root-mean-square. [3] )
Make the log-log plot . [4] [5]
A straight line of slope on the log-log plot indicates a statistical self-affinity of form . Since monotonically increases with , we always have .
The scaling exponent is a generalization of the Hurst exponent, with the precise value giving information about the series self-correlations:
Because the expected displacement in an uncorrelated random walk of length N grows like , an exponent of would correspond to uncorrelated white noise. When the exponent is between 0 and 1, the result is fractional Gaussian noise.
Though the DFA algorithm always produces a positive number for any time series, it does not necessarily imply that the time series is self-similar. Self-similarity requires the log-log graph to be sufficiently linear over a wide range of . Furthermore, a combination of techniques including maximum likelihood estimation (MLE), rather than least-squares has been shown to better approximate the scaling, or power-law, exponent. [6]
Also, there are many scaling exponent-like quantities that can be measured for a self-similar time series, including the divider dimension and Hurst exponent. Therefore, the DFA scaling exponent is not a fractal dimension, and does not have certain desirable properties that the Hausdorff dimension has, though in certain special cases it is related to the box-counting dimension for the graph of a time series.
The standard DFA algorithm given above removes a linear trend in each segment. If we remove a degree-n polynomial trend in each segment, it is called DFAn, or higher order DFA. [7]
Since is a cumulative sum of , a linear trend in is a constant trend in , which is a constant trend in (visible as short sections of "flat plateaus"). In this regard, DFA1 removes the mean from segments of the time series before quantifying the fluctuation.
Similarly, a degree n trend in is a degree (n-1) trend in . For example, DFA1 removes linear trends from segments of the time series before quantifying the fluctuation, DFA1 removes parabolic trends from , and so on.
The Hurst R/S analysis removes constant trends in the original sequence and thus, in its detrending it is equivalent to DFA1.
DFA can be generalized by computingthen making the log-log plot of , If there is a strong linearity in the plot of , then that slope is . [8] DFA is the special case where .
Multifractal systems scale as a function . Essentially, the scaling exponents need not be independent of the scale of the system. In particular, DFA measures the scaling-behavior of the second moment-fluctuations.
Kantelhardt et al. intended this scaling exponent as a generalization of the classical Hurst exponent. The classical Hurst exponent corresponds to for stationary cases, and for nonstationary cases. [8] [9] [10]
The DFA method has been applied to many systems, e.g. DNA sequences, [11] [12] neuronal oscillations, [10] speech pathology detection, [13] heartbeat fluctuation in different sleep stages, [14] and animal behavior pattern analysis. [15]
The effect of trends on DFA has been studied. [16]
In the case of power-law decaying auto-correlations, the correlation function decays with an exponent : . In addition the power spectrum decays as . The three exponents are related by: [11]
The relations can be derived using the Wiener–Khinchin theorem. The relation of DFA to the power spectrum method has been well studied. [17]
Thus, is tied to the slope of the power spectrum and is used to describe the color of noise by this relationship: .
For fractional Gaussian noise (FGN), we have , and thus , and , where is the Hurst exponent. for FGN is equal to . [18]
For fractional Brownian motion (FBM), we have , and thus , and , where is the Hurst exponent. for FBM is equal to . [9] In this context, FBM is the cumulative sum or the integral of FGN, thus, the exponents of their power spectra differ by 2.
In statistics, a power law is a functional relationship between two quantities, where a relative change in one quantity results in a relative change in the other quantity proportional to a power of the change, independent of the initial size of those quantities: one quantity varies as a power of another. For instance, considering the area of a square in terms of the length of its side, if the length is doubled, the area is multiplied by a factor of four. The rate of change exhibited in these relationships is said to be multiplicative.
In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate; the distance parameter could be any meaningful mono-dimensional measure of the process, such as time between production errors, or length along a roll of fabric in the weaving manufacturing process. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.
In probability theory and statistics, the beta distribution is a family of continuous probability distributions defined on the interval [0, 1] or in terms of two positive parameters, denoted by alpha (α) and beta (β), that appear as exponents of the variable and its complement to 1, respectively, and control the shape of the distribution.
In probability theory and statistics, the gamma distribution is a versatile two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:
In probability theory and statistics, the Gumbel distribution is used to model the distribution of the maximum of a number of samples of various distributions.
In mathematics, a transcendental function is an analytic function that does not satisfy a polynomial equation, in contrast to an algebraic function. In other words, a transcendental function "transcends" algebra in that it cannot be expressed algebraically using a finite amount of terms.
In transcendental number theory, the Lindemann–Weierstrass theorem is a result that is very useful in establishing the transcendence of numbers. It states the following:
In econometrics, the autoregressive conditional heteroskedasticity (ARCH) model is a statistical model for time series data that describes the variance of the current error term or innovation as a function of the actual sizes of the previous time periods' error terms; often the variance is related to the squares of the previous innovations. The ARCH model is appropriate when the error variance in a time series follows an autoregressive (AR) model; if an autoregressive moving average (ARMA) model is assumed for the error variance, the model is a generalized autoregressive conditional heteroskedasticity (GARCH) model.
In quantum mechanics, information theory, and Fourier analysis, the entropic uncertainty or Hirschman uncertainty is defined as the sum of the temporal and spectral Shannon entropies. It turns out that Heisenberg's uncertainty principle can be expressed as a lower bound on the sum of these entropies. This is stronger than the usual statement of the uncertainty principle in terms of the product of standard deviations.
In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.
In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution.
In probability theory and statistics, the beta prime distribution is an absolutely continuous probability distribution. If has a beta distribution, then the odds has a beta prime distribution.
In mathematics, a real or complex-valued function f on d-dimensional Euclidean space satisfies a Hölder condition, or is Hölder continuous, when there are real constants C ≥ 0, α > 0, such that for all x and y in the domain of f. More generally, the condition can be formulated for functions between any two metric spaces. The number is called the exponent of the Hölder condition. A function on an interval satisfying the condition with α > 1 is constant. If α = 1, then the function satisfies a Lipschitz condition. For any α > 0, the condition implies the function is uniformly continuous. The condition is named after Otto Hölder.
A multifractal system is a generalization of a fractal system in which a single exponent is not enough to describe its dynamics; instead, a continuous spectrum of exponents is needed.
The Hurst exponent is used as a measure of long-term memory of time series. It relates to the autocorrelations of the time series, and the rate at which these decrease as the lag between pairs of values increases. Studies involving the Hurst exponent were originally developed in hydrology for the practical matter of determining optimum dam sizing for the Nile river's volatile rain and drought conditions that had been observed over a long period of time. The name "Hurst exponent", or "Hurst coefficient", derives from Harold Edwin Hurst (1880–1978), who was the lead researcher in these studies; the use of the standard notation H for the coefficient also relates to his name.
A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.
In probability and statistics, the Tweedie distributions are a family of probability distributions which include the purely continuous normal, gamma and inverse Gaussian distributions, the purely discrete scaled Poisson distribution, and the class of compound Poisson–gamma distributions which have positive mass at zero, but are otherwise continuous. Tweedie distributions are a special case of exponential dispersion models and are often used as distributions for generalized linear models.
In probability and statistics, the log-logistic distribution is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for events whose rate increases initially and decreases later, as, for example, mortality rate from cancer following diagnosis or treatment. It has also been used in hydrology to model stream flow and precipitation, in economics as a simple model of the distribution of wealth or income, and in networking to model the transmission times of data considering both the network and the software.
The generalized normal distribution or generalized Gaussian distribution (GGD) is either of two families of parametric continuous probability distributions on the real line. Both families add a shape parameter to the normal distribution. To distinguish the two families, they are referred to below as "symmetric" and "asymmetric"; however, this is not a standard nomenclature.
A geometric stable distribution or geo-stable distribution is a type of leptokurtic probability distribution. Geometric stable distributions were introduced in Klebanov, L. B., Maniya, G. M., and Melamed, I. A. (1985). A problem of Zolotarev and analogs of infinitely divisible and stable distributions in a scheme for summing a random number of random variables. These distributions are analogues for stable distributions for the case when the number of summands is random, independent of the distribution of summand, and having geometric distribution. The geometric stable distribution may be symmetric or asymmetric. A symmetric geometric stable distribution is also referred to as a Linnik distribution. The Laplace distribution and asymmetric Laplace distribution are special cases of the geometric stable distribution. The Mittag-Leffler distribution is also a special case of a geometric stable distribution.
{{cite journal}}
: Cite journal requires |journal=
(help)