|Part of a series on|
Least-squares spectral analysis (LSSA) is a method of estimating a frequency spectrum, based on a least squares fit of sinusoids to data samples, similar to Fourier analysis.Fourier analysis, the most used spectral method in science, generally boosts long-periodic noise in long gapped records; LSSA mitigates such problems. Unlike with Fourier analysis, data need not be equally spaced to use LSSA.
LSSA is also known as the Vaníček methodor the Gauss-Vaniček method after Petr Vaníček, and as the Lomb method or the Lomb–Scargle periodogram, based on the contributions of Nicholas R. Lomb and, independently, Jeffrey D. Scargle.
The close connections between Fourier analysis, the periodogram, and least-squares fitting of sinusoids have long been known.Most developments, however, are restricted to complete data sets of equally spaced samples. In 1963, Freek J. M. Barning of Mathematisch Centrum, Amsterdam, handled unequally spaced data by similar techniques, including both a periodogram analysis equivalent to what is now referred to the Lomb method, and least-squares fitting of selected frequencies of sinusoids determined from such periodograms, connected by a procedure that is now known as matching pursuit with post-backfitting or orthogonal matching pursuit.
Petr Vaníček, a Canadian geodesist of the University of New Brunswick, also proposed the matching-pursuit approach, which he called "successive spectral analysis" and the result a "least-squares periodogram", with equally and unequally spaced data, in 1969.He generalized this method to account for systematic components beyond a simple mean, such as a "predicted linear (quadratic, exponential, ...) secular trend of unknown magnitude", and applied it to a variety of samples, in 1971.
The Vaníček method was then simplified in 1976 by Nicholas R. Lomb of the University of Sydney, who pointed out its close connection to periodogram analysis.The definition of a periodogram of unequally spaced data was subsequently further modified and analyzed by Jeffrey D. Scargle of NASA Ames Research Center, who showed that with minor changes it could be made identical to Lomb's least-squares formula for fitting individual sinusoid frequencies.
Scargle states that his paper "does not introduce a new detection technique, but instead studies the reliability and efficiency of detection with the most commonly used technique, the periodogram, in the case where the observation times are unevenly spaced," and further points out in reference to least-squares fitting of sinusoids compared to periodogram analysis, that his paper "establishes, apparently for the first time, that (with the proposed modifications) these two methods are exactly equivalent."
Presssummarizes the development this way:
A completely different method of spectral analysis for unevenly sampled data, one that mitigates these difficulties and has some other very desirable properties, was developed by Lomb, based in part on earlier work by Barning and Vanicek, and additionally elaborated by Scargle.
In 1989, Michael J. Korenberg of Queen's University in Kingston, Ontario, developed the "fast orthogonal search" method of more quickly finding a near-optimal decomposition of spectra or other problems,similar to the technique that later became known as orthogonal matching pursuit.
In the Vaníček method, a discrete data set is approximated by a weighted sum of sinusoids of progressively determined frequencies, using a standard linear regression, or least-squares fit.The frequencies are chosen using a method similar to Barning's, but going further in optimizing the choice of each successive new frequency by picking the frequency that minimizes the residual after least-squares fitting (equivalent to the fitting technique now known as matching pursuit with pre-backfitting ). The number of sinusoids must be less than or equal to the number of data samples (counting sines and cosines of the same frequency as separate sinusoids).
A data vector Φ is represented as a weighted sum of sinusoidal basis functions, tabulated in a matrix A by evaluating each function at the sample times, with weight vector x:
where the weight vector x is chosen to minimize the sum of squared errors in approximating Φ. The solution for x is closed-form, using standard linear regression:
Here the matrix A can be based on any set of functions that are mutually independent (not necessarily orthogonal) when evaluated at the sample times; for spectral analysis, the functions used are typically sines and cosines evenly distributed over the frequency range of interest. If too many frequencies are chosen in a too-narrow frequency range, the functions will not be sufficiently independent, the matrix will be badly conditioned, and the resulting spectrum will not be meaningful.
When the basis functions in A are orthogonal (that is, not correlated, meaning the columns have zero pair-wise dot products), the matrix ATA is a diagonal matrix; when the columns all have the same power (sum of squares of elements), then that matrix is an identity matrix times a constant, so the inversion is trivial. The latter is the case when the sample times are equally spaced and the sinusoids are chosen to be sines and cosines equally spaced in pairs on the frequency interval 0 to a half cycle per sample (spaced by 1/N cycle per sample, omitting the sine phases at 0 and maximum frequency where they are identically zero). This particular case is known as the discrete Fourier transform, slightly rewritten in terms of real data and coefficients.
Trying to lower the computational burden of the Vaníček method in 1976(no longer an issue), Lomb proposed using the above simplification in general, except for pair-wise correlations between sine and cosine bases of the same frequency, since the correlations between pairs of sinusoids are often small, at least when they are not too closely spaced. This is essentially the traditional periodogram formulation, but now adopted for use with unevenly spaced samples. The vector x is a good estimate of an underlying spectrum, but since correlations are ignored, Ax is no longer a good approximation to the signal, and the method is no longer a least-squares method – yet it has continued to be referred to as such.
Rather than just taking dot products of the data with sine and cosine waveforms directly, Scargle modified the standard periodogram formula to first find a time delay such that this pair of sinusoids would be mutually orthogonal at sample times , and also adjusted for the potentially unequal powers of these two basis functions, to obtain a better estimate of the power at a frequency, which made his modified periodogram method exactly equivalent to Lomb's method. The time delay is defined by the formula
The periodogram at frequency is then estimated as:
which Scargle reports then has the same statistical distribution as the periodogram in the evenly sampled case.
At any individual frequency , this method gives the same power as does a least-squares fit to sinusoids of that frequency, of the form
The standard Lomb–Scargle periodogram is valid for a model with zero mean. Commonly, this is approximated by subtracting the mean of the data before calculating the periodogram. However, this is an inaccurate assumption when the mean of the model (the fitted sinusoids) is non-zero. The generalized Lomb–Scargle periodogram removes this assumption, and explicitly solves for the mean. In this case, the function fitted is
The generalized Lomb–Scargle periodogram has also been referred to as a floating mean periodogram.
Michael Korenberg of Queen's University in Kingston, Ontario, developed a method for choosing a sparse set of components from an over-complete set, such as sinusoidal components for spectral analysis, called fast orthogonal search (FOS). Mathematically, FOS uses a slightly modified Cholesky decomposition in a mean-square error reduction (MSER) process, implemented as a sparse matrix inversion.As with the other LSSA methods, FOS avoids the major shortcoming of discrete Fourier analysis, and can achieve highly accurate identifications of embedded periodicities and excels with unequally spaced data; the fast orthogonal search method has also been applied to other problems such as nonlinear system identification.
Palmer has developed a method for finding the best-fit function to any chosen number of harmonics, allowing more freedom to find non-sinusoidal harmonic functions.This method is a fast technique (FFT-based) for doing weighted least-squares analysis on arbitrarily spaced data with non-uniform standard errors. Source code that implements this technique is available. Because data are often not sampled at uniformly spaced discrete times, this method "grids" the data by sparsely filling a time series array at the sample times. All intervening grid points receive zero statistical weight, equivalent to having infinite error bars at times between samples.
The most useful feature of the LSSA method is enabling incomplete records to be spectrally analyzed, without the need to manipulate the record or to invent otherwise non-existent data.
Magnitudes in the LSSA spectrum depict the contribution of a frequency or period to the variance of the time series.Generally, spectral magnitudes defined in the above manner enable the output's straightforward significance level regime. Alternatively, magnitudes in the Vaníček spectrum can also be expressed in dB. Note that magnitudes in the Vaníček spectrum follow β-distribution.
Inverse transformation of Vaníček's LSSA is possible, as is most easily seen by writing the forward transform as a matrix; the matrix inverse (when the matrix is not singular) or pseudo-inverse will then be an inverse transformation; the inverse will exactly match the original data if the chosen sinusoids are mutually independent at the sample points and their number is equal to the number of data points.No such inverse procedure is known for the periodogram method.
The LSSA can be implemented in less than a page of MATLAB code.In essence:
"to compute the least-squares spectrum we must compute m spectral values ... which involves performing the least-squares approximation m times, each time to get [the spectral power] for a different frequency"
I.e., for each frequency in a desired set of frequencies, sine and cosine functions are evaluated at the times corresponding to the data samples, and dot products of the data vector with the sinusoid vectors are taken and appropriately normalized; following the method known as Lomb/Scargle periodogram, a time shift is calculated for each frequency to orthogonalize the sine and cosine components before the dot product, as described by Craymer;finally, a power is computed from those two amplitude components. This same process implements a discrete Fourier transform when the data are uniformly spaced in time and the frequencies chosen correspond to integer numbers of cycles over the finite data record.
This method treats each sinusoidal component independently, or out of context, even though they may not be orthogonal on the data points; it is Vaníček's original method. In contrast, as Craymer explains, it is also possible to perform a full simultaneous or in-context least-squares fit by solving a matrix equation, partitioning the total data variance between the specified sinusoid frequencies.Such a matrix least-squares solution is natively available in MATLAB as the backslash operator.
Craymer explains that the simultaneous or in-context method, as opposed to the independent or out-of-context version (as well as the periodogram version due to Lomb), cannot fit more components (sines and cosines) than there are data samples, and further that:
"...serious repercussions can also arise if the selected frequencies result in some of the Fourier components (trig functions) becoming nearly linearly dependent with each other, thereby producing an ill-conditioned or near singular N. To avoid such ill conditioning it becomes necessary to either select a different set of frequencies to be estimated (e.g., equally spaced frequencies) or simply neglect the correlations in N (i.e., the off-diagonal blocks) and estimate the inverse least squares transform separately for the individual frequencies..."
Lomb's periodogram method, on the other hand, can use an arbitrarily high number of, or density of, frequency components, as in a standard periodogram; that is, the frequency domain can be over-sampled by an arbitrary factor.
In Fourier analysis, such as the Fourier transform or the discrete Fourier transform, the sinusoids being fitted to the data are all mutually orthogonal, so there is no distinction between the simple out-of-context dot-product-based projection onto basis functions versus an in-context simultaneous least-squares fit; that is, no matrix inversion is required to least-squares partition the variance between orthogonal sinusoids of different frequencies.This method is usually preferred for its efficient fast Fourier transform implementation, when complete data records with equally spaced samples are available.
In mathematics, the discrete Fourier transform (DFT) converts a finite sequence of equally-spaced samples of a function into a same-length sequence of equally-spaced samples of the discrete-time Fourier transform (DTFT), which is a complex-valued function of frequency. The interval at which the DTFT is sampled is the reciprocal of the duration of the input sequence. An inverse DFT is a Fourier series, using the DTFT samples as coefficients of complex sinusoids at the corresponding DTFT frequencies. It has the same sample-values as the original input sequence. The DFT is therefore said to be a frequency domain representation of the original input sequence. If the original sequence spans all the non-zero values of a function, its DTFT is continuous, and the DFT provides discrete samples of one cycle. If the original sequence is one cycle of a periodic function, the DFT provides all the non-zero values of one DTFT cycle.
In mathematics, an operator is generally a mapping or function that acts on elements of a space to produce elements of another space. There is no general definition of an operator, but the term is often used in place of function when the domain is a set of functions or other structured objects. Also, the domain of an operator is often difficult to be explicitly characterized, and may be extended to related objects. See Operator (physics) for other examples.
A Fourier transform (FT) is a mathematical transform that decomposes functions depending on space or time into functions depending on spatial frequency or temporal frequency. An example application would be decomposing the waveform of a musical chord into terms of the intensity of its constituent pitches. The term Fourier transform refers to both the frequency domain representation and the mathematical operation that associates the frequency domain representation to a function of space or time.
In mathematics, the discrete sine transform (DST) is a Fourier-related transform similar to the discrete Fourier transform (DFT), but using a purely real matrix. It is equivalent to the imaginary parts of a DFT of roughly twice the length, operating on real data with odd symmetry, where in some variants the input and/or output data are shifted by half a sample.
In signal processing and statistics, a window function is a mathematical function that is zero-valued outside of some chosen interval, normally symmetric around the middle of the interval, usually near a maximum in the middle, and usually tapering away from the middle. Mathematically, when another function or waveform/data-sequence is "multiplied" by a window function, the product is also zero-valued outside the interval: all that is left is the part where they overlap, the "view through the window". Equivalently, and in actual practice, the segment of data within the window is first isolated, and then only that data is multiplied by the window function values. Thus, tapering, not segmentation, is the main purpose of window functions.
A sine wave, sinusoidal wave, or just sinusoid is a mathematical curve defined in terms of the sine trigonometric function, of which it is the graph. It is a type of continuous wave and also a smooth periodic function. It occurs often in mathematics, as well as in physics, engineering, signal processing and many other fields.
In signal processing, a periodogram is an estimate of the spectral density of a signal. The term was coined by Arthur Schuster in 1898. Today, the periodogram is a component of more sophisticated methods. It is the most common tool for examining the amplitude vs frequency characteristics of FIR filters and window functions. FFT spectrum analyzers are also implemented as a time-sequence of periodograms.
In mathematics, the Hartley transform (HT) is an integral transform closely related to the Fourier transform (FT), but which transforms real-valued functions to real-valued functions. It was proposed as an alternative to the Fourier transform by Ralph V. L. Hartley in 1942, and is one of many known Fourier-related transforms. Compared to the Fourier transform, the Hartley transform has the advantages of transforming real functions to real functions and of being its own inverse.
The concept of signed frequency can indicate both the rate and sense of rotation; in can be as simple as a wheel rotating clockwise or counterclockwise. The rate is expressed in units such as revolutions per second (hertz) or radian/second.
Spectrum continuation analysis (SCA) is a generalization of the concept of Fourier series to non-periodic functions of which only a fragment has been sampled in the time domain.
Prony analysis was developed by Gaspard Riche de Prony in 1795. However, practical use of the method awaited the digital computer. Similar to the Fourier transform, Prony's method extracts valuable information from a uniformly sampled signal and builds a series of damped complex exponentials or damped sinusoids. This allows for the estimation of frequency, amplitude, phase and damping components of a signal.
MUSIC is an algorithm used for frequency estimation and radio direction finding.
Geophysical survey is the systematic collection of geophysical data for spatial studies. Detection and analysis of the geophysical signals forms the core of Geophysical signal processing. The magnetic and gravitational fields emanating from the Earth's interior hold essential information concerning seismic activities and the internal structure. Hence, detection and analysis of the electric and Magnetic fields is very crucial. As the Electromagnetic and gravitational waves are multi-dimensional signals, all the 1-D transformation techniques can be extended for the analysis of these signals as well. Hence this article also discusses multi-dimensional signal processing techniques.
Maximum entropy spectral estimation is a method of spectral density estimation. The goal is to improve the spectral quality based on the principle of maximum entropy. The method is based on choosing the spectrum which corresponds to the most random or the most unpredictable time series whose autocorrelation function agrees with the known values. This assumption, which corresponds to the concept of maximum entropy as used in both statistical mechanics and information theory, is maximally non-committal with regard to the unknown values of the autocorrelation function of the time series. It is simply the application of maximum entropy modeling to any type of spectrum and is used in all fields where data is presented in spectral form. The usefulness of the technique varies based on the source of the spectral data since it is dependent on the amount of assumed knowledge about the spectrum that can be applied to the model.
In statistical signal processing, the goal of spectral density estimation (SDE) is to estimate the spectral density of a random signal from a sequence of time samples of the signal. Intuitively speaking, the spectral density characterizes the frequency content of the signal. One purpose of estimating the spectral density is to detect any periodicities in the data, by observing peaks at the frequencies corresponding to these periodicities.
In statistics, signal processing, and time series analysis, a sinusoidal model is used to approximate a sequence Yi to a sine function:
SigSpec is a statistical technique to provide the reliability of periodicities in a measured time series. It relies on the amplitude spectrum obtained by the Discrete Fourier transform (DFT) and assigns a quantity called the spectral significance to each amplitude. This quantity is a logarithmic measure of the probability that the given amplitude level would be seen in white noise, in the sense of a type I error. It represents the answer to the question, “What would be the chance to obtain an amplitude like the measured one or higher, if the analysed time series were random?”
The spectrum of a chirp pulse describes its characteristics in terms of its frequency components. This frequency-domain representation is an alternative to the more familiar time-domain waveform, and the two versions are mathematically related by the Fourier transform.
The spectrum is of particular interest when pulses are subject to signal processing. For example, when a chirp pulse is compressed by its matched filter, the resulting waveform contains not only a main narrow pulse but, also, a variety of unwanted artifacts many of which are directly attributable to features in the chirp's spectral characteristics.
The simplest way to derive the spectrum of a chirp, now that computers are widely available, is to sample the time-domain waveform at a frequency well above the Nyquist limit and call up an FFT algorithm to obtain the desired result. As this approach was not an option for the early designers, they resorted to analytic analysis, where possible, or to graphical or approximation methods, otherwise. These early methods still remain helpful, however, as they give additional insight into the behavior and properties of chirps.
A frequency-selective surface (FSS) is any thin, repetitive surface designed to reflect, transmit or absorb electromagnetic fields based on the frequency of the field. In this sense, an FSS is a type of optical filter or metal-mesh optical filters in which the filtering is accomplished by virtue of the regular, periodic pattern on the surface of the FSS. Though not explicitly mentioned in the name, FSS's also have properties which vary with incidence angle and polarization as well - these are unavoidable consequences of the way in which FSS's are constructed. Frequency-selective surfaces have been most commonly used in the radio frequency region of the electromagnetic spectrum and find use in applications as diverse as the aforementioned microwave oven, antenna radomes and modern metamaterials. Sometimes frequency selective surfaces are referred to simply as periodic surfaces and are a 2-dimensional analog of the new periodic volumes known as photonic crystals.