Spectral flatness

Last updated
Maximum spectral flatness (approaching 1) is achieved by white noise. White noise spectrum.svg
Maximum spectral flatness (approaching 1) is achieved by white noise.

Spectral flatness or tonality coefficient, [1] [2] also known as Wiener entropy, [3] [4] is a measure used in digital signal processing to characterize an audio spectrum. Spectral flatness is typically measured in decibels, and provides a way to quantify how much a sound resembles a pure tone, as opposed to being noise-like. [2]

Contents

Interpretation

The meaning of tonal in this context is in the sense of the amount of peaks or resonant structure in a power spectrum, as opposed to flat spectrum of a white noise. A high spectral flatness (approaching 1.0 for white noise) indicates that the spectrum has a similar amount of power in all spectral bands — this would sound similar to white noise, and the graph of the spectrum would appear relatively flat and smooth. A low spectral flatness (approaching 0.0 for a pure tone) indicates that the spectral power is concentrated in a relatively small number of bands — this would typically sound like a mixture of sine waves, and the spectrum would appear "spiky". [5]

Dubnov [2] has shown that spectral flatness is equivalent to information theoretic concept of mutual information that is known as dual total correlation.

Formulation

The spectral flatness is calculated by dividing the geometric mean of the power spectrum by the arithmetic mean of the power spectrum, i.e.:

where x(n) represents the magnitude of bin number n. Note that a single (or more) empty bin yields a flatness of 0, so this measure is most useful when bins are generally not empty.

The ratio produced by this calculation is often converted to a decibel scale for reporting, with a maximum of 0 dB and a minimum of −∞ dB.

The spectral flatness can also be measured within a specified sub-band, rather than across the whole band.

Applications

This measurement is one of the many audio descriptors used in the MPEG-7 standard, in which it is labelled "AudioSpectralFlatness".

In birdsong research, it has been used as one of the features measured on birdsong audio, when testing similarity between two excerpts. [6] Spectral flatness has also been used in the analysis of electroencephalography (EEG) diagnostics and research, [7] and psychoacoustics in humans. [8]

Related Research Articles

<span class="mw-page-title-main">Bandwidth (signal processing)</span> Range of usable frequencies

Bandwidth is the difference between the upper and lower frequencies in a continuous band of frequencies. It is typically measured in unit of hertz.

In statistics, a central tendency is a central or typical value for a probability distribution.

The decibel is a relative unit of measurement equal to one tenth of a bel (B). It expresses the ratio of two values of a power or root-power quantity on a logarithmic scale. Two signals whose levels differ by one decibel have a power ratio of 101/10 or root-power ratio of 10120.

<span class="mw-page-title-main">Geometric mean</span> N-th root of the product of n numbers

In mathematics, the geometric mean is a mean or average which indicates a central tendency of a finite set of real numbers by using the product of their values. The geometric mean is defined as the nth root of the product of n numbers, i.e., for a set of numbers a1, a2, ..., an, the geometric mean is defined as

A mean is quantity that has a value which is intermediate to the extreme values of a set of numbers. There are several kinds of means in mathematics, especially in statistics. Each mean serves to summarize a given group of data, often to better understand the overall value of a given data set.

Signal-to-noise ratio is a measure used in science and engineering that compares the level of a desired signal to the level of background noise. SNR is defined as the ratio of signal power to noise power, often expressed in decibels. A ratio higher than 1:1 indicates more signal than noise.

<span class="mw-page-title-main">White noise</span> Type of signal in signal processing

In signal processing, white noise is a random signal having equal intensity at different frequencies, giving it a constant power spectral density. The term is used, with this or similar meanings, in many scientific and technical disciplines, including physics, acoustical engineering, telecommunications, and statistical forecasting. White noise refers to a statistical model for signals and signal sources, rather than to any specific signal. White noise draws its name from white light, although light that appears white generally does not have a flat power spectral density over the visible band.

<span class="mw-page-title-main">Pink noise</span> Signal with equal energy per octave

Pink noise, 1f noise or fractional noise or fractal noise is a signal or process with a frequency spectrum such that the power spectral density is inversely proportional to the frequency of the signal. In pink noise, each octave interval carries an equal amount of noise energy.

<span class="mw-page-title-main">Spectral density</span> Relative importance of certain frequencies in a composite signal

The power spectrum of a time series describes the distribution of power into frequency components composing that signal. According to Fourier analysis, any physical signal can be decomposed into a number of discrete frequencies, or a spectrum of frequencies over a continuous range. The statistical average of any sort of signal as analyzed in terms of its frequency content, is called its spectrum.

<span class="mw-page-title-main">Quantization (signal processing)</span> Process of mapping a continuous set to a countable set

Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set to output values in a (countable) smaller set, often with a finite number of elements. Rounding and truncation are typical examples of quantization processes. Quantization is involved to some degree in nearly all digital signal processing, as the process of representing a signal in digital form ordinarily involves rounding. Quantization also forms the core of essentially all lossy compression algorithms.

Sound pressure or acoustic pressure is the local pressure deviation from the ambient atmospheric pressure, caused by a sound wave. In air, sound pressure can be measured using a microphone, and in water with a hydrophone. The SI unit of sound pressure is the pascal (Pa).

<span class="mw-page-title-main">Brownian noise</span> Type of noise produced by Brownian motion

In science, Brownian noise, also known as Brown noise or red noise, is the type of signal noise produced by Brownian motion, hence its alternative name of random walk noise. The term "Brown noise" does not come from the color, but after Robert Brown, who documented the erratic motion for multiple types of inanimate particles in water. The term "red noise" comes from the "white noise"/"white light" analogy; red noise is strong in longer wavelengths, similar to the red end of the visible spectrum.

In applied mathematics, the Wiener–Khinchin theorem or Wiener–Khintchine theorem, also known as the Wiener–Khinchin–Einstein theorem or the Khinchin–Kolmogorov theorem, states that the autocorrelation function of a wide-sense-stationary random process has a spectral decomposition given by the power spectral density of that process.

Entropy monitoring is a method of assessing the effect of certain anaesthetic drugs on the brain's EEG. It was commercially developed by Datex-Ohmeda, which is now part of GE Healthcare.

In mathematics, the Lehmer mean of a tuple of positive real numbers, named after Derrick Henry Lehmer, is defined as:

<span class="mw-page-title-main">Wiener deconvolution</span>

In mathematics, Wiener deconvolution is an application of the Wiener filter to the noise problems inherent in deconvolution. It works in the frequency domain, attempting to minimize the impact of deconvolved noise at frequencies which have a poor signal-to-noise ratio.

The spectral centroid is a measure used in digital signal processing to characterise a spectrum. It indicates where the center of mass of the spectrum is located. Perceptually, it has a robust connection with the impression of brightness of a sound. It is sometimes called center of spectral mass.

In statistical signal processing, the goal of spectral density estimation (SDE) or simply spectral estimation is to estimate the spectral density of a signal from a sequence of time samples of the signal. Intuitively speaking, the spectral density characterizes the frequency content of the signal. One purpose of estimating the spectral density is to detect any periodicities in the data, by observing peaks at the frequencies corresponding to these periodicities.

Multidimension spectral estimation is a generalization of spectral estimation, normally formulated for one-dimensional signals, to multidimensional signals or multivariate data, such as wave vectors.

In statistics, Whittle likelihood is an approximation to the likelihood function of a stationary Gaussian time series. It is named after the mathematician and statistician Peter Whittle, who introduced it in his PhD thesis in 1951. It is commonly used in time series analysis and signal processing for parameter estimation and signal detection.

References

  1. J. D. Johnston (1988). "Transform coding of audio signals using perceptual noise criteria". IEEE Journal on Selected Areas in Communications. 6 (2): 314–332. doi:10.1109/49.608. S2CID   5999699.
  2. 1 2 3 Shlomo Dubnov (2004). "Generalization of Spectral Flatness Measure for Non-Gaussian Linear Processes". Signal Processing Letters. 11 (8): 698–701. Bibcode:2004ISPL...11..698D. doi:10.1109/LSP.2004.831663. ISSN   1070-9908. S2CID   14778866.
  3. The Song Features › Wiener entropy "defined as the ratio of geometric mean to arithmetic mean of the spectrum"
  4. Luscinia parameters "Wiener entropy is an alternative measure of the noisiness of a signal. It is defined as the ratio of the geometric mean to the arithmetic mean of the power spectrum."
  5. A Large Set of Audio Features for Sound Description - technical report published by IRCAM in 2003. Section 9.1
  6. Tchernichovski, O., Nottebohm, F., Ho, C. E., Pesaran, B., Mitra, P. P., 2000. A procedure for an automated measurement of song similarity. Animal Behaviour 59 (6), 1167–1176, doi : 10.1006/anbe.1999.1416.
  7. Burns, T.; Rajan, R. (2015). "Burns & Rajan (2015) Combining complexity measures of EEG data: multiplying measures reveal previously hidden information. F1000Research. 4:137". F1000Research. 4: 137. doi: 10.12688/f1000research.6590.1 . PMC   4648221 . PMID   26594331.
  8. Burns, T.; Rajan, R. (2019). "A Mathematical Approach to Correlating Objective Spectro-Temporal Features of Non-linguistic Sounds With Their Subjective Perceptions in Humans". Frontiers in Neuroscience. 13: 794. doi: 10.3389/fnins.2019.00794 . PMC   6685481 . PMID   31417350.