Gammatone filter

Last updated
A gammatone impulse response Sample gammatone.svg
A gammatone impulse response

A gammatone filter is a linear filter described by an impulse response that is the product of a gamma distribution and sinusoidal tone. It is a widely used model of auditory filters in the auditory system.

Contents

A gammatone response was originally proposed in 1972 as a description of revcor functions measured in the cochlear nucleus of cats. [1]

The gammatone impulse response is given by

where (in Hz) is the center frequency, (in radians) is the phase of the carrier, is the amplitude, is the filter's order, (in Hz) is the filter's bandwidth,and (in seconds) is time.

This time-domain impulse response is a sinusoid (a pure tone) with an amplitude envelope which is a scaled gamma distribution function. [2]

Gammatone filterbank cepstral coefficients (GFCCs) are auditory features that have been used first in the speech domain, and later in the field of underwater target recognition.[ citation needed ] A bank of gammatone filters is used as an improvement on the triangular filters conventionally used in mel scale filterbanks and MFCC features.

Different ways of motivating the gammatone filter for auditory processing have been presented by Johannesma, [1] Patterson et al., [3] Hewitt and Meddis, [4] and Lindeberg and Friberg. [5]

Variations

Variations and improvements of the gammatone model of auditory filtering include the complex gammatone filter, the gammachirp filter, the all-pole and one-zero gammatone filters, the two-sided gammatone filter, and filter-cascade models, and various level-dependent and dynamically nonlinear versions of these. [6]

Related Research Articles

Additive synthesis is a sound synthesis technique that creates timbre by adding sine waves together.

In physics, the cross section is a measure of the probability that a specific process will take place when some kind of radiant excitation intersects a localized phenomenon. For example, the Rutherford cross-section is a measure of probability that an alpha particle will be deflected by a given angle during an interaction with an atomic nucleus. Cross section is typically denoted σ (sigma) and is expressed in units of area, more specifically in barns. In a way, it can be thought of as the size of the object that the excitation must hit in order for the process to occur, but more exactly, it is a parameter of a stochastic process.

In signal processing, group delay and phase delay are delay times experienced by a signal's various frequency components when the signal passes through a system that is linear time-invariant (LTI), such as a microphone, coaxial cable, amplifier, loudspeaker, telecommunications system or ethernet cable. These delays are generally frequency dependent. This means that different frequency components experience different delays, which cause distortion of the signal's waveform as it passes through the system. This distortion can cause problems such as poor fidelity in analog video and analog audio, or a high bit-error rate in a digital bit stream. For a modulation signal, the information carried by the signal is carried exclusively in the wave envelope. Group delay therefore operates only with the frequency components derived from the envelope.

In radio communication, multipath is the propagation phenomenon that results in radio signals reaching the receiving antenna by two or more paths. Causes of multipath include atmospheric ducting, ionospheric reflection and refraction, and reflection from water bodies and terrestrial objects such as mountains and buildings. When the same signal is received over more than one path, it can create interference and phase shifting of the signal. Destructive interference causes fading; this may cause a radio signal to become too weak in certain areas to be received adequately. For this reason, this effect is also known as multipath interference or multipath distortion.

Phase-shift keying (PSK) is a digital modulation process which conveys data by changing (modulating) the phase of a constant frequency reference signal. The modulation is accomplished by varying the sine and cosine inputs at a precise time. It is widely used for wireless LANs, RFID and Bluetooth communication.

Rayleigh fading is a statistical model for the effect of a propagation environment on a radio signal, such as that used by wireless devices.

<span class="mw-page-title-main">Fabry–Pérot interferometer</span>

In optics, a Fabry–Pérot interferometer (FPI) or etalon is an optical cavity made from two parallel reflecting surfaces. Optical waves can pass through the optical cavity only when they are in resonance with it. It is named after Charles Fabry and Alfred Perot, who developed the instrument in 1899. Etalon is from the French étalon, meaning "measuring gauge" or "standard".

In sound processing, the mel-frequency cepstrum (MFC) is a representation of the short-term power spectrum of a sound, based on a linear cosine transform of a log power spectrum on a nonlinear mel scale of frequency.

Fourier optics is the study of classical optics using Fourier transforms (FTs), in which the waveform being considered is regarded as made up of a combination, or superposition, of plane waves. It has some parallels to the Huygens–Fresnel principle, in which the wavefront is regarded as being made up of a combination of spherical wavefronts whose sum is the wavefront being studied. A key difference is that Fourier optics considers the plane waves to be natural modes of the propagation medium, as opposed to Huygens–Fresnel, where the spherical waves originate in the physical medium.

<span class="mw-page-title-main">Butterworth filter</span> Type of signal processing filter

The Butterworth filter is a type of signal processing filter designed to have a frequency response that is as flat as possible in the passband. It is also referred to as a maximally flat magnitude filter. It was first described in 1930 by the British engineer and physicist Stephen Butterworth in his paper entitled "On the Theory of Filter Amplifiers".

Amplitude-shift keying (ASK) is a form of amplitude modulation that represents digital data as variations in the amplitude of a carrier wave. In an ASK system, a symbol, representing one or more bits, is sent by transmitting a fixed-amplitude carrier wave at a fixed frequency for a specific time duration. For example, if each symbol represents a single bit, then the carrier signal could be transmitted at nominal amplitude when the input value is 1, but transmitted at reduced amplitude or not at all when the input value is 0.

<span class="mw-page-title-main">Inferior colliculus</span> Midbrain structure involved in the auditory pathway.

The inferior colliculus (IC) is the principal midbrain nucleus of the auditory pathway and receives input from several peripheral brainstem nuclei in the auditory pathway, as well as inputs from the auditory cortex. The inferior colliculus has three subdivisions: the central nucleus, a dorsal cortex by which it is surrounded, and an external cortex which is located laterally. Its bimodal neurons are implicated in auditory-somatosensory interaction, receiving projections from somatosensory nuclei. This multisensory integration may underlie a filtering of self-effected sounds from vocalization, chewing, or respiration activities.

In statistics, econometrics and signal processing, an autoregressive (AR) model is a representation of a type of random process; as such, it is used to describe certain time-varying processes in nature, economics, etc. The autoregressive model specifies that the output variable depends linearly on its own previous values and on a stochastic term ; thus the model is in the form of a stochastic difference equation. Together with the moving-average (MA) model, it is a special case and key component of the more general autoregressive–moving-average (ARMA) and autoregressive integrated moving average (ARIMA) models of time series, which have a more complicated stochastic structure; it is also a special case of the vector autoregressive model (VAR), which consists of a system of more than one interlocking stochastic difference equation in more than one evolving random variable.

The raised-cosine filter is a filter frequently used for pulse-shaping in digital modulation due to its ability to minimise intersymbol interference (ISI). Its name stems from the fact that the non-zero portion of the frequency spectrum of its simplest form is a cosine function, 'raised' up to sit above the (horizontal) axis.

<span class="mw-page-title-main">In-phase and quadrature components</span>

In electrical engineering, a sinusoid with angle modulation can be decomposed into, or synthesized from, two amplitude-modulated sinusoids that are offset in phase by one-quarter cycle. All three functions have the same center frequency. Such amplitude modulated sinusoids are known as the in-phase and quadrature components. In some contexts it is more convenient to refer to only the amplitude modulation (baseband) itself by those terms.

Spherical multipole moments are the coefficients in a series expansion of a potential that varies inversely with the distance R to a source, i.e., as 1/R. Examples of such potentials are the electric potential, the magnetic potential and the gravitational potential.

The theoretical and experimental justification for the Schrödinger equation motivates the discovery of the Schrödinger equation, the equation that describes the dynamics of nonrelativistic particles. The motivation uses photons, which are relativistic particles with dynamics described by Maxwell's equations, as an analogue for all types of particles.

In particle physics, particle decay is the spontaneous process of one unstable subatomic particle transforming into multiple other particles. The particles created in this process must each be less massive than the original, although the total invariant mass of the system must be conserved. A particle is unstable if there is at least one allowed final state that it can decay into. Unstable particles will often have multiple ways of decaying, each with its own associated probability. Decays are mediated by one or several fundamental forces. The particles in the final state may themselves be unstable and subject to further decay.

In statistical signal processing, the goal of spectral density estimation (SDE) or simply spectral estimation is to estimate the spectral density of a signal from a sequence of time samples of the signal. Intuitively speaking, the spectral density characterizes the frequency content of the signal. One purpose of estimating the spectral density is to detect any periodicities in the data, by observing peaks at the frequencies corresponding to these periodicities.

Stabilized inverse Q filtering is a data processing technology for enhancing the resolution of reflection seismology images where the stability of the method used is considered. Q is the anelastic attenuation factor or the seismic quality factor, a measure of the energy loss as the seismic wave moves. To obtain a solution when we make computations with a seismic model we always have to consider the problem of instability and try to obtain a stabilized solution for seismic inverse Q filtering.

References

  1. 1 2 P. I. M. Johannesma (1972). "The pre-response stimulus ensemble of neurons in the cochlear nucleus". IPO Symposium on Hearing Theory. Eindhoven, the Netherlands. pp. 58–69.
  2. Slaney, Malcolm (1993). "An Efficient Implementation of the Patterson–Holdsworth Auditory Filter Bank" (PDF). Apple Computer Technical Report #35.
  3. R. D. Patterson, I. Nimmo-Smith, J. Holdsworth and P. Rice (1987). "An efficient auditory filterbank based on the gammatone function". A Meeting of the IOC Speech Group on Auditory Modelling at RSRE. Vol. 2, no. 7.{{cite news}}: CS1 maint: multiple names: authors list (link)
  4. M. J. Hewitt and R. Meddis (1994). "A computer model of amplitude-modulation sensitivity of single units in the inferior colliculus". The Journal of the Acoustical Society of America. 95 (4): 2145–2159. doi:10.1121/1.408676.
  5. T. Lindeberg and A. Friberg (2015). "Idealized computational models for auditory receptive fields". PLOS ONE. 10 (3): e0119032. doi: 10.1371/journal.pone.0119032 .
  6. Richard F. Lyon; Andreas G. Katsiamis; Emmanuel M. Drakakis (2010). "History and Future of Auditory Filter Models" (PDF). Proc. ISCAS. IEEE.