Ernst Terhardt | |
---|---|
Born | 11 December 1934 |
Nationality | German |
Education | University of Stuttgart |
Engineering career | |
Discipline | Electrical engineer |
Ernst Terhardt (born 11 December 1934) is a German engineer and psychoacoustician who made significant contributions in diverse areas of audio communication including pitch perception, music cognition, and Fourier transformation. He was professor in the area of acoustic communication at the Institute of Electroacoustics, Technical University of Munich, Germany. [1] [2]
Terhardt studied electrical engineering at the University of Stuttgart. His Master's thesis (Diplomarbeit) was entitled "Ein Funktionsmodell des Gehörs" (A functional model of hearing). His Dissertation was entitled "Beitrag zur Ermittlung der informationstragenden Merkmale von Schallen mit Hilfe der Hörempfindungen" (literally, "Contribution to determination of information-carrying characteristics of sounds with the help of auditory sensations"). Both projects were supervised by Eberhard Zwicker, with whom he founded the Institute for Electroacoustics, Technical University of Munich in 1967. Terhardt's Habilitation thesis (1972) was entitled "Ein Funktionsschema der Tonhöhenwahrnehmung von Klängen" (A model of pitch perception in complex sounds).
According to Terhardt's theory of pitch perception, [3] [4] pitch perception can be divided into two separate stages: auditory spectral analysis and harmonic pitch pattern recognition. In the first stage, the inner ear (cochlea and basilar membrane) performs a running spectral analysis of the incoming signal. The parameters of this analysis (e.g. the effective length and shape of the analysis window) depend directly on physiology and indirectly on the co-evolution of ear and voice as our human and prehuman ancestors interacted with their social and physical environments. The output of this first stage is called a spectral pitch pattern, when it is determined by psychoacoustic experiments in which listeners make subjective judgments, matching the perceived pitch of a pure reference tone to that of a successively presented complex tone. The spectral pitches differ in perceptual salience since their sound pressure levels differ physically, they lie at different distances above the threshold of hearing, they mask each other (and therefore lie at different distances above the masked threshold), and may or may not lie in a region to which the ear is particularly sensitive (a dominance region of pitch perception). A cornerstone of Terhardt's approach is the idea that because spectral pitches are subjective, we must not jump to conclusions about the relationship between them and their physiological (physical) foundations in the ear and brain.
In second stage of pitch perception, harmonic patterns among the spectral pitches are spontaneously recognized by the auditory system, in a process analogous to pattern recognition in vision. The output of this stage is a set of virtual pitches that correspond approximately to the fundamentals of approximately harmonic series of pitches. In this process, the auditory system tolerates a certain degree of mistuning, for two main reasons. First, the partials of complex tones in the environment may be physically mistuned relative to a harmonic series (e.g. piano tones). Second, the frequencies of partials may be known only approximately due to the uncertainty principle: the shorter is the effective time window, the less accurately can the frequency be known. The auditory system is physically unable to determine frequencies accurately in very short sound presentations, or in tones that are changing quickly in fundamental frequency, for example in speech.
If only one virtual pitch is perceived in a sound, it is generally the one with the highest salience. The output of Terhardt's algorithm for pitch perception is a series of virtual pitches of differing salience, of which the most salient is the prediction for “the” pitch of the sound. The existence of several competing virtual pitches can explain the ambiguity of the pitch of many sounds. Bells with non-harmonic spectra are an obvious example (it is often possible to hear the main virtual pitch as the strike tone at the start of the sound, and the main spectral pitch as a hum tone which becomes directly audible as the sound dies away). But Terhardt and his colleagues also demonstrated that regular harmonic complex tones in speech and music are slightly ambiguous in pitch, which may be the ultimate origin of octave equivalence in music and the perceived tonal affinity of successive tones at octave or fifth intervals. Terhardt claimed that the root of a chord in western music typically corresponds to its most salient virtual pitch, and that the virtual pitch phenomenon is the ultimate origin of the root effect. He also investigated the perception of roughness in music and claimed that musical consonance and dissonance has two main psychoacoustic components, roughness and harmony, harmony being related to the perception of virtual pitch. [5]
Terhardt's approach to acoustic communication [6] is based on Karl Popper's theory of three worlds [7] according to which reality is either physical, experiential (perception, sensations, emotions) or abstract (thoughts, knowledge, information, culture). Terhardt maintains that these three aspects of acoustic communication must be carefully separated before we empirically explore the relationships between them. In the physical world, we consider the physics of sound sources such as the voice and musical instruments; auditory environments including reflectors; electroacoustic systems such as microphones and loudspeakers; and the ear and brain, considered as a purely physical system. Sound is a signal that is analysed by the ear; to understand this process, we need foundations of signal processing. To understand auditory perception, we perform psychoacoustic experiments, which are generally about relationships between and among Popper's three worlds.
Acoustics is a branch of physics that deals with the study of mechanical waves in gases, liquids, and solids including topics such as vibration, sound, ultrasound and infrasound. A scientist who works in the field of acoustics is an acoustician while someone working in the field of acoustics technology may be called an acoustical engineer. The application of acoustics is present in almost all aspects of modern society with the most obvious being the audio and noise control industries.
In music, harmony is the concept of combining different sounds together in order to create new, distinct musical ideas. Theories of harmony seek to describe or explain the effects created by distinct pitches or tones coinciding with one another; harmonic objects such as chords, textures and tonalities are identified, defined, and categorized in the development of these theories. Harmony is broadly understood to involve both a "vertical" dimension (frequency-space) and a "horizontal" dimension (time-space), and often overlaps with related musical concepts such as melody, timbre, and form.
In music, timbre, also known as tone color or tone quality, is the perceived sound quality of a musical note, sound or tone. Timbre distinguishes different types of sound production, such as choir voices and musical instruments. It also enables listeners to distinguish different instruments in the same category.
Pitch is a perceptual property that allows sounds to be ordered on a frequency-related scale, or more commonly, pitch is the quality that makes it possible to judge sounds as "higher" and "lower" in the sense associated with musical melodies. Pitch is a major auditory attribute of musical tones, along with duration, loudness, and timbre.
The pitch being perceived with the first harmonic being absent in the waveform is called the missing fundamental phenomenon.
In psychoacoustics, a pure tone is a sound with a sinusoidal waveform; that is, a sine wave of constant frequency, phase-shift, and amplitude. By extension, in signal processing a single-frequency tone or pure tone is a purely sinusoidal signal . A pure tone has the property – unique among real-valued wave shapes – that its wave shape is unchanged by linear time-invariant systems; that is, only the phase and amplitude change between such a system's pure-tone input and its output.
In music, consonance and dissonance are categorizations of simultaneous or successive sounds. Within the Western tradition, some listeners associate consonance with sweetness, pleasantness, and acceptability, and dissonance with harshness, unpleasantness, or unacceptability, although there is broad acknowledgement that this depends also on familiarity and musical expertise. The terms form a structural dichotomy in which they define each other by mutual exclusion: a consonance is what is not dissonant, and a dissonance is what is not consonant. However, a finer consideration shows that the distinction forms a gradation, from the most consonant to the most dissonant. In casual discourse, as German composer and music theorist Paul Hindemith stressed, "The two concepts have never been completely explained, and for a thousand years the definitions have varied". The term sonance has been proposed to encompass or refer indistinctly to the terms consonance and dissonance.
In audiology and psychoacoustics the concept of critical bands, introduced by Harvey Fletcher in 1933 and refined in 1940, describes the frequency bandwidth of the "auditory filter" created by the cochlea, the sense organ of hearing within the inner ear. Roughly, the critical band is the band of audio frequencies within which a second tone will interfere with the perception of the first tone by auditory masking.
Volley theory states that groups of neurons of the auditory system respond to a sound by firing action potentials slightly out of phase with one another so that when combined, a greater frequency of sound can be encoded and sent to the brain to be analyzed. The theory was proposed by Ernest Wever and Charles Bray in 1930 as a supplement to the frequency theory of hearing. It was later discovered that this only occurs in response to sounds that are about 500 Hz to 5000 Hz.
Ohm's acoustic law, sometimes called the acoustic phase law or simply Ohm's law, states that a musical sound is perceived by the ear as a set of a number of constituent pure harmonic tones.
In perception and psychophysics, auditory scene analysis (ASA) is a proposed model for the basis of auditory perception. This is understood as the process by which the human auditory system organizes sound into perceptually meaningful elements. The term was coined by psychologist Albert Bregman. The related concept in machine perception is computational auditory scene analysis (CASA), which is closely related to source separation and blind signal separation.
Computational auditory scene analysis (CASA) is the study of auditory scene analysis by computational means. In essence, CASA systems are "machine listening" systems that aim to separate mixtures of sound sources in the same way that human listeners do. CASA differs from the field of blind signal separation in that it is based on the mechanisms of the human auditory system, and thus uses no more than two microphone recordings of an acoustic environment. It is related to the cocktail party problem.
Computer audition (CA) or machine listening is the general field of study of algorithms and systems for audio interpretation by machines. Since the notion of what it means for a machine to "hear" is very broad and somewhat vague, computer audition attempts to bring together several disciplines that originally dealt with specific problems or had a concrete application in mind. The engineer Paris Smaragdis, interviewed in Technology Review, talks about these systems — "software that uses sound to locate people moving through rooms, monitor machinery for impending breakdowns, or activate traffic cameras to record accidents."
In audio signal processing, auditory masking occurs when the perception of one sound is affected by the presence of another sound.
In physics, sound is a vibration that propagates as an acoustic wave through a transmission medium such as a gas, liquid or solid. In human physiology and psychology, sound is the reception of such waves and their perception by the brain. Only acoustic waves that have frequencies lying between about 20 Hz and 20 kHz, the audio frequency range, elicit an auditory percept in humans. In air at atmospheric pressure, these represent sound waves with wavelengths of 17 meters (56 ft) to 1.7 centimeters (0.67 in). Sound waves above 20 kHz are known as ultrasound and are not audible to humans. Sound waves below 20 Hz are known as infrasound. Different animal species have varying hearing ranges.
William M. Hartmann is a noted physicist, psychoacoustician, author, and former president of the Acoustical Society of America. His major contributions in psychoacoustics are in pitch perception, binaural hearing, and sound localization. Working with junior colleagues, he discovered several major pitch effects: the binaural edge pitch, the binaural coherence edge pitch, the pitch shifts of mistuned harmonics, and the harmonic unmasking effect. His textbook, Signals, Sound and Sensation, is widely used in courses on psychoacoustics. He is currently a professor of physics at Michigan State University.
Psychoacoustics is the branch of psychophysics involving the scientific study of the perception of sound by the human auditory system. It is the branch of science studying the psychological responses associated with sound including noise, speech, and music. Psychoacoustics is an interdisciplinary field including psychology, acoustics, electronic engineering, physics, biology, physiology, and computer science.
Richard Parncutt is an Australian-born academic. He has been professor of systematic musicology at Karl Franzens University Graz in Austria since 1998.
Temporal envelope (ENV) and temporal fine structure (TFS) are changes in the amplitude and frequency of sound perceived by humans over time. These temporal changes are responsible for several aspects of auditory perception, including loudness, pitch and timbre perception and spatial hearing.
Brian C.J. Moore FMedSci, FRS is an Emeritus Professor of Auditory Perception in the University of Cambridge and an Emeritus Fellow of Wolfson College, Cambridge. His research focuses on psychoacoustics, audiology, and the development and assessment of hearing aids.