Transaural

Last updated


Transaural Stereo is a technology suite of analog circuits and digital signal processing algorithms related to the field of sound playback for audio communication and entertainment. It is based on the concept of crosstalk cancellation but in some versions can embody other processes such as binaural synthesis and equalization.

Contents

The technology was developed in the 1970's by Duane H. Cooper and Jerald L. Bauck.

Description

The central concept behind transaural stereo is that there are two loudspeakers and a single listener (two ears). The left-channel signal should only reach the left ear and the right-channel signal should only reach the right ear, each with appropriate timbral corrections.

To effect this, a circuit or computer algorithm is devised. It is based on the knowledge of the four frequency-dependent transfer functions, the so-called ipsilateral and contralateral paths.:

These four functions are examples of head-related transfer functions (HRTF).

A more general theory allows arbitrary numbers of loudspeakers and ears (listeners). The inputs to the process are sometimes recorded binaural signals from a recording mannequin ("dummy head") but this is not a requirement. Virtual loudspeakers can be formed by combining crosstalk cancellation with binaural image synthesis so that, for example, narrowly-spaced loudspeakers can be made to sound farther apart or a five-channel surround sound system can be made with only two actual loudspeakers, a virtual home theater.

History

transaural stereo was developed by Duane H. Cooper and Jerald L. Bauck. An early version was published as a Master's thesis at the University of Illinois in 1978 and later in the Journal of the Audio Engineering Society . [1] The work was continued in the mid-1980s as an improvement on and practical implementation of the early work in comparative auditoria studies in the 1960s of Schroeder and Atal [2] which was reported as obtaining unstable images under slight head movements.

Cooper and Bauck, using methods to stabilize images and reduce the filter count, made an analog crosstalk canceller, a two-speaker spreader, and an eight-position binaural image synthesizer which doubled as a binaural pan pot in 1987–1989 using biquadratic analog filters in shuffler configurations. Later implementations used highly efficient digital biquadratic filters.

The distributed source concept with both discrete and continuous source distributions was created in March 1997 and later refined and the refinement named Optimal Source Distribution.

Related Research Articles

Distortion is the alteration of the original shape of something. In communications and electronics it means the alteration of the waveform of an information-bearing signal, such as an audio signal representing sound or a video signal representing images, in an electronic device or communication channel.

Digital audio Technology that records, stores, and reproduces sound

Digital audio is a representation of sound recorded in, or converted into, digital form. In digital audio, the sound wave of the audio signal is typically encoded as numerical samples in a continuous sequence. For example, in CD audio, samples are taken 44,100 times per second, each with 16-bit sample depth. Digital audio is also the name for the entire technology of sound recording and reproduction using audio signals that have been encoded in digital form. Following significant advances in digital audio technology during the 1970s and 1980s, it gradually replaced analog audio technology in many areas of audio engineering and telecommunications in the 1990s and 2000s.

Binaural recording

Binaural recording is a method of recording sound that uses two microphones, arranged with the intent to create a 3-D stereo sound sensation for the listener of actually being in the room with the performers or instruments. This effect is often created using a technique known as dummy head recording, wherein a mannequin head is outfitted with a microphone in each ear. Binaural recording is intended for replay using headphones and will not translate properly over stereo speakers. This idea of a three-dimensional or "internal" form of sound has also translated into useful advancement of technology in many things such as stethoscopes creating "in-head" acoustics and IMAX movies being able to create a three-dimensional acoustic experience.

Head-related transfer function

A head-related transfer function (HRTF), also sometimes known as the anatomical transfer function (ATF), is a response that characterizes how an ear receives a sound from a point in space. As sound strikes the listener, the size and shape of the head, ears, ear canal, density of the head, size and shape of nasal and oral cavities, all transform the sound and affect how it is perceived, boosting some frequencies and attenuating others. Generally speaking, the HRTF boosts frequencies from 2–5 kHz with a primary resonance of +17 dB at 2,700 Hz. But the response curve is more complex than a single bump, affects a broad frequency spectrum, and varies significantly from person to person.

Mixing console Device used for audio mixing for recording or performance

In sound recording and reproduction, and sound reinforcement systems, a mixing console is an electronic device for combining sounds of many different audio signals. Inputs to the console include microphones being used by singers and for picking up acoustic instruments, signals from electric or electronic instruments, or recorded music. Depending on the type, a mixer is able to control analog or digital signals. The modified signals are summed to produce the combined output signals, which can then be broadcast, amplified through a sound reinforcement system or recorded.

Surround sound System with loudspeakers that surround the listener

Surround sound is a technique for enriching the fidelity and depth of sound reproduction by using multiple audio channels from speakers that surround the listener. Its first application was in movie theaters. Prior to surround sound, theater sound systems commonly had three "screen channels" of sound that played from three loudspeakers located in front of the audience. Surround sound adds one or more channels from loudspeakers to the side or behind the listener that are able to create the sensation of sound coming from any horizontal direction around the listener.

Audio system measurements

Audio system measurements are a means of quantifying system performance. These measurements are made for several purposes. Designers take measurements so that they can specify the performance of a piece of equipment. Maintenance engineers make them to ensure equipment is still working to specification, or to ensure that the cumulative defects of an audio path are within limits considered acceptable. Audio system measurements often accommodate psychoacoustic principles to measure the system in a way that relates to human hearing.

Monaural

Monaural or monophonic sound reproduction is sound intended to be heard as if it were emanating from one position. This contrasts with stereophonic sound or stereo, which uses two separate audio channels to reproduce sound from two microphones on the right and left side, which is reproduced with two separate loudspeakers to give a sense of the direction of sound sources. In mono, only one loudspeaker is necessary, but, when played through multiple loudspeakers or headphones, identical signals are fed to each speaker, resulting in the perception of one-channel sound "imaging" in one sonic space between the speakers. Monaural recordings, like stereo ones, typically use multiple microphones fed into multiple channels on a recording console, but each channel is "panned" to the center. In the final stage, the various center-panned signal paths are usually mixed down to two identical tracks, which, because they are identical, are perceived upon playback as representing a single unified signal at a single place in the soundstage. In some cases, multitrack sources are mixed to a one-track tape, thus becoming one signal. In the mastering stage, particularly in the days of mono records, the one- or two-track mono master tape was then transferred to a one-track lathe intended to be used in the pressing of a monophonic record. Today, however, monaural recordings are usually mastered to be played on stereo and multi-track formats, yet retain their center-panned mono soundstage characteristics.

3D audio effects are a group of sound effects that manipulate the sound produced by stereo speakers, surround-sound speakers, speaker-arrays, or headphones. This frequently involves the virtual placement of sound sources anywhere in three-dimensional space, including behind, above or below the listener.

Sound localization is a listener's ability to identify the location or origin of a detected sound in direction and distance.

Stereophonic sound Method of sound reproduction using two audio channels

Stereophonic sound or, more commonly, stereo, is a method of sound reproduction that recreates a multi-directional, 3-dimensional audible perspective. This is usually achieved by using two or more independent audio channels through a configuration of two or more loudspeakers in such a way as to create the impression of sound heard from various directions, as in natural hearing. Thus the term "stereophonic" applies to so-called "quadraphonic" and "surround-sound" systems as well as the more common two-channel, two-speaker systems. Stereo sound has been in common use since the 1970s in entertainment media such as broadcast radio, TV, recorded music, internet, computer audio, and cinema.

Code-excited linear prediction (CELP) is a linear predictive speech coding algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algorithms, such as residual-excited linear prediction (RELP) and linear predictive coding (LPC) vocoders. Along with its variants, such as algebraic CELP, relaxed CELP, low-delay CELP and vector sum excited linear prediction, it is currently the most widely used speech coding algorithm. It is also used in MPEG-4 Audio speech coding. CELP is commonly used as a generic term for a class of algorithms and not for a particular codec.

Sound Blaster X-Fi Computer sound card

Sound Blaster X-Fi is a lineup of sound cards in Creative Technology's Sound Blaster series.

GameCODA is an audio middleware product by Sensaura designed for game developers to create realistic sound environments in video games. It allows development for the following platforms: Microsoft Windows, Xbox, Xbox 360, PlayStation 2 and GameCube. It uses several environmental audio technologies developed by Sensaura.

Microphone practice

There are a number of well-developed microphone techniques used for recording musical, film, or voice sources or picking up sounds as part of sound reinforcement systems. The choice of technique depends on a number of factors, including:

Ambiophonics is a method in the public domain that employs digital signal processing (DSP) and two loudspeakers directly in front of the listener in order to improve reproduction of stereophonic and 5.1 surround sound for music, movies, and games in home theaters, gaming PCs, workstations, or studio monitoring applications. First implemented using mechanical means in 1986, today a number of hardware and VST plug-in makers offer Ambiophonic DSP. Ambiophonics eliminates crosstalk inherent in the conventional “stereo triangle” speaker placement, and thereby generates a speaker-binaural soundfield that emulates headphone-binaural sound, and creates for the listener improved perception of “reality” of recorded auditory scenes. A second speaker pair can be added in back in order to enable 360° surround sound reproduction. Additional surround speakers may be used for hall ambience, including height, if desired.

Crossfeed is the process of blending the left and right channels of a stereo audio recording. It is generally used to reduce the extreme channel separation often featured in early stereo recordings, or to make audio played through headphones sound more natural, as when listening to a pair of external speakers.

3D sound localization refers to an acoustic technology that is used to locate the source of a sound in a three-dimensional space. The source location is usually determined by the direction of the incoming sound waves and the distance between the source and sensors. It involves the structure arrangement design of the sensors and signal processing techniques.

3D sound refers to the way humans experience sound in their everyday lives. In real life, people are always surrounded by sound. Sounds arrive at the ears from every direction and from varying distances. These and other factors contribute to the three-dimensional aural image humans hear. Scientists and engineers who work with 3D sound work to accurately synthesize the complexity of real-world sounds.

United Kingdom patent 394325 Seminal work on stereophonic sound by Alan Blumlein

The United Kingdom patent 394325 'Improvements in and relating to Sound-transmission, Sound-recording and Sound-reproducing Systems' is a fundamental work on stereophonic sound, written by Alan Blumlein in 1931 and published in 1933. The work exists only in the form of a patent and two accompanying memos addressed to Isaac Shoenberg. The text is exceptionally long for a patent of the period, having 70 numbered claims. It contains a brief summary of sound localization theory, a roadmap for introduction of surround sound in sound film and recording industry, and a description of Blumlein's inventions related to stereophony, notably the matrix processing of stereo signals, the Blumlein stereo microphone and the 45/45 mechanical recording system.

References

  1. Bauck, Jerald; Cooper, Duane (May 1980). "On Acoustical Specification of Natural Stereo Imaging". Journal of the Audio Engineering Society. Retrieved July 30, 2020.
  2. Schroeder, M.R.; Atal, B.S. (1963). "Computer Simulation of Sound Transmission in Rooms". IEEE Conv. Rec., Pt. 7: 150–155.