Ambiophonics

Last updated

Ambiophonics is a method in the public domain that employs digital signal processing (DSP) and two loudspeakers directly in front of the listener in order to improve reproduction of stereophonic and 5.1 surround sound for music, movies, and games in home theaters, gaming PCs, workstations, or studio monitoring applications. First implemented using mechanical means in 1986, [1] [2] today a number of hardware and VST plug-in makers offer Ambiophonic DSP. [3] Ambiophonics eliminates crosstalk inherent in the conventional stereo triangle speaker placement, and thereby generates a speaker-binaural soundfield that emulates headphone-binaural sound, and creates for the listener improved perception of reality of recorded auditory scenes. A second speaker pair can be added in back in order to enable 360° surround sound reproduction. Additional surround speakers may be used for hall ambience, including height, if desired.

Contents

Ambiophonics, stereophonics, and human hearing

In stereophonics, the reproduced sound is distorted by crosstalk, where signals from either speaker reach not only the intended ear, but the opposite ear, causing comb filtering that distorts timbre of central voices, and creating false “early reflections” due to the delay of sound reaching the opposite ear. In addition, auditory images are bounded between left (L) and right (R) speakers, usually positioned at ±30° with respect to the listener, thereby including 60°, only 1/6 of the horizontal circle, with the listener at the center. Human hearing can locate sound from directions not only in a 360° circle, but a full sphere.

Ambiophonics eliminates speaker crosstalk and its deleterious effects. Using ambiophonics, auditory images can extend in theory all the way to the sides, at ±90° left and right and including the front hemi-circle of 180°, depending on listening acoustics and to what degree the recording has captured the interaural level differences (ILD) and the interaural time differences (ITD) that characterize two-eared human hearing. Most existing two channel discs (LPs as well as CDs) include ILD and ITD data that cannot be reproduced by the stereo loudspeaker “triangle” due to inherent crosstalk. When reproduced using ambiophonics, such existing recordings’ true qualities are revealed, with natural solo voices and wider images, up to 150° in practice.

It is also possible to make new recordings using binaurally-based main microphones, such as an ambiophone, [3] which is optimized for Ambiophonic reproduction (stereo-compatible) since it captures and preserves the same ILD and ITD that one would experience with one's own ears at the recording session. Along with lifelike spatial qualities, more correct timbre (tone color) of sounds is preserved. Use of ORTF, Jecklin Disk, and sphere microphones without pinna (outer ear) can produce similar results. (Note that microphone techniques such as these that are binaural-based but without pinna also produce compatible results using conventional speaker-stereo, 5.1 surround, and mp3 players.)

Roots and research

In 1981, Carver Corporation incorporated filtering to attempt to pre-subtract anti-crosstalk in their analogue Carver C4000 Control Console. This was called "Sonic Holography". [4]

An early hardware attempt to compensate for loudspeaker-ear crosstalk was to apply a little out-of-phase left channel to a separate driver in the right speaker cabinet, and vice versa. This was marketed in 1982 by Polk Audio as "true stereo" in their SDA-SRS, SDA1 and SDA2 series speakers [5] [6] by licensing the Carver Sonic Holography patent. [7]

In 1991, Roland Corporation launched Roland Sound Space, [8] a system that created a 3D sound-space using stereo speakers. It worked better for some listeners than others.[ citation needed ]

Ambiophonics is an amalgam of new research and previously known psychoacoustic principles and binaural technologies. This knowledge has enabled audio recording and reproduction that approaches the realistic soundfield at the ears of the listener that is comparable to what one would perceive in a concert hall, movie scene, or game environment. This level of high-fidelity was not realizable until human hearing and acoustics principles were thoroughly researched, and affordable PCs with sufficient processing speed became available. At the Casa Della Musica at the University of Parma, Italy, or at the listening lab at Filmaker Technology, Pennsylvania, US, ambiophonics, ambisonics, stereophonics, 5.1 2D surround, and hybrid full-sphere 3D systems can be compared for the abilities of these methods to convey the spatiality and tone color of real perception. Developers have provided many scientific papers and downloadable tools for implementing ambiophonics free of charge for personal use. [9]

Results and limitations

By repositioning speakers closer together to create a stereo dipole, and using digital signal processing (DSP) such as free RACE (Recursive Ambiophonic Crosstalk Elimination) or similar software, [10] ambiophonic reproduction is able to generate wide auditory images from most ordinary CDs/LPs/DVDs or MP3s of music, movies, or games and, depending upon the recording, restore the lifelike localization, spatiality, and tone color they have captured. For most test subjects, results are dramatic, suggesting that Ambiophonics has the potential to revitalize interest in high-fidelity sound reproduction, both in stereo and surround.

Additionally, ambiophonics provides for the optional use of concert-hall or other ambience impulse response convolution to generate hall ambience signals for virtually any number and any placement of surround speakers. [11] [12] But ambiophonics is not for theaters, auditoriums, or any large groups. Ambiophonics can usually accommodate more than one listener since one can move back and forth along the line bisecting the speakers. Precisely because of the higher level of envelopment along this line, the loss of realism when one moves away from the center line is more dramatic in the case of Ambiophonics than stereo. The listening area can be enlarged with ambience convolution, whereby surround speakers mimic the contributions of concert-hall walls.

Ambiophonics methods can be implemented in ordinary laptops, PCs, soundcards, hi-fi amplifiers, and even modest loudspeakers with consistent phase response, especially in any crossover regions. Neither true-binaural (dummy head with pinna) recordings nor head tracking are required, as with headphone-binaural listening. Commercial products now implement ambiophonics DSP, although tools for use on PCs are also available online. [9]

Surround sound

In practice in its simplest two-speaker implementation, ambiophonic reproduction unlocks auditory cues for images of up to 150° horizontally (azimuth), depending on the binaural cues captured in existing stereo recordings. Multi-channel recordings made with ambiophone-like microphone arrays to make 5.1-compatible DVD/SACD recordings can be reproduced using just four speakers (a center speaker is obviated in ambiophonic layouts). Allowing for the human hearing “cone of confusion” at each side, a full 360° degree circle of perceived sound localization has been measured within ±5° of actual source azimuth, reproducing lifelike spatial envelopment and timbre (contributed by accurate directional provenance of early reflections) of multi-channel music, movies, and game content. [3] [13] [14]

Especially in the case of stereo content where ambience has been purposely reduced (because a natural level coming from front 60°-only is perceived as too much), additional signals for surround speakers can be produced using a measured hall impulse response, convolved in a PC with the two front channel signals. For full ambiophonic replay, one PC can provide the DSP for 4-channel crosstalk-cancellation and four or more (up to 16 depending on the PC) surround speakers. [15]

The development of ambiophonics is the work of several researchers and companies including Ralph Glasgal, founder of the Ambiophonic Institute; Dr. Angelo Farina, University of Parma; Robin Miller, Filmaker Technology; Waves Audio; Dr. Roger West, Soundlab; Dr. Radomir Bozovic, TacT Audio; and Prof. Edgar Choueiri, Princeton University.

See also

Related Research Articles

Binaural recording Method of recording sound

Binaural recording is a method of recording sound that uses two microphones, arranged with the intent to create a 3-D stereo sound sensation for the listener of actually being in the room with the performers or instruments. This effect is often created using a technique known as dummy head recording, wherein a mannequin head is fitted with a microphone in each ear. Binaural recording is intended for replay using headphones and will not translate properly over stereo speakers. This idea of a three-dimensional or "internal" form of sound has also translated into useful advancement of technology in many things such as stethoscopes creating "in-head" acoustics and IMAX movies being able to create a three-dimensional acoustic experience.

Head-related transfer function

A head-related transfer function (HRTF), also known as anatomical transfer function (ATF), is a response that characterizes how an ear receives a sound from a point in space. As sound strikes the listener, the size and shape of the head, ears, ear canal, density of the head, size and shape of nasal and oral cavities, all transform the sound and affect how it is perceived, boosting some frequencies and attenuating others. Generally speaking, the HRTF boosts frequencies from 2–5 kHz with a primary resonance of +17 dB at 2,700 Hz. But the response curve is more complex than a single bump, affects a broad frequency spectrum, and varies significantly from person to person.

Ambisonics Full-sphere surround sound format

Ambisonics is a full-sphere surround sound format: in addition to the horizontal plane, it covers sound sources above and below the listener.

Surround sound System with loudspeakers that surround the listener

Surround sound is a technique for enriching the fidelity and depth of sound reproduction by using multiple audio channels from speakers that surround the listener. Its first application was in movie theaters. Prior to surround sound, theater sound systems commonly had three screen channels of sound that played from three loudspeakers located in front of the audience. Surround sound adds one or more channels from loudspeakers to the side or behind the listener that are able to create the sensation of sound coming from any horizontal direction around the listener.

The precedence effect or law of the first wavefront is a binaural psychoacoustical effect. When a sound is followed by another sound separated by a sufficiently short time delay, listeners perceive a single auditory event; its perceived spatial location is dominated by the location of the first-arriving sound. The lagging sound also affects the perceived location. However, its effect is suppressed by the first-arriving sound.

3D audio effects are a group of sound effects that manipulate the sound produced by stereo speakers, surround-sound speakers, speaker-arrays, or headphones. This frequently involves the virtual placement of sound sources anywhere in three-dimensional space, including behind, above or below the listener.

Sound localization is a listener's ability to identify the location or origin of a detected sound in direction and distance.

Stereophonic sound Method of sound reproduction using two audio channels

Stereophonic sound or, more commonly, stereo, is a method of sound reproduction that recreates a multi-directional, 3-dimensional audible perspective. This is usually achieved by using two independent audio channels through a configuration of two loudspeakers in such a way as to create the impression of sound heard from various directions, as in natural hearing.

In acoustics, the dummy head recording is a method of recording used to generate binaural recordings. The tracks are then listened to through headphones allowing for the listener to hear from the dummy’s perspective. The dummy head is designed to record multiple sounds at the same time enabling it to be exceptional at recording music as well as in other industries where multiple sound sources are involved.

Stereo imaging refers to the aspect of sound recording and reproduction of stereophonic sound concerning the perceived spatial locations of the sound source(s), both laterally and in depth. An image is considered to be good if the location of the performers can be clearly identified; the image is considered to be poor if the location of the performers is difficult to locate. A well-made stereo recording, properly reproduced, can provide good imaging within the front quadrant.

Virtual acoustic space (VAS), also known as virtual auditory space, is a technique in which sounds presented over headphones appear to originate from any desired direction in space. The illusion of a virtual sound source outside the listener's head is created.

Microphone practice Microphone techniques used for recording audio

There are a number of well-developed microphone techniques used for recording musical, film, or voice sources or picking up sounds as part of sound reinforcement systems. The choice of technique depends on a number of factors, including:

This page focusses on decoding of classic first-order Ambisonics. Other relevant information is available on the Ambisonic reproduction systems page.

Holophonics is a binaural recording system created by Hugo Zuccarelli that is based on the claim that the human auditory system acts as an interferometer. It relies on phase variance, just like stereophonic sound. The sound characteristics of holophonics are most clearly heard through headphones, though they can be effectively demonstrated with two-channel stereo speakers, provided that they are phase-coherent. The word "holophonics" is related to "acoustic hologram".

Crossfeed is the process of blending the left and right channels of a stereo audio recording. It is generally used to reduce the extreme channel separation often featured in early stereo recordings, or to make audio played through headphones sound more natural, as when listening to a pair of external speakers.

Amplitude panning is a technique in sound engineering where the same sound signal is applied to a number of loudspeakers in different directions equidistant from the listener. Then, a virtual source appears to a direction that is dependent on amplitudes of the loudspeakers. The direction may not coincide with any physical sound source. Most typically amplitude panning has been used with stereophonic loudspeaker setup. However, it is increasingly used to position virtual sources to arbitrary loudspeaker setups.

Perceptual-based 3D sound localization is the application of knowledge of the human auditory system to develop 3D sound localization technology.

3D sound is most commonly defined as the daily human experience of sounds. The sounds arrive at the ears from every direction and varying distances, which contribute to the three-dimensional aural image humans hear. Scientists and engineers who work with 3D sound work to accurately synthesize the complexity of real-world sounds.


Transaural Stereo is a technology suite of analog circuits and digital signal processing algorithms related to the field of sound playback for audio communication and entertainment. It is based on the concept of crosstalk cancellation but in some versions can embody other processes such as binaural synthesis and equalization.

United Kingdom patent 394325 Seminal work on stereophonic sound by Alan Blumlein

The United Kingdom patent 394325 'Improvements in and relating to Sound-transmission, Sound-recording and Sound-reproducing Systems' is a fundamental work on stereophonic sound, written by Alan Blumlein in 1931 and published in 1933. The work exists only in the form of a patent and two accompanying memos addressed to Isaac Shoenberg. The text is exceptionally long for a patent of the period, having 70 numbered claims. It contains a brief summary of sound localization theory, a roadmap for introduction of surround sound in sound film and recording industry, and a description of Blumlein's inventions related to stereophony, notably the matrix processing of stereo signals, the Blumlein stereo microphone and the 45/45 mechanical recording system.

References

  1. Bock, T.M. and Keele, D.B. Jr., “The Effects of Interaural Crosstalk on Stereo Reproduction. and Minimizing Interaural Crosstalk in Nearfield Monitoring by the Use of a Physical Barrier,” AES Preprints 2420-A and 2420-B November 1986
  2. Glasgal, Ralph, “The Domestic Concert Hall,” Stereophile (magazine), July 1988
  3. 1 2 3 Robert E. (Robin) Miller III, "User Guide to VST plug-in Ambiophonic DSP," www.filmaker.com
  4. USpatent 4218585, Carver, Robert W.,"Dimensional sound producing apparatus and method",issued 1980-08-19
  5. "Archived copy". Archived from the original on 2010-12-01. Retrieved 2010-11-22.{{cite web}}: CS1 maint: archived copy as title (link)
  6. "Vintage Polk Audio Speakers: SDA 2".
  7. US 4,218,585
  8. "The History of Roland: Part 3".
  9. 1 2 Robert E. (Robin) Miller III, “Spatial Definition and the PanAmbiophone Microphone Array for 2D Surround & 3D Fully Periphonic Recording.” AES Preprint, Oct. 2004
  10. Instructions and free downloads available at http://www.ambiophonics.org/
  11. Glasgal, Ralph, “360 Degree Localization via 4.x RACE Processing,” AES Preprint 7301, Oct. 2007
  12. Farina, Angelo et al., “Ambiophonic Principles for the Recording and Reproduction of Surround Sound for Music. Spatial Sound Techniques, Part 2,” AES Anthology 2006
  13. Glasgal, Ralph, “Ambiophonics, 2nd Edition”
  14. Glasgal, Ralph, “The Ambiophone, Derivation of a Recording Methodology Optimized for Ambiophonic Reproduction,” AES 19th Conference, Schloss Elmau, June 2001
  15. Robert E. (Robin) Miller III, “Compatible PanAmbiophonic 4.1 and PerAmbiophonic 6.1 Surround Sound for Advanced Television-Beyond ITU 5.1,” SMPTE 144th Technical Conference, October 2002