Wave field synthesis

Last updated November 15, 2024

Wave field synthesis (WFS) is a spatial audio rendering technique, characterized by creation of virtual acoustic environments. It produces artificial wavefronts synthesized by a large number of individually driven loudspeakers from elementary waves. Such wavefronts seem to originate from a virtual starting point, the virtual sound source. Contrary to traditional phantom sound sources, the localization of WFS established virtual sound sources does not depend on the listener's position. Like as a genuine sound source the virtual source remains at fixed starting point.

Physical fundamentals

WFS is based on the Huygens–Fresnel principle, which states that any wavefront can be regarded as a superposition of spherical elementary waves. Therefore, any wavefront can be synthesized from such elementary waves. In practice, a computer controls a large array of individual loudspeakers and actuates each one exactly by the time and level, at which the desired virtual wavefront would pass through its point. By that way from a mono signal source a genuine wave front of a sound source may by restored.

The basic procedure was developed in 1988 by Professor A.J. Berkhout at the Delft University of Technology.^[1] Its mathematical basis is the Kirchhoff–Helmholtz integral. It states that the sound pressure is completely determined within a volume free of sources, if sound pressure and velocity are determined in all points on its surface.

{\boldsymbol {P}}(w,z)=\iint _{dA}\left(G(w,z\vert z'){\frac {\partial }{\partial n}}P(w,z')-P(w,z'){\frac {\partial }{\partial n}}G(w,z\vert z')\right)dz'

Therefore, any sound field can be reconstructed, if sound pressure and acoustic velocity are restored on all points of the surface of its volume. This approach is the underlying principle of holophony.

For reproduction, the entire surface of the volume would have to be covered with closely spaced loudspeakers, each individually driven with its own signal. Moreover, the listening area would have to be anechoic, in order to avoid sound reflections that would violate source-free volume assumption. In practice, this is hardly feasible. Because our acoustic perception is most exact in the horizontal plane, practical approaches generally reduce the array to a horizontal loudspeaker line, circle or rectangle around the listener. So origin of the synthesized wavefront restrict at any point on the horizontal plane of the loudspeakers. Real 3D audio is not possible with such loudspeaker rows. For sources behind the loudspeakers, the array will produce convex wavefronts. Sources in front of the speakers can be rendered by concave wavefronts that focus in the virtual source inside playback area and diverge again as convex wave. Hence the reproduction inside the volume is incomplete - it breaks down if the listener is situated between the speakers and the virtual source.

Procedural advantages

If overcome the restriction to the horizontal plane, it becomes possible to establish a virtual copy of a genuine sound field indistinguishable from the real sound field. Changes of the listener position in the rendition area produce the same impression as an appropriate change of location in the recording room. Two dimensionally arrays can establish parallel wavefronts, which are direct at the loudspeakers not louder as in some meter distance. The horizontal arrays can only produce cylinder waves, which lose 3 dB level at any doubling of distance. But already with that restriction the Listeners at wave field synthesis are no longer relegated to a sweet spot area within the room.

The Moving Picture Expert Group standardized the object-oriented transmission standard MPEG-4 which allows a separate transmission of content (dry recorded audio signal) and form (the impulse response or the acoustic model). Each virtual acoustic source needs its own (mono) audio channel. The spatial sound field in the recording room consists of the direct wave of the acoustic source and a spatially distributed pattern of mirror acoustic sources caused by the reflections by the room surfaces. Reducing that spatial mirror source distribution onto a few transmitting channels causes a significant loss of spatial information. This spatial distribution can be synthesized much more accurately by the rendition side.

Compared to conventional channel-orientated rendition procedures, WFS provides a clear advantage: Virtual acoustic sources guided by the signal content of the associated channels can be positioned far beyond the conventional material rendition area. This reduces the influence of the listener position because the relative changes in angles and levels are clearly smaller compared to conventional loudspeakers located within the rendition area. This extends the sweet spot considerably; it can now cover nearly the entire rendition area. WFS thus is not only compatible with, but potentially improves the reproduction for conventional channel-oriented methods.

Challenges

Sensitivity to room acoustics

Since WFS attempts to simulate the acoustic characteristics of the recording space, the acoustics of the rendition area must be suppressed. One possible solution is use of acoustic damping or to otherwise arrange the walls in an absorbing and non-reflective configuration. A second possibility is playback within the near field. For this to work effectively the loudspeakers must couple very closely at the hearing zone or the diaphragm surface must be very large.

In some cases, the most perceptible difference compared to the original sound field is the reduction of the sound field to two dimensions along the horizontal of the loudspeaker lines. This is particularly noticeable for reproduction of ambiance. The suppression of acoustics in the rendition area does not complement playback of natural acoustic ambient sources.

Aliasing

There are undesirable spatial aliasing distortions caused by position-dependent narrow-band break-downs in the frequency response within the rendition range. Their frequency depends on the angle of the virtual acoustic source and on the angle of the listener to the loudspeaker arrangement:

f_{\text{alias}}={\frac {c}{\Delta x\left|\sin \Theta ^{\text{sec}}-\sin \Theta ^{\text{v}}\right|}}

For aliasing-free rendition in the entire audio range a distance of the single emitters below 2 cm would be necessary. But fortunately, our ear is not particularly sensitive to spatial aliasing. A 10–15 cm emitter distance is generally sufficient.^[2]

Truncation effect

Another cause for disturbance of the spherical wavefront is the truncation effect. Because the resulting wavefront is a composite of elementary waves, a sudden change of pressure can occur if no further speakers deliver elementary waves where the speaker row ends. This causes a 'shadow-wave' effect. For virtual acoustic sources placed in front of the loudspeaker arrangement, this pressure change hurries ahead of the actual wavefront whereby it becomes clearly audible.

In signal processing terms, this is spectral leakage in the spatial domain and is caused by application of a rectangular function as a window function on what would otherwise be an infinite array of speakers. The shadow wave can be reduced if the volume of the outer loudspeakers is reduced; this corresponds to using a different window function that tapers off instead of being truncated.

High cost

A further and resultant problem is high cost. A large number of individual transducers must be very close together. Reducing the number of transducers by increasing their spacing introduces spatial aliasing artifacts. Reducing the number of transducers at a given spacing reduces the size of the emitter field and limits the representation range; outside of its borders no virtual acoustic sources can be produced.

Research and market maturity

Early development of WFS began 1988 at Delft University.^{[ citation needed ]} Further work was carried out from January 2001 to June 2003 in the context of the CARROUSO project by the European Union which included ten institutes.^{[ citation needed ]} The WFS sound system IOSONO was developed by the Fraunhofer Institute for digital media technology (IDMT) by the Technische Universität Ilmenau in 2004.

The first live WFS transmission took place in July 2008, recreating an organ recital at Cologne Cathedral in lecture hall 104 of Technische Universität Berlin.^[3] The room contains the world's largest speaker system with 2700 loudspeakers on 832 independent channels.

Research trends in wave field synthesis include the consideration of psychoacoustics to reduce the necessary number of loudspeakers, and to implement complicated sound radiation properties so that a virtual grand piano sounds as grand as in real life.^[4]^[5]^[6]

A practical breakthrough of WFS technology only came with the X1 modules from the Berlin-based technology company Holoplot. The startup eschewed the usual restriction to a horizontal plane and installed 96 individually controlled speaker drivers in a modular system. Optimized according to WFS principles, the beams are able to deliver sound very evenly to large, arbitrarily shaped audience areas, even simultaneously with beams of different content. Because reflective surfaces are not hit unintentionally, there is hardly any reverberation even in highly reflective environments. The company's largest project to date is the Sphere in the Las Vegas Valley. The venue's sound system is made of 1,586 permanently installed X1 Matrix Arrays comprising 167,000 speaker drivers, and it will combines elementary waves into common wave fronts.

Related Research Articles

<span class="mw-page-title-main">Loudspeaker</span> Converts an electrical audio signal into a corresponding sound

A loudspeaker is a combination of one or more speaker drivers, an enclosure, and electrical connections. The speaker driver is an electroacoustic transducer that converts an electrical audio signal into a corresponding sound.

<span class="mw-page-title-main">Head-related transfer function</span> Response that characterizes how an ear receives a sound from a point in space

A head-related transfer function (HRTF) is a response that characterizes how an ear receives a sound from a point in space. As sound strikes the listener, the size and shape of the head, ears, ear canal, density of the head, size and shape of nasal and oral cavities, all transform the sound and affect how it is perceived, boosting some frequencies and attenuating others. Generally speaking, the HRTF boosts frequencies from 2–5 kHz with a primary resonance of +17 dB at 2,700 Hz. But the response curve is more complex than a single bump, affects a broad frequency spectrum, and varies significantly from person to person.

<span class="mw-page-title-main">Ambisonics</span> Full-sphere surround sound format

Ambisonics is a full-sphere surround sound format: in addition to the horizontal plane, it covers sound sources above and below the listener.

Surround sound is a technique for enriching the fidelity and depth of sound reproduction by using multiple audio channels from speakers that surround the listener. Its first application was in movie theaters. Prior to surround sound, theater sound systems commonly had three screen channels of sound that played from three loudspeakers located in front of the audience. Surround sound adds one or more channels from loudspeakers to the side or behind the listener that are able to create the sensation of sound coming from any horizontal direction around the listener.

Directional Sound refers to the notion of using various devices to create fields of sound which spread less than most (small) traditional loudspeakers. Several techniques are available to accomplish this, and each has its benefits and drawbacks. Ultimately, choosing a directional sound device depends greatly on the environment in which it is deployed as well as the content that will be reproduced. Keeping these factors in mind will yield the best results through any evaluation of directional sound technologies.

The precedence effect or law of the first wavefront is a binaural psychoacoustical effect concerning sound reflection and the perception of echoes. When two versions of the same sound presented are separated by a sufficiently short time delay, listeners perceive a single auditory event; its perceived spatial location is dominated by the location of the first-arriving sound. The lagging sound does also affect the perceived location; however, its effect is mostly suppressed by the first-arriving sound.

<span class="mw-page-title-main">Horn loudspeaker</span> Loudspeaker using an acoustic horn

A horn loudspeaker is a loudspeaker or loudspeaker element which uses an acoustic horn to increase the overall efficiency of the driving element(s). A common form (right) consists of a compression driver which produces sound waves with a small metal diaphragm vibrated by an electromagnet, attached to a horn, a flaring duct to conduct the sound waves to the open air. Another type is a woofer driver mounted in a loudspeaker enclosure which is divided by internal partitions to form a zigzag flaring duct which functions as a horn; this type is called a folded horn speaker. The horn serves to improve the coupling efficiency between the speaker driver and the air. The horn can be thought of as an "acoustic transformer" that provides impedance matching between the relatively dense diaphragm material and the less-dense air. The result is greater acoustic output power from a given driver.

3D audio effects are a group of sound effects that manipulate the sound produced by stereo speakers, surround-sound speakers, speaker-arrays, or headphones. This frequently involves the virtual placement of sound sources anywhere in three-dimensional space, including behind, above or below the listener.

Sound localization is a listener's ability to identify the location or origin of a detected sound in direction and distance.

Virtual surround is an audio system that attempts to create the perception that there are many more sources of sound than are actually present. In order to achieve this, it is necessary to devise some means of tricking the human auditory system into thinking that a sound is coming from somewhere that it is not. Most recent examples of such systems are designed to simulate the true (physical) surround sound experience using one, two or three loudspeakers. Such systems are popular among consumers who want to enjoy the experience of surround sound without the large number of speakers that are traditionally required to do so.

<span class="mw-page-title-main">Loudspeaker enclosure</span> Acoustical component

A loudspeaker enclosure or loudspeaker cabinet is an enclosure in which speaker drivers and associated electronic hardware, such as crossover circuits and, in some cases, power amplifiers, are mounted. Enclosures may range in design from simple, homemade DIY rectangular particleboard boxes to very complex, expensive computer-designed hi-fi cabinets that incorporate composite materials, internal baffles, horns, bass reflex ports and acoustic insulation. Loudspeaker enclosures range in size from small "bookshelf" speaker cabinets with 4-inch (10 cm) woofers and small tweeters designed for listening to music with a hi-fi system in a private home to huge, heavy subwoofer enclosures with multiple 18-inch (46 cm) or even 21-inch (53 cm) speakers in huge enclosures which are designed for use in stadium concert sound reinforcement systems for rock music concerts.

This page focusses on decoding of classic first-order Ambisonics. Other relevant information is available on the Ambisonic reproduction systems page.

Ambiophonics is a method in the public domain that employs digital signal processing (DSP) and two loudspeakers directly in front of the listener in order to improve reproduction of stereophonic and 5.1 surround sound for music, movies, and games in home theaters, gaming PCs, workstations, or studio monitoring applications. First implemented using mechanical means in 1986, today a number of hardware and VST plug-in makers offer Ambiophonic DSP. Ambiophonics eliminates crosstalk inherent in the conventional stereo triangle speaker placement, and thereby generates a speaker-binaural soundfield that emulates headphone-binaural sound, and creates for the listener improved perception of reality of recorded auditory scenes. A second speaker pair can be added in back in order to enable 360° surround sound reproduction. Additional surround speakers may be used for hall ambience, including height, if desired.

Distributed Mode Loudspeaker (DML) is a flat-panel loudspeaker technology, developed by NXT, in which sound is produced by inducing uniformly distributed vibration modes in the panel through a special electro-acoustic exciter. Distributed mode loudspeakers function differently from most others, which typically produce sound by inducing pistonic motion in the diaphragm.

A parabolic loudspeaker is a loudspeaker which seeks to focus its sound in coherent plane waves either by reflecting sound output from a speaker driver to a parabolic reflector aimed at the target audience, or by arraying drivers on a parabolic surface. The resulting beam of sound travels farther, with less dissipation in air, than horn loudspeakers, and can be more focused than line array loudspeakers allowing sound to be sent to isolated audience targets. The parabolic loudspeaker has been used for such diverse purposes as directing sound at faraway targets in performing arts centers and stadia, for industrial testing, for intimate listening at museum exhibits, and as a sonic weapon.

Psychoacoustics is the branch of psychophysics involving the scientific study of the perception of sound by the human auditory system. It is the branch of science studying the psychological responses associated with sound including noise, speech, and music. Psychoacoustics is an interdisciplinary field including psychology, acoustics, electronic engineering, physics, biology, physiology, and computer science.

The sweet spot is a term used by audiophiles and recording engineers to describe the focal point between two speakers, where an individual is fully capable of hearing the stereo audio mix the way it was intended to be heard by the mixer. The sweet spot is the location which creates an equilateral triangle together with the stereo loudspeakers, the stereo triangle. In the case of surround sound, this is the focal point between four or more speakers, i.e., the location at which all wave fronts arrive simultaneously. In international recommendations the sweet spot is referred to as reference listening point.

3D sound reconstruction is the application of reconstruction techniques to 3D sound localization technology. These methods of reconstructing three-dimensional sound are used to recreate sounds to match natural environments and provide spatial cues of the sound source. They also see applications in creating 3D visualizations on a sound field to include physical aspects of sound waves including direction, pressure, and intensity. This technology is used in entertainment to reproduce a live performance through computer speakers. The technology is also used in military applications to determine location of sound sources. Reconstructing sound fields is also applicable to medical imaging to measure points in ultrasound.

3D sound is most commonly defined as the sounds of everyday human experience. Sound arrives at the ears from every direction and distance, which contribute to the three-dimensional aural image of what humans hear. Scientists and engineers who work with 3D sound work to accurately synthesize the complexity of real-world sounds.

Apparent source width (ASW) is the audible impression of a spatially extended sound source. This psychoacoustic impression results from the sound radiation characteristics of the source and the properties of the acoustic space into which it is radiating. Wide source widths are desired by listeners of music because these are associated with the sound of acoustic music, opera, classical music, and historically informed performance. Research concerning ASW comes from the field of room acoustics, architectural acoustics and auralization, as well as musical acoustics, psychoacoustics and systematic musicology.

References

↑ Brandenburg, Karlheinz; Brix, Sandra; Sporer, Thomas (2009). 2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video. pp. 1–4. doi:10.1109/3DTV.2009.5069680. ISBN 978-1-4244-4317-8. S2CID 22600136.
↑ "Audio Engineering Society Convention Paper, Spatial Aliasing Artifacts Produced by Linear and Circular Loudspeaker Arrays used for Wave Field Synthesis" (PDF). Retrieved 2012-02-03.
↑ "Birds on the wire – Olivier Messiaen's Livre du Saint Sacrément in the world's first wave field synthesis live transmission (technical project report)" (PDF). 2008. Retrieved 2013-03-27.
↑ Ziemer, Tim (2018). "Wave Field Synthesis". In Bader, Rolf (ed.). Springer Handbook of Systematic Musicology. Springer Handbooks. Berlin / Heidelberg: Springer. pp. 329–347. doi:10.1007/978-3-662-55004-5_18. ISBN 978-3-662-55004-5.
↑ Ziemer, Tim (2017). "Source Width in Music Production. Methods in Stereo, Ambisonics, and Wave Field Synthesis". In Schneider, Albrecht (ed.). Studies in Musical Acoustics and Psychoacoustics. Current Research in Systematic Musicology. Vol. 4. Cham: Springer. pp. 299–340. doi:10.1007/978-3-319-47292-8_10. ISBN 978-3-319-47292-8.
↑ Ziemer, Tim (2020). Psychoacoustic Music Sound Field Synthesis. Current Research in Systematic Musicology. Vol. 7. Cham: Springer International Publishing. doi:10.1007/978-3-030-23033-3. ISBN 978-3-030-23033-3.

External links

Photo of wave field synthesis installation
Perceptual Differences Between Wavefield Synthesis and Stereophony by Helmut Wittek
Inclusion of the playback room properties into the synthesis for WFS - Holophony
Wave Field Synthesis – A Promising Spatial Audio Rendering Concept by Günther Theile/(IRT)
Wave Field Synthesis at IRCAM
Wave Field Synthesis at the University of Erlangen-Nuremberg
Wavefield Generator build by HOLOPLOT Germany
The theory of wave field synthesis revisited. S. Spors, R. Rabenstein, and J. Ahrens. In 124th AES Convention, May 2008.
Sound Reproduction by Wave Field Synthesis (Thesis, 1997) by Edwin Verheijen
/Sphere Unveils State-of-the-Art Immersive Audio, Pollstar News 07/24/2023
Wave field synthesis Animation ( 60 sec.)

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Brandenburg, Karlheinz; Brix, Sandra; Sporer, Thomas (2009). 2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video. pp. 1–4. doi:10.1109/3DTV.2009.5069680. ISBN 978-1-4244-4317-8. S2CID 22600136.

[2] "Audio Engineering Society Convention Paper, Spatial Aliasing Artifacts Produced by Linear and Circular Loudspeaker Arrays used for Wave Field Synthesis" (PDF). Retrieved 2012-02-03.

[3] "Birds on the wire – Olivier Messiaen's Livre du Saint Sacrément in the world's first wave field synthesis live transmission (technical project report)" (PDF). 2008. Retrieved 2013-03-27.

[zie-4] Ziemer, Tim (2018). "Wave Field Synthesis". In Bader, Rolf (ed.). Springer Handbook of Systematic Musicology. Springer Handbooks. Berlin / Heidelberg: Springer. pp. 329–347. doi:10.1007/978-3-662-55004-5_18. ISBN 978-3-662-55004-5.

[width-5] Ziemer, Tim (2017). "Source Width in Music Production. Methods in Stereo, Ambisonics, and Wave Field Synthesis". In Schneider, Albrecht (ed.). Studies in Musical Acoustics and Psychoacoustics. Current Research in Systematic Musicology. Vol. 4. Cham: Springer. pp. 299–340. doi:10.1007/978-3-319-47292-8_10. ISBN 978-3-319-47292-8.

[soundfieldsynthesis-6] Ziemer, Tim (2020). Psychoacoustic Music Sound Field Synthesis. Current Research in Systematic Musicology. Vol. 7. Cham: Springer International Publishing. doi:10.1007/978-3-030-23033-3. ISBN 978-3-030-23033-3.

[1]

[2]

[3]

[4]

[5]

[6]