Wave field synthesis

Last updated
WFS Principle, as animation in external links Principle wfs 2.svg
WFS Principle, as animation in external links

Wave field synthesis (WFS) is a spatial audio rendering technique, characterized by creation of virtual acoustic environments. It produces artificial wavefronts synthesized by a large number of individually driven loudspeakers from elementary waves. Such wavefronts seem to originate from a virtual starting point, the virtual sound source. Contrary to traditional phantom sound sources, the localization of WFS established virtual sound sources does not depend on the listener's position. Like as a genuine sound source the virtual source remains at fixed starting point.

Contents

Physical fundamentals

WFS is based on the Huygens–Fresnel principle, which states that any wavefront can be regarded as a superposition of spherical elementary waves. Therefore, any wavefront can be synthesized from such elementary waves. In practice, a computer controls a large array of individual loudspeakers and actuates each one exactly by the time and level, at which the desired virtual wavefront would pass through its point. By that way from a mono signal source a genuine wave front of a sound source may by restored.

The basic procedure was developed in 1988 by Professor A.J. Berkhout at the Delft University of Technology. [1] Its mathematical basis is the Kirchhoff–Helmholtz integral. It states that the sound pressure is completely determined within a volume free of sources, if sound pressure and velocity are determined in all points on its surface.

Therefore, any sound field can be reconstructed, if sound pressure and acoustic velocity are restored on all points of the surface of its volume. This approach is the underlying principle of holophony.

For reproduction, the entire surface of the volume would have to be covered with closely spaced loudspeakers, each individually driven with its own signal. Moreover, the listening area would have to be anechoic, in order to avoid sound reflections that would violate source-free volume assumption. In practice, this is hardly feasible. Because our acoustic perception is most exact in the horizontal plane, practical approaches generally reduce the array to a horizontal loudspeaker line, circle or rectangle around the listener. So origin of the synthesized wavefront restrict at any point on the horizontal plane of the loudspeakers. Real 3D audio is not possible with such loudspeaker rows. For sources behind the loudspeakers, the array will produce convex wavefronts. Sources in front of the speakers can be rendered by concave wavefronts that focus in the virtual source inside playback area and diverge again as convex wave. Hence the reproduction inside the volume is incomplete - it breaks down if the listener is situated between the speakers and the virtual source.

Procedural advantages

If overcome the restriction to the horizontal plane, it becomes possible to establish a virtual copy of a genuine sound field indistinguishable from the real sound field. Changes of the listener position in the rendition area produce the same impression as an appropriate change of location in the recording room. Two dimensionally arrays can establish parallel wavefronts, which are direct at the loudspeakers not louder as in some meter distance. The horizontal arrays can only produce cylinder waves, which lose 3 dB level at any doubling of distance. But already with that restriction the Listeners at wave field synthesis are no longer relegated to a sweet spot area within the room.

The Moving Picture Expert Group standardized the object-oriented transmission standard MPEG-4 which allows a separate transmission of content (dry recorded audio signal) and form (the impulse response or the acoustic model). Each virtual acoustic source needs its own (mono) audio channel. The spatial sound field in the recording room consists of the direct wave of the acoustic source and a spatially distributed pattern of mirror acoustic sources caused by the reflections by the room surfaces. Reducing that spatial mirror source distribution onto a few transmitting channels causes a significant loss of spatial information. This spatial distribution can be synthesized much more accurately by the rendition side.

Compared to conventional channel-orientated rendition procedures, WFS provides a clear advantage: Virtual acoustic sources guided by the signal content of the associated channels can be positioned far beyond the conventional material rendition area. This reduces the influence of the listener position because the relative changes in angles and levels are clearly smaller compared to conventional loudspeakers located within the rendition area. This extends the sweet spot considerably; it can now cover nearly the entire rendition area. WFS thus is not only compatible with, but potentially improves the reproduction for conventional channel-oriented methods.

Challenges

Sensitivity to room acoustics

Since WFS attempts to simulate the acoustic characteristics of the recording space, the acoustics of the rendition area must be suppressed. One possible solution is use of acoustic damping or to otherwise arrange the walls in an absorbing and non-reflective configuration. A second possibility is playback within the near field. For this to work effectively the loudspeakers must couple very closely at the hearing zone or the diaphragm surface must be very large.

In some cases, the most perceptible difference compared to the original sound field is the reduction of the sound field to two dimensions along the horizontal of the loudspeaker lines. This is particularly noticeable for reproduction of ambiance. The suppression of acoustics in the rendition area does not complement playback of natural acoustic ambient sources.

Aliasing

There are undesirable spatial aliasing distortions caused by position-dependent narrow-band break-downs in the frequency response within the rendition range. Their frequency depends on the angle of the virtual acoustic source and on the angle of the listener to the loudspeaker arrangement:

For aliasing-free rendition in the entire audio range a distance of the single emitters below 2 cm would be necessary. But fortunately, our ear is not particularly sensitive to spatial aliasing. A 10–15 cm emitter distance is generally sufficient. [2]

Truncation effect

Another cause for disturbance of the spherical wavefront is the truncation effect. Because the resulting wavefront is a composite of elementary waves, a sudden change of pressure can occur if no further speakers deliver elementary waves where the speaker row ends. This causes a 'shadow-wave' effect. For virtual acoustic sources placed in front of the loudspeaker arrangement, this pressure change hurries ahead of the actual wavefront whereby it becomes clearly audible.

In signal processing terms, this is spectral leakage in the spatial domain and is caused by application of a rectangular function as a window function on what would otherwise be an infinite array of speakers. The shadow wave can be reduced if the volume of the outer loudspeakers is reduced; this corresponds to using a different window function that tapers off instead of being truncated.

High cost

A further and resultant problem is high cost. A large number of individual transducers must be very close together. Reducing the number of transducers by increasing their spacing introduces spatial aliasing artifacts. Reducing the number of transducers at a given spacing reduces the size of the emitter field and limits the representation range; outside of its borders no virtual acoustic sources can be produced.

Research and market maturity

2-dimensional placement of wavefront synthesis speaker arrays. Wavefront synthesis speaker array 2, Eindhoven, 2006-11-13.jpg
2-dimensional placement of wavefront synthesis speaker arrays.

Early development of WFS began 1988 at Delft University.[ citation needed ] Further work was carried out from January 2001 to June 2003 in the context of the CARROUSO project by the European Union which included ten institutes.[ citation needed ] The WFS sound system IOSONO was developed by the Fraunhofer Institute for digital media technology (IDMT) by the Technical University of Ilmenau in 2004.

The first live WFS transmission took place in July 2008, recreating an organ recital at Cologne Cathedral in lecture hall 104 of the Technical University of Berlin. [3] The room contains the world's largest speaker system with 2700 loudspeakers on 832 independent channels.

Research trends in wave field synthesis include the consideration of psychoacoustics to reduce the necessary number of loudspeakers, and to implement complicated sound radiation properties so that a virtual grand piano sounds as grand as in real life. [4] [5] [6]

A practical breakthrough of WFS technology only came with the X1 modules from the Berlin-based technology company HOLOPLOT. The startup had abandoned the usual restriction to a horizontal plane and installed 96 individually controlled loudspeakers as a sound transducer surface in a modular system. Optimized according to WFS principles, the beams are thus able to deliver sound very evenly to large, arbitrarily shaped audience areas, even simultaneously with beams of different content. Because reflective surfaces are not hit unintentionally, there is hardly any reverberation even in highly reflective environments. The company's largest project to date is the MSG Sphere in Las Vegas. 167.000 individually controlled loudspeakers will combine the elementary waves into common wave fronts at the U2 opening concert at the end of September 2023.

See also

Related Research Articles

<span class="mw-page-title-main">Loudspeaker</span> Converts an electrical audio signal into a corresponding sound

A loudspeaker is an electroacoustic transducer that converts an electrical audio signal into a corresponding sound. A speaker system, also often simply referred to as a speaker or loudspeaker, comprises one or more such speaker drivers, an enclosure, and electrical connections possibly including a crossover network. The speaker driver can be viewed as a linear motor attached to a diaphragm which couples that motor's movement to motion of air, that is, sound. An audio signal, typically from a microphone, recording, or radio broadcast, is amplified electronically to a power level capable of driving that motor in order to reproduce the sound corresponding to the original unamplified electronic signal. This is thus the opposite function to the microphone; indeed the dynamic speaker driver, by far the most common type, is a linear motor in the same basic configuration as the dynamic microphone which uses such a motor in reverse, as a generator.

<span class="mw-page-title-main">Head-related transfer function</span> Response that characterizes how an ear receives a sound from a point in space

A head-related transfer function (HRTF), also known as a head shadow, is a response that characterizes how an ear receives a sound from a point in space. As sound strikes the listener, the size and shape of the head, ears, ear canal, density of the head, size and shape of nasal and oral cavities, all transform the sound and affect how it is perceived, boosting some frequencies and attenuating others. Generally speaking, the HRTF boosts frequencies from 2–5 kHz with a primary resonance of +17 dB at 2,700 Hz. But the response curve is more complex than a single bump, affects a broad frequency spectrum, and varies significantly from person to person.

<span class="mw-page-title-main">Ambisonics</span> Full-sphere surround sound format

Ambisonics is a full-sphere surround sound format: in addition to the horizontal plane, it covers sound sources above and below the listener.

<span class="mw-page-title-main">Surround sound</span> System with loudspeakers that surround the listener

Surround sound is a technique for enriching the fidelity and depth of sound reproduction by using multiple audio channels from speakers that surround the listener. Its first application was in movie theaters. Prior to surround sound, theater sound systems commonly had three screen channels of sound that played from three loudspeakers located in front of the audience. Surround sound adds one or more channels from loudspeakers to the side or behind the listener that are able to create the sensation of sound coming from any horizontal direction around the listener.

<span class="mw-page-title-main">Acoustical engineering</span> Branch of engineering dealing with sound and vibration

Acoustical engineering is the branch of engineering dealing with sound and vibration. It includes the application of acoustics, the science of sound and vibration, in technology. Acoustical engineers are typically concerned with the design, analysis and control of sound.

<span class="mw-page-title-main">Directional sound</span> Fields of sound which spread less than most transmissions

Directional Sound refers to the notion of using various devices to create fields of sound which spread less than most (small) traditional loudspeakers. Several techniques are available to accomplish this, and each has its benefits and drawbacks. Ultimately, choosing a directional sound device depends greatly on the environment in which it is deployed as well as the content that will be reproduced. Keeping these factors in mind will yield the best results through any evaluation of directional sound technologies.

The precedence effect or law of the first wavefront is a binaural psychoacoustical effect concerning sound reflection and the perception of echoes. When two versions of the same sound are presented separated by a sufficiently short time delay, listeners perceive a single auditory event; its perceived spatial location is dominated by the location of the first-arriving sound. The lagging sound does also affect the perceived location; however, its effect is mostly suppressed by the first-arriving sound.

<span class="mw-page-title-main">Horn loudspeaker</span> Loudspeaker using an acoustic horn

A horn loudspeaker is a loudspeaker or loudspeaker element which uses an acoustic horn to increase the overall efficiency of the driving element(s). A common form (right) consists of a compression driver which produces sound waves with a small metal diaphragm vibrated by an electromagnet, attached to a horn, a flaring duct to conduct the sound waves to the open air. Another type is a woofer driver mounted in a loudspeaker enclosure which is divided by internal partitions to form a zigzag flaring duct which functions as a horn; this type is called a folded horn speaker. The horn serves to improve the coupling efficiency between the speaker driver and the air. The horn can be thought of as an "acoustic transformer" that provides impedance matching between the relatively dense diaphragm material and the less-dense air. The result is greater acoustic output power from a given driver.

3D audio effects are a group of sound effects that manipulate the sound produced by stereo speakers, surround-sound speakers, speaker-arrays, or headphones. This frequently involves the virtual placement of sound sources anywhere in three-dimensional space, including behind, above or below the listener.

Virtual surround is an audio system that attempts to create the perception that there are many more sources of sound than are actually present. In order to achieve this, it is necessary to devise some means of tricking the human auditory system into thinking that a sound is coming from somewhere that it is not. Most recent examples of such systems are designed to simulate the true (physical) surround sound experience using one, two or three loudspeakers. Such systems are popular among consumers who want to enjoy the experience of surround sound without the large number of speakers that are traditionally required to do so.

Sound from ultrasound is the name given here to the generation of audible sound from modulated ultrasound without using an active receiver. This happens when the modulated ultrasound passes through a nonlinear medium which acts, intentionally or unintentionally, as a demodulator.

This page focusses on decoding of classic first-order Ambisonics. Other relevant information is available on the Ambisonic reproduction systems page.

Distributed Mode Loudspeaker (DML) is a flat panel loudspeaker technology, developed by NXT, in which sound is produced by inducing uniformly distributed vibration modes in the panel through a special electro-acoustic exciter. Distributed mode loudspeakers function differently from most others, which typically produce sound by inducing pistonic motion in the diaphragm.

A parabolic loudspeaker is a loudspeaker which seeks to focus its sound in coherent plane waves either by reflecting sound output from a speaker driver to a parabolic reflector aimed at the target audience, or by arraying drivers on a parabolic surface. The resulting beam of sound travels farther, with less dissipation in air, than horn loudspeakers, and can be more focused than line array loudspeakers allowing sound to be sent to isolated audience targets. The parabolic loudspeaker has been used for such diverse purposes as directing sound at faraway targets in performing arts centers and stadia, for industrial testing, for intimate listening at museum exhibits, and as a sonic weapon.

Psychoacoustics is the branch of psychophysics involving the scientific study of sound perception and audiology—how human auditory system perceives various sounds. More specifically, it is the branch of science studying the psychological responses associated with sound. Psychoacoustics is an interdisciplinary field of many areas, including psychology, acoustics, electronic engineering, physics, biology, physiology, and computer science.

The sweet spot is a term used by audiophiles and recording engineers to describe the focal point between two speakers, where an individual is fully capable of hearing the stereo audio mix the way it was intended to be heard by the mixer. The sweet spot is the location which creates an equilateral triangle together with the stereo loudspeakers, the stereo triangle. In the case of surround sound, this is the focal point between four or more speakers, i.e., the location at which all wave fronts arrive simultaneously. In international recommendations the sweet spot is referred to as reference listening point.

Amplitude panning is a technique in sound engineering where the same sound signal is applied to a number of loudspeakers in different directions equidistant from the listener. Then, a virtual source appears to a direction that is dependent on amplitudes of the loudspeakers. The direction may not coincide with any physical sound source. Most typically amplitude panning has been used with stereophonic loudspeaker setup. However, it is increasingly used to position virtual sources to arbitrary loudspeaker setups.

3D sound reconstruction is the application of reconstruction techniques to 3D sound localization technology. These methods of reconstructing three-dimensional sound are used to recreate sounds to match natural environments and provide spatial cues of the sound source. They also see applications in creating 3D visualizations on a sound field to include physical aspects of sound waves including direction, pressure, and intensity. This technology is used in entertainment to reproduce a live performance through computer speakers. The technology is also used in military applications to determine location of sound sources. Reconstructing sound fields is also applicable to medical imaging to measure points in ultrasound.

3D sound is most commonly defined as the daily human experience of sounds. The sounds arrive to the ears from every direction and varying distances, which contribute to the three-dimensional aural image humans hear. Scientists and engineers who work with 3D sound work to accurately synthesize the complexity of real-world sounds.

Apparent source width (ASW) is the audible impression of a spatially extended sound source. This psychoacoustic impression results from the sound radiation characteristics of the source and the properties of the acoustic space into which it is radiating. Wide source widths are desired by listeners of music because these are associated with the sound of acoustic music, opera, classical music, and historically informed performance. Research concerning ASW comes from the field of room acoustics, architectural acoustics and auralization, as well as musical acoustics, psychoacoustics and systematic musicology.

References

  1. Brandenburg, Karlheinz; Brix, Sandra; Sporer, Thomas (2009). 2009 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video. pp. 1–4. doi:10.1109/3DTV.2009.5069680. ISBN   978-1-4244-4317-8. S2CID   22600136.
  2. "Audio Engineering Society Convention Paper, Spatial Aliasing Artifacts Produced by Linear and Circular Loudspeaker Arrays used for Wave Field Synthesis" (PDF). Retrieved 2012-02-03.
  3. "Birds on the wire – Olivier Messiaen's Livre du Saint Sacrément in the world's first wave field synthesis live transmission (technical project report)" (PDF). 2008. Retrieved 2013-03-27.
  4. Ziemer, Tim (2018). "Wave Field Synthesis". In Bader, Rolf (ed.). Springer Handbook of Systematic Musicology. Springer Handbooks. Berlin / Heidelberg: Springer. pp. 329–347. doi:10.1007/978-3-662-55004-5_18. ISBN   978-3-662-55004-5.
  5. Ziemer, Tim (2017). "Source Width in Music Production. Methods in Stereo, Ambisonics, and Wave Field Synthesis". In Schneider, Albrecht (ed.). Studies in Musical Acoustics and Psychoacoustics. Current Research in Systematic Musicology. Vol. 4. Cham: Springer. pp. 299–340. doi:10.1007/978-3-319-47292-8_10. ISBN   978-3-319-47292-8.
  6. Ziemer, Tim (2020). Psychoacoustic Music Sound Field Synthesis. Current Research in Systematic Musicology. Vol. 7. Cham: Springer International Publishing. doi:10.1007/978-3-030-23033-3. ISBN   978-3-030-23033-3.

Further reading