Robotic voice effects

Last updated

Robotic voice effects became a recurring element in popular music starting in the second half of the twentieth century. Several methods of producing variations on this effect have arisen.

Contents

Vocoder

The vocoder was originally designed to aid in the transmission of voices over telephony systems. In musical applications the original sounds, either from vocals or from other sources such as instruments, are used and fed into a system of filters and noise generators. The input is fed through band-pass filters to separate the tonal characteristics which then trigger noise generators. The sounds generated are mixed back with some of the original sound and this gives the effect.

Vocoders have been used in an analog form from as early as 1959 at Siemens Studio for Electronic Music [1] [2] but were made more famous after Robert Moog developed one of the first solid-state musical vocoders. [3]

In 1970 Wendy Carlos and Robert Moog built another musical vocoder, a 10-band device inspired by the vocoder designs of Homer Dudley which was later referred to simply as a vocoder.

Carlos and Moog's vocoder was featured in several recordings, including the soundtrack to Stanley Kubrick's A Clockwork Orange for the vocal part of Beethoven's "Ninth Symphony" and a piece called "Timesteps". [4] In 1974 Isao Tomita used a Moog vocoder on a classical music album, Snowflakes are Dancing , which became a worldwide success. [5] Since then they have been widely used by artists such as: Kraftwerk's album Autobahn (1974); The Alan Parsons Project's track "The Raven" ( Tales of Mystery and Imagination album 1976); Electric Light Orchestra on "Mr. Blue Sky" and "Sweet Talkin' Woman" ( Out of the Blue album 1977) using EMS Vocoder 2000's.

Other examples include Pink Floyd's album Animals , where the band put the sound of a barking dog through the device, and the Styx song "Mr. Roboto". Vocoders have appeared on pop recordings from time to time ever since, most often simply as a special effect rather than a featured aspect of the work. Many experimental electronic artists of the new-age music genre often utilize the vocoder in a more comprehensive manner in specific works, such as Jean Michel Jarre on Zoolook (1984), Mike Oldfield on QE2 (1980) and Five Miles Out (1982). There are also some artists who have made vocoders an essential part of their music, overall or during an extended phase, such as the German synthpop group Kraftwerk, or the jazz-infused metal band Cynic.

Other examples

Though the vocoder is by far the best-known, the following other pieces of music technology are often confused with it:

Sonovox
This was an early version of the talk box invented by Gilbert Wright in 1939. It worked by placing two loudspeakers over the larynx and as the speakers transmitted sounds up the throat, the performer would silently articulate words which would in turn make the sounds seem to "speak." It was used to create the voice of the piano in the Sparky's Magic Piano series from 1947, many musical instruments in Rusty in Orchestraville, and as the voice of Casey the Train in the films Dumbo and The Reluctant Dragon [ citation needed ]. Radio jingle companies PAMS and JAM Creative Productions used the sonovox in many of the station IDs they produced.
Talk box
The talk box guitar effect was invented by Doug Forbes and popularized by Peter Frampton. In the talk box effect, amplified sound is actually fed via a tube into the performer's mouth and is then shaped by the performer's lip, tongue, and mouth movements before being picked up by a microphone. In contrast, the vocoder effect is produced entirely electronically. The background riff from "Sensual Seduction" by Snoop Dogg is a well-known example. "California Love" by 2Pac and Roger Troutman is a more recent recording featuring a talk box fed with a synthesizer instead of guitar. Steven Drozd of The Flaming Lips used the talk box on parts of the group's eleventh album, At War with the Mystics, to imitate some of Wayne Coyne's repeated lyrics in the "Yeah Yeah Yeah Song".
Pitch correction
The vocoder should also not be confused with the Antares Auto-Tune Pitch Correcting Plug-In, which can be used to achieve a robotic-sounding vocal effect by quantizing (removing smooth changes in) voice pitch or by adding pitch changes. The first such use in a commercial song was in 1998 on "Believe", a song by Cher, and the radical pitch changes became known as the 'Cher effect'. [6] This has been employed in recent years by artists such as Daft Punk (who also use vocoders and talk boxes), T-Pain, Kanye West, the Italian dance/pop group Eiffel 65, Japanese electropop acts Aira Mitsuki, Saori@destiny, Capsule, Meg and Perfume, and some Korean pop groups, most specifically 2NE1 and Big Bang.
Linear prediction coding
Linear prediction coding is also used as a musical effect (generally for cross-synthesis of musical timbres), but is not as popular as bandpass filter bank vocoders, and the musical use of the word vocoder refers exclusively to the latter type of device.
Ring modulator
Although ring modulation usually does not work well with melodic sounds, it can be used to make speech sound robotic. As an example, it has been used to robotify the voices of the Daleks in Dr Who.
Speech synthesis
Robotic voices in music may also be produced by speech synthesis. This does not usually create a "singing" effect (although it can). Speech synthesis means that, unlike in vocoding, no human speech is employed as basis. One example of such use is the song Das Boot by U96. A more tongue-in-cheek musical use of speech synthesis is MC Hawking. Most notably, Kraftwerk, who had previously used the vocoder extensively in their 1970s recordings, began opting for speech synthesis software in place of vocoders starting with 1981's Computer World album; on newer recordings and in the reworked versions of older songs that appear on The Mix and the band's current live show, the previously vocoder-processed vocals have been almost completely replaced by software-synthesized "singing".
Comb filter
A comb filter can be used to single out a few frequencies in the audio signal producing a sharp, resonating transformation of the voice. Comb filtering can be performed with a delay unit set to a high feedback level and delay time of less than a tenth of a second. Of the robot voice effects listed here, this one requires the least resources, since delay units are a staple of recording studios and sound editing software. As the effect deprives a voice of much of its musical qualities (and has few options for sound customization), the robotic delay is mostly used in TV/movie applications.

Related Research Articles

<span class="mw-page-title-main">Electronic musical instrument</span> Musical instrument that uses electronic circuits to generate sound

An electronic musical instrument or electrophone is a musical instrument that produces sound using electronic circuitry. Such an instrument sounds by outputting an electrical, electronic or digital audio signal that ultimately is plugged into a power amplifier which drives a loudspeaker, creating the sound heard by the performer and listener.

<span class="mw-page-title-main">Kraftwerk</span> German electronic music band

Kraftwerk is a German band formed in Düsseldorf in 1970 by Ralf Hütter and Florian Schneider. Widely considered innovators and pioneers of electronic music, Kraftwerk were among the first successful acts to popularize the genre. The group began as part of West Germany's experimental krautrock scene in the early 1970s before fully embracing electronic instrumentation, including synthesizers, drum machines, and vocoders. Wolfgang Flür joined the band in 1974 and Karl Bartos in 1975, expanding the band to a quartet.

<span class="mw-page-title-main">Vocoder</span> Voice encryption, transformation, and synthesis device

A vocoder is a category of speech coding that analyzes and synthesizes the human voice signal for audio data compression, multiplexing, voice encryption or voice transformation.

Speech synthesis is the artificial production of human speech. A computer system used for this purpose is called a speech synthesizer, and can be implemented in software or hardware products. A text-to-speech (TTS) system converts normal language text into speech; other systems render symbolic linguistic representations like phonetic transcriptions into speech. The reverse process is speech recognition.

<span class="mw-page-title-main">Sound effect</span> Artificially created or enhanced sound

A sound effect is an artificially created or enhanced sound, or sound process used to emphasize artistic or other content of films, television shows, live performance, animation, video games, music, or other media. Traditionally, in the twentieth century, they were created with foley. In motion picture and television production, a sound effect is a sound recorded and presented to make a specific storytelling or creative point without the use of dialogue or music. The term often refers to a process applied to a recording, without necessarily referring to the recording itself. In professional motion picture and television production, dialogue, music, and sound effects recordings are treated as separate elements. Dialogue and music recordings are never referred to as sound effects, even though the processes applied to such as reverberation or flanging effects, often are called "sound effects".

Time stretching is the process of changing the speed or duration of an audio signal without affecting its pitch. Pitch scaling is the opposite: the process of changing the pitch without affecting the speed. Pitch shift is pitch scaling implemented in an effects unit and intended for live performance. Pitch control is a simpler process which affects pitch and speed simultaneously by slowing down or speeding up a recording.

<span class="mw-page-title-main">Music technology (electronic and digital)</span>

Digital music technology encompasses digital instruments, computers, electronic effects units, software, or digital audio equipment by a performer, composer, sound engineer, DJ, or record producer to produce, perform or record music. The term refers to electronic devices, instruments, computer hardware, and software used in performance, playback, recording, composition, mixing, analysis, and editing of music.

<span class="mw-page-title-main">Analog synthesizer</span> Synthesizer that uses analog circuits

An analogsynthesizer is a synthesizer that uses analog circuits and analog signals to generate sound electronically.

<i>Computer World</i> 1981 studio album by Kraftwerk

Computer World is the eighth studio album by German electronic band Kraftwerk, released on 10 May 1981.

<i>The Mix</i> (Kraftwerk album) 1991 studio album by Kraftwerk

The Mix is the tenth studio album by the German electronic music band Kraftwerk. It was released on 11 June 1991 by Kling Klang and EMI in Europe and by Elektra Records in North America. It has entirely re-arranged and re-recorded versions of a selection of songs which had originally appeared on Kraftwerk's albums Autobahn (1974) to Electric Café (1986). Some of the songs, such as "The Robots" and "Radioactivity", have new additional melodies and/or lyrics.

Electronic Music Studios (EMS) is a synthesizer company formed in Putney, London in 1969 by Peter Zinovieff, Tristram Cary and David Cockerell. It is now based in Ladock, Cornwall.

<span class="mw-page-title-main">Talk box</span> Effects unit that allows musicians to modify the sound of a musical instrument

A talk box is an effects unit that allows musicians to modify the sound of a musical instrument by shaping the frequency content of the sound and to apply speech sounds onto the sounds of the instrument. Typically, a talk box directs sound from the instrument into the musician's mouth by means of a plastic tube adjacent to a vocal microphone. The musician controls the modification of the instrument's sound by changing the shape of the mouth, "vocalizing" the instrument's output into a microphone.

<span class="mw-page-title-main">Auto-Tune</span> Audio processor that alters pitch

Auto-Tune is an audio processor introduced in 1996 by the American company Antares Audio Technologies. It uses a proprietary device to measure and alter pitch in vocal and instrumental music recording and performances.

<span class="mw-page-title-main">Pitch correction</span> Technique for calibrating the pitch of an audio recording to match musical notes

Pitch correction is an electronic effects unit or audio software that changes the intonation of an audio signal so that all pitches will be notes from the equally tempered system. Pitch correction devices do this without affecting other aspects of its sound. Pitch correction first detects the pitch of an audio signal, then calculates the desired change and modifies the audio signal accordingly. The widest use of pitch corrector devices is in Western popular music on vocal lines.

<span class="mw-page-title-main">Synthesizer</span> Electronic musical instrument

A synthesizer is an electronic musical instrument that generates audio signals. Synthesizers typically create sounds by generating waveforms through methods including subtractive synthesis, additive synthesis and frequency modulation synthesis. These sounds may be altered by components such as filters, which cut or boost frequencies; envelopes, which control articulation, or how notes begin and end; and low-frequency oscillators, which modulate parameters such as pitch, volume, or filter characteristics affecting timbre. Synthesizers are typically played with keyboards or controlled by sequencers, software or other instruments, and may be synchronized to other equipment via MIDI.

Homer W. Dudley was a pioneering electronic and acoustic engineer who created the first electronic voice synthesizer for Bell Labs in the 1930s and led the development of a method of sending secure voice transmissions during World War Two. His awards include the Franklin Institute's Stuart Ballantine Medal (1965).

<span class="mw-page-title-main">Korg VC-10</span> Analogue vocoder

The Korg VC-10 is an analogue vocoder which Korg launched in 1978. Vocoding refers to voice encoding of speech and singing with musical synthesis. It gained popularity in the 1970s following utilisation by bands such as Kraftwerk and Electric Light Orchestra. The VC-10 allows basic functionality in operation and modulation of signal carriers. It has two microphone input options.

<span class="mw-page-title-main">Voder</span>

The Bell Telephone Laboratory's Voder was the first attempt to electronically synthesize human speech by breaking it down into its acoustic components. It was invented by Homer Dudley in 1937–1938 and developed on his earlier work on the vocoder. The quality of the speech was limited; however, it demonstrated the synthesis of the human voice, which became one component of the vocoder used in voice communications for security and to save bandwidth.

<span class="mw-page-title-main">WaveNet</span> Deep neural network for generating raw audio

WaveNet is a deep neural network for generating raw audio. It was created by researchers at London-based AI firm DeepMind. The technique, outlined in a paper in September 2016, is able to generate relatively realistic-sounding human-like voices by directly modelling waveforms using a neural network method trained with recordings of real speech. Tests with US English and Mandarin reportedly showed that the system outperforms Google's best existing text-to-speech (TTS) systems, although as of 2016 its text-to-speech synthesis still was less convincing than actual human speech. WaveNet's ability to generate raw waveforms means that it can model any kind of audio, including music.

References

  1. "Das Siemens-Studio für elektronische Musik von Alexander Schaaf und Helmut Klein" (in German). Deutsches Museum. Archived from the original on 2013-09-30.
  2. Siemens Electronic Music Studio in Deutsches Museum (multi part) (Video). Archived from the original on 2021-12-19.
      details of Siemens Electronic Music Studio, exhibited on Deutsches Museum.
  3. Harald Bode (October 1984). "History of Electronic Sound Modification". Journal of the Audio Engineering Society . 32 (10): 730–739.
  4. Spencer, Kristopher (2008). Film and television scores, 1950–1979 : a critical survey by genre. Jefferson, N.C.: McFarland & Co. ISBN   978-0-7864-3682-8.
  5. Mark Jenkins (2007). Analog synthesizers: from the legacy of Moog to software synthesis. Elsevier. pp. 133–4. ISBN   978-0-240-52072-8 . Retrieved 2011-05-27.
  6. Sound On Sound, February 1999. Sue Sillitoe. "Recording Cher's 'Believe'". Historical Footnote by Matt Bell: "Cher's 'Believe' (Dec 1998) was the first commercial recording to feature the audible side-effects of Antares Auto-tune software used as a deliberate creative effect... As most people are now all-too familiar with the 'Cher effect', as it became known..."