SIGSALY

Last updated
SIGSALY exhibit at the National Cryptologic Museum SIGSALY.jpg
SIGSALY exhibit at the National Cryptologic Museum

SIGSALY (also known as the X System, Project X, Ciphony I, and the Green Hornet) was a secure speech system used in World War II for the highest-level Allied communications. It pioneered a number of digital communications concepts, including the first transmission of speech using pulse-code modulation.

Contents

The name SIGSALY was not an acronym, but a cover name that resembled an acronym—the SIG part was common in Army Signal Corps names (e.g., SIGABA). [1] The prototype was called the "Green Hornet" after the radio show The Green Hornet , because it sounded like a buzzing hornet, resembling the show's theme tune, to anyone trying to eavesdrop on the conversation. [2]

Development

At the time of its inception, long-distance telephone communications used the "A-3" voice scrambler developed by Western Electric. It worked on the voice inversion principle. The Germans had a listening station on the Dutch coast which could intercept and break A-3 traffic. [1]

Although telephone scramblers were used by both sides in World War II, they were known not to be very secure in general, and both sides often cracked the scrambled conversations of the other. Inspection of the audio spectrum using a spectrum analyzer often provided significant clues to the scrambling technique. The insecurity of most telephone scrambler schemes led to the development of a more secure scrambler, based on the one-time pad principle.

A prototype was developed at Bell Telephone Laboratories, under the direction of A. B. Clark, assisted by British mathematician Alan Turing, [1] [3] and demonstrated to the US Army. The Army was impressed and awarded Bell Labs a contract for two systems in 1942. SIGSALY went into service in 1943 and remained in service until 1946.

Operation

SIGSALY used a random noise mask to encrypt voice conversations which had been encoded by a vocoder. The latter was used to minimize the amount of redundancy (which is high in voice traffic), in order to reduce the amount of information to be encrypted. [2]

The voice encoding used the fact that speech varies fairly slowly as the components of the throat move. The system extracts information about the voice signal 50 times a second (every 20 milliseconds). [4]

Next, each signal was sampled for its amplitude once every 20 milliseconds. [4] For the band amplitude signals, the amplitude converted into one of six amplitude levels, with values from 0 through 5. The amplitude levels were on a nonlinear scale, with the steps between levels wide at high amplitudes and narrower at low amplitudes. This scheme, known as "companding" or "compressing-expanding", exploits the fact that the fidelity of voice signals is more sensitive to low amplitudes than to high amplitudes. The pitch signal, which required greater sensitivity, was encoded by a pair of six-level values (one coarse, and one fine), giving thirty-six levels in all.

A cryptographic key, consisting of a series of random values from the same set of six levels, was subtracted from each sampled voice amplitude value to encrypt them before transmission. The subtraction was performed using modular arithmetic: a "wraparound" fashion, meaning that if there was a negative result, it was added to six to give a positive result. For example, if the voice amplitude value was 3 and the random value was 5, then the subtraction would work as follows:

— giving a value of 4.

The sampled value was then transmitted, with each sample level transmitted on one of six corresponding frequencies in a frequency band, a scheme known as "frequency-shift keying (FSK)". The receiving SIGSALY read the frequency values, converted them into samples, and added the key values back to them to decrypt them. The addition was also performed in a "modulo" fashion, with six subtracted from any value over five. To match the example above, if the receiving SIGSALY got a sample value of 4 with a matching random value of 5, then the addition would be as follows:

— which gives the correct value of 3.

To convert the samples back into a voice waveform, they were first turned back into the dozen low-frequency vocoded signals. An inversion of the vocoder process was employed, which included:

The noise values used for the encryption key were originally produced by large mercury-vapor rectifying vacuum tubes and stored on a phonograph record. The record was then duplicated, with the records being distributed to SIGSALY systems on both ends of a conversation. The records served as the SIGSALY one-time pad, and distribution was very strictly controlled (although if one had been seized, it would have been of little importance, since only one pair of each was ever produced). For testing and setup purposes, a pseudo-random number generating system made out of relays, known as the "threshing machine", was used.

The records were played on turntables, but since the timing – the clock synchronization – between the two SIGSALY terminals had to be precise, the turntables were by no means just ordinary record-players. The rotation rate of the turntables was carefully controlled, and the records were started at highly specific times, based on precision time-of-day clock standards. Since each record only provided 12 minutes of key, each SIGSALY had two turntables, with a second record "queued up" while the first was "playing".

Usage

A SIGSALY terminal in 1943. SIGSALY-1943.jpg
A SIGSALY terminal in 1943.

The SIGSALY terminal was massive, consisting of 40 racks of equipment. It weighed over 50 tons, and used about 30 kW of power, necessitating an air-conditioned room to hold it. Too big and cumbersome for general use, it was only used for the highest level of voice communications. [5]

A dozen SIGSALY terminal installations were eventually set up all over the world. The first was installed in the Pentagon building rather than the White House, which had an extension line, as the US President Franklin Roosevelt knew of the British Prime Minister Winston Churchill's insistence that he be able to call at any time of the day or night. The second was installed 60 metres (200 ft) below street level in the basement of Selfridges department store on Oxford Street, London, close to the US Embassy on Grosvenor Square. The first conference took place on 15 July 1943, and it was used by both General Dwight D. Eisenhower as the commander of SHAEF, and Churchill, before extensions were installed to the Embassy, 10 Downing Street and the Cabinet War Rooms. [1] One was installed in a ship and followed General Douglas MacArthur during his South Pacific campaigns. In total during WW2, the system supported about 3,000 high-level telephone conferences.

The installation and maintenance of all SIGSALY machines was undertaken by the specially formed and vetted members of the 805th Signal Service Company of the US Army Signal Corps. The system was cumbersome, but it worked very effectively. When the Allies invaded Germany, an investigative team discovered that the Germans had recorded significant amounts of traffic from the system, but had erroneously concluded that it was a complex telegraphic encoding system. [1] [ failed verification ]

Significance

SIGSALY has been credited with a number of "firsts"; this list is taken from (Bennett, 1983):

  1. The first realization of enciphered telephony
  2. The first quantized speech transmission
  3. The first transmission of speech by pulse-code modulation (PCM)
  4. The first use of companded PCM
  5. The first examples of multilevel frequency-shift keying (FSK)
  6. The first useful realization of speech bandwidth compression
  7. The first use of FSK-FDM (Frequency Shift Keying-Frequency Division Multiplex) as a viable transmission method over a fading medium
  8. The first use of a multilevel "eye pattern" to adjust the sampling intervals (a new, and important, instrumentation technique)

See also

Related Research Articles

<span class="mw-page-title-main">Amplitude modulation</span> Radio modulation via wave amplitude

Amplitude modulation (AM) is a modulation technique used in electronic communication, most commonly for transmitting messages with a radio wave. In amplitude modulation, the amplitude of the wave is varied in proportion to that of the message signal, such as an audio signal. This technique contrasts with angle modulation, in which either the frequency of the carrier wave is varied, as in frequency modulation, or its phase, as in phase modulation.

<span class="mw-page-title-main">Frequency modulation</span> Encoding of information in a carrier wave by varying the instantaneous frequency of the wave

Frequency modulation (FM) is the encoding of information in a carrier wave by varying the instantaneous frequency of the wave. The technology is used in telecommunications, radio broadcasting, signal processing, and computing.

In electronics and telecommunications, modulation is the process of varying one or more properties of a periodic waveform, called the carrier signal, with a separate signal called the modulation signal that typically contains information to be transmitted. For example, the modulation signal might be an audio signal representing sound from a microphone, a video signal representing moving images from a video camera, or a digital signal representing a sequence of binary digits, a bitstream from a computer.

<span class="mw-page-title-main">Single-sideband modulation</span> Type of modulation

In radio communications, single-sideband modulation (SSB) or single-sideband suppressed-carrier modulation (SSB-SC) is a type of modulation used to transmit information, such as an audio signal, by radio waves. A refinement of amplitude modulation, it uses transmitter power and bandwidth more efficiently. Amplitude modulation produces an output signal the bandwidth of which is twice the maximum frequency of the original baseband signal. Single-sideband modulation avoids this bandwidth increase, and the power wasted on a carrier, at the cost of increased device complexity and more difficult tuning at the receiver.

<span class="mw-page-title-main">Vocoder</span> Voice encryption, transformation, and synthesis device

A vocoder is a category of speech coding that analyzes and synthesizes the human voice signal for audio data compression, multiplexing, voice encryption or voice transformation.

<span class="mw-page-title-main">Baseband</span> Range of frequencies occupied by an unmodulated signal

In telecommunications and signal processing, baseband is the range of frequencies occupied by a signal that has not been modulated to higher frequencies. Baseband signals typically originate from transducers, converting some other variable into an electrical signal. For example, the electronic output of a microphone is a baseband signal that is analogous to the applied voice audio. In conventional analog radio broadcasting, the baseband audio signal is used to modulate an RF carrier signal of a much higher frequency.

<span class="mw-page-title-main">Delta modulation</span> Signal conversion technique

Delta modulation is an analog-to-digital and digital-to-analog signal conversion technique used for transmission of voice information where quality is not of primary importance. DM is the simplest form of differential pulse-code modulation (DPCM) where the difference between successive samples is encoded into n-bit data streams. In delta modulation, the transmitted data are reduced to a 1-bit data stream representing either up (↗) or down (↘). Its main features are:

<span class="mw-page-title-main">Eye pattern</span> Oscilloscope display of a digital data signal

In telecommunication, an eye pattern, also known as an eye diagram, is an oscilloscope display in which a digital signal from a receiver is repetitively sampled and applied to the vertical input (y-axis), while the data rate is used to trigger the horizontal sweep (x-axis). It is so called because, for several types of coding, the pattern looks like a series of eyes between a pair of rails. It is a tool for the evaluation of the combined effects of channel noise, dispersion and intersymbol interference on the performance of a baseband pulse-transmission system. The technique was first used with the WWII SIGSALY secure speech transmission system.

<span class="mw-page-title-main">Frequency-shift keying</span> Data communications modulation protocol

Frequency-shift keying (FSK) is a frequency modulation scheme in which digital information is encoded on a carrier signal by periodically shifting the frequency of the carrier between several discrete frequencies. The technology is used for communication systems such as telemetry, weather balloon radiosondes, caller ID, garage door openers, and low frequency radio transmission in the VLF and ELF bands. The simplest FSK is binary FSK (BFSK), in which the carrier is shifted between two discrete frequencies to transmit binary information.

In telecommunications, a scrambler is a device that transposes or inverts signals or otherwise encodes a message at the sender's side to make the message unintelligible at a receiver not equipped with an appropriately set descrambling device. Whereas encryption usually refers to operations carried out in the digital domain, scrambling usually refers to operations carried out in the analog domain. Scrambling is accomplished by the addition of components to the original signal or the changing of some important component of the original signal in order to make extraction of the original signal difficult. Examples of the latter might include removing or changing vertical or horizontal sync pulses in television signals; televisions will not be able to display a picture from such a signal. Some modern scramblers are actually encryption devices, the name remaining due to the similarities in use, as opposed to internal operation.

<span class="mw-page-title-main">Sideband</span> Radio communications concept

In radio communications, a sideband is a band of frequencies higher than or lower than the carrier frequency, that are the result of the modulation process. The sidebands carry the information transmitted by the radio signal. The sidebands comprise all the spectral components of the modulated signal except the carrier. The signal components above the carrier frequency constitute the upper sideband (USB), and those below the carrier frequency constitute the lower sideband (LSB). All forms of modulation produce sidebands.

This is an index of articles relating to electronics and electricity or natural electricity and things that run on electricity and things that use or conduct electricity.

The International Telecommunication Union uses an internationally agreed system for classifying radio frequency signals. Each type of radio emission is classified according to its bandwidth, method of modulation, nature of the modulating signal, and type of information transmitted on the carrier signal. It is based on characteristics of the signal, not on the transmitter used.

Harmonic Vector Excitation Coding, abbreviated as HVXC is a speech coding algorithm specified in MPEG-4 Part 3 standard for very low bit rate speech coding. HVXC supports bit rates of 2 and 4 kbit/s in the fixed and variable bit rate mode and sampling frequency of 8 kHz. It also operates at lower bitrates, such as 1.2 - 1.7 kbit/s, using a variable bit rate technique. The total algorithmic delay for the encoder and decoder is 36 ms.

<span class="mw-page-title-main">Secure voice</span> Encrypted voice communication

Secure voice is a term in cryptography for the encryption of voice communication over a range of communication types such as radio, telephone or IP.

Voice inversion scrambling is an analog method of obscuring the content of a transmission. It is sometimes used in public service radio, automobile racing, cordless telephones and the Family Radio Service. Without a descrambler, the transmission makes the speaker "sound like Donald Duck". Despite the term, the technique operates on the passband of the information and so can be applied to any information being transmitted.

Homer W. Dudley was an American pioneering electronic and acoustic engineer who created the first electronic voice synthesizer for Bell Labs in the 1930s and led the development of a method of sending secure voice transmissions during World War Two. His awards include the Franklin Institute's Stuart Ballantine Medal (1965).

<span class="mw-page-title-main">NXDN</span> Radio standard

NXDN stands for Next Generation Digital Narrowband, and is an open standard for public land mobile radio systems; that is, systems of two-way radios (transceivers) for bidirectional person-to-person voice communication. It was developed jointly by Icom Incorporated and Kenwood Corporation as an advanced digital system using FSK modulation that supports encrypted transmission and data as well as voice transmission. Like other land mobile systems, NXDN systems use the VHF and UHF frequency bands. It is also used as a niche mode in amateur radio.

Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the amplitude of the analog signal is sampled at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps.

References

  1. 1 2 3 4 5 The SIGSALY Story, by Patrick D. Weadon, National Security Agency/Central Security Service
  2. 1 2 "Vox Ex Machina". 99% Invisible. Retrieved 2022-03-28.
  3. Liat Clark; Ian Steadman. "Turing's achievements: codebreaking, AI and the birth of computer science" . Retrieved 12 February 2013.
  4. 1 2 Jon D. Paul. "Rebuilding a Piece of the First Digital Voice Scrambler". IEEE Spectrum. 2019.
  5. Boone, J.V.; Peterson, R.R. (July 2000). The Start of the Digital Revolution: SIGSALY, Secure Digital Voice Communications in World War II (pdf). National Security Agency.
Notes

Further reading