S/PDIF

Last updated

TOSLINK connector (JIS F05) TOS LINK clear cable.jpg
TOSLINK connector (JIS F05)

S/PDIF (Sony/Philips Digital Interface) [1] [2] is a type of digital audio interface used in consumer audio equipment to output audio over relatively short distances. The signal is transmitted over either a coaxial cable using RCA or BNC connectors, or a fiber-optic cable using TOSLINK connectors. S/PDIF interconnects components in home theaters and other digital high-fidelity systems.

Contents

S/PDIF is based on the AES3 interconnect standard. [3] S/PDIF can carry two channels of uncompressed PCM audio or compressed 5.1 surround sound (such as DTS audio codec or Dolby Digital codec); it cannot support lossless surround formats that require greater bandwidth. [4]

S/PDIF is a data link layer protocol as well as a set of physical layer specifications for carrying digital audio signals over either optical or electrical cable. The name stands for Sony/Philips Digital Interconnect Format but is also known as Sony/Philips Digital Interface. Sony and Philips were the primary designers of S/PDIF. S/PDIF is standardized in IEC 60958 as IEC 60958 type II (IEC 958 before 1998). [5]

Applications

A common use is to carry two channels of uncompressed digital audio from a CD player to an amplifying receiver.

The S/PDIF interface is also used to carry compressed digital audio for surround sound as defined by the IEC 61937 standard. This mode is used to connect the output of a Blu-ray, DVD player or computer, via optical or coax, to a home theatre amplifying receiver that supports Dolby Digital or DTS decoding.

Hardware specifications

Composite Video RCA connector (yellow) Composite-video-cable.jpg
Composite Video RCA connector (yellow)
Digital Audio Coaxial RCA connector (orange) Digital coaxial audio cable (orange).jpg
Digital Audio Coaxial RCA connector (orange)

S/PDIF was developed at the same time as the main standard, AES3, used to interconnect professional audio equipment in the professional audio field. This resulted from the desire of the various stakeholders to have at least sufficient similarities between the two interfaces to allow the use of the same, or very similar, designs for interfacing ICs. [6] S/PDIF is nearly identical at the protocol level, [lower-alpha 1] but uses either coaxial cable (with RCA connectors) or optical fibre (TOSLINK; i.e., JIS F05 or EIAJ optical), both of which cost less than the XLR connection used by AES3. The RCA connectors are typically colour-coded orange to differentiate from other RCA connector uses such as composite video. S/PDIF uses 75 Ω coaxial cable while AES3 uses 110  Ω balanced twisted pair.

Signals transmitted over consumer-grade TOSLINK connections are identical in content to those transmitted over coaxial connectors, though TOSLINK S/PDIF commonly exhibits higher jitter. One test revealed a coaxial connector introduced a jitter of < 1 nanosecond with an average jitter of 8 picoseconds, while a TOSLINK cable introduced a jitter of < 2 nanoseconds with an average jitter of 100 picoseconds. [7]

Comparison of AES3 and S/PDIF [8]
AES3 balancedAES3 unbalancedS/PDIF copperS/PDIF optical
Cabling110 Ω STP75 Ω coaxial75 Ω coaxialoptical fibre
Connector3-pin XLR BNC RCA or BNC TOSLINK
Output level2–7 V peak to peak1.0–1.2 V peak to peak0.5–0.6 V peak to peak
Min. input level0.2 V0.32 V0.2 V
Max. distance1000 m100 m10 m
Modulation Biphase mark code
Subcode information ASCII id. text SCMS copy protection info.
Audio bit depth 24 bits20 bits (24 bits optional)[ citation needed ]

Protocol specifications

S/PDIF is used to transmit digital signals in a number of formats, the most common being the 48 kHz sample rate format (used in Digital Audio Tape) and the 44.1 kHz format, used in CD audio. In order to support both sample rates, as well as others that might be needed, the format has no defined bit rate. Instead, the data is sent using biphase mark code, which has either one or two transitions for every bit, allowing the original word clock to be extracted from the signal itself.

S/PDIF protocol differs from AES3 only in the channel status bits; see AES3 § Protocol for the high-level view. Both protocols group 192 samples into an audio block, and transmit one channel status bit per sample, providing one 192-bit channel status word per channel per audio block. For S/PDIF, the 192-bit status word is identical between the two channels and is divided into 12 words of 16 bits each, with the first 16 bits being a control code.

S/PDIF control word components [9]
ByteBitUnset (0)Set (1)
00Consumer (S/PDIF)Professional (AES3)
(changes meaning to AES3 channel status word)
01Normal PCMCompressed data
02Copy restrictCopy permit
032 channels4 channels
04
05No pre-emphasis Pre-emphasis 50/15
06–7Mode, defines subsequent bytes, always zero
10–6Audio source category indicating the type of source equipment (general, CD-DA, DVD, etc.)
17L-bit, original or copy [upper-alpha 1]
20–3Source number
24–7Channel number
30-3Sampling frequency: 0000 = 44.1 KHz, 0100 = 48 KHz, 1100 = 32 KHz
34-5Clock accuracy: 10 = 50ppm, 00 = 1100ppm, 01 = variable pitch (requires special receiver)
36-7(undefined)
40Word length 20 bitsWord length 24 bits
41-3Sample length (0=undefined, 1–4=word length minus 1-4 bits, 5=full word length)
44-7(undefined)
5-110-7EAN-13 code (BCD?)
114-7(undefined; padding on 13-digit EAN code)
12-130-7(undefined)
140-3(undefined)
14-210-7ISRC (encoding unclear; ISRC is 2 alphabetic, 3 alphanumeric and 7 numeric, which is 262 × 363 × 107 ≈ 248.164 and so obviously fits into 7.5 bytes, but a naive 5 ASCII + 7 BCD would be 8.5 bytes)
22-230-7(undefined)
  1. (for most category codes) indicates whether copy-restricted audio is original (may be copied once) or a copy (does not allow recording again). The L-bit is only used if bit 2 is zero, meaning copy-restricted audio. The L-bit polarity depends on the category, with recording allowed if it is 1 for DVD-R and DVD-RW, but 0 for CD-R, CD-RW, and DVD. For plain CD-DA (ordinary nonrecordable CDs), the L-bit is not defined, and recording is prevented by alternating bit 2 at a rate of 4–10 Hz.

Data framing

S/PDIF is meant to be used for transmitting 20-bit audio data streams plus other related information. S/PDIF can also transport 24-bit samples by way of four extra bits; however, not all equipment supports this, and these extra bits may be ignored.

To transmit sources with less than 20 bits of sample accuracy, the superfluous bits will be set to zero, and the 4:13 bits (sample length) are set accordingly.

IEC 61937 encapsulation

IEC 61937 defines a way to transmit compressed, multi-channel data over S/PDIF. [10]

  • The control word bit 0:1 is set to indicate the presence of non-linear-PCM data.
  • The sample rate is set to maintain the needed symbol (data) rate. The symbol rate is usually 64 times the sample rate.
  • Data is packed into blocks. Each data block is given a IEC 61937 preamble, containing two 16-bit sync words and indicating the state and identity (type, validity, bitstream number, length) of encapsulated data present. Padding is added to match full block size as required by timing.

A number of encodings are available over IEC 61937, including Dolby AC-3/E-AC-3, Dolby TrueHD, MP3, AAC, ATRAC, DTS, and WMA Pro. [11] [12]

Limitations

The receiver does not control the data rate, so it must avoid bit slip by synchronizing its reception with the source clock. Many S/PDIF implementations cannot fully decouple the final signal from influence of the source or the interconnect. Specifically, the process of clock recovery used to synchronize reception may produce jitter. [13] [14] [15] If the DAC does not have a stable clock reference then noise will be introduced into the resulting analog signal. However, receivers can implement various strategies that limit this influence. [15] [16]

See also

Notes

  1. Consumer S/PDIF supports the Serial Copy Management System, whereas professional interfaces do not.

Related Research Articles

<span class="mw-page-title-main">MPEG-2</span> Video encoding standard

MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods, which permit storage and transmission of movies using currently available storage media and transmission bandwidth. While MPEG-2 is not as efficient as newer standards such as H.264/AVC and H.265/HEVC, backwards compatibility with existing hardware and software means it is still widely used, for example in over-the-air digital television broadcasting and in the DVD-Video standard.

<span class="mw-page-title-main">Digital audio</span> Technology that records, stores, and reproduces sound

Digital audio is a representation of sound recorded in, or converted into, digital form. In digital audio, the sound wave of the audio signal is typically encoded as numerical samples in a continuous sequence. For example, in CD audio, samples are taken 44,100 times per second, each with 16-bit sample depth. Digital audio is also the name for the entire technology of sound recording and reproduction using audio signals that have been encoded in digital form. Following significant advances in digital audio technology during the 1970s and 1980s, it gradually replaced analog audio technology in many areas of audio engineering, record production and telecommunications in the 1990s and 2000s.

<span class="mw-page-title-main">SCART</span> 21-pin connector for audio-visual equipment

SCART is a French-originated standard and associated 21-pin connector for connecting audio-visual (AV) equipment. The name SCART comes from Syndicat des Constructeurs d'Appareils Radiorécepteurs et Téléviseurs, "Radio and Television Receiver Manufacturers' Association", the French organisation that created the connector in the mid-1970s. The related European standard EN 50049 has then been refined and published in 1978 by CENELEC, calling it péritelevision, but it is commonly called by the abbreviation péritel in French.

Dolby Digital, originally synonymous with Dolby AC-3, is the name for a family of audio compression technologies developed by Dolby Laboratories. Called Dolby Stereo Digital until 1995, it is lossy compression. The first use of Dolby Digital was to provide digital sound in cinemas from 35 mm film prints. It has since also been used for TV broadcast, radio broadcast via satellite, digital video streaming, DVDs, Blu-ray discs and game consoles.

<span class="mw-page-title-main">RCA connector</span> Electrical connector used for analog audio and video

The RCA connector is a type of electrical connector commonly used to carry audio and video signals. The name RCA derives from the company Radio Corporation of America, which introduced the design in the 1930s. The connector’s male plug and female jack are called RCA plug and RCA jack.

AES3 is a standard for the exchange of digital audio signals between professional audio devices. An AES3 signal can carry two channels of pulse-code-modulated digital audio over several transmission media including balanced lines, unbalanced lines, and optical fiber.

<span class="mw-page-title-main">DVD-Audio</span> DVD format for storing high-fidelity audio

DVD-Audio is a digital format for delivering high-fidelity audio content on a DVD. DVD-Audio uses most of the storage on the disc for high-quality audio and is not intended to be a video delivery format.

<span class="mw-page-title-main">Serial digital interface</span> Family of digital video interfaces

Serial digital interface (SDI) is a family of digital video interfaces first standardized by SMPTE in 1989. For example, ITU-R BT.656 and SMPTE 259M define digital video interfaces used for broadcast-grade video. A related standard, known as high-definition serial digital interface (HD-SDI), is standardized in SMPTE 292M; this provides a nominal data rate of 1.485 Gbit/s.

In digital audio electronics, a word clock or wordclock is a clock signal used to synchronise other devices, such as digital audio tape machines and compact disc players, which interconnect via digital audio signals. Word clock is so named because it clocks each audio sample. Samples are represented in data words.

I²S, is an electrical serial bus interface standard used for connecting digital audio devices together. It is used to communicate PCM audio data between integrated circuits in an electronic device. The I²S bus separates clock and serial data signals, resulting in simpler receivers than those required for asynchronous communications systems that need to recover the clock from the data stream. Alternatively I²S is spelled I2S or IIS. Despite the similar name, I²S is unrelated to the bidirectional I²C (IIC) bus.

AES47 is a standard which describes a method for transporting AES3 professional digital audio streams over Asynchronous Transfer Mode (ATM) networks.

Dolby Digital Plus, also known as Enhanced AC-3, is a digital audio compression scheme developed by Dolby Labs for the transport and storage of multi-channel digital audio. It is a successor to Dolby Digital (AC-3), and has a number of improvements over that codec, including support for a wider range of data rates, an increased channel count, and multi-program support, as well as additional tools (algorithms) for representing compressed data and counteracting artifacts. Whereas Dolby Digital (AC-3) supports up to five full-bandwidth audio channels at a maximum bitrate of 640 kbit/s, E-AC-3 supports up to 15 full-bandwidth audio channels at a maximum bitrate of 6.144 Mbit/s.

<span class="mw-page-title-main">AV receiver</span> Consumer electronics component

An audio/video receiver (AVR) is a consumer electronics component used in a home theater. Its purpose is to receive audio and video signals from a number of sources, and to process them and provide power amplifiers to drive loudspeakers and route the video to displays such as a television, monitor or video projector. Inputs may come from a satellite receiver, radio, DVD players, Blu-ray Disc players, VCRs or video game consoles, among others. The AVR source selection and settings such as volume, are typically set by a remote controller.

<span class="mw-page-title-main">MADI</span> Multichannel digital audio interface

Multichannel Audio Digital Interface (MADI) standardized as AES10 by the Audio Engineering Society (AES) defines the data format and electrical characteristics of an interface that carries multiple channels of digital audio. The AES first documented the MADI standard in AES10-1991 and updated it in AES10-2003 and AES10-2008. The MADI standard includes a bit-level description and has features in common with the two-channel AES3 interface.

The ADAT Lightpipe, officially the ADAT Optical Interface, is a standard for the transfer of digital audio between equipment. It was originally developed by Alesis but has since become widely accepted, with many third party hardware manufacturers including Lightpipe interfaces on their equipment. The protocol has become so popular that the term ADAT is now often used to refer to the transfer standard rather than to the Alesis Digital Audio Tape itself.

Audio connectors and video connectors are electrical or optical connectors for carrying audio or video signals. Audio interfaces or video interfaces define physical parameters and interpretation of signals. For digital audio and digital video, this can be thought of as defining the physical layer, data link layer, and most or all of the application layer. For analog audio and analog video these functions are all represented in a single signal specification like NTSC or the direct speaker-driving signal of analog audio.

<span class="mw-page-title-main">TOSLINK</span> Standardized optical fiber digital audio interconnect

TOSLINK is a standardized optical fiber connector system. Also known generically as optical audio, its most common use is in consumer audio equipment, where it carries a digital audio stream from components such as CD and DVD players, Digital Audio Tape recorders, computers, and modern video game consoles, to an AV receiver that can decode two channels of uncompressed pulse-code modulated (PCM) audio or compressed 5.1/7.1 surround sound such as Dolby Digital or DTS Surround System. Unlike HDMI, TOSLINK does not have the bandwidth to carry the uncompressed versions of Dolby TrueHD, DTS-HD Master Audio, or more than two channels of PCM audio.

Pulse-code modulation (PCM) is a method used to digitally represent analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the amplitude of the analog signal is sampled at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps.

<span class="mw-page-title-main">HDBaseT</span> Point-to-point media connection over category cable

HDBaseT is a consumer electronic (CE) and commercial connectivity standard for transmission of uncompressed ultra-high-definition video, digital audio, DC power, Ethernet, USB 2.0, and other control communication over a single category cable up to 100 m (328 ft) in length, terminated using the same 8P8C modular connectors as used in Ethernet networks. HDBaseT technology is promoted and advanced by the HDBaseT Alliance.

AES50 is an Audio over Ethernet protocol for multichannel digital audio. It is defined in the AES50-2011 standard for High-resolution multi-channel audio interconnection (HRMAI).

References

  1. "S/PDIF Information". Intel. 21 July 2017. Retrieved 3 April 2018.
  2. "S/PDIF" . Retrieved 3 April 2018.
  3. "SoundSystem SixPack 5.1+ True 6 Channel + Digital In & out – Stuff Worth Knowing" (PDF). TerraTec. 5 July 2001. p. 43. Retrieved 18 January 2011.
  4. Mark Johnson; Charles Crawford; Chris Armbrust (2007). High-Definition DVD Handbook : Producing for HD-DVD and Blu-Ray Disc: Producing for HD-DVD and Blu-Ray Disc . McGraw Hill Professional. pp.  4–10. ISBN   9780071485852. ...connections such as S/PDIF do not have the bandwidth necessary to deliver uncompressed surround sound...
  5. "Sound card". kioskea.net. Kioskea Network. Retrieved 4 August 2010. The components of a sound card are: [...] An SPDIF digital output (Sony Philips Digital Interface, also known as S/PDIF or S-PDIF or IEC 958 or IEC 60958 since 1998). This is an output line that sends digitised audio data to a digital amplifier using a coaxial cable with RCA connectors at the ends.
  6. Finger, Robert A. 1992 'AES3-1992: The RevisedTwo-ChannelDigital Audio Interface', J.AudioEng.Soc., Vol.40, No. 3, 1992 March, p108
  7. "Toslink or Coax" . Retrieved 15 April 2015.
  8. Dennis Bohn (2001). "Interfacing AES3 & S/PDIF" (PDF). Rane Corporation. p. 2. Retrieved 18 January 2011.
  9. Understanding/Analyzing Digital Audio Channel Status Bits at the Wayback Machine (archived 2019-02-28)
  10. Digitalton - Schnittstelle für nichtlinear-PCM-codierte Audio-Bitströme unter Verwendung von IEC 60958 - Teil 1: Allgemeines (IEC 61937-1:2007 + A1:2011); Deutsche Fassung EN 61937-1:2007 + A1:2011
  11. "FFmpeg: libavformat/spdif.h File Reference". ffmpeg.org.
  12. "Representing Formats for IEC 61937 Transmissions - Win32 apps". learn.microsoft.com. 15 May 2023.
  13. Giorgio Pozzoli. "DIGITabilis: crash course on digital audio interfaces" tnt-audio.com.
  14. Chris Dunn, Malcolm J. Hawksford. "Is the AES/EBU/SPDIF Digital Audio Interface Flawed?" AES Convention 93, paper 3360.
  15. 1 2 Tracy, Norman. "On Jitter, the S/PDIF Standard, and Audio DACs". Archived from the original on 1 July 2017.
  16. Lesso, Paul (2006). "A High Performance S/PDIF Receiver" (PDF). Audio Engineering Society. Archived from the original (PDF) on 4 June 2014.{{cite journal}}: Cite journal requires |journal= (help) AES Convention 121, paper 6948