CTA-708

Last updated

CTA-708 (formerly EIA-708 and CEA-708) is the standard for closed captioning for ATSC digital television (DTV) streams in the United States and Canada. It was developed by the Consumer Electronics sector of the Electronic Industries Alliance, which later became the standalone organization Consumer Technology Association.

Contents

Unlike RLE DVB and DVD subtitles, CTA-708 captions are low bandwidth and textual like traditional EIA-608 captions and EBU Teletext subtitles. However, unlike EIA-608 byte pairs, CTA-708 captions are not able to be modulated on an ATSC receiver's NTSC VBI line 21 composite output and must be pre-rendered by the receiver with the digital video frames, they also include more of the Latin-1 character set, and include stubs to support full UTF-32 captions, and downloadable fonts. CTA-708 caption streams can also optionally encapsulate EIA-608 byte pairs internally, a fairly common usage. [1]

CTA-708 captions are injected into MPEG-2 video streams in the picture user data. The packets are in picture order and must be rearranged just like picture frames are. This is known as the DTVCC Transport Stream. It is a fixed-bandwidth channel that has 960 bit/s typically allocated for backward compatible "encapsulated" Line 21 captions, and 1.08 kB/s allocated for CTA-708 captions, for a total of 1.2 kB/s. [2] The ATSC A/53 Standard contains the encoding specifics. The main form of signalling is via a PSIP caption descriptor which indicates the language of each caption and if formatted for "easy reader" (3rd-grade level for language learners) in the PSIP EIT on a per-event basis and optionally in the H.222 PMT only if the video always sends caption data.

CTA-708 caption decoders are required in the U.S. by FCC regulation in all 13-inch (33 cm) diagonal or larger digital televisions. Further, some broadcasters are required by FCC regulations to caption a percentage of their broadcasts.

CTA-708 Technical Details

Caption streams are transmitted with many packet wrappers around them. These are the picture user data, which contains the caption data, which contains the cc_data, which contains the Caption Channel packets, which contains the Service Block, which contains the caption streams.

This layering is based on the OSI Protocol Reference Model:

OSI LayersDTVCC LayersComments
ApplicationInterpretationIssuing commands and appending text to windows
PresentationCodingBreaking up individual commands and characters
SessionServiceService Block Packets
--PacketDTVCC Packet assembly from cc_data Packets
TransportInjectioncc_data Packets extracted from video frames
Networkunuseddirectly connected link
Link SMPTE 259M or H.222 or MXF video frames split from link format
Physical SDI or 8VSB link format demodulated from transmission

Picture User Data

These are inserted before a SMPTE 259M active video frame or video packet. Common video packets are a picture header, a picture parameter set, and a Material Exchange Format essence.

ISO/IEC 13818 H.262/14496-2 MPEG-4 Video user data structure prefix
LengthNameTypeValue
32 bitsuser_data_start_codepatterned bslbf0x000001B2 [3]
32 bitsuser_identifierASCII bslbfGA94 [4]
8 bitsuser_data_type_codeuimsbf3
X*8 bitsuser_data_type_structurebinaryfree form

This structure was designed for any digital audio or metadata that is to be synchronized with a video frame. SDI transports every eight bits in a 10-bit aligned packet and the ancillary flag bytes are replaced by 128 bit header.

A cdp_timecode is used when cdp data stream is discontinuous (i.e., not padded) and the cdp_service_info is used to add extra details to the PSIP broadcast metadata such as language code, easy reader, and widescreen usage.

Fonts

CTA-708 supports eight font tags: undefined, monospaced serif, proportional serif, monospaced sans serif, proportional sans serif, casual, cursive, and small capitals. Below are some font examples, for more see the Wikipedia Fonts article.

Proportional Serif
Proportional vs. monospace fonts Proportional-vs-monospace.svg
Proportional vs. monospace fonts
Rockwell font.png
Proportional Sans Serif
Arial sample.png

Related Research Articles

<span class="mw-page-title-main">MPEG-2</span> Video encoding standard

MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods, which permit storage and transmission of movies using currently available storage media and transmission bandwidth. While MPEG-2 is not as efficient as newer standards such as H.264/AVC and H.265/HEVC, backwards compatibility with existing hardware and software means it is still widely used, for example in over-the-air digital television broadcasting and in the DVD-Video standard.

<span class="mw-page-title-main">Closed captioning</span> Process of displaying interpretive texts to screens

Closed captioning (CC) and subtitling are both processes of displaying text on a television, video screen, or other visual display to provide additional or interpretive information. Both are typically used as a transcription of the audio portion of a program as it occurs, sometimes including descriptions of non-speech elements. Other uses have included providing a textual alternative language translation of a presentation's primary audio language that is usually burned-in to the video and unselectable.

In telecommunications and computer networking, a network packet is a formatted unit of data carried by a packet-switched network. A packet consists of control information and user data; the latter is also known as the payload. Control information provides data for delivering the payload. Typically, control information is found in packet headers and trailers.

<span class="mw-page-title-main">ATSC standards</span> Standards for digital television in the US

Advanced Television Systems Committee (ATSC) standards are an International set of standards for broadcast and digital television transmission over terrestrial, cable and satellite networks. It is largely a replacement for the analog NTSC standard and, like that standard, is used mostly in the United States, Mexico, Canada, South Korea and Trinidad & Tobago. Several former NTSC users, such as Japan, have not used ATSC during their digital television transition, because they adopted other systems such as ISDB developed by Japan, and DVB developed in Europe, for example.

<span class="mw-page-title-main">EIA-608</span> Analog television closed captioning standard

EIA-608, also known as "line 21 captions" and "CEA-608", was once the standard for closed captioning for NTSC TV broadcasts in the United States, Canada and Mexico. It was developed by the Electronic Industries Alliance and required by law to be implemented in most television receivers made in the United States.

The MPEG user data feature provides a means to inject application-specific data into an MPEG elementary stream. User data can be inserted on three different levels:

Datacasting is the broadcasting of data over a wide area via radio waves. It most often refers to supplemental information sent by television stations along with digital terrestrial television (DTT), but may also be applied to digital signals on analog TV or radio. It generally does not apply to data inherent to the medium, such as PSIP data that defines virtual channels for DTT or direct broadcast satellite system, or to things like cable modems or satellite modems, which use a completely separate channel for data.

<span class="mw-page-title-main">Program and System Information Protocol</span> Video and audio industry protocol

The Program and System Information Protocol (PSIP) is the MPEG and privately defined program-specific information originally defined by General Instrument for the DigiCipher 2 system and later extended for the ATSC digital television system for carrying metadata about each channel in the broadcast MPEG transport stream of a television station and for publishing information about television programs so that viewers can select what to watch by title and description. Its FM radio equivalent is Radio Data System (RDS).

MPEG transport stream or simply transport stream (TS) is a standard digital container format for transmission and storage of audio, video, and Program and System Information Protocol (PSIP) data. It is used in broadcast systems such as DVB, ATSC and IPTV.

<span class="mw-page-title-main">Teletext</span> Television information retrieval service developed in the United Kingdom in the early 1970s

Teletext, or broadcast teletext, is a standard for displaying text and rudimentary graphics on suitably equipped television sets. Teletext sends data in the broadcast signal, hidden in the invisible vertical blanking interval area at the top and bottom of the screen. The teletext decoder in the television buffers this information as a series of "pages", each given a number. The user can display chosen pages using their remote control. In broad terms, it can be considered as Videotex, a system for the delivery of information to a user in a computer-like format, typically displayed on a television or a dumb terminal, but that designation is usually reserved for systems that provide bi-directional communication, such as Prestel or Minitel.

Dolby Digital Plus, also known as Enhanced AC-3, is a digital audio compression scheme developed by Dolby Labs for the transport and storage of multi-channel digital audio. It is a successor to Dolby Digital (AC-3), and has a number of improvements over that codec, including support for a wider range of data rates, an increased channel count, and multi-program support, as well as additional tools (algorithms) for representing compressed data and counteracting artifacts. Whereas Dolby Digital (AC-3) supports up to five full-bandwidth audio channels at a maximum bitrate of 640 kbit/s, E-AC-3 supports up to 15 full-bandwidth audio channels at a maximum bitrate of 6.144 Mbit/s.

Flash Video is a container file format used to deliver digital video content over the Internet using Adobe Flash Player version 6 and newer. Flash Video content may also be embedded within SWF files. There are two different Flash Video file formats: FLV and F4V. The audio and video data within FLV files are encoded in the same way as SWF files. The F4V file format is based on the ISO base media file format, starting with Flash Player 9 update 3. Both formats are supported in Adobe Flash Player and developed by Adobe Systems. FLV was originally developed by Macromedia. In the early 2000s, Flash Video was the de facto standard for web-based streaming video. Users include Hulu, VEVO, Yahoo! Video, metacafe, Reuters.com, and many other news providers.

Copy Generation Management System – Analog (CGMS-A) is a copy protection mechanism for analog television signals. It consists of a waveform inserted into the non-picture vertical blanking interval (VBI) of an analogue video signal. If a compatible recording device detects this waveform, it may block or restrict recording of the video content.

Real-Time Messaging Protocol (RTMP) is a communication protocol for streaming audio, video, and data over the Internet. Originally developed as a proprietary protocol by Macromedia for streaming between Flash Player and the Flash Communication Server, Adobe has released an incomplete version of the specification of the protocol for public use.

<span class="mw-page-title-main">Asynchronous serial interface</span> Standardised transport interface for the broadcast industry

Asynchronous Serial Interface, or ASI, is a method of carrying an MPEG Transport Stream (MPEG-TS) over 75-ohm copper coaxial cable or optical fiber. It is popular in the television industry as a means of transporting broadcast programs from the studio to the final transmission equipment before it reaches viewers sitting at home.

Ancillary data is data that has been added to given data and uses the same form of transport. Common examples are cover art images for media files or streams, or digital data added to radio or television broadcasts.

Packetized Elementary Stream (PES) is a specification in the MPEG-2 Part 1 (Systems) and ITU-T H.222.0 that defines carrying of elementary streams in packets within MPEG program streams and MPEG transport streams. The elementary stream is packetized by encapsulating sequential data bytes from the elementary stream inside PES packet headers.

ATSC-M/H is a U.S. standard for mobile digital TV that allows TV broadcasts to be received by mobile devices.

Timed Text Markup Language (TTML), previously referred to as Distribution Format Exchange Profile (DFXP), is an XML-based W3C standard for timed text in online media and was designed to be used for the purpose of authoring, transcoding or exchanging timed text information presently in use primarily for subtitling and captioning functions. TTML2, the second major revision of the language, was finalized on November 8, 2018. It has been adopted widely in the television industry, including by Society of Motion Picture and Television Engineers (SMPTE), European Broadcasting Union (EBU), ATSC, DVB, HbbTV and MPEG CMAF and several profiles and extensions for the language exist nowadays.

References

  1. https://www.adobe.com/content/dam/acom/en/devnet/video/pdfs/introduction_to_closed_captions.pdf (2015) "The majority of premium content produced for the United States today still contains 608 captions embedded in the 608 over 708 digital formats."
  2. https://ecfsapi.fcc.gov/file/6008646915.pdf
  3. Table A7 Picture User Data Syntax6 for 5F485C53d01
  4. "Archived copy" (PDF). Archived from the original (PDF) on November 20, 2010. Retrieved May 25, 2012.{{cite web}}: CS1 maint: archived copy as title (link)