Timed Text Markup Language

Last updated
Filename extension
.ttml, .dfxp, .xml
Internet media type
Developed by W3C
Initial release1 November 2004;18 years ago (2004-11-01) [1]
Type of format Timed text
Extended from XML
Standard W3C TTML1
Open format?Yes

Timed Text Markup Language (TTML), previously referred to as Distribution Format Exchange Profile (DFXP), is an XML-based W3C standard for timed text in online media and was designed to be used for the purpose of authoring, transcoding or exchanging timed text information presently in use primarily for subtitling and captioning functions. TTML2, the second major revision of the language, was finalized on November 8, 2018. It has been adopted widely in the television industry, including by Society of Motion Picture and Television Engineers (SMPTE), European Broadcasting Union (EBU), ATSC, DVB, HbbTV and MPEG CMAF and several profiles and extensions for the language exist nowadays.


TTML Content may also be used directly as a distribution format and is widely supported in media players, with the exception of major web browsers, where WebVTT, the second W3C standard for timed text in online media, has better built-in support in connection with the HTML5 <track> element; many organisations nevertheless use TTML content on web video using their own player code.


The idea of adding timing information on the Web by extending HTML [2] came very early on, out of the work done on the Synchronized Multimedia Integration Language. Based on XML, the work on TTML started in 2003 [3] and an early draft was released in November 2004 as Timed Text (TT) Authoring Format 1.0 – Distribution Format Exchange Profile (DFXP). [4] The first version of TTML, TTML1, was finalized in November 2010.

In 2010, after discussions about its adoption in HTML5, WHATWG opted for a new but more lightweight standard based on the popular SRT format, now named WebVTT. [5]

In February 2012 the FCC declared the SMPTE closed-captioning standard for online video content, a superset of TTML, as a "safe harbor interchange, delivery format". [6]

In 2015, Netflix, Home Box Office (HBO), Telestream, SMPTE, and W3C received a Technology & Engineering Emmy Award for the category “Standardization and Pioneering Development of Non-Live Broadband Captioning,” for their work on TTML.

TTML2, the second version of TTML started in February 2015, was finalized in November 2018, along with a new revision of TTML1.


The TTML standard specifies a wide range of features, of which a smaller set are sometimes necessary, depending on the specific application. For this reason, the standard developed the concept of profiles, which are subsets of required features from the full specification. TTML1 defines three standard profiles: DFXP Transformation, DFXP Presentation and DFXP Full. Many profiles of TTML were developed by W3C and other organizations over the years to subset or extend the features of TTML. The Timed Text Working Group maintains a registry used to identify TTML profiles.

DFXP Transformation

This profile defines the minimum feature requirements that a transformation processor (e.g. caption converter) needs to support in order to be considered TTML compliant.

DFXP Presentation

This profile defines the minimum feature requirements that a presentation processor (e.g. video player) needs to support in order to be considered TTML compliant.


This profile requires the support of all the feature defined by TTML specification.


This profile extends TTML with three SMPTE-specific elements aimed at legacy formats. Interoperability with pre-existing and regionally-specific formats (such as CEA-708, CEA-608, DVB Subtitles, and WST (World System Teletext)) is provided by means of tunneling data or bit map images and adding necessary metadata. [7]

The U.S Federal Communications Commission (FCC) has declared SMPTE-TT to be a safe harbor interchange and delivery format in February 2012.


The European Broadcasting Union (EBU) defined several related profiles. EBU-TT Part 1 (Tech3350) uses a subset of TTML1 constraining the features to make it more suitable for archive, exchange and use with broadcast video and web video applications. [8] EBU-TT Part 3 (Tech3370) extends and constrains Part 1 further, in particular adding functionality to support live streaming of subtitles from the subtitle author to a distribution encoder. [9] EBU-TT-D (Tech3380) is highly constrained profile of TTML1 intended specifically for distribution to players, and has been adopted by HbbTV, DVB and Freeview Play for example. [10]


TTML Profiles for Internet Media Subtitles and Captions specifies two profiles, a text-only profile and an image-only profile, intended to be used across subtitle and caption delivery applications worldwide, thereby simplifying interoperability, consistent rendering and conversion to other subtitling and captioning formats. It incorporate extensions from SMPTE-TT and EBU-TT.



ATSC A/343 requires subtitle and caption content essence to be either IMSC 1 Text or Image Profile conformant.


ETSI EN 303 560 v1.1.1 (May 2018) is the DVB TTML Subtitling Systems specification. It defines a default conformance point that is the common intersection of conformance between EBU-TT-D and IMSC 1 Text Profile, and allows for subtitle and caption documents conformant to EBU-TT-D, IMSC1 Text Profile or other profiles of TTML to be sent and signalled within DVB MPEG-2 transport streams, and includes the ability to embed fonts for subtitle presentation, also within the transport stream.

HbbTV 2

ETSI TS 102 796 V1.5.1 (2018-09) is the HbbTV 2.0.2 specification. It specifies that conformant players must be able to play back EBU-TT-D subtitles delivered online for example in ISO BMFF via MPEG DASH, as well as allowing for other existing broadcast subtitle formats.


At WWDC 2017 Apple announced support for IMSC 1 Text Profile in HLS, and shortly after shipped systems that include presentation support, including iOS and tvOS.

Freeview Play

Freeview Play — Technical Specification 2018 Profile Version: 3.0.9 (14/07/2017) defines the application requirements for the Freeview (UK) hybrid IPTV and Broadcast device for the UK market, conforming to the HbbTV specification, requiring support for "DASH streaming technology with integrated EBU-TT-D subtitles".


CMAF is the Common Media Application Format published by MPEG as part 19 of MPEG-A, also published as ISO/IEC 23000-19:2018 Information technology -- Multimedia application format (MPEG-A) -- Part 19: Common media application format (CMAF) for segmented media. The format specifies CMFHD presentation profiles in which subtitle tracks shall include at least one "switching set" for each language and role in the IMSC 1 Text profile, while also allowing for other representations of subtitles in WebVTT.

Related Research Articles

<span class="mw-page-title-main">MPEG-2</span> Video encoding standard

MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods, which permit storage and transmission of movies using currently available storage media and transmission bandwidth. While MPEG-2 is not as efficient as newer standards such as H.264/AVC and H.265/HEVC, backwards compatibility with existing hardware and software means it is still widely used, for example in over-the-air digital television broadcasting and in the DVD-Video standard.

<span class="mw-page-title-main">Closed captioning</span> Process of displaying interpretive texts to screens

Closed captioning (CC) and subtitling are both processes of displaying text on a television, video screen, or other visual display to provide additional or interpretive information. Both are typically used as a transcription of the audio portion of a program as it occurs, sometimes including descriptions of non-speech elements. Other uses have included providing a textual alternative language translation of a presentation's primary audio language that is usually burned-in to the video and unselectable.

<span class="mw-page-title-main">DVB</span> Open standard for digital television broadcasting

Digital Video Broadcasting (DVB) is a set of international open standards for digital television. DVB standards are maintained by the DVB Project, an international industry consortium, and are published by a Joint Technical Committee (JTC) of the European Telecommunications Standards Institute (ETSI), European Committee for Electrotechnical Standardization (CENELEC) and European Broadcasting Union (EBU).

Material Exchange Format (MXF) is a container format for professional digital video and audio media defined by a set of SMPTE standards. A typical example of its use is for delivering advertisements to TV stations and tapeless archiving of broadcast TV programs. It is also used as part of the Digital Cinema Package for delivering movies to commercial theaters.

AES3 is a standard for the exchange of digital audio signals between professional audio devices. An AES3 signal can carry two channels of pulse-code-modulated digital audio over several transmission media including balanced lines, unbalanced lines, and optical fiber.

<span class="mw-page-title-main">VLC media player</span> Free and open-source media player and streaming media server

VLC media player is a free and open-source, portable, cross-platform media player software and streaming media server developed by the VideoLAN project. VLC is available for desktop operating systems and mobile platforms, such as Android, iOS and iPadOS. VLC is also available on digital distribution platforms such as Apple's App Store, Google Play, and Microsoft Store.

<span class="mw-page-title-main">1080p</span> Video mode

1080p is a set of HDTV high-definition video modes characterized by 1,920 pixels displayed across the screen horizontally and 1,080 pixels down the screen vertically; the p stands for progressive scan, i.e. non-interlaced. The term usually assumes a widescreen aspect ratio of 16:9, implying a resolution of 2.1 megapixels. It is often marketed as Full HD or FHD, to contrast 1080p with 720p resolution screens. Although 1080p is sometimes informally referred to as 2K, these terms reflect two distinct technical standards, with differences including resolution and aspect ratio.

MPEG-4 Part 17, or MPEG-4 Timed Text (MP4TT), or MPEG-4 Streaming text format is the text-based subtitle format for MPEG-4, published as ISO/IEC 14496-17 in 2006. It was developed in response to the need for a generic method for coding of text as one of the multimedia components within audiovisual presentations.

These tables compare features of multimedia container formats, most often used for storing or streaming digital video or digital audio content. To see which multimedia players support which container format, look at comparison of media players.

<span class="mw-page-title-main">Common Interface</span> Television technology

In Digital Video Broadcasting, the Common Interface is a technology which allows decryption of pay TV channels. Pay TV stations want to choose which encryption method to use. The Common Interface allows TV manufacturers to support many different pay TV stations, by allowing to plug in exchangeable conditional-access modules (CAM) for various encryption schemes.

Broadcast Wave Format (BWF) is an extension of the popular Microsoft WAV audio format and is the recording format of most file-based non-linear digital recorders used for motion picture, radio and television production. It was first specified by the European Broadcasting Union in 1997, and updated in 2001 and 2003. It has been accepted as the ITU recommendation ITU-R BS.1352-3, Annex 1.

SubRip is a free software program for Microsoft Windows which extracts subtitles and their timings from various video formats to a text file. It is released under the GNU GPL. Its subtitle format's file extension is .srt and is widely supported. Each .srt file is a human-readable file format where the subtitles are stored sequentially along with the timing information. Most subtitles distributed on the Internet are in this format.

Timed text is the presentation of text media in synchrony with other media, such as audio and video.

<span class="mw-page-title-main">GOM Player</span> Media player

GOM Player is a media player for Windows, developed by GOM & Company. With more than 100 million downloads, it is also known as the most used player in South Korea. Its main features include the ability to play some broken media files and find missing codecs using a codec finder service.

<span class="mw-page-title-main">Subtitles</span> Textual representation of events and speech in motion imagery

Subtitles and captions are lines of dialogue or other text displayed at the bottom of the screen in films, television programs, video games or other visual media. They can be transcriptions of the screenplay, translations of it, or information to help viewers who are deaf or hard-of-hearing understand what is shown.

<span class="mw-page-title-main">Hybrid Broadcast Broadband TV</span> Industry standard for hybrid digital television

Hybrid Broadcast Broadband TV (HbbTV) is both an industry standard and promotional initiative for hybrid digital TV to harmonise the broadcast, Internet Protocol Television (IPTV), and broadband delivery of entertainment to the end consumer through connected TVs and set-top boxes. The HbbTV Association, comprising digital broadcasting and Internet industry companies, has established a standard for the delivery of broadcast TV and broadband TV to the home, through a single user interface, creating an open platform as an alternative to proprietary technologies. Products and services using the HbbTV standard can operate over different broadcasting technologies, such as satellite, cable, or terrestrial networks.

IMSC may refer to:

The Entertainment Identifier Registry, or EIDR, is a global unique identifier system for a broad array of audio visual objects, including motion pictures, television, and radio programs. The identification system resolves an identifier to a metadata record that is associated with top-level titles, edits, DVDs, encodings, clips, and mash-ups. EIDR also provides identifiers for video service providers, such as broadcast and cable networks.

WebVTT is a World Wide Web Consortium (W3C) standard for displaying timed text in connection with the HTML5 <track> element.

Interoperable Master Format (IMF) is a container format for the standardized digital delivery and storage of finished audio-visual masters, including movies, episodic content and advertisements.


  1. "Timed Text (TT) Authoring Format 1.0 – Distribution Format Exchange Profile (DFXP)" . Retrieved 2015-02-16.
  2. "Timed Interactive Multimedia Extensions for HTML (HTML+TIME)" . Retrieved 2019-08-09.
  3. "W3C Launches Timed Text Working Group" . Retrieved 2019-08-09.
  4. "Timed Text (TT) Authoring Format 1.0 – Distribution Format Exchange Profile (DFXP)" . Retrieved 2004-11-01.
  5. "WebVTT versus TTML: XML considered harmful for web captions?" . Retrieved 16 February 2015.
  6. "FCC Declares SMPTE Closed-Captioning Standard For Online Video Content As Safe Harbor Interchange, Delivery Format" . Retrieved 20 February 2015.
  7. "SMPTE Timed Text Format (SMPTE ST 2052-1:2010)" (PDF). 3 December 2010.
  8. "Part 1: EBU-TT Part 1 - Subtitle format definition (EBU Tech 3350)". 24 May 2017.
  9. "Part 1: EBU-TT Part 3 Live Subtitling (EBU Tech 3370)". 24 May 2017.
  10. "EBU-TT-D Subtitling Distribution Format (Tech3380)". 22 May 2018.