Timed text

Last updated

Timed text is the presentation of text media in synchrony with other media, such as audio and video.



Typical applications of timed text are the real-time subtitling of foreign-language movies on the Web, captioning for people lacking audio devices or having hearing impairments, karaoke, scrolling news items or teleprompter applications.

Timed text for MPEG-4 movies and cellphone media is specified in MPEG-4 Part 17 Timed Text, and its MIME type is specified by RFC 3839.

Markup language specifications

The W3C keeps two standards intended to regulate timed text on the Internet: the Timed Text Markup Language (TTML) [1] and WebVTT [2] (currently in draft stage). SMPTE created additional metadata structures for use in TTML and developed a profile of TTML called SMPTE-TT. [3] The DECE incorporated the SMPTE Timed Text in their UltraViolet Common File Format specification.

Competing formats

Interoperability for timed text came up during the development of the SMIL 2.0 specification. Today, incompatible formats for captioning, subtitling and other forms of timed text are used on the Web. This means that when creating a SMIL presentation, the text portion often needs to be targeted to a particular playback environment. Moreover, the accessibility community relies heavily on captioning to make audiovisual content accessible. The lack of an interoperable format adds a significant additional cost to the costs of captioning Web content, which are already high.


The following is an extract from the English closed captioning file, in SubRip format, for the 1916 Krazy Kat Bugolist film.

1 00:00:22,000 --> 00:00:27,000 I'll teach thee Bugology, Ignatzes  2  00:00:40,000 --> 00:00:43,000  Something tells me  3  00:00:58,000 --> 00:01:59,000  Look, Ignatz, a sleeping bee

The equivalent in W3C WebVTT is the following:

WEBVTT  00:22.000 --> 00:27.000 I'll teach thee Bugology, Ignatzes  00:40.000 --> 00:43.000  Something tells me  00:58.000 --> 01:59.000  Look, Ignatz, a sleeping bee

The equivalent in W3C TTML is the following:


See also

Related Research Articles

<span class="mw-page-title-main">HTML</span> HyperText Markup Language

The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It defines the meaning and structure of web content. It is often assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript.

Scalable Vector Graphics (SVG) is an XML-based vector image format for defining two-dimensional graphics, having support for interactivity and animation. The SVG specification is an open standard developed by the World Wide Web Consortium since 1999.

<span class="mw-page-title-main">Synchronized Multimedia Integration Language</span> XML-based markup language for multimedia presentations

Synchronized Multimedia Integration Language ) is a World Wide Web Consortium recommended Extensible Markup Language (XML) markup language to describe multimedia presentations. It defines markup for timing, layout, animations, visual transitions, and media embedding, among other things. SMIL allows presenting media items such as text, images, video, audio, links to other SMIL presentations, and files from multiple web servers. SMIL markup is written in XML, and has similarities to HTML.

<span class="mw-page-title-main">Closed captioning</span> Process of displaying interpretive texts to screens

Closed captioning (CC) and subtitling are both processes of displaying text on a television, video screen, or other visual display to provide additional or interpretive information. Both are typically used as a transcription of the audio portion of a program as it occurs, sometimes including descriptions of non-speech elements. Other uses have included providing a textual alternative language translation of a presentation's primary audio language that is usually burned-in to the video and unselectable.

<span class="mw-page-title-main">Wireless Markup Language</span> Markup language intended for devices that implement the Wireless Application Protocol specification

Wireless Markup Language (WML), based on XML, is an obsolete markup language intended for devices that implement the Wireless Application Protocol (WAP) specification, such as mobile phones. It provides navigational support, data input, hyperlinks, text and image presentation, and forms, much like HTML. It preceded the use of other markup languages used with WAP, such as XHTML and HTML itself, which achieved dominance as processing power in mobile devices increased.

Material Exchange Format (MXF) is a container format for professional digital video and audio media defined by a set of SMPTE standards. A typical example of its use is for delivering advertisements to TV stations and tapeless archiving of broadcast TV programs. It is also used as part of the Digital Cinema Package for delivering movies to commercial theaters.

An HTML element is a type of HTML document component, one of several types of HTML nodes. The first used version of HTML was written by Tim Berners-Lee in 1993 and there have since been many versions of HTML. The most commonly used version is HTML 4.01, which became official standard in December 1999. An HTML document is composed of a tree of simple HTML nodes, such as text nodes, and HTML elements, which add semantics and formatting to parts of document. Each element can have HTML attributes specified. Elements can also have content, including other elements and text.

Vector Markup Language (VML) is an obsolete XML-based file format for two-dimensional vector graphics. It was specified in Part 4 of the Office Open XML standards ISO/IEC 29500 and ECMA-376. According to the specification, VML is a deprecated format included in Office Open XML for legacy reasons only.

MPEG-4 Part 17, or MPEG-4 Timed Text (MP4TT), or MPEG-4 Streaming text format is the text-based subtitle format for MPEG-4, published as ISO/IEC 14496-17 in 2006. It was developed in response to the need for a generic method for coding of text as one of the multimedia components within audiovisual presentations.

These tables compare features of multimedia container formats, most often used for storing or streaming digital video or digital audio content. To see which multimedia players support which container format, look at comparison of media players.

SubRip is a free software program for Microsoft Windows which extracts subtitles and their timings from various video formats to a text file. It is released under the GNU GPL. Its subtitle format's file extension is .srt and is widely supported. Each .srt file is a human-readable file format where the subtitles are stored sequentially along with the timing information. Most subtitles distributed on the Internet are in this format.

<span class="mw-page-title-main">HTML5</span> Fifth and current version of hypertext markup language

HTML5 is a markup language used for structuring and presenting content on the World Wide Web. It is the fifth and final major HTML version that is a World Wide Web Consortium (W3C) recommendation. The current specification is known as the HTML Living Standard. It is maintained by the Web Hypertext Application Technology Working Group (WHATWG), a consortium of the major browser vendors.

Extensible HyperText Markup Language (XHTML) is part of the family of XML markup languages. It mirrors or extends versions of the widely used HyperText Markup Language (HTML), the language in which Web pages are formulated.

Animation of Scalable Vector Graphics, an open XML-based standard vector graphics format is possible through various means:

<span class="mw-page-title-main">CSS</span> Style sheet language

Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of a document written in a markup language such as HTML or XML. CSS is a cornerstone technology of the World Wide Web, alongside HTML and JavaScript.

WebVTT is a World Wide Web Consortium (W3C) standard for displaying timed text in connection with the HTML5 <track> element.

Timed Text Markup Language (TTML), previously referred to as Distribution Format Exchange Profile (DFXP), is an XML-based W3C standard for timed text in online media and was designed to be used for the purpose of authoring, transcoding or exchanging timed text information presently in use primarily for subtitling and captioning functions. TTML2, the second major revision of the language, was finalized on November 8, 2018. It has been adopted widely in the television industry, including by Society of Motion Picture and Television Engineers (SMPTE), European Broadcasting Union (EBU), ATSC, DVB, HbbTV and MPEG CMAF and several profiles and extensions for the language exist nowadays.

Interoperable Master Format (IMF) is a container format for the standardized digital delivery and storage of finished audio-visual masters, including movies, episodic content and advertisements.


  1. Glenn Adams (Ed.): Timed Text Markup Language (TTML) 1.0 - W3C Recommendation, 18 November 2010
  2. "WebVTT Draft Report". Archived from the original on 2015-10-06. Retrieved 2015-02-16.
  3. SMPTE (August 2010), ST-2052-1; SMPTE Timed Text, Copyright © 2010 SMPTE. August 2010 (PDF), retrieved 2011-03-25