Annodex

Last updated

Annodex is a digital media format developed by CSIRO to provide annotation and indexing of continuous media, such as audio and video.

Contents

It is based on the Ogg container format, with an XML language called CMML (Continuous Media Markup Language) providing additional metadata. It is intended to create a Continuous Media Web (CMWeb), whereby continuous media can be manipulated in a similar manner to text media on the World Wide Web, including searching and dynamic arrangement of elements.

History

The specific design of the elements of the Continuous Media Web project were invented by Silvia Pfeiffer and Conrad Parker at CSIRO Australia in mid-2001. Some of the ideas behind CMML and the generic addressing of temporal offsets were proposed in a 1997 paper by Bill Simpson-Young and Ken Yap.

In January 2002 the Annodex team took on two students, Andrew Nesbit and Andre Pang, along with Simon Lai who became the first person to author meaningful content in CMML. During this time the basics of the Annodex technology were designed, including the design of temporal URI fragments, the basic DTDs, the choice of the Ogg encapsulation format and the initial design of the libraries.

By late 2004, Andre Pang developed the Annodex Plug-in for Mozilla Firefox Browsers, allowing for the playback of Annodex media encoded with the Ogg Theora video codec and the Ogg Vorbis audio codec. Time URIs implemented at the Location Bar provides the server-side seeking functionality on Annodex media and enables hyperlinking into and out of Annodex media through a table of contents clip list for CMML content.

Over time there was increasing development of Annodex technology from the open-source community, starting with Debian packages by Jamie Wilkinson, Python bindings by Ben Leslie, and Perl bindings by Angus Lees. The command-line authoring tools were completed early in 2001, whilst being continually updated to adhere to the current Version 3 of the Annodex annotation standards by 2005. [1]

In November 2005, CSIRO wanted to focus on closed-source research and build existing products on top of the technology, thus losing interest in the open source standard components of it. Therefore, a decision was made to separate out the open-source components into its own organisation by creating an Annodex Foundation similar in spirit to the many other foundations that have been created around other FOSS technologies. [2]

Technology

The core technical specification documents on Annodex are being developed through the Annodex community. They consist of the following components as follows:

CMML

Continuous Media Markup Language is a XML markup language for time-continuous data such as audio and video. The main principles of CMML are as follows:

Example of CMML Content

<cmml> <stream timebase="0">   <import src="galaxies.mpg" contenttype="video/mpeg"/> </stream> <head>   <title>Hidden Galaxies</title>   <meta name="author" content="CSIRO"/> </head> <clip id="findingGalaxies" start="15">   <a href="http://www.aao.gov.au/galaxies.anx#radio">     Related video on detection of galaxies   </a>   <img src="galaxy.jpg"/>   <desc>What's out there?</desc>   <meta name="KEYWORDS" content="Radio Telescope"/> </clip> </cmml> 

The origin of the CMML document, along with further documentation and standards can be found at Annodex CMML Standard Version 2.1 [usurped]

Annodex File Format

Annodex File Structure Annodex file structure.svg
Annodex File Structure

Annodex is an encapsulation format, which interleaves time-continuous data with CMML markup in a streamable manner. The Annodex format is built on the Ogg encapsulation format to allows for internet servers and proxies to manage temporal subparts and reconstruct files from annodexed clips. This introduces the following stream types:

Further information can be found at Annodex Annotation Format for Time-continuous Bitstreams, Version 3.0 [usurped]

Time intervals in URIs

To include time-continuous content such as audio and video media into the Web, it is necessary to be able to point hyperlinks into such content to address temporal offsets. Further information can be found at Annodex Time Intervals in URI Queries and Fragments [ dead link ]

Notes and references

Related Research Articles

A codec is a computer hardware or software component that encodes or decodes a data stream or signal. Codec is a portmanteau of coder/decoder.

<span class="mw-page-title-main">Markup language</span> Modern system for annotating a document

A markuplanguage is a text-encoding system which specifies the structure and formatting of a document and potentially the relationships among its parts. Markup can control the display of a document or enrich its content to facilitate automated processing.

<span class="mw-page-title-main">Ogg</span> Open container format maintained by the Xiph.Org Foundation

Ogg is a free, open container format maintained by the Xiph.Org Foundation. The authors of the Ogg format state that it is unrestricted by software patents and is designed to provide for efficient streaming and manipulation of high-quality digital multimedia. Its name is derived from "ogging", jargon from the computer game Netrek.

<span class="mw-page-title-main">Vorbis</span> Royalty-free lossy audio encoding format

Vorbis is a free and open-source software project headed by the Xiph.Org Foundation. The project produces an audio coding format and software reference encoder/decoder (codec) for lossy audio compression, libvorbis. Vorbis is most commonly used in conjunction with the Ogg container format and it is therefore often referred to as Ogg Vorbis.

The Xiph.Org Foundation is a nonprofit organization that produces free multimedia formats and software tools. It focuses on the Ogg family of formats, the most successful of which has been Vorbis, an open and freely licensed audio format and codec designed to compete with the patented WMA, MP3 and AAC. As of 2013, development work was focused on Daala, an open and patent-free video format and codec designed to compete with VP9 and the patented High Efficiency Video Coding.

An HTML element is a type of HTML document component, one of several types of HTML nodes. The first used version of HTML was written by Tim Berners-Lee in 1993 and there have since been many versions of HTML. The current de facto standard is governed by the industry group WHATWG and is known as the HTML Living Standard.

A container format or metafile is a file format that allows multiple data streams to be embedded into a single file, usually along with metadata for identifying and further detailing those streams. Notable examples of container formats include archive files and formats used for multimedia playback. Among the earliest cross-platform container formats were Distinguished Encoding Rules and the 1985 Interchange File Format.

These tables compare features of multimedia container formats, most often used for storing or streaming digital video or digital audio content. To see which multimedia players support which container format, look at comparison of media players.

Flash Video is a container file format used to deliver digital video content over the Internet using Adobe Flash Player version 6 and newer. Flash Video content may also be embedded within SWF files. There are two different Flash Video file formats: FLV and F4V. The audio and video data within FLV files are encoded in the same way as SWF files. The F4V file format is based on the ISO base media file format, starting with Flash Player 9 update 3. Both formats are supported in Adobe Flash Player and developed by Adobe Systems. FLV was originally developed by Macromedia. In the early 2000s, Flash Video was the de facto standard for web-based streaming video. Users include Hulu, VEVO, Yahoo! Video, metacafe, Reuters.com, and many other news providers.

<span class="mw-page-title-main">Vorbis comment</span> Metadata container for Ogg file formats

A Vorbis comment is a metadata container used in the Ogg file format. It allows information such as the title, artist, album, track number or other information about the file to be added to the file itself. However, as the official Ogg Vorbis documentation notes, “[the comment header] is meant for short, text comments, not arbitrary metadata; arbitrary metadata belongs in a separate logical bitstream that provides greater structure and machine parseability.” Instead, the intended function of Vorbis comments is to approximate the kind of information that might be hand-inked onto a blank faced CD-R or CD-RW: a few lines of notes briefly detailing the content.

<span class="mw-page-title-main">MP4 file format</span> Digital format for storing video and audio

MPEG-4 Part 14, or MP4, is a digital multimedia container format most commonly used to store video and audio, but it can also be used to store other data such as subtitles and still images. Like most modern container formats, it allows streaming over the Internet. The only filename extension for MPEG-4 Part 14 files as defined by the specification is .mp4. MPEG-4 Part 14 is a standard specified as a part of MPEG-4.

Continuous Media Markup Language (CMML) is to audio or video what HTML is to text. CMML is essentially a timed text codec. It allows file creators to structure a time-continuously sampled data file by dividing it into temporal sections, and provides these clips with some additional information. This information is HTML-like and is essentially a textual representation of the audio or video file. CMML enables textual searches on these otherwise binary files.

Clesh is a cloud-based video editing platform, created by Forbidden Technologies plc, designed for the consumers, prosumers, and online communities to integrate user-generated content. The core technology is based on FORscene which is geared towards professionals working for example in broadcasting, news media, post production.

.m2ts is a filename extension used for the Blu-ray disc Audio-Video (BDAV) MPEG-2 Transport Stream (M2TS) container file format. It is used for multiplexing audio, video and other streams, such as subtitles. It is based on the MPEG-2 transport stream container. This container format is commonly used for high-definition video on Blu-ray Disc and AVCHD.

The HTML5 draft specification adds video and audio elements for embedding video and audio in HTML documents. The specification had formerly recommended support for playback of Theora video and Vorbis audio encapsulated in Ogg containers to provide for easier distribution of audio and video over the internet by using open standards, but the recommendation was soon after dropped.

HTML video is a subject of the HTML specification as the standard way of playing video via the web. Introduced in HTML5, it is designed to partially replace the object element and the previous de facto standard of using the proprietary Adobe Flash plugin, though early adoption was hampered by lack of agreement as to which video coding formats and audio coding formats should be supported in web browsers. As of 2020, HTML video is the only widely supported video playback technology in modern browsers, with the Flash plugin being phased out.

XHTML+RDFa is an extended version of the XHTML markup language for supporting RDF through a collection of attributes and processing rules in the form of well-formed XML documents. XHTML+RDFa is one of the techniques used to develop Semantic Web content by embedding rich semantic markup. Version 1.1 of the language is a superset of XHTML 1.1, integrating the attributes according to RDFa Core 1.1. In other words, it is an RDFa support through XHTML Modularization.

<span class="mw-page-title-main">Opus (audio format)</span> Lossy audio coding format

Opus is a lossy audio coding format developed by the Xiph.Org Foundation and standardized by the Internet Engineering Task Force, designed to efficiently code speech and general audio in a single format, while remaining low-latency enough for real-time interactive communication and low-complexity enough for low-end embedded processors. Opus replaces both Vorbis and Speex for new applications, and several blind listening tests have ranked it higher-quality than any other standard audio format at any given bitrate until transparency is reached, including MP3, AAC, and HE-AAC.

The Augmented Reality Markup Language (ARML) is a data standard to describe and interact with augmented reality (AR) scenes. It has been developed within the Open Geospatial Consortium (OGC) by a dedicated ARML 2.0 Standards Working Group. ARML consists of both an XML grammar to describe the location and appearance of virtual objects in the scene, as well as ECMAScript bindings to allow dynamic access to the properties of the virtual objects, as well as event handling, and is currently published in version 2.0. ARML focuses on visual augmented reality.