MPEG-7

Last updated

MPEG-7 is a multimedia content description standard. It was standardized in ISO/IEC 15938 (Multimedia content description interface). [1] [2] [3] [4] This description will be associated with the content itself, to allow fast and efficient searching for material that is of interest to the user. MPEG-7 is formally called Multimedia Content Description Interface. Thus, it is not a standard which deals with the actual encoding of moving pictures and audio, like MPEG-1, MPEG-2 and MPEG-4. It uses XML to store metadata, and can be attached to timecode in order to tag particular events, or synchronise lyrics to a song, for example.

Contents

It was designed to standardize:

The combination of MPEG-4 and MPEG-7 has been sometimes referred to as MPEG-47. [5]

Introduction

MPEG-7 is intended to complement the previous MPEG standards, by standardizing multimedia metadata -- information about the content, not the content itself. MPEG-7 can be used independently of the other MPEG standards - the description might even be attached to an analog movie. The representation that is defined within MPEG-4, i.e. the representation of audio-visual data in terms of objects, is however very well suited to what will be built on the MPEG-7 standard. This representation is basic to the process of categorization. In addition, MPEG-7 descriptions could be used to improve the functionality of previous MPEG standards. With these tools, we can build an MPEG-7 Description and deploy it. According to the requirements document,1 "a Description consists of a Description Scheme (structure) and the set of Descriptor Values (instantiations) that describe the Data." A Descriptor Value is "an instantiation of a Descriptor for a given data set (or subset thereof)." The Descriptor is the syntactic and semantic definition of the content. Extraction algorithms are inside the scope of the standard because their standardization is not required to allow interoperability.

Parts

The MPEG-7 (ISO/IEC 15938) consists of different Parts. Each part covers a certain aspect of the whole specification.

MPEG-7 Parts [4] [6]
PartNumberFirst public release date (First edition)Latest public release date (edition)Latest amendmentTitleDescription
Part 1 ISO/IEC 15938-1 200220022006Systemsthe architectural framework of MPEG-7, the carriage of MPEG-7 content - TeM (Textual format for MPEG-7) and the binary format for MPEG-7 descriptions (BiM) [7]
Part 2 ISO/IEC 15938-2 20022002Description definition language
Part 3 ISO/IEC 15938-3 200220022010Visual
Part 4 ISO/IEC 15938-4 200220022006Audio
Part 5 ISO/IEC 15938-5 200320032015Multimedia description schemes
Part 6 ISO/IEC 15938-6 200320032011Reference software
Part 7 ISO/IEC 15938-7 200320032011Conformance testing
Part 8 ISO/IEC TR 15938-8 200220022011Extraction and use of MPEG-7 descriptions
Part 9 ISO/IEC 15938-9 200520052012Profiles and levels
Part 10 ISO/IEC 15938-10 20052005Schema definition
Part 11 ISO/IEC TR 15938-11 200520052012MPEG-7 profile schemas
Part 12 ISO/IEC 15938-12 20082012Query format
Part 13 ISO/IEC 15938-13 20152015Compact descriptors for visual search

Relation between description and content

Independence between description and content Mpeg7image1.svg
Independence between description and content

An MPEG-7 architecture requirement is that description must be separate from the audiovisual content.

On the other hand, there must be a relation between the content and description. Thus the description is multiplexed with the content itself.

On the right side you can see this relation between description and content.

MPEG-7 tools

Relation between different tools and elaboration process of MPEG-7 Mpeg7image2.svg
Relation between different tools and elaboration process of MPEG-7

MPEG-7 uses the following tools:

On the right side you can see the relation between MPEG-7 tools.

MPEG-7 applications

There are many applications and application domains which will benefit from the MPEG-7 standard. A few application examples are:

See also

Limitations

The MPEG-7 standard was originally written in XML Schema (XSD), which constitutes semi-structured data. For example, the running time of a movie annotated using MPEG-7 in XML is machine-readable data, so software agents will know that the number expressing the running time is a positive integer, but such data is not machine-interpretable (cannot be understood by agents), because it does not convey semantics (meaning), known as the "Semantic Gap." To address this issue, there were many attempts to map the MPEG-7 XML Schema to the Web Ontology Language (OWL), which is a structured data equivalent of the terms of the MPEG-7 standard (MPEG-7Ontos, COMM, SWIntO, etc.). However, these mappings did not really bridge the "Semantic Gap," because low-level video features alone are inadequate for representing video semantics. [9] In other words, annotating an automatically extracted video feature, such as color distribution, does not provide the meaning of the actual visual content. [10]

Compare

Related Research Articles

<span class="mw-page-title-main">Moving Picture Experts Group</span> Alliance of working groups to set standards for multimedia coding

The Moving Picture Experts Group (MPEG) is an alliance of working groups established jointly by ISO and IEC that sets standards for media coding, including compression coding of audio, video, graphics, and genomic data; and transmission and file formats for various applications. Together with JPEG, MPEG is organized under ISO/IEC JTC 1/SC 29 – Coding of audio, picture, multimedia and hypermedia information.

<span class="mw-page-title-main">MPEG-2</span> Video encoding standard

MPEG-2 is a standard for "the generic coding of moving pictures and associated audio information". It describes a combination of lossy video compression and lossy audio data compression methods, which permit storage and transmission of movies using currently available storage media and transmission bandwidth. While MPEG-2 is not as efficient as newer standards such as H.264/AVC and H.265/HEVC, backwards compatibility with existing hardware and software means it is still widely used, for example in over-the-air digital television broadcasting and in the DVD-Video standard.

MPEG-4 is a group of international standards for the compression of digital audio and visual data, multimedia systems, and file storage formats. It was originally introduced in late 1998 as a group of audio and video coding formats and related technology agreed upon by the ISO/IEC Moving Picture Experts Group (MPEG) under the formal standard ISO/IEC 14496 – Coding of audio-visual objects. Uses of MPEG-4 include compression of audiovisual data for Internet video and CD distribution, voice and broadcast television applications. The MPEG-4 standard was developed by a group led by Touradj Ebrahimi and Fernando Pereira.

MPEG-1 Audio Layer II or MPEG-2 Audio Layer II is a lossy audio compression format defined by ISO/IEC 11172-3 alongside MPEG-1 Audio Layer I and MPEG-1 Audio Layer III (MP3). While MP3 is much more popular for PC and Internet applications, MP2 remains a dominant standard for audio broadcasting.

The MPEG-21 standard, from the Moving Picture Experts Group, aims at defining an open framework for multimedia applications. MPEG-21 is ratified in the standards ISO/IEC 21000 - Multimedia framework (MPEG-21).

MPEG-4 Part 3 or MPEG-4 Audio is the third part of the ISO/IEC MPEG-4 international standard developed by Moving Picture Experts Group. It specifies audio coding methods. The first version of ISO/IEC 14496-3 was published in 1999.

<span class="mw-page-title-main">High-Efficiency Advanced Audio Coding</span> Audio codec

High-Efficiency Advanced Audio Coding (HE-AAC) is an audio coding format for lossy data compression of digital audio defined as an MPEG-4 Audio profile in ISO/IEC 14496–3. It is an extension of Low Complexity AAC (AAC-LC) optimized for low-bitrate applications such as streaming audio. The usage profile HE-AAC v1 uses spectral band replication (SBR) to enhance the modified discrete cosine transform (MDCT) compression efficiency in the frequency domain. The usage profile HE-AAC v2 couples SBR with Parametric Stereo (PS) to further enhance the compression efficiency of stereo signals.

MPEG-4 Part 2, MPEG-4 Visual is a video compression format developed by the Moving Picture Experts Group (MPEG). It belongs to the MPEG-4 ISO/IEC standards. It uses block-wise motion compensation and a discrete cosine transform (DCT), similar to previous standards such as MPEG-1 Part 2 and H.262/MPEG-2 Part 2.

The Extensible MPEG-4 Textual Format (XMT) is a high-level, XML-based file format for storing MPEG-4 data in a way suitable for further editing. In contrast, the more common MPEG-4 Part 14 (MP4) format is less flexible and used for distributing finished content.

A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.

The ISO/IEC 11179 Metadata Registry (MDR) standard is an international ISO/IEC standard for representing metadata for an organization in a metadata registry. It documents the standardization and registration of metadata to make data understandable and shareable.

MPEG-4 Part 11Scene description and application engine was published as ISO/IEC 14496-11 in 2005. MPEG-4 Part 11 is also known as BIFS, XMT, MPEG-J. It defines:

Digital Item is the basic unit of transaction in the MPEG-21 framework. It is a structured digital object, including a standard representation, identification and metadata.

MPEG-4 Structured Audio is an ISO/IEC standard for describing sound. It was published as subpart 5 of MPEG-4 Part 3 in 1999.

Geospatial metadata is a type of metadata applicable to geographic data and information. Such objects may be stored in a geographic information system (GIS) or may simply be documents, data-sets, images or other objects, services, or related items that exist in some other native environment but whose features may be appropriate to describe in a (geographic) metadata catalog.

<span class="mw-page-title-main">MP4 file format</span> Digital format for storing video and audio

MPEG-4 Part 14 or MP4 is a digital multimedia container format most commonly used to store video and audio, but it can also be used to store other data such as subtitles and still images. Like most modern container formats, it allows streaming over the Internet. The only filename extension for MPEG-4 Part 14 files as defined by the specification is .mp4. MPEG-4 Part 14 is a standard specified as a part of MPEG-4.

BiM is an international standard defining a generic binary format for encoding XML documents.

A metadata standard is a requirement which is intended to establish a common understanding of the meaning or semantics of the data, to ensure correct and proper use and interpretation of the data by its owners and users. To achieve this common understanding, a number of characteristics, or attributes of the data have to be defined, also known as metadata.

The ISO base media file format (ISOBMFF) is a container file format that defines a general structure for files that contain time-based multimedia data such as video and audio. It is standardized in ISO/IEC 14496-12, a.k.a. MPEG-4 Part 12, and was formerly also published as ISO/IEC 15444-12, a.k.a. JPEG 2000 Part 12.

References

  1. ISO. "ISO/IEC 15938-1:2002 - Information technology -- Multimedia content description interface -- Part 1: Systems" . Retrieved 2009-10-31.
  2. MPEG. "About MPEG - Achievements". chiariglione.org. Archived from the original on July 8, 2008. Retrieved 2009-10-31.
  3. MPEG. "Terms of Reference". chiariglione.org. Archived from the original on February 21, 2010. Retrieved 2009-10-31.
  4. 1 2 MPEG. "MPEG standards - Full list of standards developed or under development". chiariglione.org. Archived from the original on April 20, 2010. Retrieved 2009-10-31.
  5. NetworkDictionary. "Complete Protocol dictionary, glossary and reference - M". Archived from the original on 2010-01-01. Retrieved 2009-12-26.
  6. ISO/IEC JTC 1/SC 29 (2009-10-30). "MPEG-7 (Multimedia content description interface)". Archived from the original on 2013-12-31. Retrieved 2009-11-10.
  7. ISO/IEC JTC1/SC29/WG11 (October 2004). "MPEG-7 Overview (version 10)". chiariglione.org. Retrieved 2009-11-01.
  8. "MPEG-7 Ontology" . Retrieved 29 June 2017.
  9. Sikos, Leslie F.; Powers, David M.W. (2015). "Knowledge-Driven Video Information Retrieval with LOD". Proceedings of the Eighth Workshop on Exploiting Semantic Annotations in Information Retrieval. pp. 35–37. doi:10.1145/2810133.2810141. ISBN   9781450337908. S2CID   16544890.
  10. Boll, Susanne; Klas, Wolfgang; Sheth, Amit (1998). "Overview on Using Metadata to Manage Multimedia Data". Using Metadata to Integrate and Apply Digital Media. McGraw-Hill. p.  3. ISBN   978-0070577350.