This article needs additional citations for verification .(January 2015)
A video coding format(or sometimes video compression format) is a content representation format for storage or transmission of digital video content (such as in a data file or bitstream). It typically uses a standardized video compression algorithm, most commonly based on discrete cosine transform (DCT) coding and motion compensation. Examples of video coding formats include H.262 (MPEG-2 Part 2), MPEG-4 Part 2, H.264 (MPEG-4 Part 10), HEVC (H.265), Theora, RealVideo RV40, VP9, and AV1. A specific software or hardware implementation capable of compression or decompression to/from a specific video coding format is called a video codec; an example of a video codec is Xvid, which is one of several different codecs which implements encoding and decoding videos in the MPEG-4 Part 2 video coding format in software.
Some video coding formats are documented by a detailed technical specification document known as a video coding specification. Some such specifications are written and approved by standardization organizations as technical standards, and are thus known as a video coding standard. The term 'standard' is also sometimes used for de facto standards as well as formal standards.
Video content encoded using a particular video coding format is normally bundled with an audio stream (encoded using an audio coding format) inside a multimedia container format such as AVI, MP4, FLV, RealMedia, or Matroska. As such, the user normally doesn't have a H.264 file, but instead has a .mp4 video file, which is an MP4 container containing H.264-encoded video, normally alongside AAC-encoded audio. Multimedia container formats can contain any one of a number of different video coding formats; for example the MP4 container format can contain video in either the MPEG-2 Part 2 or the H.264 video coding format, among others. Another example is the initial specification for the file type WebM, which specified the container format (Matroska), but also exactly which video (VP8) and audio (Vorbis) compression format is used inside the Matroska container, even though the Matroska container format itself is capable of containing other video coding formats (VP9 video and Opus audio support was later added to the WebM specification).
Although video coding formats such as H.264 are sometimes referred to as codecs, there is a clear conceptual difference between a specification and its implementations. Video coding formats are described in specifications, and software or hardware to encode/decode data in a given video coding format from/to uncompressed video are implementations of those specifications. As an analogy, the video coding format H.264 (specification) is to the codec OpenH264 (specific implementation) what the C Programming Language (specification) is to the compiler GCC (specific implementation). Note that for each specification (e.g. H.264), there can be many codecs implementing that specification (e.g. x264, OpenH264, H.264/MPEG-4 AVC products and implementations).
This distinction is not consistently reflected terminologically in the literature. The H.264 specification calls H.261, H.262, H.263, and H.264 video coding standards and does not contain the word codec.The Alliance for Open Media clearly distinguishes between the AV1 video coding format and the accompanying codec they are developing, but calls the video coding format itself a video codec specification. The VP9 specification calls the video coding format VP9 itself a codec.
As an example of conflation, Chromium'sand Mozilla's pages listing their video format support both call video coding formats such as H.264 codecs. As another example, in Cisco's announcement of a free-as-in-beer video codec, the press release refers to the H.264 video coding format as a "codec" ("choice of a common video codec"), but calls Cisco's implementation of a H.264 encoder/decoder a "codec" shortly thereafter ("open-source our H.264 codec").
A video coding format does not dictate all algorithms used by a codec implementing the format. For example, a large part of how video compression typically works is by finding similarities between video frames (block-matching), and then achieving compression by copying previously-coded similar subimages (e.g., macroblocks) and adding small differences when necessary. Finding optimal combinations of such predictors and differences is an NP-hard problem,meaning that it is practically impossible to find an optimal solution. While the video coding format must support such compression across frames in the bitstream format, by not needlessly mandating specific algorithms for finding such block-matches and other encoding steps, the codecs implementing the video coding specification have some freedom to optimize and innovate in their choice of algorithms. For example, section 0.5 of the H.264 specification says that encoding algorithms are not part of the specification. Free choice of algorithm also allows different space–time complexity trade-offs for the same video coding format, so a live feed can use a fast but space-inefficient algorithm, while a one-time DVD encoding for later mass production can trade long encoding-time for space-efficient encoding.
The concept of analog video compression dates back to 1929, when R.D. Kell in Britain proposed the concept of transmitting only the portions of the scene that changed from frame-to-frame. The concept of digital video compression dates back to 1952, when Bell Labs researchers B.M. Oliver and C.W. Harrison proposed the use of differential pulse-code modulation (DPCM) in video coding. The concept of inter-frame motion compensation dates back to 1959, when NHK researchers Y. Taki, M. Hatori and S. Tanaka proposed predictive inter-frame video coding in the temporal dimension.In 1967, University of London researchers A.H. Robinson and C. Cherry proposed run-length encoding (RLE), a lossless compression scheme, to reduce the transmission bandwidth of analog television signals.
The earliest digital video coding algorithms were either for uncompressed video or used lossless compression, both methods inefficient and impractical for digital video coding. –200 Mbit/s for standard-definition (SD) video, which was up to 2,000 times greater than the telecommunication bandwidth (up to 100 kbit/s) available until the 1990s. Similarly, uncompressed high-definition (HD) 1080p video requires bitrates exceeding 1 Gbit/s, significantly greater than the bandwidth available in the 2000s.Digital video was introduced in the 1970s, initially using uncompressed pulse-code modulation (PCM) requiring high bitrates around 45
Practical video compression was made possible by the development of motion-compensated DCT (MC DCT) coding,also called block motion compensation (BMC) or DCT motion compensation. This is a hybrid coding algorithm, which combines two key data compression techniques: discrete cosine transform (DCT) coding in the spatial dimension, and predictive motion compensation in the temporal dimension.
DCT coding is a lossy block compression transform coding technique that was first proposed by Nasir Ahmed, who initially intended it for image compression, while he was working at Kansas State University in 1972. It was then developed into a practical image compression algorithm by Ahmed with T. Natarajan and K. R. Rao at the University of Texas in 1973, and was published in 1974.
The other key development was motion-compensated hybrid coding.In 1974, Ali Habibi at the University of Southern California introduced hybrid coding, which combines predictive coding with transform coding. He examined several transform coding techniques, including the DCT, Hadamard transform, Fourier transform, slant transform, and Karhunen-Loeve transform. However, his algorithm was initially limited to intra-frame coding in the spatial dimension. In 1975, John A. Roese and Guner S. Robinson extended Habibi's hybrid coding algorithm to the temporal dimension, using transform coding in the spatial dimension and predictive coding in the temporal dimension, developing inter-frame motion-compensated hybrid coding. For the spatial transform coding, they experimented with different transforms, including the DCT and the fast Fourier transform (FFT), developing inter-frame hybrid coders for them, and found that the DCT is the most efficient due to its reduced complexity, capable of compressing image data down to 0.25-bit per pixel for a videotelephone scene with image quality comparable to a typical intra-frame coder requiring 2-bit per pixel.
The DCT was applied to video encoding by Wen-Hsiung Chen,who developed a fast DCT algorithm with C.H. Smith and S.C. Fralick in 1977, and founded Compression Labs to commercialize DCT technology. In 1979, Anil K. Jain and Jaswant R. Jain further developed motion-compensated DCT video compression. This led to Chen developing a practical video compression algorithm, called motion-compensated DCT or adaptive scene coding, in 1981. Motion-compensated DCT later became the standard coding technique for video compression from the late 1980s onwards.
The first digital video coding standard was H.120, developed by the CCITT (now ITU-T) in 1984.H.120 was not usable in practice, as its performance was too poor. H.120 used motion-compensated DPCM coding, a lossless compression algorithm that was inefficient for video coding. During the late 1980s, a number of companies began experimenting with discrete cosine transform (DCT) coding, a much more efficient form of compression for video coding. The CCITT received 14 proposals for DCT-based video compression formats, in contrast to a single proposal based on vector quantization (VQ) compression. The H.261 standard was developed based on motion-compensated DCT compression. H.261 was the first practical video coding standard, and was developed with patents licensed from a number of companies, including Hitachi, PictureTel, NTT, BT, and Toshiba, among others. Since H.261, motion-compensated DCT compression has been adopted by all the major video coding standards (including the H.26x and MPEG formats) that followed.
MPEG-1, developed by the Motion Picture Experts Group (MPEG), followed in 1991, and it was designed to compress VHS-quality video.It was succeeded in 1994 by MPEG-2/H.262, which was developed with patents licensed from a number of companies, primarily Sony, Thomson and Mitsubishi Electric. MPEG-2 became the standard video format for DVD and SD digital television. Its motion-compensated DCT algorithm was able to achieve a compression ratio of up to 100:1, enabling the development of digital media technologies such as video-on-demand (VOD) and high-definition television (HDTV). In 1999, it was followed by MPEG-4/H.263, which was a major leap forward for video compression technology. It was developed with patents licensed from a number of companies, primarily Mitsubishi, Hitachi and Panasonic.
The most widely used video coding format as of 2019 [update] is H.264/MPEG-4 AVC. It was developed in 2003, with patents licensed from a number of organizations, primarily Panasonic, Godo Kaisha IP Bridge and LG Electronics. In contrast to the standard DCT used by its predecessors, AVC uses the integer DCT. H.264 is one of the video encoding standards for Blu-ray Discs; all Blu-ray Disc players must be able to decode H.264. It is also widely used by streaming internet sources, such as videos from YouTube, Netflix, Vimeo, and the iTunes Store, web software such as the Adobe Flash Player and Microsoft Silverlight, and also various HDTV broadcasts over terrestrial (Advanced Television Systems Committee standards, ISDB-T, DVB-T or DVB-T2), cable (DVB-C), and satellite (DVB-S2).
A main problem for many video coding formats has been patents, making it expensive to use or potentially risking a patent lawsuit due to submarine patents. The motivation behind many recently designed video coding formats such as Theora, VP8 and VP9 have been to create a (libre) video coding standard covered only by royalty-free patents.Patent status has also been a major point of contention for the choice of which video formats the mainstream web browsers will support inside the HTML5 video tag.
The current-generation video coding format is HEVC (H.265), introduced in 2013. While AVC uses the integer DCT with 4x4 and 8x8 block sizes, HEVC uses integer DCT and DST transforms with varied block sizes between 4x4 and 32x32. As of 2019 [update] , AVC is by far the most commonly used format for the recording, compression and distribution of video content, used by 91% of video developers, followed by HEVC which is used by 43% of developers.HEVC is heavily patented, with the majority of patents belonging to Samsung Electronics, GE, NTT and JVC Kenwood. It is currently being challenged by the aiming-to-be-freely-licensed AV1 format.
|Basic algorithm||Video coding standard||Year||Publisher(s)||Committee(s)||Licensor(s)||Market share (2019)||Popular implementations|
|DCT||H.261||1988||CCITT||VCEG||Hitachi, PictureTel, NTT, BT, Toshiba, etc.||N/A||Videoconferencing, videotelephony|
|Motion JPEG (MJPEG)||1992||JPEG||JPEG||N/A||N/A||QuickTime|
|MPEG-1 Part 2||1993||ISO, IEC||MPEG||Fujitsu, IBM, Matsushita, etc.||N/A||Video-CD, Internet video|
|H.262 / MPEG-2 Part 2 (MPEG-2 Video)||1995||ISO, IEC, ITU-T||MPEG, VCEG||Sony, Thomson, Mitsubishi, etc.||29%||DVD Video, Blu-ray, DVB, ATSC, SVCD, SDTV|
|DV||1995||IEC||IEC||Sony, Panasonic||Unknown||Camcorders, digital cassettes|
|H.263||1996||ITU-T||VCEG||Mitsubishi, Hitachi, Panasonic, etc.||Unknown||Videoconferencing, videotelephony, H.320, Integrated Services Digital Network (ISDN), mobile video (3GP), MPEG-4 Visual|
|MPEG-4 Part 2 (MPEG-4 Visual)||1999||ISO, IEC||MPEG||Mitsubishi, Hitachi, Panasonic, etc.||Unknown||Internet video, DivX, Xvid|
|DWT||Motion JPEG 2000 (MJ2)||2001||JPEG||JPEG||N/A||Unknown||Digital cinema|
|DCT||Advanced Video Coding (H.264 / MPEG-4 AVC)||2003||ISO, IEC, ITU-T||MPEG, VCEG||Panasonic, Godo Kaisha IP Bridge, LG, etc.||91%||Blu-ray, HD DVD, HDTV (DVB, ATSC), video streaming (YouTube, Netflix, Vimeo), iTunes Store, iPod Video, Apple TV, videoconferencing, Flash Player, Silverlight, VOD|
|Theora||2004||Xiph||Xiph||N/A||Unknown||Internet video, web browsers|
|VC-1||2006||SMPTE||SMPTE||Microsoft, Panasonic, LG, Samsung, etc.||Unknown||Blu-ray, Internet video|
|Apple ProRes||2007||Apple||Apple||Apple||Unknown||Video production, post-production|
|High Efficiency Video Coding (H.265 / MPEG-H HEVC)||2013||ISO, IEC, ITU-T||MPEG, VCEG||Samsung, GE, NTT, JVC Kenwood, etc.||43%||UHD Blu-ray, DVB, ATSC 3.0, UHD streaming, High Efficiency Image Format, macOS High Sierra, iOS 11|
|Versatile Video Coding (VVC / H.266)||2020||JVET||JVET||Unknown||N/A||N/A|
Consumer video is generally compressed using lossy video codecs, since that results in significantly smaller files than lossless compression. While there are video coding formats designed explicitly for either lossy or lossless compression, some video coding formats such as Dirac and H.264 support both.
Uncompressed video formats, such as Clean HDMI, is a form of lossless video used in some circumstances such as when sending video to a display over a HDMI connection. Some high-end cameras can also capture video directly in this format.
Interframe compression complicates editing of an encoded video sequence.One subclass of relatively simple video coding formats are the intra-frame video formats, such as DV, in which each frame of the video stream is compressed independently without referring to other frames in the stream, and no attempt is made to take advantage of correlations between successive pictures over time for better compression. One example is Motion JPEG, which is simply a sequence of individually JPEG-compressed images. This approach is quick and simple, at the expense the encoded video being much larger than a video coding format supporting Inter frame coding.
Because interframe compression copies data from one frame to another, if the original frame is simply cut out (or lost in transmission), the following frames cannot be reconstructed properly. Making 'cuts' in intraframe-compressed video while video editing is almost as easy as editing uncompressed video: one finds the beginning and ending of each frame, and simply copies bit-for-bit each frame that one wants to keep, and discards the frames one doesn't want. Another difference between intraframe and interframe compression is that, with intraframe systems, each frame uses a similar amount of data. In most interframe systems, certain frames (such as "I frames" in MPEG-2) aren't allowed to copy data from other frames, so they require much more data than other frames nearby.
It is possible to build a computer-based video editor that spots problems caused when I frames are edited out while other frames need them. This has allowed newer formats like HDV to be used for editing. However, this process demands a lot more computing power than editing intraframe compressed video with the same picture quality. But, this compression is not very effective to use for any audio format.
A video coding format can define optional restrictions to encoded video, called profiles and levels. It is possible to have a decoder which only supports decoding a subset of profiles and levels of a given video format, for example to make the decoder program/hardware smaller, simpler, or faster.
A profile restricts which encoding techniques are allowed. For example, the H.264 format includes the profiles baseline, main and high (and others). While P-slices (which can be predicted based on preceding slices) are supported in all profiles, B-slices (which can be predicted based on both preceding and following slices) are supported in the main and high profiles but not in baseline.
A level is a restriction on parameters such as maximum resolution and data rates.
A significant advance in image coding methodology occurred with the introduction of the concept of hybrid transform/DPCM coding (Habibi, 1974).
H.263 is similar to, but more complex than H.261. It is currently the most widely used international video compression standard for video telephony on ISDN (Integrated Services Digital Network) telephone lines.
In signal processing, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.
In information technology, lossy compression or irreversible compression is the class of data encoding methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size for storing, handling, and transmitting content. The different versions of the photo of the cat on this page show how higher degrees of approximation create coarser images as more details are removed. This is opposed to lossless data compression which does not degrade the data. The amount of data reduction possible using lossy compression is much higher than using lossless techniques.
Motion compensation is an algorithmic technique used to predict a frame in a video, given the previous and/or future frames by accounting for motion of the camera and/or objects in the video. It is employed in the encoding of video data for video compression, for example in the generation of MPEG-2 files. Motion compensation describes a picture in terms of the transformation of a reference picture to the current picture. The reference picture may be previous in time or even from the future. When images can be accurately synthesized from previously transmitted/stored images, the compression efficiency can be improved.
A video codec is software or hardware that compresses and decompresses digital video. In the context of video compression, codec is a portmanteau of encoder and decoder, while a device that only compresses is typically called an encoder, and one that only decompresses is a decoder.
Dirac is an open and royalty-free video compression format, specification and system developed by BBC Research & Development. Schrödinger and dirac-research are open and royalty-free software implementations of Dirac. Dirac format aims to provide high-quality video compression for Ultra HDTV and beyond, and as such competes with existing formats such as H.264 and VC-1.
Motion JPEG is a video compression format in which each video frame or interlaced field of a digital video sequence is compressed separately as a JPEG image.
A compression artifact is a noticeable distortion of media caused by the application of lossy compression. Lossy data compression involves discarding some of the media's data so that it becomes small enough to be stored within the desired disk space or transmitted (streamed) within the available bandwidth. If the compressor cannot store enough data in the compressed version, the result is a loss of quality, or introduction of artifacts. The compression algorithm may not be intelligent enough to discriminate between distortions of little subjective importance and those objectionable to the user.
Advanced Video Coding (AVC), also referred to as H.264 or MPEG-4 Part 10, Advanced Video Coding, is a video compression standard based on block-oriented, motion-compensated integer-DCT coding. It is by far the most commonly used format for the recording, compression, and distribution of video content, used by 91% of video industry developers as of September 2019. It supports resolutions up to and including 8K UHD.
H.261 is an ITU-T video compression standard, first ratified in November 1988. It is the first member of the H.26x family of video coding standards in the domain of the ITU-T Study Group 16 Video Coding Experts Group, and was developed with a number of companies, including Hitachi, PictureTel, NTT, BT and Toshiba. It was the first video coding standard that was useful in practical terms.
An inter frame is a frame in a video compression stream which is expressed in terms of one or more neighboring frames. The "inter" part of the term refers to the use of Inter frame prediction. This kind of prediction tries to take advantage from temporal redundancy between neighboring frames enabling higher compression rates.
x264 is a free and open-source software library and a command-line utility developed by VideoLAN for encoding video streams into the H.264/MPEG-4 AVC video coding format. It is released under the terms of the GNU General Public License.
Quarter-pixel motion(also known as Q-pel motion or Qpel motion) refers to using a quarter of the distance between pixels as the motion vector precision for motion estimation and motion compensation in video compression schemes. It is used in many modern video coding formats such as MPEG-4 ASP, H.264/AVC, and HEVC. Though higher precision motion vectors take more bits to encode, they can sometimes result in more efficient compression overall, by increasing the quality of the prediction signal.
The Video Coding Experts Group or Visual Coding Experts Group is a working group of the ITU Telecommunication Standardization Sector (ITU-T) concerned with video coding standards. It is responsible for standardization of the "H.26x" line of video coding standards, the "T.8xx" line of image coding standards, and related technologies.
The macroblock is a processing unit in image and video compression formats based on linear block transforms, typically the discrete cosine transform (DCT). A macroblock typically consists of 16×16 samples, and is further subdivided into transform blocks, and may be further subdivided into prediction blocks. Formats which are based on macroblocks include JPEG, where they are called MCU blocks, H.261, MPEG-1 Part 2, H.262/MPEG-2 Part 2, H.263, MPEG-4 Part 2, and H.264/MPEG-4 AVC. In H.265/HEVC, the macroblock as a basic processing unit has been replaced by the coding tree unit.
VP8 is an open and royalty-free video compression format created by On2 Technologies as a successor to VP7 and owned by Google from 2010.
Multiview Video Coding is a stereoscopic video coding standard for video compression that allows for the efficient encoding of video sequences captured simultaneously from multiple camera angles in a single video stream. It uses the 2D plus Delta method and is an amendment to the H.264 video compression standard, developed jointly by MPEG and VCEG, with contributions from a number of companies, primarily Panasonic and LG Electronics.
High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H Part 2, is a video compression standard designed as part of the MPEG-H project as a successor to the widely used Advanced Video Coding. In comparison to AVC, HEVC offers from 25% to 50% better data compression at the same level of video quality, or substantially improved video quality at the same bit rate. It supports resolutions up to 8192×4320, including 8K UHD, and unlike the primarily 8-bit AVC, HEVC's higher fidelity Main10 profile has been incorporated into nearly all supporting hardware.
VP9 is an open and royalty-free video coding format developed by Google.
ZPEG is a motion video technology that applies a human visual acuity model to a decorrelated transform-domain space, thereby optimally reducing the redundancies in motion video by removing the subjectively imperceptible. This technology is applicable to a wide range of video processing problems such as video optimization, real-time motion video compression, subjective quality monitoring, and format conversion.
Versatile Video Coding (VVC), also known as H.266, ISO/IEC 23090-3, MPEG-I Part 3 and Future Video Coding (FVC), is a video compression standard finalized on 6 July 2020, by the Joint Video Experts Team (JVET), a joint video expert team of the VCEG working group of ITU-T Study Group 16 and the MPEG working group of ISO/IEC JTC 1. It is the successor to High Efficiency Video Coding. The aim is to make 4K broadcast and streaming commercially viable.