Video codec

Last updated
A short video explaining the concept of video codecs.

A video codec is software or hardware that compresses and decompresses digital video. In the context of video compression, codec is a portmanteau of encoder and decoder, while a device that only compresses is typically called an encoder , and one that only decompresses is a decoder.

Contents

The compressed data format usually conforms to a standard video coding format. The compression is typically lossy, meaning that the compressed video lacks some information present in the original video. A consequence of this is that decompressed video has lower quality than the original, uncompressed video because there is insufficient information to accurately reconstruct the original video.

There are complex relationships between the video quality, the amount of data used to represent the video (determined by the bit rate), the complexity of the encoding and decoding algorithms, sensitivity to data losses and errors, ease of editing, random access, and end-to-end delay (latency).

History

Historically, video was stored as an analog signal on magnetic tape. Around the time when the compact disc entered the market as a digital-format replacement for analog audio, it became feasible to also store and convey video in digital form. Because of the large amount of storage and bandwidth needed to record and convey raw video, a method was needed to reduce the amount of data used to represent the raw video. Since then, engineers and mathematicians have developed a number of solutions for achieving this goal that involve compressing the digital video data.

In 1974, discrete cosine transform (DCT) compression was introduced by Nasir Ahmed, T. Natarajan and K. R. Rao. [1] [2] [3] During the late 1980s, a number of companies began experimenting with DCT lossy compression for video coding, leading to the development of the H.261 standard. [4] H.261 was the first practical video coding standard, [5] and was developed by a number of companies, including Hitachi, PictureTel, NTT, BT, and Toshiba, among others. [6] Since H.261, DCT compression has been adopted by all the major video coding standards that followed. [4]

The most popular video coding standards used for codecs have been the MPEG standards. MPEG-1 was developed by the Motion Picture Experts Group (MPEG) in 1991, and it was designed to compress VHS-quality video. It was succeeded in 1994 by MPEG-2/H.262, [5] which was developed by a number of companies, primarily Sony, Thomson and Mitsubishi Electric. [7] MPEG-2 became the standard video format for DVD and SD digital television. [5] In 1999, it was followed by MPEG-4/H.263, which was a major leap forward for video compression technology. [5] It was developed by a number of companies, primarily Mitsubishi Electric, Hitachi and Panasonic. [8]

The most widely used video coding format, as of 2016, is H.264/MPEG-4 AVC. It was developed in 2003 by a number of organizations, primarily Panasonic, Godo Kaisha IP Bridge and LG Electronics. [9] H.264 is the main video encoding standard for Blu-ray Discs, and is widely used by streaming internet services such as YouTube, Netflix, Vimeo, and iTunes Store, web software such as Adobe Flash Player and Microsoft Silverlight, and various HDTV broadcasts over terrestrial and satellite television.

AVC has been succeeded by HEVC (H.265), developed in 2013. It is heavily patented, with the majority of patents belonging to Samsung Electronics, GE, NTT and JVC Kenwood. [10] [11] The adoption of HEVC has been hampered by its complex licensing structure. HEVC is in turn succeeded by Versatile Video Coding (VVC).

There are also the open and free VP8, VP9 and AV1 video coding formats, used by YouTube, all of which were developed with involvement from Google.

Applications

Video codecs are used in DVD players, Internet video, video on demand, digital cable, digital terrestrial television, videotelephony and a variety of other applications. In particular, they are widely used in applications that record or transmit video, which may not be feasible with the high data volumes and bandwidths of uncompressed video. For example, they are used in operating theaters to record surgical operations, in IP cameras in security systems, and in remotely operated underwater vehicles and unmanned aerial vehicles. Any video stream or file can be encoded using a wide variety of live video format options. Here are some of the H.264 encoder settings that need to be set when streaming to an HTML5 video player. [12]

Video codec design

Video codecs seek to represent a fundamentally analog data set in a digital format. Because of the design of analog video signals, which represent luminance (luma) and color information (chrominance, chroma) separately, a common first step in image compression in codec design is to represent and store the image in a YCbCr color space. The conversion to YCbCr provides two benefits: first, it improves compressibility by providing decorrelation of the color signals; and second, it separates the luma signal, which is perceptually much more important, from the chroma signal, which is less perceptually important and which can be represented at lower resolution using chroma subsampling to achieve more efficient data compression. It is common to represent the ratios of information stored in these different channels in the following way Y:Cb:Cr. Different codecs use different chroma subsampling ratios as appropriate to their compression needs. Video compression schemes for Web and DVD make use of a 4:2:1 color sampling pattern, and the DV standard uses 4:1:1 sampling ratios. Professional video codecs designed to function at much higher bitrates and to record a greater amount of color information for post-production manipulation sample in 4:2:2 and 4:4:4 ratios. Examples of these codecs include Panasonic's DVCPRO50 and DVCPROHD codecs (4:2:2), Sony's HDCAM-SR (4:4:4), Panasonic's HDD5 (4:2:2), Apple's Prores HQ 422 (4:2:2). [13]

It is also worth noting that video codecs can operate in RGB space as well. These codecs tend not to sample the red, green, and blue channels in different ratios, since there is less perceptual motivation for doing so—just the blue channel could be undersampled.

Some amount of spatial and temporal downsampling may also be used to reduce the raw data rate before the basic encoding process. The most popular encoding transform is the 8x8 DCT. Codecs which make use of a wavelet transform are also entering the market, especially in camera workflows which involve dealing with RAW image formatting in motion sequences. This process involves representing the video image as a set of macroblocks. For more information about this critical facet of video codec design, see B-frames. [14]

The output of the transform is first quantized, then entropy encoding is applied to the quantized values. When a DCT has been used, the coefficients are typically scanned using a zig-zag scan order, and the entropy coding typically combines a number of consecutive zero-valued quantized coefficients with the value of the next non-zero quantized coefficient into a single symbol, and also has special ways of indicating when all of the remaining quantized coefficient values are equal to zero. The entropy coding method typically uses variable-length coding tables. Some encoders compress the video in a multiple step process called n-pass encoding (e.g. 2-pass), which performs a slower but potentially higher quality compression.

The decoding process consists of performing, to the extent possible, an inversion of each stage of the encoding process. [15] The one stage that cannot be exactly inverted is the quantization stage. There, a best-effort approximation of inversion is performed. This part of the process is often called inverse quantization or dequantization, although quantization is an inherently non-invertible process.

Video codec designs are usually standardized or eventually become standardized—i.e., specified precisely in a published document. However, only the decoding process need be standardized to enable interoperability. The encoding process is typically not specified at all in a standard, and implementers are free to design their encoder however they want, as long as the video can be decoded in the specified manner. For this reason, the quality of the video produced by decoding the results of different encoders that use the same video codec standard can vary dramatically from one encoder implementation to another.

Commonly used video codecs

A variety of video compression formats can be implemented on PCs and in consumer electronics equipment. It is therefore possible for multiple codecs to be available in the same product, reducing the need to choose a single dominant video compression format to achieve interoperability.

Standard video compression formats can be supported by multiple encoder and decoder implementations from multiple sources. For example, video encoded with a standard MPEG-4 Part 2 codec such as Xvid can be decoded using any other standard MPEG-4 Part 2 codec such as FFmpeg MPEG-4 or DivX Pro Codec, because they all use the same video format.

Codecs have their qualities and drawbacks. Comparisons are frequently published. The trade-off between compression power, speed, and fidelity (including artifacts) is usually considered the most important figure of technical merit.

Codec packs

Online video material is encoded by a variety of codecs, and this has led to the availability of codec packs — a pre-assembled set of commonly used codecs combined with an installer available as a software package for PCs, such as K-Lite Codec Pack, Perian and Combined Community Codec Pack.

See also

Related Research Articles

A codec is a device or computer program that encodes or decodes a data stream or signal. Codec is a portmanteau of coder/decoder.

In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.

<span class="mw-page-title-main">JPEG</span> Lossy compression method for reducing the size of digital images

JPEG is a commonly used method of lossy compression for digital images, particularly for those images produced by digital photography. The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality. Since its introduction in 1992, JPEG has been the most widely used image compression standard in the world, and the most widely used digital image format, with several billion JPEG images produced every day as of 2015.

<span class="mw-page-title-main">Lossy compression</span> Data compression approach that reduces data size while discarding or changing some of it

In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size for storing, handling, and transmitting content. The different versions of the photo of the cat on this page show how higher degrees of approximation create coarser images as more details are removed. This is opposed to lossless data compression which does not degrade the data. The amount of data reduction possible using lossy compression is much higher than using lossless techniques.

MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting (DAB) practical.

<span class="mw-page-title-main">Compression artifact</span> Distortion of media caused by lossy data compression

A compression artifact is a noticeable distortion of media caused by the application of lossy compression. Lossy data compression involves discarding some of the media's data so that it becomes small enough to be stored within the desired disk space or transmitted (streamed) within the available bandwidth. If the compressor cannot store enough data in the compressed version, the result is a loss of quality, or introduction of artifacts. The compression algorithm may not be intelligent enough to discriminate between distortions of little subjective importance and those objectionable to the user.

<span class="mw-page-title-main">Advanced Video Coding</span> Most widely used standard for video compression

Advanced Video Coding (AVC), also referred to as H.264 or MPEG-4 Part 10, is a video compression standard based on block-oriented, motion-compensated coding. It is by far the most commonly used format for the recording, compression, and distribution of video content, used by 91% of video industry developers as of September 2019. It supports a maximum resolution of 8K UHD.

H.261 is an ITU-T video compression standard, first ratified in November 1988. It is the first member of the H.26x family of video coding standards in the domain of the ITU-T Study Group 16 Video Coding Experts Group. It was the first video coding standard that was useful in practical terms.

H.262 or MPEG-2 Part 2 is a video coding format standardised and jointly maintained by ITU-T Study Group 16 Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG), and developed with the involvement of many companies. It is the second part of the ISO/IEC MPEG-2 standard. The ITU-T Recommendation H.262 and ISO/IEC 13818-2 documents are identical.

MPEG-4 Part 2, MPEG-4 Visual is a video compression format developed by the Moving Picture Experts Group (MPEG). It belongs to the MPEG-4 ISO/IEC standards. It uses block-wise motion compensation and a discrete cosine transform (DCT), similar to previous standards such as MPEG-1 Part 2 and H.262/MPEG-2 Part 2.

Α video codec is software or a device that provides encoding and decoding for digital video, and which may or may not include the use of video compression and/or decompression. Most codecs are typically implementations of video coding formats.

The macroblock is a processing unit in image and video compression formats based on linear block transforms, typically the discrete cosine transform (DCT). A macroblock typically consists of 16×16 samples, and is further subdivided into transform blocks, and may be further subdivided into prediction blocks. Formats which are based on macroblocks include JPEG, where they are called MCU blocks, H.261, MPEG-1 Part 2, H.262/MPEG-2 Part 2, H.263, MPEG-4 Part 2, and H.264/MPEG-4 AVC. In H.265/HEVC, the macroblock as a basic processing unit has been replaced by the coding tree unit.

Context-adaptive binary arithmetic coding (CABAC) is a form of entropy encoding used in the H.264/MPEG-4 AVC and High Efficiency Video Coding (HEVC) standards. It is a lossless compression technique, although the video coding standards in which it is used are typically for lossy compression applications. CABAC is notable for providing much better compression than most other entropy encoding algorithms used in video encoding, and it is one of the key elements that provides the H.264/AVC encoding scheme with better compression capability than its predecessors.

High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H Part 2, is a video compression standard designed as part of the MPEG-H project as a successor to the widely used Advanced Video Coding. In comparison to AVC, HEVC offers from 25% to 50% better data compression at the same level of video quality, or substantially improved video quality at the same bit rate. It supports resolutions up to 8192×4320, including 8K UHD, and unlike the primarily 8-bit AVC, HEVC's higher fidelity Main 10 profile has been incorporated into nearly all supporting hardware.

A video coding format is a content representation format for storage or transmission of digital video content. It typically uses a standardized video compression algorithm, most commonly based on discrete cosine transform (DCT) coding and motion compensation. A specific software, firmware, or hardware implementation capable of compression or decompression to/from a specific video coding format is called a video codec.

<span class="mw-page-title-main">VP9</span> Open and royalty-free video coding format released by Google in 2013

VP9 is an open and royalty-free video coding format developed by Google.

<span class="mw-page-title-main">Audio coding format</span> Digitally coded format for audio signals

An audio coding format is a content representation format for storage or transmission of digital audio. Examples of audio coding formats include MP3, AAC, Vorbis, FLAC, and Opus. A specific software or hardware implementation capable of audio compression and decompression to/from a specific audio coding format is called an audio codec; an example of an audio codec is LAME, which is one of several different codecs which implements encoding and decoding audio in the MP3 audio coding format in software.

Versatile Video Coding (VVC), also known as H.266, ISO/IEC 23090-3, and MPEG-I Part 3, is a video compression standard finalized on 6 July 2020, by the Joint Video Experts Team (JVET), a joint video expert team of the VCEG working group of ITU-T Study Group 16 and the MPEG working group of ISO/IEC JTC 1/SC 29. It is the successor to High Efficiency Video Coding. It was developed with two primary goals – improved compression performance and support for a very broad range of applications.

JPEG XS is an interoperable, visually lossless, low-latency and lightweight image and video coding system used in professional applications. Applications of the standard include streaming high quality content for virtual reality, drones, autonomous vehicles using cameras, gaming, and broadcasting. In this respect, JPEG XS is unique, being the first ISO codec ever designed for this specific purpose. JPEG XS, built on core technology from both intoPIX and Fraunhofer IIS, is formally standardized as ISO/IEC 21122 by the Joint Photographic Experts Group with the first edition published in 2019. Although not official, the XS acronym was chosen to highlight the eXtra Small and eXtra Speed characteristics of the codec. Today, the JPEG committee is still actively working on further improvements to XS, with the second edition scheduled for publication and initial efforts being launched towards a third edition.

References

  1. Ahmed, Nasir; Natarajan, T.; Rao, K. R. (January 1974), "Discrete Cosine Transform", IEEE Transactions on Computers, C-23 (1): 90–93, doi:10.1109/T-C.1974.223784, S2CID   149806273
  2. Rao, K. R.; Yip, P. (1990), Discrete Cosine Transform: Algorithms, Advantages, Applications, Boston: Academic Press, ISBN   978-0-12-580203-1
  3. "T.81 – DIGITAL COMPRESSION AND CODING OF CONTINUOUS-TONE STILL IMAGES – REQUIREMENTS AND GUIDELINES" (PDF). CCITT. September 1992. Retrieved 12 July 2019.
  4. 1 2 Ghanbari, Mohammed (2003). Standard Codecs: Image Compression to Advanced Video Coding. Institution of Engineering and Technology. pp. 1–2. ISBN   9780852967102.
  5. 1 2 3 4 "The History of Video File Formats Infographic — RealPlayer". 22 April 2012.
  6. "ITU-T Recommendation declared patent(s)". ITU. Retrieved 12 July 2019.
  7. "MPEG-2 Patent List" (PDF). MPEG LA . Retrieved 7 July 2019.
  8. "MPEG-4 Visual - Patent List" (PDF). MPEG LA . Retrieved 6 July 2019.
  9. "AVC/H.264 – Patent List" (PDF). MPEG LA. Retrieved 6 July 2019.
  10. "HEVC Patent List" (PDF). MPEG LA . Retrieved 6 July 2019.
  11. "HEVC Advance Patent List". HEVC Advance . Archived from the original on 24 August 2020. Retrieved 6 July 2019.
  12. "What is the Best Video Codec for Web Streaming? (2021 Update)". Dacast. 2021-06-18. Retrieved 2022-02-11.
  13. Hoffman, P. (June 2011). Requirements for Internet-Draft Tracking by the IETF Community in the Datatracker. doi: 10.17487/rfc6293 .
  14. "Video Codec Design: Developing Image and Video Compression Systems | Wiley". Wiley.com. Retrieved 2022-02-11.
  15. "Encoding Stage - an overview | ScienceDirect Topics". www.sciencedirect.com. Retrieved 2022-02-11.