H.261

Last updated
H.261
Video codec for audiovisual services at p x 64 kbit/s
StatusPublished
Year started1988
Latest version(03/93)
Organization ITU-T, Hitachi, PictureTel, NTT, BT, Toshiba, etc.
Committee ITU-T Study Group 16 VCEG (then: Specialists Group on Coding for Visual Telephony)
Related standards H.262, H.263, H.264, H.265, H.266, H.320
Domain video compression
Website https://www.itu.int/rec/T-REC-H.261

H.261 is an ITU-T video compression standard, first ratified in November 1988. [1] [2] It is the first member of the H.26x family of video coding standards in the domain of the ITU-T Study Group 16 Video Coding Experts Group (VCEG, then Specialists Group on Coding for Visual Telephony). It was the first video coding standard that was useful in practical terms.

Contents

H.261 was originally designed for transmission over ISDN lines on which data rates are multiples of 64 kbit/s. The coding algorithm was designed to be able to operate at video bit rates between 40 kbit/s and 2 Mbit/s. The standard supports two video frame sizes: CIF (352×288 luma with 176×144 chroma) and QCIF (176×144 with 88×72 chroma) using a 4:2:0 sampling scheme. It also has a backward-compatible trick for sending still images with 704×576 luma resolution and 352×288 chroma resolution (which was added in a later revision in 1993).

History

The first digital video coding standard was H.120, created by the CCITT (now ITU-T) in 1984. [3] H.120 was not usable in practice, as its performance was too poor. [3] H.120 was based on differential pulse-code modulation (DPCM), which had inefficient compression. During the late 1980s, a number of companies began experimenting with the much more efficient DCT compression for video coding, with the CCITT receiving 14 proposals for DCT-based video compression formats, in contrast to a single proposal based on vector quantization (VQ) compression. The H.261 standard was subsequently developed based on DCT compression. [4]

H.261 was developed by the CCITT Study Group XV Specialists Group on Coding for Visual Telephony (which later became part of ITU-T SG16), chaired by Sakae Okubo of NTT. [5] Since H.261, DCT compression has been adopted by all the major video coding standards that followed. [4]

Whilst H.261 was preceded in 1984 by H.120 (which also underwent a revision in 1988 of some historic importance) as a digital video coding standard, H.261 was the first truly practical digital video coding standard (in terms of product support in significant quantities). In fact, all subsequent international video coding standards (MPEG-1 Part 2, H.262/MPEG-2 Part 2, H.263, MPEG-4 Part 2, H.264/MPEG-4 Part 10, and HEVC) have been based closely on the H.261 design. Additionally, the methods used by the H.261 development committee to collaboratively develop the standard have remained the basic operating process for subsequent standardization work in the field. [5]

Although H.261 was first approved as a standard in 1988, the first version was missing some significant elements necessary to make it a complete interoperability specification. Various parts of it were marked as "Under Study". [2] It was later revised in 1990 to add the remaining necessary aspects, [6] and was then revised again in 1993. [7] The 1993 revision added an Annex D entitled "Still image transmission", which provided a backward-compatible way to send still images with 704×576 luma resolution and 352×288 chroma resolution by using a staggered 2:1 subsampling horizontally and vertically to separate the picture into four sub-pictures that were sent sequentially. [7]

H.261 design

The basic processing unit of the design is called a macroblock, and H.261 was the first standard in which the macroblock concept appeared. Each macroblock consists of a 16×16 array of luma samples and two corresponding 8×8 arrays of chroma samples, using 4:2:0 sampling and a YCbCr color space. The coding algorithm uses a hybrid of motion-compensated inter-picture prediction and spatial transform coding with scalar quantization, zig-zag scanning and entropy encoding.

The inter-picture prediction reduces temporal redundancy, with motion vectors used to compensate for motion. Whilst only integer-valued motion vectors are supported in H.261, a blurring filter can be applied to the prediction signal – partially mitigating the lack of fractional-sample motion vector precision. Transform coding using an 8×8 discrete cosine transform (DCT) reduces the spatial redundancy. The DCT that is widely used in this regard was introduced by N. Ahmed, T. Natarajan and K. R. Rao in 1974. [8] Scalar quantization is then applied to round the transform coefficients to the appropriate precision determined by a step size control parameter, and the quantized transform coefficients are zig-zag scanned and entropy-coded (using a "run-level" variable-length code) to remove statistical redundancy.

The H.261 standard actually only specifies how to decode the video. Encoder designers were left free to design their own encoding algorithms (such as their own motion estimation algorithms), as long as their output was constrained properly to allow it to be decoded by any decoder made according to the standard. Encoders are also left free to perform any pre-processing they want to their input video, and decoders are allowed to perform any post-processing they want to their decoded video prior to display. One effective post-processing technique that became a key element of the best H.261-based systems is called deblocking filtering. This reduces the appearance of block-shaped artifacts caused by the block-based motion compensation and spatial transform parts of the design. Indeed, blocking artifacts are probably a familiar phenomenon to almost everyone who has watched digital video. Deblocking filtering has since become an integral part of the more recent standards H.264 and HEVC (although even when using these newer standards, additional post-processing is still allowed and can enhance visual quality if performed well).

Design refinements introduced in later standardization efforts have resulted in significant improvements in compression capability relative to the H.261 design. This has resulted in H.261 becoming essentially obsolete, although it is still used as a backward-compatibility mode in some video-conferencing systems (such as H.323) and for some types of internet video. However, H.261 remains a major historical milestone in the field of video coding development.

Software implementations

The LGPL-licensed libavcodec includes a H.261 encoder and decoder. It is supported by the free VLC media player and MPlayer multimedia players, and in ffdshow and FFmpeg decoders projects.

Patent holders

The following companies contributed patents towards the development of the H.261 format: [9]

See also

Related Research Articles

In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.

H.263 is a video compression standard originally designed as a low-bit-rate compressed format for videotelephony. It was standardized by the ITU-T Video Coding Experts Group (VCEG) in a project ending in 1995/1996. It is a member of the H.26x family of video coding standards in the domain of the ITU-T.

<span class="mw-page-title-main">Lossy compression</span> Data compression approach that reduces data size while discarding or changing some of it

In information technology, lossy compression or irreversible compression is the class of data compression methods that uses inexact approximations and partial data discarding to represent the content. These techniques are used to reduce data size for storing, handling, and transmitting content. The different versions of the photo of the cat on this page show how higher degrees of approximation create coarser images as more details are removed. This is opposed to lossless data compression which does not degrade the data. The amount of data reduction possible using lossy compression is much higher than using lossless techniques.

MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting (DAB) practical.

<span class="mw-page-title-main">Motion compensation</span> Video compression technique, used to efficiently predict and generate video frames

Motion compensation in computing, is an algorithmic technique used to predict a frame in a video, given the previous and/or future frames by accounting for motion of the camera and/or objects in the video. It is employed in the encoding of video data for video compression, for example in the generation of MPEG-2 files. Motion compensation describes a picture in terms of the transformation of a reference picture to the current picture. The reference picture may be previous in time or even from the future. When images can be accurately synthesized from previously transmitted/stored images, the compression efficiency can be improved.

A video codec is software or hardware that compresses and decompresses digital video. In the context of video compression, codec is a portmanteau of encoder and decoder, while a device that only compresses is typically called an encoder, and one that only decompresses is a decoder.

<span class="mw-page-title-main">Compression artifact</span> Distortion of media caused by lossy data compression

A compression artifact is a noticeable distortion of media caused by the application of lossy compression. Lossy data compression involves discarding some of the media's data so that it becomes small enough to be stored within the desired disk space or transmitted (streamed) within the available bandwidth. If the compressor cannot store enough data in the compressed version, the result is a loss of quality, or introduction of artifacts. The compression algorithm may not be intelligent enough to discriminate between distortions of little subjective importance and those objectionable to the user.

<span class="mw-page-title-main">Advanced Video Coding</span> Most widely used standard for video compression

Advanced Video Coding (AVC), also referred to as H.264 or MPEG-4 Part 10, is a video compression standard based on block-oriented, motion-compensated coding. It is by far the most commonly used format for the recording, compression, and distribution of video content, used by 91% of video industry developers as of September 2019. It supports resolutions up to and including 8K UHD.

H.262 or MPEG-2 Part 2 is a video coding format standardised and jointly maintained by ITU-T Study Group 16 Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG), and developed with the involvement of many companies. It is the second part of the ISO/IEC MPEG-2 standard. The ITU-T Recommendation H.262 and ISO/IEC 13818-2 documents are identical.

<span class="mw-page-title-main">Indeo</span> Audio and video formats by Intel

Indeo Video is a family of audio and video formats and codecs first released in 1992, and designed for real-time video playback on desktop CPUs. While its original version was related to Intel's DVI video stream format, a hardware-only codec for the compression of television-quality video onto compact discs, Indeo was distinguished by being one of the first codecs allowing full-speed video playback without using hardware acceleration. Also unlike Cinepak and TrueMotion S, the compression used the same Y'CbCr 4:2:0 colorspace as the ITU's H.261 and ISO's MPEG-1. Indeo use was free of charge to allow for broadest usage.

JPEG XR is an image compression standard for continuous tone photographic images, based on the HD Photo specifications that Microsoft originally developed and patented. It supports both lossy and lossless compression, and is the preferred image format for Ecma-388 Open XML Paper Specification documents.

The Video Coding Experts Group or Visual Coding Experts Group is a working group of the ITU Telecommunication Standardization Sector (ITU-T) concerned with standards for compression coding of video, images, audio, and other signals. It is responsible for standardization of the "H.26x" line of video coding standards, the "T.8xx" line of image coding standards, and related technologies.

The macroblock is a processing unit in image and video compression formats based on linear block transforms, typically the discrete cosine transform (DCT). A macroblock typically consists of 16×16 samples, and is further subdivided into transform blocks, and may be further subdivided into prediction blocks. Formats which are based on macroblocks include JPEG, where they are called MCU blocks, H.261, MPEG-1 Part 2, H.262/MPEG-2 Part 2, H.263, MPEG-4 Part 2, and H.264/MPEG-4 AVC. In H.265/HEVC, the macroblock as a basic processing unit has been replaced by the coding tree unit.

<span class="mw-page-title-main">Intra-frame coding</span>

Intra-frame coding is a data compression technique used within a video frame, enabling smaller file sizes and lower bitrates, with little or no loss in quality. Since neighboring pixels within an image are often very similar, rather than storing each pixel independently, the frame image is divided into blocks and the typically minor difference between each pixel can be encoded using fewer bits.

A deblocking filter is a video filter applied to decoded compressed video to improve visual quality and prediction performance by smoothing the sharp edges which can form between macroblocks when block coding techniques are used. The filter aims to improve the appearance of decoded pictures. It is a part of the specification for both the SMPTE VC-1 codec and the ITU H.264 codec.

H.120 was the first digital video compression standard. It was developed by COST 211 and published by the CCITT in 1984, with a revision in 1988 that included contributions proposed by other organizations. The video turned out not to be of adequate quality, there were few implementations, and there are no existing codecs for the format, but it provided important knowledge leading directly to its practical successors, such as H.261. The latest revision was published in March 1993.

High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H Part 2, is a video compression standard designed as part of the MPEG-H project as a successor to the widely used Advanced Video Coding. In comparison to AVC, HEVC offers from 25% to 50% better data compression at the same level of video quality, or substantially improved video quality at the same bit rate. It supports resolutions up to 8192×4320, including 8K UHD, and unlike the primarily 8-bit AVC, HEVC's higher fidelity Main 10 profile has been incorporated into nearly all supporting hardware.

A video coding format is a content representation format for storage or transmission of digital video content. It typically uses a standardized video compression algorithm, most commonly based on discrete cosine transform (DCT) coding and motion compensation. A specific software, firmware, or hardware implementation capable of compression or decompression to/from a specific video coding format is called a video codec.

Coding tree unit (CTU) is the basic processing unit of the High Efficiency Video Coding (HEVC) video standard and conceptually corresponds in structure to macroblock units that were used in several previous video standards. CTU is also referred to as largest coding unit (LCU).

References

  1. "(Nokia position paper) Web Architecture and Codec Considerations for Audio-Visual Services" (PDF). H.261, which (in its first version) was ratified in November 1988.
  2. 1 2 ITU-T (1988). "H.261 : Video codec for audiovisual services at p x 384 kbit/s - Recommendation H.261 (11/88)" . Retrieved 2010-10-21.
  3. 1 2 "The History of Video File Formats Infographic". RealNetworks . 22 April 2012. Retrieved 5 August 2019.
  4. 1 2 Ghanbari, Mohammed (2003). Standard Codecs: Image Compression to Advanced Video Coding. Institution of Engineering and Technology. pp. 1–2. ISBN   9780852967102.
  5. 1 2 S. Okubo, "Reference model methodology – A tool for the collaborative creation of video coding standards", Proceedings of the IEEE, vol. 83, no. 2, Feb. 1995, pp. 139–150
  6. ITU-T (1990). "H.261 : Video codec for audiovisual services at p x 64 kbit/s - Recommendation H.261 (12/90)" . Retrieved 2015-12-10.
  7. 1 2 ITU-T (1993). "H.261 : Video codec for audiovisual services at p x 64 kbit/s - Recommendation H.261 (03/93)" . Retrieved 2015-12-10.
  8. N. Ahmed, T. Natarajan and K. R. Rao, "Discrete Cosine Transform", IEEE Transactions on Computers, Jan. 1974, pp. 90-93; PDF file Archived 2011-11-25 at the Wayback Machine .
  9. "ITU-T Recommendation declared patent(s)". ITU. Retrieved 12 July 2019.
  10. "Patent statement declaration registered as H261-07". ITU. Retrieved 11 July 2019.