Global motion compensation

Last updated

Global motion compensation(GMC) is a motion compensation technique used in video compression to reduce the bitrate required to encode video. It is most commonly used in MPEG-4 ASP, such as with the DivX and Xvid codecs.

Contents

Operation

Global motion compensation describes the motion in a scene based on a single affine transform instruction. The reference frame is panned, rotated and zoomed in accordance to GMC warp points to create a prediction of how the following frame will look. Since this operation works on individual pixels (rather than blocks), it is capable of creating predictions that are not possible using block-based approaches.

Each macroblock in such a frame can be compensated using global motion (no further motion information is then signalled) or, alternatively, local motion (as if GMC were off). This choice, while costing an additional bit per macroblock, can improve prediction quality and therefore reduce residual.

Because the transforms used in global motion compensation are only added to the encoding stream when used, they do not have a constant bitrate overhead. A predicted frame which uses GMC is called an S-frame (sprite frame) while a predicted frame encoded without GMC is called either a P-frame, if it was predicted purely by previous (past) frames, or a B-frame if it was predicted jointly with past and future frames (an unpredicted frame encoded as a whole image is referred to as an I-frame).

Implementations

DivX offers 1 warp-point GMC encoding: This enables easier hardware support in DivX certified and non-certified devices. But as 1 warp-point GMC limits the global transform to panning operation only (since panning can be described using blocks), this implementation rarely improves video quality.

Xvid offers 3 warp-point GMC encoding: As a result, it currently has no hardware support.

Criticism

GMC failed to meet expectations of dramatic improvements in motion compensation, and as a result it was omitted from the H.264/MPEG-4 AVC specification - designed as a successor to MPEG-4 ASP. Most of GMC's benefits could be obtained via better motion vector prediction. [1] GMC also represents a large computational cost during encoding which yields relatively minor quality improvements.

Due to the extra decoding CPU cost of global motion compensation, most hardware players do not support global motion compensation.

See also

Related Research Articles

MPEG-1 is a standard for lossy compression of video and audio. It is designed to compress VHS-quality raw digital video and CD audio down to about 1.5 Mbit/s without excessive quality loss, making video CDs, digital cable/satellite TV and digital audio broadcasting (DAB) practical.

<span class="mw-page-title-main">Motion compensation</span> Video compression technique, used to efficiently predict and generate video frames

Motion compensation in computing, is an algorithmic technique used to predict a frame in a video, given the previous and/or future frames by accounting for motion of the camera and/or objects in the video. It is employed in the encoding of video data for video compression, for example in the generation of MPEG-2 files. Motion compensation describes a picture in terms of the transformation of a reference picture to the current picture. The reference picture may be previous in time or even from the future. When images can be accurately synthesized from previously transmitted/stored images, the compression efficiency can be improved.

Motion JPEG is a video compression format in which each video frame or interlaced field of a digital video sequence is compressed separately as a JPEG image.

<span class="mw-page-title-main">Compression artifact</span> Distortion of media caused by lossy data compression

A compression artifact is a noticeable distortion of media caused by the application of lossy compression. Lossy data compression involves discarding some of the media's data so that it becomes small enough to be stored within the desired disk space or transmitted (streamed) within the available bandwidth. If the compressor cannot store enough data in the compressed version, the result is a loss of quality, or introduction of artifacts. The compression algorithm may not be intelligent enough to discriminate between distortions of little subjective importance and those objectionable to the user.

<span class="mw-page-title-main">Xvid</span> Video codec library

Xvid is a video codec library following the MPEG-4 video coding standard, specifically MPEG-4 Part 2 Advanced Simple Profile (ASP). It uses ASP features such as b-frames, global and quarter pixel motion compensation, lumi masking, trellis quantization, and H.263, MPEG and custom quantization matrices.

<span class="mw-page-title-main">Advanced Video Coding</span> Most widely used standard for video compression

Advanced Video Coding (AVC), also referred to as H.264 or MPEG-4 Part 10, is a video compression standard based on block-oriented, motion-compensated coding. It is by far the most commonly used format for the recording, compression, and distribution of video content, used by 91% of video industry developers as of September 2019. It supports resolutions up to and including 8K UHD.

In the field of video compression a video frame is compressed using different algorithms with different advantages and disadvantages, centered mainly around amount of data compression. These different algorithms for video frames are called picture types or frame types. The three major picture types used in the different video algorithms are I, P and B. They are different in the following characteristics:

An inter frame is a frame in a video compression stream which is expressed in terms of one or more neighboring frames. The "inter" part of the term refers to the use of Inter frame prediction. This kind of prediction tries to take advantage from temporal redundancy between neighboring frames enabling higher compression rates.

H.262 or MPEG-2 Part 2 is a video coding format standardised and jointly maintained by ITU-T Study Group 16 Video Coding Experts Group (VCEG) and ISO/IEC Moving Picture Experts Group (MPEG), and developed with the involvement of many companies. It is the second part of the ISO/IEC MPEG-2 standard. The ITU-T Recommendation H.262 and ISO/IEC 13818-2 documents are identical.

Smart Bitrate Control, commonly referred to as SBC, was a technique for achieving greatly improved video compression efficiency using the DivX 3.11 Alpha video codec or Microsoft's proprietary MPEG4v2 video codec and the Nandub video encoder. SBC relied on two main technologies to achieve this improved efficiency: Multipass encoding and Variable Keyframe Intervals (VKI). SBC ceased to be commonly used after XviD and DivX development progressed to a point where they incorporated the same features that SBC pioneered and could offer even more efficient video compression without the need for a specialized application. Files created by SBC are compatible with DivX 3.11 Alpha and can be decoded by most codecs that support ISO MPEG4 video.

MPEG-4 Part 2, MPEG-4 Visual is a video compression format developed by the Moving Picture Experts Group (MPEG). It belongs to the MPEG-4 ISO/IEC standards. It uses block-wise motion compensation and a discrete cosine transform (DCT), similar to previous standards such as MPEG-1 Part 2 and H.262/MPEG-2 Part 2.

Quarter-pixel motion(also known as Q-pel motion or Qpel motion) refers to using a quarter of the distance between pixels as the motion vector precision for motion estimation and motion compensation in video compression schemes. It is used in many modern video coding formats such as MPEG-4 ASP, H.264/AVC, and HEVC. Though higher precision motion vectors take more bits to encode, they can sometimes result in more efficient compression overall, by increasing the quality of the prediction signal.

Α video codec is software or a device that provides encoding and decoding for digital video, and which may or may not include the use of video compression and/or decompression. Most codecs are typically implementations of video coding formats.

The macroblock is a processing unit in image and video compression formats based on linear block transforms, typically the discrete cosine transform (DCT). A macroblock typically consists of 16×16 samples, and is further subdivided into transform blocks, and may be further subdivided into prediction blocks. Formats which are based on macroblocks include JPEG, where they are called MCU blocks, H.261, MPEG-1 Part 2, H.262/MPEG-2 Part 2, H.263, MPEG-4 Part 2, and H.264/MPEG-4 AVC. In H.265/HEVC, the macroblock as a basic processing unit has been replaced by the coding tree unit.

<span class="mw-page-title-main">Intra-frame coding</span>

Intra-frame coding is a data compression technique used within a video frame, enabling smaller file sizes and lower bitrates, with little or no loss in quality. Since neighboring pixels within an image are often very similar, rather than storing each pixel independently, the frame image is divided into blocks and the typically minor difference between each pixel can be encoded using fewer bits.

Rate-distortion optimization (RDO) is a method of improving video quality in video compression. The name refers to the optimization of the amount of distortion against the amount of data required to encode the video, the rate. While it is primarily used by video encoders, rate-distortion optimization can be used to improve quality in any encoding situation where decisions have to be made that affect both file size and quality simultaneously.

Reference frames are frames of a compressed video that are used to define future frames. As such, they are only used in inter-frame compression techniques. In older video encoding standards, such as MPEG-2, only one reference frame – the previous frame – was used for P-frames. Two reference frames were used for B-frames.

<span class="mw-page-title-main">VP8</span> Open and royalty-free video coding format released by Google in 2010

VP8 is an open and royalty-free video compression format released by On2 Technologies in 2008.

A video coding format is a content representation format for storage or transmission of digital video content. It typically uses a standardized video compression algorithm, most commonly based on discrete cosine transform (DCT) coding and motion compensation. A specific software, firmware, or hardware implementation capable of compression or decompression to/from a specific video coding format is called a video codec.

<span class="mw-page-title-main">VP9</span> Open and royalty-free video coding format released by Google in 2013

VP9 is an open and royalty-free video coding format developed by Google.

References

  1. Lair Of The Multimedia Guru » 15 reasons why MPEG4 sucks