Internet media type | video/VP9 |
---|---|
Developed by | |
Initial release | June 17, 2013 |
Type of format | Video coding format |
Contained by | |
Extended from | VP8 |
Extended to | AV1 |
Standard | VP9 Bitstream & Decoding Process Specification |
Open format? | Yes |
Free format? | Yes § Patent claims |
Website | webmproject.org/vp9 |
VP9 is an open and royalty-free [1] video coding format developed by Google.
VP9 is the successor to VP8 and competes mainly with MPEG's High Efficiency Video Coding (HEVC/H.265). At first, VP9 was mainly used on Google's video platform YouTube. [2] [3] The emergence of the Alliance for Open Media, and its support for the ongoing development of the successor AV1, of which Google is a part, led to growing interest in the format.
In contrast to HEVC, VP9 support is common among modern web browsers (see HTML video § Browser support). Android has supported VP9 since version 4.4 KitKat, [4] while Safari 14 added support for VP9 in iOS / iPadOS / tvOS 14 and macOS Big Sur. [5] [6]
Parts of the format are covered by patents held by Google. The company grants free usage of its own related patents based on reciprocity, i.e. as long as the user does not engage in patent litigations. [7]
VP9 is the last official iteration of the TrueMotion series of video formats that Google bought in 2010 for $134 million together with the company On2 Technologies that created it. The development of VP9 started in the second half of 2011 under the development names of Next Gen Open Video (NGOV) and VP-Next. [8] [9] [10] The design goals for VP9 included reducing the bit rate by 50% compared to VP8 while maintaining the same video quality, and aiming for better compression efficiency than the MPEG High Efficiency Video Coding (HEVC) standard. [9] [11] In June 2013 the "profile 0" of VP9 was finalized, and two months later Google's Chrome browser was released with support for VP9 video playback. [12] [13] In October of that year a native VP9 decoder was added to FFmpeg, [14] and to Libav six weeks later. Mozilla added VP9 support to Firefox in March 2014. [15] In 2014 Google added two high bit depth profiles: profile 2 and profile 3. [16] [17]
In 2013 an updated version of the WebM format was published, featuring support for VP9 together with Opus audio.
In March 2013, the MPEG Licensing Administration dropped an announced assertion of disputed patent claims against VP8 and its successors after the United States Department of Justice started to investigate whether it was acting to unfairly stifle competition. [18] [19] [20]
Throughout, Google has worked with hardware vendors to get VP9 support into silicon. In January 2014, Ittiam, in collaboration with ARM and Google, demonstrated its VP9 decoder for ARM Cortex devices. Using GPGPU techniques, the decoder was capable of 1080p at 30fps on an Arndale Board. [21] [22] In early 2015 Nvidia announced VP9 support in its Tegra X1 SoC, and VeriSilicon announced VP9 Profile 2 support in its Hantro G2v2 decoder IP. [23] [24] [25]
In April 2015 Google released a significant update to its libvpx library, with version 1.4.0 adding support for 10-bit and 12-bit bit depth, 4:2:2 and 4:4:4 chroma subsampling, and VP9 multithreaded decoding/encoding. [26]
In December 2015, Netflix published a draft proposal for including VP9 video in an MP4 container with MPEG Common Encryption. [27]
In January 2016, Ittiam demonstrated an OpenCL based VP9 encoder. [28] The encoder is targeting ARM Mali mobile GPUs and was demonstrated on a Samsung Galaxy S6.
VP9 support was added to Microsoft's web browser Edge in 2016. [29]
In March 2017, Ittiam announced the completion of a project to enhance the encoding speed of libvpx. The speed improvement was said to be 50-70%, and the code "publicly available as part of libvpx". [30]
VP9 is customized for video resolutions greater than 1080p (such as UHD) and also enables lossless compression. It supports resolutions up to 65536×65536, whereas HEVC supports resolutions up to 8192×4320 pixels.
The VP9 format supports the following color spaces (and corresponding YCbCr to RGB transformation matrices): Rec. 601, Rec. 709, Rec. 2020, SMPTE-170, SMPTE-240, and sRGB. [31] [32]
VP9 supports many transfer functions and supports HDR video with hybrid log–gamma (HLG) or perceptual quantizer (PQ). [33] [34]
An early comparison that took varying encoding speed into account showed x265 to narrowly beat libvpx at the very highest quality (slowest encoding) whereas libvpx was superior at any other encoding speed, by SSIM. [35]
In a subjective quality comparison conducted in 2014 featuring the reference encoders for HEVC (HM 15.0), MPEG-4 AVC/H.264 (JM 18.6), and VP9 (libvpx 1.2.0 with preliminary VP9 support), VP9, like H.264, required about two times the bitrate to reach video quality comparable to HEVC, while with synthetic imagery VP9 was close to HEVC. [36] By contrast, another subjective comparison from 2014 concluded that at higher quality settings HEVC and VP9 were tied at a 40 to 45% bitrate advantage over H.264. [37]
Netflix, after a large test in August 2016, concluded that libvpx was 20% less efficient than x265, but by October the same year also found that tweaking encoding parameters could "reduce or even reverse the gap between VP9 and HEVC". [38] At NAB 2017, Netflix shared that they had switched to the EVE encoder, which according to their studies offered better two-pass rate control and was 8% more efficient than libvpx. [39]
An offline encoder comparison between libvpx, two HEVC encoders and x264 in May 2017 by Jan Ozer of Streaming Media Magazine, with encoding parameters supplied or reviewed by each encoder vendor (Google, MulticoreWare and MainConcept respectively), and using Netflix's VMAF objective metric, concluded that "VP9 and both HEVC codecs produce very similar performance" and "Particularly at lower bitrates, both HEVC codecs and VP9 deliver substantially better performance than H.264". [40]
An encoding speed versus efficiency comparison of the reference implementation in libvpx, x264 and x265 was made by an FFmpeg developer in September 2015: By SSIM index, libvpx was mostly superior to x264 across the range of comparable encoding speeds, but the main benefit was at the slower end of x264@veryslow (reaching a sweet spot of 30–40% bitrate improvement within twice as slow as this), whereas x265 only became competitive with libvpx around 10 times as slow as x264@veryslow. It was concluded that libvpx and x265 were both capable of the claimed 50% bitrate improvement over H.264, but only at 10–20 times the encoding time of x264. [35] Judged by the objective quality metric VQM in early 2015, the VP9 reference encoder delivered video quality on par with the best HEVC implementations. [41]
A decoder comparison by the same developer showed 10% faster decoding for ffvp9 than ffh264 for same-quality video, or "identical" at same bitrate. It also showed that the implementation can make a difference, concluding that "ffvp9 beats libvpx consistently by 25–50%". [42]
Another decoder comparison indicated 10–40 percent higher CPU load than H.264 (but does not say whether this was with ffvp9 or libvpx), and that on mobile, the Ittiam demo player was about 40 percent faster than the Chrome browser at playing VP9. [43]
There are several variants of the VP9 format (known as "coding profiles"), which successively allow more features; profile 0 is the basic variant, requiring the least from a hardware implementation:
VP9 offers the following 14 levels: [45]
Level | Luma Samples/s | Luma Picture Size | Max Bitrate (Mbit/s) | Max CPB Size for Visual Layer (MBits) | Min Compression Ratio | Max Tiles | Min Alt-Ref Distance | Max Reference Frames | Examples for resolution @ frame rate |
---|---|---|---|---|---|---|---|---|---|
1 | 829440 | 36864 | 0.20 | 0.40 | 2 | 1 | 4 | 8 | 256×144@15 |
1.1 | 2764800 | 73728 | 0.80 | 1.0 | 2 | 1 | 4 | 8 | 384×192@30 |
2 | 4608000 | 122880 | 1.8 | 1.5 | 2 | 1 | 4 | 8 | 480×256@30 |
2.1 | 9216000 | 245760 | 3.6 | 2.8 | 2 | 2 | 4 | 8 | 640×384@30 |
3 | 20736000 | 552960 | 7.2 | 6.0 | 2 | 4 | 4 | 8 | 1080×512@30 |
3.1 | 36864000 | 983040 | 12 | 10 | 2 | 4 | 4 | 8 | 1280×768@30 |
4 | 83558400 | 2228224 | 18 | 16 | 4 | 4 | 4 | 8 | 2048×1088@30 |
4.1 | 160432128 | 2228224 | 30 | 18 | 4 | 4 | 5 | 6 | 2048×1088@60 |
5 | 311951360 | 8912896 | 60 | 36 | 6 | 8 | 6 | 4 | 4096×2176@30 |
5.1 | 588251136 | 8912896 | 120 | 46 | 8 | 8 | 10 | 4 | 4096×2176@60 |
5.2 | 1176502272 | 8912896 | 180 | TBD | 8 | 8 | 10 | 4 | 4096×2176@120 |
6 | 1176502272 | 35651584 | 180 | TBD | 8 | 16 | 10 | 4 | 8192×4352@30 |
6.1 | 2353004544 | 35651584 | 240 | TBD | 8 | 16 | 10 | 4 | 8192×4352@60 |
6.2 | 4706009088 | 35651584 | 480 | TBD | 8 | 16 | 10 | 4 | 8192×4352@120 |
VP9 is a traditional block-based transform coding format. The bitstream format is relatively simple compared to formats that offer similar bitrate efficiency like HEVC. [46]
VP9 has many design improvements compared to VP8. Its biggest improvement is support for the use of coding units [47] of 64×64 pixels. This is especially useful with high-resolution video. [3] [8] [9] Also, the prediction of motion vectors was improved. [48] In addition to VP8's four modes (average/"DC", "true motion", horizontal, vertical), VP9 supports six oblique directions for linear extrapolation of pixels in intra-frame prediction.[ citation needed ]
New coding tools also include:
In order to enable some parallel processing of frames, video frames can be split along coding unit boundaries into up to four rows of 256 to 4096 pixels wide evenly spaced tiles with each tile column coded independently. This is mandatory for video resolutions in excess of 4096 pixels. A tile header contains the tile size in bytes so decoders can skip ahead and decode each tile row in a separate thread. The image is then divided into coding units called superblocks of 64×64 pixels which are adaptively subpartitioned in a quadtree coding structure. [8] [9] They can be subdivided either horizontally or vertically or both; square (sub)units can be subdivided recursively down to 4×4 pixel blocks. Subunits are coded in raster scan order: left to right, top to bottom.
Starting from each key frame, decoders keep 8 frames buffered to be used as reference frames or to be shown later. Transmitted frames signal which buffer to overwrite and can optionally be decoded into one of the buffers without being shown. The encoder can send a minimal frame that just triggers one of the buffers to be displayed ("skip frame"). Each inter frame can reference up to three of the buffered frames for temporal prediction. Up to two of those reference frames can be used in each coding block to calculate a sample data prediction, using spatially displaced (motion compensation) content from a reference frame or an average of content from two reference frames ("compound prediction mode"). The (ideally small) remaining difference (delta encoding) from the computed prediction to the actual image content is transformed using a DCT or ADST (for edge blocks) and quantized.
Something like a b-frame can be coded while preserving the original frame order in the bitstream using a structure named superframes. Hidden alternate reference frames can be packed together with an ordinary inter frame and a skip frame that triggers display of previous hidden altref content from its reference frame buffer right after the accompanying p-frame. [46]
VP9 enables lossless encoding by transmitting at the lowest quantization level (q index 0) an additional 4×4-block encoded Walsh–Hadamard transformed (WHT) residue signal. [49] [50]
In order to be seekable, raw VP9 bitstreams have to be encapsulated in a container format, for example Matroska (.mkv), its derived WebM format (.webm) or the older minimalistic Indeo video file (IVF) format which is traditionally supported by libvpx. [46] [47] VP9 is identified as V_VP9
in WebM and VP09
in MP4, adhering to respective naming conventions. [51]
Adobe Flash, which traditionally used VPx formats up to VP7, was never upgraded to VP8 or VP9, but instead to H.264. Therefore, VP9 often penetrated corresponding web applications only with the gradual shift from Flash to HTML5 technology, which was still somewhat immature when VP9 was introduced. Trends towards UHD resolutions, higher color depth and wider gamuts are driving a shift towards new, specialized video formats. With the clear development perspective and support from the industry demonstrated by the founding of the Alliance for Open Media, as well as the pricey and complex licensing situation of HEVC it is expected that users of the hitherto leading MPEG formats will often switch to the royalty-free alternative formats of the VPx/AVx series instead of upgrading to HEVC. [52]
A main user of VP9 is Google's popular video platform YouTube, which offers VP9 video at all resolutions [52] along with Opus audio in the WebM file format, through DASH streaming.
Another early adopter was Wikipedia (specifically Wikimedia Commons, which hosts multimedia files across Wikipedia's subpages and languages). Wikipedia endorses open and royalty-free multimedia formats. [53] As of 2016, the three accepted video formats are VP9, VP8 and Theora. [54]
Since December 2016, Netflix has used VP9 encoding for their catalog, alongside H.264 and HEVC. As of February 2020, AV1 has been started to be adopted for mobile devices, not unlike how VP9 has started on the platform. [55]
Google TV uses (at least in part) VP9 profile 2 with Widevine DRM. [56] [57] [58]
Stadia used VP9 for video game streaming up to 4k on supported hardware like the Chromecast Ultra, mobile phones as well as web browsers. [59]
A series of cloud encoding services offer VP9 encoding, including Amazon, Bitmovin, [60] Brightcove, castLabs, JW Player, Telestream, and Wowza. [43]
Encoding.com has offered VP9 encoding since Q4 2016, [61] which amounted to a yearly average of 11% popularity for VP9 among its customers that year. [62]
JW Player supports VP9 in its widely used software-as-a-service HTML video player. [43]
VP9 is implemented in these web browsers:
Microsoft Windows | macOS | BSD / Linux | Android OS | iOS | |
---|---|---|---|---|---|
Codec support | Yes Partial: Win 10 v1607 Full: Win 10 v1809 | Yes | Yes | Yes | Yes |
Container support | On Windows 10 Anniversary Update (1607): WebM (.webm is not recognized; requires pseudo extension) Matroska (.mkv) On Windows 10 October 2018 Update (1809): | WebM (.webm) - Introduced in macOS 11.3 | WebM (.webm) Matroska (.mkv) | WebM (.webm) Matroska (.mkv) | WebM (.webm) - Introduced in iOS 17.4 |
Notes | On Windows 10 : - On Anniversary Update (1607), limited support is available in Microsoft Edge (via MSE only) and Universal Windows Platform apps. - On April 2018 Update (1803) with Web Media Extensions preinstalled, Microsoft Edge (EdgeHTML 17) supports VP9 videos embedded in <video> tags. - On October 2018 Update (1809), VP9 Video Extensions is preinstalled. It enables encoding of VP8 and VP9 content on devices that do not have a hardware-based video encoder. [67] | Support introduced in macOS 11.0 | Support introduced by FFmpeg 2.7.7 "Nash" | Support introduced in Android 4.4 | Support introduced in iOS 14.0 [5] [6] |
VP9 is supported in all major open source media player software, including VLC, MPlayer/MPlayer2/MPV, Kodi, MythTV, [68] and FFplay.
Android has had VP9 software decoding since version 4.4 "KitKat". [69] For a list of consumer electronics with hardware support, including TVs, smartphones, set top boxes and game consoles, see webmproject.org's list. [70]
Hardware accelerated VP9 decoding support nowadays is ubiquitous as most GPUs and SoCs support it natively. Hardware encoding is present in Intel's Kaby Lake processors and above. [71]
The Sony PlayStation 5 supports capturing 1080p and 2160p footage using VP9 in a WebM container. [72]
The reference implementation from Google is found in the free software programming library libvpx
. It has a single-pass and a two-pass encoding mode, but the single-pass mode is considered broken and does not offer effective control over the target bitrate. [43] [73]
FFmpeg's VP9 decoder takes advantage of a corpus of SIMD optimizations shared with other codecs to make it fast. A comparison made by an FFmpeg developer indicated that this was faster than libvpx, and compared to FFmpeg's h.264 decoder, "identical" performance for same-bitrate video, or about 10% faster for same-quality video. [42]
In March 2019, Luxembourg-based Sisvel announced the formation of patent pools for VP9 and AV1. Members of the pools included JVCKenwood, NTT, Orange S.A., Philips, and Toshiba, all of whom were also licensing patents to the MPEG-LA for either the AVC, DASH, or the HEVC patent pools. [75] [76] A list of claimed patents was first published on 10 March 2020. This list contains over 650 patents. [77]
Sisvel's prices are .24 Euros for display devices and .08 Euros for non-display devices using VP9, but would not seek royalties for encoded content. [78] [75] However, their license makes no exemption for software. [77]
According to The WebM Project, Google does not plan to alter their current or upcoming usage plans of VP9 or AV1 even though they are aware of the patent pools, none of the licensors of the patent pools were involved in the development of VP9 or VP8, and third parties cannot be stopped from demanding licensing fees from any technology that is open-source, royalty-free, and/or free-of-charge. [79]
On September 12, 2014, Google announced that development on VP10 had begun and that after the release of VP10 they planned to have an 18-month gap between releases of video formats. [80] In August 2015, Google began to publish code for VP10. [81]
However, Google decided to incorporate VP10 into AOMedia Video 1 (AV1). The AV1 codec was developed based on a combination of technologies from VP10, Daala (Xiph/Mozilla) and Thor (Cisco). [82] [83] [84] Accordingly, Google has stated that they will not deploy VP10 internally nor officially release it, making VP9 the last of the VPx-based codecs to be released by Google. [85]
A video codec is software or hardware that compresses and decompresses digital video. In the context of video compression, codec is a portmanteau of encoder and decoder, while a device that only compresses is typically called an encoder, and one that only decompresses is a decoder.
FFmpeg is a free and open-source software project consisting of a suite of libraries and programs for handling video, audio, and other multimedia files and streams. At its core is the command-line ffmpeg
tool itself, designed for processing video and audio files. It is widely used for format transcoding, basic editing, video scaling, video post-production effects, and standards compliance.
libavcodec is a free and open-source library of codecs for encoding and decoding video and audio data.
MPEG LA was an American company based in Denver, Colorado that licensed patent pools covering essential patents required for use of the MPEG-2, MPEG-4, IEEE 1394, VC-1, ATSC, MVC, MPEG-2 Systems, AVC/H.264 and HEVC standards.
VP8 is an open and royalty-free video compression format released by On2 Technologies in 2008.
High Efficiency Video Coding (HEVC), also known as H.265 and MPEG-H Part 2, is a video compression standard designed as part of the MPEG-H project as a successor to the widely used Advanced Video Coding. In comparison to AVC, HEVC offers from 25% to 50% better data compression at the same level of video quality, or substantially improved video quality at the same bit rate. It supports resolutions up to 8192×4320, including 8K UHD, and unlike the primarily 8-bit AVC, HEVC's higher fidelity Main 10 profile has been incorporated into nearly all supporting hardware.
HTML video is a subject of the HTML specification as the standard way of playing video via the web. Introduced in HTML5, it is designed to partially replace the object element and the previous de facto standard of using the proprietary Adobe Flash plugin, though early adoption was hampered by lack of agreement as to which video coding formats and audio coding formats should be supported in web browsers. As of 2020, HTML video is the only widely supported video playback technology in modern browsers, with the Flash plugin being phased out.
WebM is an audiovisual media file format. It is primarily intended to offer a royalty-free alternative to use in the HTML video and the HTML audio elements. It has a sister project, WebP, for images. The development of the format is sponsored by Google, and the corresponding software is distributed under a BSD license.
libvpx is a free software video codec library from Google and the Alliance for Open Media (AOMedia). It serves as the reference software implementation for the VP8 and VP9 video coding formats, and for AV1 a special fork named libaom that was stripped of backwards compatibility.
Chips&Media, Inc. is a provider of intellectual property for integrated circuits such as system on a chip technology for encoding and decoding video, and image processing. Headquartered in Seoul, South Korea.
x265 is an encoder for creating digital video streams in the High Efficiency Video Coding (HEVC/H.265) video compression format developed by the Joint Collaborative Team on Video Coding (JCT-VC). It is available as a command-line app or a software library, under the terms of GNU General Public License (GPL) version 2 or later; however, customers may request a commercial license.
Opus is a lossy audio coding format developed by the Xiph.Org Foundation and standardized by the Internet Engineering Task Force, designed to efficiently code speech and general audio in a single format, while remaining low-latency enough for real-time interactive communication and low-complexity enough for low-end embedded processors. Opus replaces both Vorbis and Speex for new applications, and several blind listening tests have ranked it higher-quality than any other standard audio format at any given bitrate until transparency is reached, including MP3, AAC, and HE-AAC.
Intel Quick Sync Video is Intel's brand for its dedicated video encoding and decoding hardware core. Quick Sync was introduced with the Sandy Bridge CPU microarchitecture on 9 January 2011 and has been found on the die of Intel CPUs ever since.
A video coding format is a content representation format of digital video content, such as in a data file or bitstream. It typically uses a standardized video compression algorithm, most commonly based on discrete cosine transform (DCT) coding and motion compensation. A specific software, firmware, or hardware implementation capable of compression or decompression in a specific video coding format is called a video codec.
The Alliance for Open Media (AOMedia) is a non-profit industry consortium headquartered in Wakefield, Massachusetts, and formed to develop open, royalty-free technology for multimedia delivery. It uses the ideas and principles of open web standard development to create video standards that can serve as alternatives to the hitherto dominant standards of the Moving Picture Experts Group (MPEG).
AOMedia Video 1 (AV1) is an open, royalty-free video coding format initially designed for video transmissions over the Internet. It was developed as a successor to VP9 by the Alliance for Open Media (AOMedia), a consortium founded in 2015 that includes semiconductor firms, video on demand providers, video content producers, software development companies and web browser vendors. The AV1 bitstream specification includes a reference video codec. In 2018, Facebook conducted testing that approximated real-world conditions, and the AV1 reference encoder achieved 34%, 46.2%, and 50.3% higher data compression than libvpx-vp9, x264 High profile, and x264 Main profile respectively.
Nvidia NVDEC is a feature in its graphics cards that performs video decoding, offloading this compute-intensive task from the CPU. NVDEC is a successor of PureVideo and is available in Kepler and later NVIDIA GPUs.
Low Complexity Enhancement Video Coding (LCEVC) is a ISO/IEC video coding standard developed by the Moving Picture Experts Group (MPEG) under the project name MPEG-5 Part 2 LCEVC.
new VP9 video decoding implemented
H.265, VP9 4K 60 fps Video
x265/libvpx are ~50% better than x264, as claimed. But, they are also 10–20x slower.
{{cite book}}
: |journal=
ignored (help)So how does VP9 decoding performance compare to that of other codecs? There's basically two ways to measure this: same-bitrate, or same-quality (…) We did same-quality measurements, and found: ffvp9 tends to beat ffh264 by a tiny bit (10%) (…) we did some same-bitrate comparisons, and found that x264 and ffvp9 are essentially identical in that scenario
We submitted this WHT plus a few variants to Google for use in VP9's lossless coding mode; they chose one of the alternate versions of the WHT illustrated above.
NVIDIA has now confirmed to us that the SHIELD Android TV will be updated in due course to support encrypted VP9 and Google Play Movies & TV 4K content.
The new Chromecast Ultra has support for (…) VP9 profile 0 and 2
WebM
{{cite book}}
: |journal=
ignored (help)[...] code from VP10, by far the most mature of the three, will dominate.