Original author(s) | Jyrki Alakuijala, Zoltán Szabadka |
---|---|
Developer(s) | Jyrki Alakuijala, Eugene Kliuchnikov, Robert Obryk, Zoltán Szabadka, Lode Vandevenne |
Initial release | 15 October 2013 |
Stable release | |
Repository | |
Written in | C |
Operating system | Cross-platform |
Platform | Portable |
Type | Data compression |
License | MIT License |
Website | brotli |
Brotli is a lossless data compression algorithm developed by Google. It uses a combination of the general-purpose LZ77 lossless compression algorithm, Huffman coding and 2nd-order context modelling. Brotli is primarily used by web servers and content delivery networks to compress HTTP content, making internet websites load faster. A successor to gzip, it is supported by all major web browsers and has become increasingly popular, as it provides better compression than gzip.[ citation needed ]
Google employees Jyrki Alakuijala and Zoltán Szabadka initially developed Brotli in 2013 to decrease the size of transmissions of WOFF web font. [2] Alakuijala and Szabadka completed the Brotli specification during 2013–2016. The specification was accompanied with a reference implementation developed by two additional authors, Evgenii Kliuchnikov and Lode Vandevenne, who had previously developed Google's zopfli implementation of deflate and gzip compatible compression in 2013. [3] : 1 Unlike zopfli, which was a reimplementation of an existing data format specification, Brotli was a new data format and allowed the authors to improve compression ratios even further. [4]
The Brotli specification was generalized in September 2015 for HTTP stream compression (content-encoding type "br"). This generalized iteration also improved the compression ratio by using a predefined dictionary of frequently used words and phrases. The version of Brotli released in September 2015 by the Google software engineers contained enhancements in generic lossless data compression, with particular emphasis on use for HTTP compression. The encoder was partly rewritten, with the result that the compression ratio improved, both the encoder and the decoder have been sped up, the streaming API was improved, and more compression quality levels have been added. Additionally, the new release shows performance improvements across platforms, with decoding memory reduction. [4]
The Internet Engineering Task Force approved the Brotli compressed data format specification as an informational request for comment ( RFC 7932) in July 2016. [5] The Brotli data format is an integral part of the 2nd iteration of the Web Open Font Format, [5] : 3 which was recognized in a 2021 Technology & Engineering Emmy Award from the National Academy of Television Arts & Sciences for font technology standardization at W3C. [6] [7]
Brotli support has been added over the years to web browsers, with 96% of worldwide users using a browser that supports the format, as of July 2022. [8]
In 2016 Dropbox reimplemented Brotli in Rust to fulfill their requirement to be more secure against a malicious client. [9] [10]
Brotli's new file format allows its authors to improve upon Deflate by several algorithmic and format-level improvements: the use of context models for literals and copy distances, describing copy distances through past distances, use of move-to-front queue in entropy code selection, joint-entropy coding of literal and copy lengths, the use of graph algorithms in block splitting, and a larger backward reference window are example improvements.
Unlike most general-purpose compression algorithms, Brotli uses a predefined dictionary, roughly 120 KiB in size, in addition to the dynamically populated ("sliding window") dictionary. The predefined dictionary contains over 13000 common words, phrases and other substrings derived from a large corpus of text and HTML documents. [11] [3] Using a predefined dictionary has been shown to increase compression where a file mostly contains commonly used words. [12] Brotli's sliding window is limited to 16 MiB. This enables decoding on mobile phones with limited resources, but makes Brotli underperform on compression benchmarks having larger files. The constraints of the small window size can be alleviated by using Large Window Brotli, which is not compatible with RFC7932 (Brotli proper). [13]
While Google's zopfli implementation of the deflate compression algorithm is named after Zöpfli, the Swiss German word for a snack-sized braided buttery bread, brotli is named after Brötli, the Swiss German word for a bread roll. [4] Google's own implementation of the Brotli specification was released under the terms of the permissive free software MIT license in 2016. A formal validation of the Brotli specification was independently implemented by Mark Adler, [5] : 126 one of the co-authors of the zlib/gzip compression format and library. Adler's implementation was released under the terms of the similarly permissive Apache License. [14] Other implementations of the specification also exist, including one in the source-to-source Haxe language.
Brotli compression is generally used as an alternative to gzip on the web, as Brotli provides better overall compression. [15] Compared to gzip compression, JavaScript files compressed with Brotli are roughly 15% smaller, HTML files are around 20% smaller, and CSS files are around 16% smaller. [16]
The reference implementation does ship a command-line program brotli
similar to gzip
, [17] but use in the Unix-like world as a simple compressor is scarce. Libarchive developers find the raw stream format of .br
files difficult to support, as there is no magic number to indicate the file format. [18]
gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and intended for use by GNU. Version 0.1 was first publicly released on 31 October 1992, and version 1.0 followed in February 1993.
Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistical redundancy. By contrast, lossy compression permits reconstruction only of an approximation of the original data, though usually with greatly improved compression rates.
Portable Network Graphics is a raster-graphics file format that supports lossless data compression. PNG was developed as an improved, non-patented replacement for Graphics Interchange Format (GIF)—unofficially, the initials PNG stood for the recursive acronym "PNG's not GIF".
zlib is a software library used for data compression as well as a data format. zlib was written by Jean-loup Gailly and Mark Adler and is an abstraction of the DEFLATE compression algorithm used in their gzip file compression program. zlib is also a crucial component of many software platforms, including Linux, macOS, and iOS. It has also been used in gaming consoles such as the PlayStation 4, PlayStation 3, Wii U, Wii, Xbox One and Xbox 360.
In computing, Deflate is a lossless data compression file format that uses a combination of LZ77 and Huffman coding. It was designed by Phil Katz, for version 2 of his PKZIP archiving tool. Deflate was later specified in RFC 1951 (1996).
ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed. The ZIP file format permits a number of compression algorithms, though DEFLATE is the most common. This format was originally created in 1989 and was first implemented in PKWARE, Inc.'s PKZIP utility, as a replacement for the previous ARC compression format by Thom Henderson. The ZIP format was then quickly supported by many software utilities other than PKZIP. Microsoft has included built-in ZIP support in versions of Microsoft Windows since 1998 via the "Plus! 98" addon for Windows 98. Native support was added as of the year 2000 in Windows ME. Apple has included built-in ZIP support in Mac OS X 10.3 and later. Most free operating systems have built in support for ZIP in similar manners to Windows and macOS.
compress is a Unix shell compression program based on the LZW compression algorithm. Compared to gzip's fastest setting, compress is slightly slower at compression, slightly faster at decompression, and has a significantly lower compression ratio. 1.8 MiB of memory is used to compress the Hutter Prize data, slightly more than gzip's slowest setting.
7-Zip is a free and open-source file archiver, a utility used to place groups of files within compressed containers known as "archives". It is developed by Igor Pavlov and was first released in 1999. 7-Zip has its own archive format called 7z, but can read and write several others.
7z is a compressed archive file format that supports several different data compression, encryption and pre-processing algorithms. The 7z format initially appeared as implemented by the 7-Zip archiver. The 7-Zip program is publicly available under the terms of the GNU Lesser General Public License. The LZMA SDK 4.62 was placed in the public domain in December 2008. The latest stable version of 7-Zip and LZMA SDK is version 24.08.
Lempel–Ziv–Oberhumer (LZO) is a lossless data compression algorithm that is focused on decompression speed.
The following tables compare general and technical information for a number of file archivers. Please see the individual products' articles for further information. They are neither all-inclusive nor are some entries necessarily up to date. Unless otherwise specified in the footnotes section, comparisons are based on the stable versions—without add-ons, extensions or external programs.
An image file format is a file format for a digital image. There are many formats that can be used, such as JPEG, PNG, and GIF. Most formats up until 2022 were for storing 2D images, not 3D ones. The data stored in an image file format may be compressed or uncompressed. If the data is compressed, it may be done so using lossy compression or lossless compression. For graphic design applications, vector formats are often used. Some image file formats support transparency.
HTTP compression is a capability that can be built into web servers and web clients to improve transfer speed and bandwidth utilization.
XZ Utils is a set of free software command-line lossless data compressors, including the programs lzma and xz, for Unix-like operating systems and, from version 5.0 onwards, Microsoft Windows. For compression/decompression the Lempel–Ziv–Markov chain algorithm (LZMA) is used. XZ Utils started as a Unix port of Igor Pavlov's LZMA-SDK that has been adapted to fit seamlessly into Unix environments and their usual structure and behavior.
WebP is a raster graphics file format developed by Google intended as a replacement for JPEG, PNG, and GIF file formats. It supports both lossy and lossless compression, as well as animation and alpha transparency.
Zopfli is a data compression library that performs Deflate, gzip and zlib data encoding. It achieves higher compression ratios than mainstream Deflate and zlib implementations at the cost of being slower. Google first released Zopfli in February 2013 under the terms of Apache License 2.0.
Zstandard is a lossless data compression algorithm developed by Yann Collet at Facebook. Zstd is the corresponding reference implementation in C, released as open-source software on 31 August 2016.
Guetzli is a freely licensed JPEG encoder that Jyrki Alakuijala, Robert Obryk, and Zoltán Szabadka have developed in Google's Zürich research branch. The encoder seeks to produce significantly smaller files than prior encoders at equivalent quality, albeit at very low speed. It is named after the Swiss German diminutive expression for biscuits, in line with the names of other compression technology from Google.
JPEG XL is a royalty-free open standard for the compressed representation of raster graphics images. It defines a graphics file format and the abstract device for coding JPEG XL bitstreams. It is developed by the Joint Photographic Experts Group (JPEG) and standardized by the International Electrotechnical Commission (IEC) and the International Organization for Standardization (ISO) as the international standard ISO/IEC 18181. As a superset of JPEG/JFIF encoding, with a compression mode built on a traditional block-based transform coding core and a "modular mode" for synthetic image content and lossless compression. Optional lossy quantization enables both lossless and lossy compression.
Currently we are testing "Large Window Brotli" extension that will allow up to 1GiB window. [...] "Large Window Brotli" is landed.
Without a magic signature, libarchive cannot automatically recognize the file type, so it cannot automatically decompress. (Libarchive does not consider the file name, only the contents.)