LZX is an LZ77 family compression algorithm, a slightly improved version of DEFLATE. [1] It is also the name of a file archiver with the same name. Both were invented by Jonathan Forbes and Tomi Poutanen in the 1990s.
LZX was publicly released as an Amiga file archiver in 1995, while the authors were studying at the University of Waterloo in Canada. The software was shareware, which was common for compression software at the time. The registered version contained fixes and improvements that were not available in the evaluation version. In 1997, the authors gave away a free keyfile, which allowed anyone to use the registered version, as they had stopped work on the archiver and stopped accepting registrations.
In 1996, Forbes went to work for Microsoft, [2] and Microsoft's cabinet archiver was enhanced to include the LZX compression method. Improvements included a variable search window size; Amiga LZX was fixed to 64 KB, and Microsoft LZX could range on powers of two between 32 and 2048 kilobytes (32,768 to 2,097,152 bytes). A special preprocessor was added to detect Intel 80x86 "CALL" instructions, converting their operands from relative addressing to absolute addressing, thus calls to the same location resulted in repeated strings that the compressor could match, improving compression of 80x86 binary code. (This technique is later generalized as Branch-Call-Jump [BCJ] filtering.)
When Microsoft introduced Microsoft Compressed HTML Help, the replacement for their classic Help file format, they chose to compress all of the HTML data with the LZX algorithm. However, in order to improve random access speed, the compressor was altered to reset itself after every 64 kilobyte (65,536 bytes) interval and re-align to a 16-bit boundary after every 32 kilobyte interval. Thus, the HTMLHelp software could immediately seek the nearest 64 kilobyte interval and start decoding from there, rather than decoding from the beginning of the compressed datastream at all times.
Microsoft LIT files for Microsoft Reader are simply an extension of the CHM file format, and thus also use LZX compression.
Windows Imaging Format, the installation/drive image file format of Windows Vista and Windows 7, uses LZX as one of the compression methods. [3]
In Windows 10, LZX compression from Windows Imaging Format is used for the new CompactOS NTFS file compression.
Microsoft uses LZX compression on Xbox Live Avatars to reduce their disk and bandwidth requirements. [4]
The unlzx program and XAD can unpack Amiga LZX archives. The cabextract program can unpack Microsoft cabinet files using the LZX method. [5] There are a multitude of cross-platform tools for decompiling or viewing CHM files, as stated in the CHM article. LIT files can be unpacked using the Convert LIT software. [6]
A file archiver is a computer program that combines a number of files together into one archive file, or a series of archive files, for easier transportation or storage. File archivers may employ lossless data compression in their archive formats to reduce the size of the archive.
Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistical redundancy. By contrast, lossy compression permits reconstruction only of an approximation of the original data, though usually with greatly improved compression rates.
Portable Network Graphics is a raster-graphics file format that supports lossless data compression. PNG was developed as an improved, non-patented replacement for Graphics Interchange Format (GIF)—unofficially, the initials PNG stood for the recursive acronym "PNG's not GIF".
Windows Media Audio (WMA) is a series of audio codecs and their corresponding audio coding formats developed by Microsoft. It is a proprietary technology that forms part of the Windows Media framework. WMA consists of four distinct codecs. The original WMA codec, known simply as WMA, was conceived as a competitor to the popular MP3 and RealAudio codecs. WMA Pro, a newer and more advanced codec, supports multichannel and high-resolution audio. A lossless codec, WMA Lossless, compresses audio data without loss of audio fidelity. WMA Voice, targeted at voice content, applies compression using a range of low bit rates. Microsoft has also developed a digital container format called Advanced Systems Format to store audio encoded by WMA.
zlib is a software library used for data compression as well as a data format. zlib was written by Jean-loup Gailly and Mark Adler and is an abstraction of the DEFLATE compression algorithm used in their gzip file compression program. zlib is also a crucial component of many software platforms, including Linux, macOS, and iOS. It has also been used in gaming consoles such as the PlayStation 4, PlayStation 3, Wii U, Wii, Xbox One and Xbox 360.
ZIP is an archive file format that supports lossless data compression. A ZIP file may contain one or more files or directories that may have been compressed. The ZIP file format permits a number of compression algorithms, though DEFLATE is the most common. This format was originally created in 1989 and was first implemented in PKWARE, Inc.'s PKZIP utility, as a replacement for the previous ARC compression format by Thom Henderson. The ZIP format was then quickly supported by many software utilities other than PKZIP. Microsoft has included built-in ZIP support in versions of Microsoft Windows since 1998 via the "Plus! 98" addon for Windows 98. Native support was added as of the year 2000 in Windows ME. Apple has included built-in ZIP support in Mac OS X 10.3 and later. Most free operating systems have built in support for ZIP in similar manners to Windows and macOS.
RAR is a proprietary archive file format that supports data compression, error correction and file spanning. It was developed in 1993 by Russian software engineer Eugene Roshal and the software is licensed by win.rar GmbH. The name RAR stands for Roshal Archive.
Cabinet is an archive-file format for Microsoft Windows that supports lossless data compression and embedded digital certificates used for maintaining archive integrity. Cabinet files have .cab
filename extensions and are recognized by their first four bytes MSCF. Cabinet files were known originally as Diamond files.
7-Zip is a free and open-source file archiver, a utility used to place groups of files within compressed containers known as "archives". It is developed by Igor Pavlov and was first released in 1999. 7-Zip has its own archive format called 7z, but can read and write several others.
7z is a compressed archive file format that supports several different data compression, encryption and pre-processing algorithms. The 7z format initially appeared as implemented by the 7-Zip archiver. The 7-Zip program is publicly available under the terms of the GNU Lesser General Public License. The LZMA SDK 4.62 was placed in the public domain in December 2008. The latest stable version of 7-Zip and LZMA SDK is version 24.08.
LHA or LZH is a freeware compression utility and associated file format. It was created in 1988 by Haruyasu Yoshizaki, a doctor, and originally named LHarc. A complete rewrite of LHarc, tentatively named LHx, was eventually released as LH. It was then renamed to LHA to avoid conflicting with the then-new MS-DOS 5.0 LH command. The original LHA and its Windows port, LHA32, are no longer in development because Yoshizaki is busy at work.
Microsoft Compiled HTML Help is a Microsoft proprietary online help format, consisting of a collection of HTML pages, an index and other navigation tools. The files are compressed and deployed in a binary format with the extension .CHM, for Compiled HTML. The format is often used for software documentation.
The Windows Imaging Format (WIM) is a file-based disk image format. It was developed by Microsoft to help deploy Windows Vista and subsequent versions of the Windows operating system family, as well as Windows Fundamentals for Legacy PCs.
JPEG XR is an image compression standard for continuous tone photographic images, based on the HD Photo specifications that Microsoft originally developed and patented. It supports both lossy and lossless compression, and is the preferred image format for Ecma-388 Open XML Paper Specification documents.
XZ Utils is a set of free software command-line lossless data compressors, including the programs lzma and xz, for Unix-like operating systems and, from version 5.0 onwards, Microsoft Windows. For compression/decompression the Lempel–Ziv–Markov chain algorithm (LZMA) is used. XZ Utils started as a Unix port of Igor Pavlov's LZMA-SDK that has been adapted to fit seamlessly into Unix environments and their usual structure and behavior.
STDU Viewer is computer software, a compact viewer for many computer file formats: Portable Document Format (PDF), World Wide Fund for Nature (WWF), DjVu, comic book archive, FB2, ePUB, XML Paper Specification (XPS), Text Compression for Reader (TCR), Mobipocket (MOBI), AZW, multi-page TIFF, text file (TXT), PalmDoc (PDB), Windows Metafile (EMF), Windows Metafile (WMF), bitmap (BMP), Graphics Interchange Format (GIF), JPEG-JPG, Portable Network Graphics (PNG), Photoshop Document (PSD), PiCture eXchange (PCX-DCX). It works under Microsoft Windows, and is free for non-commercial use.
Brotli is a lossless data compression algorithm developed by Google. It uses a combination of the general-purpose LZ77 lossless compression algorithm, Huffman coding and 2nd-order context modelling. Brotli is primarily used by web servers and content delivery networks to compress HTTP content, making internet websites load faster. A successor to gzip, it is supported by all major web browsers and has become increasingly popular, as it provides better compression than gzip.
In data compression, BCJ, short for branch/call/jump, refers to a technique that improves the compression of machine code by replacing relative branch addresses with absolute ones. This allows a Lempel–Ziv compressor to identify duplicate targets and more efficiently encode them. On decompression, the inverse filter restores the original encoding. Different BCJ filters are used for different instruction sets, as each use different opcodes for branching.