Cabinet (file format)

Last updated

Cabinet
Cab file format icon.png
Filename extension
.cab
Internet media type
application/vnd.ms-cab-compressed
Uniform Type Identifier (UTI) public.archive.cab
UTI conformationpublic.data
public.archive
Magic number MSCF
Developed by Microsoft
Type of format Archive file format

Cabinet (or CAB) is an archive-file format for Microsoft Windows that supports lossless data compression and embedded digital certificates used for maintaining archive integrity. Cabinet files have .cab filename extensions and are recognized by their first four bytes (also called their magic number ) MSCF. Cabinet files were known originally as Diamond files.

Contents

Design

A CAB archive can contain up to 65,535 folders (distinct to standard operating system directories), each of which can contain up to 65,535 files for a maximum of 4,294,836,225. Internally, each folder is treated as a single compressed block, which provides more efficient compression than individually compressing each file.

Every entry in a folder has to be a file. [1] Due to this structure, it is not possible to store empty folders in CAB archives.

The following shows an example a CAB file structure, demonstrating the relationship between folders and files:

How paths should be handled is not specified in the CAB file format, leaving it to the software implementation:

The CAB file format may employ the following compression algorithms:

A CAB archive can reserve empty spaces in the archive as well as for each file in the archive, for some application-specific uses like digital signatures or arbitrary data.

Implementations

Microsoft Windows supports creating CAB archive files using the makecab command-line utility. It supports extracting the contents of a CAB archive files using File Explorer, Setup API, and using the command-line commands expand.exe, [10] extract.exe and extrac32.exe. [11] [12]

Other well-known software with CAB archive support includes WinZip, WinRAR or 7-Zip. The aforementioned cabextract is a common tool for Linux systems, [13] but is only capable of extracting archives. The gcab tool however can both extract and create CAB archives. [14] For a full list, see Comparison of file archivers § archive formats.

Uses

A variety of Microsoft installation technologies use the CAB format: these include Windows Installer, Setup API, Device Installer, Theme Pack and AdvPack (used by Internet Explorer to install ActiveX components). CAB files are also often associated with self-extracting programs like IExpress where the executable program extracts the associated CAB file. CAB files are also sometimes embedded into other files. For example, MSI and MSU files usually include one or more embedded CAB files.

Windows uses the cabinet format to archive its Component-Based Servicing (CBS) log, which is kept in the folder C:\Windows\Logs\CBS. A bug in the compression process can cause run-away generation of useless log files both in that folder and in C:\Windows\Temp, which can consume disk storage until completely filling the hard drive. [15] [16] Deletion of the files without following a specific procedure [16] can cause the deleted files to be regenerated at an increased pace.

On Linux systems, CAB archives are used by fwupd to distribute firmware updates from hardware vendors [17] that can be flashed on non-volatile memory.

The .cab filename extension is also used by other installer programs (e.g. InstallShield) for their own proprietary archiving formats. InstallShield uses zlib for compression (see Deflate), but their headers are not the same as for Microsoft CAB files so they are incompatible and cannot be manipulated or edited with the programs that are made for standard cabinet format. Specialized third-party utilities, such as Unshield, can extract this specific proprietary format. [18] This format has a different magic number of ISC(. [19]

Windows CE installer uses a variant of Microsoft CAB format with a MSCE\0\0\0 magic. [19] The compression is typically NONE, but MSZIP can also be found. [20]

Microsoft Publisher has a "Pack and Go" feature that bundles a publisher document, together with all external links, into a CAB file with a .PUZ extension. These files are meant to be activated with a companion .EXE file which is distributed along with the .PUZ file. These files may be opened with any CAB file extraction program.

See also

Related Research Articles

New Technology File System (NTFS) is a proprietary journaling file system developed by Microsoft. Starting with Windows NT 3.1, it is the default file system of the Windows NT family. It superseded File Allocation Table (FAT) as the preferred filesystem on Windows and is supported in Linux and BSD as well. NTFS reading and writing support is provided using a free and open-source kernel implementation known as NTFS3 in Linux and the NTFS-3G driver in BSD. By using the convert command, Windows can convert FAT32/16/12 into NTFS without the need to rewrite all files. NTFS uses several files typically hidden from the user to store metadata about other files stored on the drive which can help improve speed and performance when reading data. Unlike FAT and High Performance File System (HPFS), NTFS supports access control lists (ACLs), filesystem encryption, transparent compression, sparse files and file system journaling. NTFS also supports shadow copy to allow backups of a system while it is running, but the functionality of the shadow copies varies between different versions of Windows.

An 8.3 filename is one that obeys the filename convention used by old versions of DOS and versions of Microsoft Windows prior to Windows 95 and Windows NT 3.5. It is also used in modern Microsoft operating systems as an alternate filename to the long filename, to provide compatibility with legacy programs. The filename convention is limited by the FAT file system. Similar 8.3 file naming schemes have also existed on earlier CP/M, TRS-80, Atari, and some Data General and Digital Equipment Corporation minicomputer operating systems.

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own, such as devices that use magnetic tape. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization. POSIX abandoned tar in favor of pax, yet tar sees continued widespread use.

<span class="mw-page-title-main">7-Zip</span> Open-source file archiver

7-Zip is a free and open-source file archiver, a utility used to place groups of files within compressed containers known as "archives". It is developed by Igor Pavlov and was first released in 1999. 7-Zip has its own archive format called 7z, but can read and write several others.

For Microsoft Windows, OS/2, and DOS, .exe is the filename extension that denotes a file as being executable – a computer program – containing an entry point.

LZX is an LZ77 family compression algorithm, a slightly improved version of DEFLATE. It is also the name of a file archiver with the same name. Both were invented by Jonathan Forbes and Tomi Poutanen in the 1990s.

<span class="mw-page-title-main">WinRAR</span> File archiver

WinRAR is a trialware file archiver utility for Windows, developed by Eugene Roshal of win.rar GmbH. It can create and view archives in RAR or ZIP file formats, and unpack numerous archive file formats. To enable the user to test the integrity of archives, WinRAR embeds CRC32 or BLAKE2 checksums for each file in each archive. WinRAR supports creating encrypted, multi-part and self-extracting archives.

The following tables compare general and technical information for a number of file archivers. Please see the individual products' articles for further information. They are neither all-inclusive nor are some entries necessarily up to date. Unless otherwise specified in the footnotes section, comparisons are based on the stable versions—without add-ons, extensions or external programs.

The Global Assembly Cache (GAC) is a machine-wide CLI assembly cache for the Common Language Infrastructure (CLI) in Microsoft's .NET Framework. The approach of having a specially controlled central repository addresses the flaws in the shared library concept and helps to avoid pitfalls of other solutions that led to drawbacks like DLL hell.

In computing, SUBST is a command on the DOS, IBM OS/2, Microsoft Windows and ReactOS operating systems used for substituting paths on physical and logical drives as virtual drives.

<span class="mw-page-title-main">PeaZip</span> File archive computer program

PeaZip is a free and open-source file manager and file archiver for Microsoft Windows, ReactOS, Linux, MacOS and BSD by Giorgio Tani. It supports its native PEA archive format and other mainstream formats, with special focus on handling open formats. Version 9.4.0 supported 234 file extensions.

Windows Vista contains a range of new technologies and features that are intended to help network administrators and power users better manage their systems. Notable changes include a complete replacement of both the Windows Setup and the Windows startup processes, completely rewritten deployment mechanisms, new diagnostic and health monitoring tools such as random access memory diagnostic program, support for per-application Remote Desktop sessions, a completely new Task Scheduler, and a range of new Group Policy settings covering many of the features new to Windows Vista. Subsystem for UNIX Applications, which provides a POSIX-compatible environment is also introduced.

A batch file is a script file in DOS, OS/2 and Microsoft Windows. It consists of a series of commands to be executed by the command-line interpreter, stored in a plain text file. A batch file may contain any command the interpreter accepts interactively and use constructs that enable conditional branching and looping within the batch file, such as IF, FOR, and GOTO labels. The term "batch" is from batch processing, meaning "non-interactive execution", though a batch file might not process a batch of multiple data.

In computing, convert is a command-line utility included in the Windows NT operating system line. It is used to convert volumes using the FAT file systems to NTFS.

<span class="mw-page-title-main">IExpress</span>

IExpress, a component of Windows 2000 and later versions of the operating system, is used to create self-extracting packages from a set of files. Such packages can be used to install software.

In computing, ftype is a command-line utility on Microsoft Windows that is used to display or change the link between a file type and an executable program.

lzip Data compression utility

lzip is a free, command-line tool for the compression of data; it employs the Lempel–Ziv–Markov chain algorithm (LZMA) with a user interface that is familiar to users of usual Unix compression tools, such as gzip and bzip2.

forfiles is a computer software utility for Microsoft Windows, which selects files and runs a command on them. File selection criteria include name and last modified date. The command specifier supports some special syntax options. It can be used directly on the command line, or in batch files or other scripts.

In computing, dpath is an internal cmd.exe command on IBM OS/2 and Microsoft Windows that allows using a set of files with the TYPE command and with input redirection as if they are in the current directory. On Windows it is undocumented and deprecated. dpath differs from the append command in the way it operates. dpath informs programs what directories they should search in order to find computer files. It is then up to the applications to recognize %DPATH%. Using the append command on the other side, programs are able to find files without recognizing that the command is in effect.

References

  1. Microsoft Cabinet Format
  2. "Microsoft Cabinet Software Development Kit". Support. Microsoft. Retrieved 27 March 2013.
  3. "lcab". Freecode. Dice. Retrieved 27 March 2013.
  4. "cabextract".
  5. "[Undocumented][Bugs] Expand.exe (more about it) (Page 1) / Windows CMD Shell / SS64 Forum". ss64.org. Retrieved 21 April 2018.
  6. "[MS-MCI]: Microsoft ZIP (MSZIP) Compression and Decompression Data Structure". docs.microsoft.com. 13 October 2020.
  7. Salomon, D. (David) (2007). Data compression : the complete reference. Library Genesis. London : Springer. p. 187. ISBN   978-1-84628-602-5.
  8. "Quantum compression format". Matthew Russotto. Retrieved 29 March 2022.
  9. "Cabinet File Format – Open Specifications Documentation" (PDF). Microsoft Corporation. 4 March 2011. Archived (PDF) from the original on 18 November 2016. Retrieved 15 February 2023.
  10. expand – Windows CMD – SS64.com
  11. How to extract CAB File using command line tools in Windows 10
  12. Extrac32 – Uncompress files – Windows CMD – SS64.com
  13. How to extract CAB files in Linux [ permanent dead link ]
  14. gcab project on GNOME
  15. "Clean Up Component-Based Servicing logs". Microsoft TechNet. 7 July 2017.
  16. 1 2 Leonhard, Woody (25 August 2016). "Windows 7 log file compression bug can fill up your hard drive". Computerworld. Archived from the original on 8 January 2019. Retrieved 8 January 2019.
  17. "LVFS: Hardware Vendors". Linux Vendor Firmware Service Project. Retrieved 3 May 2022.
  18. "twogood/unshield". GitHub. Retrieved 21 April 2018.
  19. 1 2 "kyz/libmspack: doc/magic". GitHub. 2 January 2022.
  20. "Windows CE installation cabinet (.CAB) file format". cabextract.