File archiver

Last updated

A file archiver is a computer program that combines a number of files together into one archive file, or a series of archive files, for easier transportation or storage. File archivers may employ lossless data compression in their archive formats to reduce the size of the archive.

Contents

Basic archivers just take a list of files and concatenate their contents sequentially into archives. The archive files need to store metadata, at least the names and lengths of the original files, if proper reconstruction is possible. More advanced archivers store additional metadata, such as the original timestamps, file attributes or access control lists.

The process of making an archive file is called archiving or packing. Reconstructing the original files from the archive is termed unarchiving, unpacking or extracting.

History

An early archiver was the Multics command archive, descended from the CTSS command of the same name, which was a basic archiver and performed no compression. Multics also had a "tape_archiver" command, abbreviated ta, which was perhaps the forerunner of the Unix command tar. [1]

Unix archivers

The Unix tools ar , tar , and cpio act as archivers but not compressors. Users of the Unix tools use additional compression tools, such as gzip, bzip2, or xz, to compress the archive file after packing or remove compression before unpacking the archive file. The filename extensions are successively added at each step of this process. For example, archiving a collection of files with tar and then compressing the resulting archive file with gzip results a file with .tar.gz extension.

This approach has two goals:

  1. It follows the Unix philosophy that each program should accomplish a single task to perfection, as opposed to attempting to accomplish everything with one tool. As compression technology progresses, users may use different compression programs without having to modify or abandon their archiver.
  2. The archives use solid compression. When the files are combined, the compressor can exploit redundancy across several archived files and achieve better compression than a compressor that compresses each files individually.

This approach, however, has disadvantages too:

  1. Extracting or modifying one file is difficult. Extracting one file requires decompressing an entire archive, which can be time- and space-consuming. Modifying one means the file needs to be put back into archive and the archive recompressed again. This operation requires additional time and disk space.
  2. The archive becomes damage-prone. If the area holding shared data for several files is damaged, all those files are lost.
  3. It is impossible to take advantage of redundancy between files unless the compression window is larger than the size of an individual file. For example, gzip uses DEFLATE, which typically operates with a 32768-byte window, whereas bzip2 uses a Burrows–Wheeler transform roughly 27 times bigger. xz defaults to 8 MiB but supports significantly larger windows.

Windows archivers

The built-in archiver of Microsoft Windows as well as third-party archiving software, such as WinRAR and 7-zip, often use a graphical user interface. They also offer an optional command-line interface, while Windows itself does not. Windows archivers perform both archiving and compression. Solid compression may or may not be offered, depending on the product: Windows itself does not support it; WinRAR and 7-zip offer it as an option that can be turned on or off.

See also

Related Research Articles

gzip GNU file compression/decompression tool

gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and intended for use by GNU. Version 0.1 was first publicly released on 31 October 1992, and version 1.0 followed in February 1993.

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization. POSIX abandoned tar in favor of pax, yet tar sees continued widespread use.

compress is a Unix shell compression program based on the LZW compression algorithm. Compared to gzip's fastest setting, compress is slightly slower at compression, slightly faster at decompression, and has a significantly lower compression ratio. 1.8 MiB of memory is used to compress the Hutter Prize data, slightly more than gzip's slowest setting.

<span class="mw-page-title-main">7-Zip</span> Open-source file archiver

7-Zip is a free and open-source file archiver, a utility used to place groups of files within compressed containers known as "archives". It is developed by Igor Pavlov and was first released in 1999. 7-Zip has its own archive format called 7z, but can read and write several others.

7z is a compressed archive file format that supports several different data compression, encryption and pre-processing algorithms. The 7z format initially appeared as implemented by the 7-Zip archiver. The 7-Zip program is publicly available under the terms of the GNU Lesser General Public License. The LZMA SDK 4.62 was placed in the public domain in December 2008. The latest stable version of 7-Zip and LZMA SDK is version 22.01.

<span class="mw-page-title-main">StuffIt Expander</span> File decompressor software utility

StuffIt Expander is a proprietary, freeware, closed source, decompression software utility developed by Allume Systems. It runs on the classic Mac OS, macOS, and Microsoft Windows. Prior to 2011, a Linux version had also been available for download.

pax is an archiving utility available for various operating systems and defined since 1995. Rather than sort out the incompatible options that have crept up between tar and cpio, along with their implementations across various versions of Unix, the IEEE designed new archive utility pax that could support various archive formats with useful options from both archivers. The pax command is available on Unix and Unix-like operating systems and on IBM i, and Microsoft Windows NT until Windows 2000.

<span class="mw-page-title-main">WinRAR</span> File archiver

WinRAR is a trialware file archiver utility for Windows, developed by Eugene Roshal of win.rar GmbH. It can create and view archives in RAR or ZIP file formats, and unpack numerous archive file formats. To enable the user to test the integrity of archives, WinRAR embeds CRC32 or BLAKE2 checksums for each file in each archive. WinRAR supports creating encrypted, multi-part and self-extracting archives.

rzip is a huge-scale data compression computer program designed around initial LZ77-style string matching on a 900 MB dictionary window, followed by bzip2-based Burrows–Wheeler transform and entropy coding (Huffman) on 900 kB output chunks.

The following tables compare general and technical information for a number of file archivers. Please see the individual products' articles for further information. They are neither all-inclusive nor are some entries necessarily up to date. Unless otherwise specified in the footnotes section, comparisons are based on the stable versions—without add-ons, extensions or external programs.

dar is a computer program, a command-line archiving tool intended as a replacement for tar in Unix-like operating systems.

In computing, solid compression is a method for data compression of multiple files, wherein all the uncompressed files are concatenated and treated as a single data block. Such an archive is called a solid archive. It is used natively in the 7z and RAR formats, as well as indirectly in tar-based formats such as .tar.gz and .tar.bz2. By contrast, the ZIP format is not solid because it stores separately compressed files.

<span class="mw-page-title-main">Ark (software)</span> Archiving tool for KDE desktop environment

Ark is a file archiver and compressor developed by KDE and included in the KDE Applications software bundle. It supports various common archive and compression formats including zip, 7z, rar, lha and tar.

<span class="mw-page-title-main">Self-extracting archive</span>

A self-extracting archive is a computer executable program which contains compressed data in an archive file combined with machine-executable program instructions to extract this information on a compatible operating system and without the necessity for a suitable extractor to be already installed on the target computer. The executable part of the file is known as a decompressor stub.

<span class="mw-page-title-main">PeaZip</span> File archive computer program

PeaZip is a free and open-source file manager and file archiver for Microsoft Windows, ReactOS, Linux, MacOS and BSD by Giorgio Tani. It supports its native PEA archive format and other mainstream formats, with special focus on handling open formats. Version 8.6.0 supported 226 file extensions.

XZ Utils is a set of free software command-line lossless data compressors, including the programs lzma and xz, for Unix-like operating systems and, from version 5.0 onwards, Microsoft Windows. For compression/decompression the Lempel–Ziv–Markov chain algorithm (LZMA) is used. XZ Utils started as a Unix port of Igor Pavlov's LZMA-SDK that has been adapted to fit seamlessly into Unix environments and their usual structure and behavior.

FreeArc is a free and open-source high-performance file archiver developed by Bulat Ziganshin. The project is presumably discontinued no information has been released by developers since 2016 and the official website is down.

lzip Data compression utility

lzip is a free, command-line tool for the compression of data; it employs the Lempel–Ziv–Markov chain algorithm (LZMA) with a user interface that is familiar to users of usual Unix compression tools, such as gzip and bzip2.

<span class="mw-page-title-main">BetterZip</span>

BetterZip is a trialware file archiver developed by Robert Rezabek, and first released in May 2006.It is developed solely for the macOS platform. Unlike the built-in Archive Utility from Apple it includes the ability to extract and compress in many archive formats, as well as the ability to view an archive and selectively extract files without automatically extracting the entire contents.

References

  1. Van Vleck, Tom. "Multics Glossary - A". multicians.org. Retrieved Jan 6, 2016.