This article possibly contains original research .(April 2021) |
Original author(s) | Mark H. Colburn (sponsored by The USENIX Association) |
---|---|
Developer(s) | Various open-source and commercial developers |
Initial release | 1989 |
Written in | Colburn pax, Muller pax, Heirloom Project pax: C |
Operating system | Unix, Unix-like, IBM i, Windows |
Platform | Cross-platform |
Type | Command |
License | Colburn pax: Prior BSD License Muller pax: BSD-4-Clause Heirloom Project pax: zlib Windows: Proprietary software |
pax is an archiving utility available for various operating systems and defined since 1995. [1] Rather than sort out the incompatible options that have crept up between tar
and cpio
, along with their implementations across various versions of Unix, the IEEE designed new archive utility pax that could support various archive formats with useful options from both archivers. The pax
command is available on Unix and Unix-like operating systems and on IBM i, [2] and Microsoft Windows NT [3] until Windows 2000.
In 2001, IEEE defined a new pax format which is basically tar with additional extended attributes. [4] [5] The format is not supported by pax commands in most Linux distributions and in FreeBSD, but it is supported by tar commands from GNU and FreeBSD; the format is further supported by pax commands in AIX, Solaris and HP-UX.
The name "pax" is an acronym for portable archive exchange, but is also an allusion to the Latin word for "peace"; the command invocation and structure represents somewhat of a peaceful unification of both tar
and cpio
.
The first public implementation of pax was written by Mark H. Colburn in 1989. Colburn posted it to comp.sources.unix as Usenix/IEEE POSIX replacement for TAR and CPIO. Manual pages for pax on HP-UX, IRIX, and SCO UNIX attribute pax to Colburn. [6]
As early as POSIX.2 draft 10 from July 1990 covers pax command. [7] Furthermore, POSIX.2 and IEEE 1003.1b drafts in 1991 cover pax command, featuring cpio and ustar archive formats.
Another version of the pax program was created by Keith Muller in 1992–1993. [8] The version first appeared in 4.4BSD (1995). [9] Pax command appeared in X/Open issue 4 (Single Unix Specification version 1) in 1995, [1] featuring cpio and ustar archive formats, which were also the only two formats featuring in the 1997 Single Unix Specification.
In 1997, Sun Microsystems proposed a method for adding extensions to the ustar format. This method was accepted for the POSIX.1-2001 standard as the new pax file format. The POSIX specification for the pax utility was updated to include this format. [1]
pax has four general modes that are invoked by a combination of the -r ("read") and -w ("write") options.
This table summarizes the modal behaviour:
Option | Mode | Description |
---|---|---|
(none) | "list" | shows contents of archive, does not modify or extract anything. |
-r | "read" | reads and extracts contents of an archive |
-w | "write" | creates archives or appends files to an archive |
-rw | "copy" | reads and copies files and directory tree to a specified directory |
This model is similar to cpio, which has a similar set of basic operations.
List contents of an archive:
$ pax<archive.tar
Extract contents of an archive into the current directory:
$ pax-r<archive.tar
Create an archive of the current directory:
$ printf'.'|pax-w>archive.tar $ pax-w.>archive.tar
Copy current directory tree to another location:
$ printf'.'|pax-w>archive.tar $ pax-rwtarget_dir.
(The target directory must exist beforehand!)
By default, pax uses the standard input/output for archive and listing operations. This can be changed with the "tar-style" option -f
that specifies the archive file. Pax differs from cpio by recursively considering the content of a directory; to disable this behavior, POSIX pax has an option -d
to disable it.
The pax command is a mish-mash of cpio and tar features. Like tar, pax processes directory entries recursively, a feature that can be disabled with -d
for cpio-style behavior. The handling of file input/outputs is also a mix: when a list of file names is specified on the command line, they are taken as shell globs for file input or listing (tar-like); otherwise pax takes the cpio-style behavior of using the standard input for a file list. Finally, pax supports reading/writing to a named archive file using tar's -f FILE
option.
For example, if one desires a cpio-style archiving of the current directory, pax can be used with find just like one does using cpio:
$ find.-depth-print|pax-wd>archive.tar
(This construct is pointless without any filters for find, as it becomes identical to the above example.)
The command for extracting the contents for an archive is the same as cpio:
$ pax-r<archive.tar
It is possible to invoke these commands in a tar
-like syntax as well:
$ pax-wfarchive.tar.# write (archive)$ pax-rfarchive.tar# read (extract)$ pax-farchive.tar# list$ pax-rw.archive_dir# copy
Most implementations of pax use the -z (gzip) and -j (bzip2) switches for compression; this feature however, is not specified by POSIX. It is important to note that pax cannot append to compressed archives.
Example for extracting a gzipped archive:
$ pax-rzfarchive.tar.gz
As in tar and cpio, pax output can be piped to another compressor/decompressor program. As an example xz is used here:
$ pax-w.|xz>archive.tar.xz
and listing an xz-compressed archive as the input:
$ xzcatarchive.tar.xz|pax
POSIX.1-2001 requires that pax command supports the archive formats cpio, ustar and pax at a minimum.
The versions of pax command that stem from the 4.4BSD implementation usually inherit the formats supported by that version, selectable via the -x option:
The POSIX.1-2001 pax format is not supported by this BSD version of pax command. The format is not supported on most Linux distributions (whose pax command is from the MirBSD branch of MirCPIO-paxmirabilis) [10] [11] [12] and on FreeBSD. [9] The format is supported by pax command in AIX, [13] Solaris [14] and HP-UX. It is further supported by Python tarfile module, [15] by BSD-licensed libarchive, [16] [17] by 7-ZIP [18] and further by GNU tar [19] and FreeBSD tar. [20] The Heirloom Project pax command, developed by Gunnar Ritter in 2003, supports the pax format as well as many extra formats. [21]
pax supports archiving on multiple volumes. When the end of a volume is reached, the following message appears:
$ pax-wf/dev/fd0. ATTENTION! pax archive volume change required./dev/fd0 ready for archive volume: 2Load the NEXT STORAGE MEDIA (if required) and make sure it is WRITE ENABLED.Type "y" to continue, "." to quit pax, or "s" to switch to new device.If you cannot change storage media, type "s"Is the device ready and online? >
When restoring an archive from multiple media, pax asks for the next media in the same fashion, when the end of the media is reached before the end of the archive.
Despite being standardized in 2001 by IEEE, as of 2010, pax enjoys relatively little popularity or adoption.[ citation needed ] This is in part because there was not any need for it from the Unix users; it was just the POSIX committee that wants to have a more consistent interface. Pax is also fairly chatty and expects user interactions when things go wrong.[ citation needed ]
pax is required to be present in all conformant systems by Linux Standard Base since version 3.0 (released on July 6, 2005), [22] but so far few Linux distributions ship and install it by default. However, most distributions include pax as a separately installable package. [23] [24] [25]
pax has also been present in Windows NT, where it is limited to file archives (tapes not supported). [3] It was later moved to the Interix subsystem. It does not support archiving or restoring Win32 ACLs. [26] pax was further present in Windows 2000. [27]
Packages handled by the Installer (macOS) often carry the bulk of their contents in an Archive.pax.gz file that may be read using the system's pax (heirloom) utility. [28]
The Portable Operating System Interface is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines both the system and user-level application programming interfaces (APIs), along with command line shells and utility interfaces, for software compatibility (portability) with variants of Unix and other operating systems. POSIX is also a trademark of the IEEE. POSIX is intended to be used by both application and system developers.
The Single UNIX Specification (SUS) is a standard for computer operating systems, compliance with which is required to qualify for using the "UNIX" trademark. The standard specifies programming interfaces for the C language, a command-line shell, and user commands. The core specifications of the SUS known as Base Specifications are developed and maintained by the Austin Group, which is a joint working group of IEEE, ISO/IEC JTC 1/SC 22/WG 15 and The Open Group. If an operating system is submitted to The Open Group for certification, and passes conformance tests, then it is deemed to be compliant with a UNIX standard such as UNIX 98 or UNIX 03.
A Unix shell is a command-line interpreter or shell that provides a command line user interface for Unix-like operating systems. The shell is both an interactive command language and a scripting language, and is used by the operating system to control the execution of the system using shell scripts.
In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own, such as devices that use magnetic tape. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization. POSIX abandoned tar in favor of pax, yet tar sees continued widespread use.
compress is a Unix shell compression program based on the LZW compression algorithm. Compared to gzip's fastest setting, compress is slightly slower at compression, slightly faster at decompression, and has a significantly lower compression ratio. 1.8 MiB of memory is used to compress the Hutter Prize data, slightly more than gzip's slowest setting.
uname is a computer program in Unix and Unix-like computer operating systems that prints the name, version and other details about the current machine and the operating system running on it.
The ln
command is a standard Unix command utility used to create a hard link or a symbolic link (symlink) to an existing file or directory. The use of a hard link allows multiple filenames to be associated with the same file since a hard link points to the inode of a given file, the data of which is stored on disk. On the other hand, symbolic links are special files that refer to other files by name.
The archiver, also known simply as ar, is a Unix utility that maintains groups of files as a single archive file. Today, ar
is generally used only to create and update static library files that the link editor or linker uses and for generating .deb packages for the Debian family; it can be used to create archives for any purpose, but has been largely replaced by tar
for purposes other than static libraries. An implementation of ar
is included as one of the GNU Binutils.
xargs is a command on Unix and most Unix-like operating systems used to build and execute commands from standard input. It converts input from standard input into arguments to a command.
dd is a command-line utility for Unix, Plan 9, Inferno, and Unix-like operating systems and beyond, the primary purpose of which is to convert and copy files. On Unix, device drivers for hardware and special device files appear in the file system just like normal files; dd can also read and/or write from/to these files, provided that function is implemented in their respective driver. As a result, dd can be used for tasks such as backing up the boot sector of a hard drive, and obtaining a fixed amount of random data. The dd program can also perform conversions on the data as it is copied, including byte order swapping and conversion to and from the ASCII and EBCDIC text encodings.
tr is a command in Unix, Plan 9, Inferno, and Unix-like operating systems. It is an abbreviation of translate or transliterate, indicating its operation of replacing or removing specific characters in its input data set.
In computing, cp
is a command in various Unix and Unix-like operating systems for copying files and directories. The command has three principal modes of operation, expressed by the types of arguments presented to the program for copying a file to another file, one or more files to a directory, or for copying entire directories to another directory.
Source Code Control System (SCCS) is a version control system designed to track changes in source code and other text files during the development of a piece of software. This allows the user to retrieve any of the previous versions of the original source code and the changes which are stored. It was originally developed at Bell Labs beginning in late 1972 by Marc Rochkind for an IBM System/370 computer running OS/360.
The following tables compare general and technical information for a number of file archivers. Please see the individual products' articles for further information. They are neither all-inclusive nor are some entries necessarily up to date. Unless otherwise specified in the footnotes section, comparisons are based on the stable versions—without add-ons, extensions or external programs.
cpio is a general file archiver utility and its associated file format. It is primarily installed on Unix-like computer operating systems. The software utility was originally intended as a tape archiving program as part of the Programmer's Workbench (PWB/UNIX), and has been a component of virtually every Unix operating system released thereafter. Its name is derived from the phrase copy in and out, in close description of the program's use of standard input and standard output in its operation.
The Heirloom Project is a collection of traditional Unix utilities. Most of them are derived from original Unix source code, as released as open-source by Caldera and Sun.
sum is a legacy utility available on some Unix and Unix-like operating systems. This utility outputs a 16-bit checksum of each argument file, as well as the number of blocks they take on disk. Two different checksum algorithms are in use. POSIX abandoned sum
in favor of cksum.
XZ Utils is a set of free software command-line lossless data compressors, including the programs lzma and xz, for Unix-like operating systems and, from version 5.0 onwards, Microsoft Windows. For compression/decompression the Lempel–Ziv–Markov chain algorithm (LZMA) is used. XZ Utils started as a Unix port of Igor Pavlov's LZMA-SDK that has been adapted to fit seamlessly into Unix environments and their usual structure and behavior.
lzip is a free, command-line tool for the compression of data; it employs the Lempel–Ziv–Markov chain algorithm (LZMA) with a user interface that is familiar to users of usual Unix compression tools, such as gzip and bzip2.
libarchive is a free and open-source library for reading and writing various archive and compression formats. It is written in C and works on most Unix-like systems and Windows.