Pax (Unix)

Last updated

pax
Operating system Unix and Unix-like
Type Command

pax is an archiving utility created by POSIX, defined since 1995. [1] Rather than sort out the incompatible options that have crept up between tar and cpio , along with their implementations across various versions of UNIX, the IEEE designed a new archive utility that could support various archive formats with useful options from both archivers.

Contents

IEEE, in 2001, defined a new pax format which is basically tar with additional extended attributes. [2]

The name "pax" is an acronym for portable archive exchange. The command invocation and structure is somewhat a unification of both tar and cpio.

History

A version of the pax program was created by Keith Muller in 1992–1993. The version first appeared in 4.4BSD (1995). [3] Pax was accepted into X/Open issue 4 (Single Unix Specification version 1) in 1995. [1] These versions of pax only defined the command-line interface as a tar/cpio hybrid, but the pax file format was not yet defined. (The work on defining Pax likely precedes Muller's work; it appears to be in the early POSIX.2 and IEEE 1003.1b drafts circa 1991.)

In 1997, Sun proposed a method for adding extensions to the ustar format. This method was later accepted for the POSIX.1-2001 standard, as the new pax file format. The POSIX specification for the pax utility was updated to include this format. [1]

Features

Modes

pax has four general modes that are invoked by a combination of the -r ("read") and -w ("write") options.

This table summarizes the modal behaviour:

OptionModeDescription
(none)"list"shows contents of archive, does not modify or extract anything.
-r"read"reads and extracts contents of an archive
-w"write"creates archives or appends files to an archive
-rw"copy"reads and copies files and directory tree to a specified directory

This model is similar to cpio, which has a similar set of basic operations.

Examples:
List contents of an archive:

pax < archive.tar

Extract contents of an archive into the current directory:

pax -r < archive.tar

Create an archive of the current directory:

printf '.' | pax -w > archive.tar pax -w . > archive.tar

Copy current directory tree to another location:
The target directory must exist beforehand!

printf '.' | pax -w > archive.tar pax -rw target_dir .

Command invocation

By default, it uses the standard input/output for archive and listing operations, but this can be overwridden with the "tar-style" option -f that specifies the archive file. Pax differs from cpio by recursively considering the content of a directory; to disable this behavior, POSIX pax has an option -d to disable it.

The pax command is a mish-mash of cpio and tar features. Like tar, pax processes directory entries recursively, a feature that can be disabled with -d for cpio-style behavior. The handling of file input/outputs is also a mix: when a list of file names is specified on the command line, they are taken as shell globs for file input or listing (tar-like); otherwise pax takes the cpio-style behavior of using the standard input for a file list. Finally, pax supports reading/writing to a named archive file using tar's -f FILE option.

For example, if one desires a cpio-style archiving of the current directory, pax can be used with find just like one does using cpio:

find . -depth -print | pax -wd > archive.tar

(This construct is pointless without any filters for find, as it becomes identical to the above example.)

The command for extracting the contents for an archive is the same as cpio:

pax -r < archive.tar

It is possible to invoke these commands in a tar-like syntax as well:

pax -wf archive.tar . # write (archive) pax -rf archive.tar   # read (extract) pax -f archive.tar    # list pax -rw . archive_dir # copy

Compression

Most implementations of pax use the -z (gzip) and -j (bzip2) switches for compression; this feature however, is not specified by POSIX. It is important to note that pax cannot append to compressed archives.
Example for extracting a gzipped archive:

pax -rzf archive.tar.gz

As in tar and cpio, pax output can be piped to another compressor/decompressor program. As an example xz is used here:

pax -w . | xz > archive.tar.xz

and listing an xz-compressed archive as the input:

xzcat archive.tar.xz | pax

Format support

Almost all extant versions of pax stemmed from the original 4.4BSD implementation. Most of them inherits the formats supported by that version, selectable via the -x option:

Notably, the support for the 2001 pax format is not supported by this legacy pax. This is the case on most Linux distributions (which uses the MirBSD branch pf MirCPIO-paxmirabilis) [4] [5] [6] and on FreeBSD. [3]

The Heirloom Project pax has support for the pax support as well as many extra formats. [7]

Multiple volumes

pax supports archiving on multiple volumes. When the end of a volume is reached, the following message appears:

$ pax -wf /dev/fd0 . ATTENTION! pax archive volume change required./dev/fd0 ready for archive volume: 2Load the NEXT STORAGE MEDIA (if required) and make sure it is WRITE ENABLED.Type "y" to continue, "." to quit pax, or "s" to switch to new device.If you cannot change storage media, type "s"Is the device ready and online? >

When restoring an archive from multiple media, pax asks for the next media in the same fashion, when the end of the media is reached before the end of the archive.

Standardization, reception and popularity

Despite being standardized in 2001 by IEEE, as of 2010, pax enjoys relatively little popularity or adoption. This is in part because there was not any need for it from the Unix users; it was just the POSIX committee that wants to have a more consistent interface. Pax is also fairly chatty and expects user interactions when things go wrong.

pax is required to be present in all conformant systems by Linux Standard Base since version 3.0 (released on July 6, 2005), [8] but so far few Linux distributions ship and install it by default. However, most distributions include pax as a separately installable package.

pax has also been present in Windows NT, where it is limited to file archives (tapes not supported). [9] It was later moved to the Interix subsystem. It does not support archiving or restoring Win32 ACLs. [10]

Packages handled by the Installer (OS X) often carry the bulk of their contents in an Archive.pax.gz file that may be read using the system's pax (heirloom) utility. [11]

See also

Notes

  1. 1 2 3 The Open Group Base Specifications Issue 6 - POSIX.1-2001 (IEEE Std 1003.1) Copyright © 2001-2004 The IEEE and The Open Group
  2. "GNU tar 1.30: 8.1 Using Less Space through Compression". www.gnu.org. Retrieved 22 April 2018.
  3. 1 2 pax(1)    FreeBSD General Commands Manual
  4. "MirCPIO — paxmirabilis: tar and even ar!". MirBSD. Retrieved 24 January 2020.
  5. "MirCPIO in Launchpad". launchpad.net. Retrieved 22 April 2018.
  6. "Bug #456405 "pax lack of support for "pax" format fails LSB" : Bugs : MirCPIO". bugs.launchpad.net. Retrieved 22 April 2018.
  7. "pax.1". heirloom.sourceforge.net. Retrieved 22 April 2018.
  8. Release notes for LSB 3.0 Archived 2011-07-17 at the Wayback Machine note that pax was added: LSB 3.0. Commands and Utilities requires pax, but LSB 2.1.0. Commands and Utilities does not require it
  9. Eric Pearce (1997). Windows NT in a nutshell: a desktop quick reference for system administrators . O'Reilly Media, Inc. p.  271. ISBN   978-1-56592-251-8.
  10. http://support.microsoft.com/kb/246322
  11. "Extracting files from a pax.gz package". cocoabuilder.com. Archived from the original on 14 February 2015. Retrieved 15 August 2013.

Related Research Articles

A file archiver is a computer program that combines a number of files together into one archive file, or a series of archive files, for easier transportation or storage. File archivers may employ lossless data compression in their archive formats to reduce the size of the archive.

gzip GNU file compression/decompression tool

gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and intended for use by GNU. Version 0.1 was first publicly released on 31 October 1992, and version 1.0 followed in February 1993.

The Portable Operating System Interface (POSIX) is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines the application programming interface (API), along with command line shells and utility interfaces, for software compatibility with variants of Unix and other operating systems.

The Single UNIX Specification (SUS) is the collective name of a family of standards for computer operating systems, compliance with which is required to qualify for using the "UNIX" trademark. The core specifications of the SUS are developed and maintained by the Austin Group, which is a joint working group of IEEE, ISO JTC 1 SC22 and The Open Group. If an operating system is submitted to The Open Group for certification, and passes conformance tests, then it is deemed to be compliant with a UNIX standard such as UNIX 98 or UNIX 03.

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file access permissions, and directory organization. The command line utility was first introduced in the Version 7 Unix in January 1979, replacing the tp program. The file structure to store this information was standardized in POSIX.1-1988 and later POSIX.1-2001, and became a format supported by most modern file archiving systems.

ls command to list files in Unix and Unix-like operating systems

In computing, ls is a command to list computer files in Unix and Unix-like operating systems. ls is specified by POSIX and the Single UNIX Specification. When invoked without any arguments, ls lists the files in the current working directory. The command is also available in the EFI shell. In other environments, such as DOS, OS/2, and Microsoft Windows, similar functionality is provided by the dir command. The numerical computing environments MATLAB and GNU Octave include an ls function with similar functionality.

compress is a Unix shell compression program based on the LZW compression algorithm. Compared to more modern compression utilities such as gzip and bzip2, compress performs faster and with less memory usage, at the cost of a significantly lower compression ratio.

uname standard UNIX utility

uname is a computer program in Unix and Unix-like computer operating systems that prints the name, version and other details about the current machine and the operating system running on it.

The ln command is a standard Unix command utility used to create a hard link or a symbolic link (symlink) to an existing file. The use of a hard link allows multiple filenames to be associated with the same file since a hard link points to the inode of a given file, the data of which is stored on disk. On the other hand, symbolic links are special files that refer to other files by name.

The archiver, also known simply as ar, is a Unix utility that maintains groups of files as a single archive file. Today, ar is generally used only to create and update static library files that the link editor or linker uses and for generating .deb packages for the Debian family; it can be used to create archives for any purpose, but has been largely replaced by tar for purposes other than static libraries. An implementation of ar is included as one of the GNU Binutils.

xargs is a command on Unix and most Unix-like operating systems used to build and execute commands from standard input. It converts input from standard input into arguments to a command.

dd is a command-line utility for Unix and Unix-like operating systems, the primary purpose of which is to convert and copy files.

In computing, cp is a command in various Unix and Unix-like operating systems for copying files and directories. The command has three principal modes of operation, expressed by the types of arguments presented to the program for copying a file to another file, one or more files to a directory, or for copying entire directories to another directory.

The standard Unix file types are regular, directory, symbolic link, FIFO special, block special, character special, and socket as defined by POSIX. Different OS-specific implementations allow more types than what POSIX requires. A file's type can be identified by the ls -l command, which displays the type in the first character of the file system permissions field.

The file command is a standard program of Unix and Unix-like operating systems for recognizing the type of data contained in a computer file.

The following tables compare general and technical information for a number of file archivers. Please see the individual products' articles for further information. They are neither all-inclusive nor are some entries necessarily up to date. Unless otherwise specified in the footnotes section, comparisons are based on the stable versions—without add-ons, extensions or external programs.

cpio is a general file archiver utility and its associated file format. It is primarily installed on Unix-like computer operating systems. The software utility was originally intended as a tape archiving program as part of the Programmer's Workbench (PWB/UNIX), and has been a component of virtually every Unix operating system released thereafter. Its name is derived from the phrase copy in and out, in close description of the program's use of standard input and standard output in its operation.

Toybox is a free and open-source software implementation of some Unix command line utilities for embedded devices. These include ls, cp, mv, and about 150 others. The Toybox project was started in 2006, and became a BSD-licensed BusyBox alternative. Toybox is included with Android 6.0 "Marshmallow" and all later Android versions, and is also available for installation on certain other operating systems.

XZ Utils is a set of free software command-line lossless data compressors, including LZMA and xz, for Unix-like operating systems and, from version 5.0 onwards, Microsoft Windows.

lzip

lzip is a free, command-line tool for the compression of data; it employs the Lempel–Ziv–Markov chain algorithm (LZMA) with a user interface that is familiar to users of usual Unix compression tools, such as gzip and bzip2.