Terse

Last updated
TERSE file format
Filename extension .trs
Developed by IBM
Initial release1984
Type of format Data compression
Open format?Yes

TERSE is an IBM archive file format that supports lossless compression. A TERSE file may contain a sequential data set, a partitioned data set (PDS), partitioned data set extended (PDSE), or a large format dataset (DSNTYPE=LARGE). Any record format (RECFM) is allowed as long as the record length is less than 32 K (64 K for RECFM=VBS). Records may contain printer control characters. [1]

Contents

Terse files are compressed using a modification of Ziv, Lempel compression algorithm developed by Victor S. Miller and Mark Wegman at the Thomas J. Watson Research Center in Yorktown Heights, New York. [2] [3]

The Terse algorithm was proprietary to IBM; however, IBM has released an open source Java decompressor under the Apache 2 license. [4] The compression/decompression program (called terse and unterse)—AMATERSE or TRSMAIN—is available from IBM for z/OS; the z/VM equivalents are the TERSE and DETERSE commands, for sequential datasets only. Versions for PC DOS, OS/2, AIX, Windows (2000, XP, 2003), Linux, and Mac OS/X are available online. [5]

AMATERSE

The following JCL can be used to invoke AMATERSE on z/OS (TRSMAIN uses INFILE and OUTFILE instead of SYSUT1 and SYSUT2): [6] [7]

//jobnameJOB...//stepnameEXECPGM=AMATERSE,PARM=ppppp//SYSPRINT DD SYSOUT=*//SYSUT1   DD DISP=SHR,DSN=input.dataset//SYSUT2   DD DISP=(NEW,CATLG),DCB=ddd,DSN=output.dataset,//SPACE=space_parameters//SYSUT3   DD DISP=(NEW,DELETE),SPACE=space_parametersOptional temporary dataset

Uses

Terse can be used as a general-purpose compression/decompression tool. IBM also distributes downloadable Program temporary fixs (PTFs) as tersed datasets. Terse is also used by IBM customers to package diagnostic information such as z/OS dumps and traces, for transmission to IBM.

Related Research Articles

In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression reduces bits by identifying and eliminating statistical redundancy. No information is lost in lossless compression. Lossy compression reduces bits by removing unnecessary or less important information. Typically, a device that performs data compression is referred to as an encoder, and one that performs the reversal of the process (decompression) as a decoder.

Lossless compression is a class of data compression that allows the original data to be perfectly reconstructed from the compressed data with no loss of information. Lossless compression is possible because most real-world data exhibits statistical redundancy. By contrast, lossy compression permits reconstruction only of an approximation of the original data, though usually with greatly improved compression rates.

<span class="mw-page-title-main">MVS</span> Operating system for IBM mainframes

Multiple Virtual Storage, more commonly called MVS, is the most commonly used operating system on the System/370, System/390 and IBM Z IBM mainframe computers. IBM developed MVS, along with OS/VS1 and SVS, as a successor to OS/360. It is unrelated to IBM's other mainframe operating system lines, e.g., VSE, VM, TPF.

zlib DEFLATE codec library

zlib is a software library used for data compression as well as a data format. zlib was written by Jean-loup Gailly and Mark Adler and is an abstraction of the DEFLATE compression algorithm used in their gzip file compression program. zlib is also a crucial component of many software platforms, including Linux, macOS, and iOS. It has also been used in gaming consoles such as the PlayStation 4, PlayStation 3, Wii U, Wii, Xbox One and Xbox 360.

z/OS 64-bit operating system for IBM mainframes

z/OS is a 64-bit operating system for IBM z/Architecture mainframes, introduced by IBM in October 2000. It derives from and is the successor to OS/390, which in turn was preceded by a string of MVS versions. Like OS/390, z/OS combines a number of formerly separate, related products, some of which are still optional. z/OS has the attributes of modern operating systems but also retains much of the older functionality that originated in the 1960s and is still in regular use—z/OS is designed for backward compatibility.

<span class="mw-page-title-main">Image compression</span> Reduction of image size to save storage and transmission costs

Image compression is a type of data compression applied to digital images, to reduce their cost for storage or transmission. Algorithms may take advantage of visual perception and the statistical properties of image data to provide superior results compared with generic data compression methods which are used for other digital data.

Lempel–Ziv–Welch (LZW) is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978. The algorithm is simple to implement and has the potential for very high throughput in hardware implementations. It is the algorithm of the Unix file compression utility compress and is used in the GIF image format.

LZ77 and LZ78 are the two lossless data compression algorithms published in papers by Abraham Lempel and Jacob Ziv in 1977 and 1978. They are also known as LZ1 and LZ2 respectively. These two algorithms form the basis for many variations including LZW, LZSS, LZMA and others. Besides their academic influence, these algorithms formed the basis of several ubiquitous compression schemes, including GIF and the DEFLATE algorithm used in PNG and ZIP.

compress is a Unix shell compression program based on the LZW compression algorithm. Compared to gzip's fastest setting, compress is slightly slower at compression, slightly faster at decompression, and has a significantly lower compression ratio. 1.8 MiB of memory is used to compress the Hutter Prize data, slightly more than gzip's slowest setting.

RAR is a proprietary archive file format that supports data compression, error correction and file spanning. It was developed in 1993 by Russian software engineer Eugene Roshal and the software is licensed by win.rar GmbH. The name RAR stands for Roshal Archive.

Job Control Language (JCL) is a scripting language used on IBM mainframe operating systems to instruct the system on how to run a batch job or start a subsystem. The purpose of JCL is to say which programs to run, using which files or devices for input or output, and at times to also indicate under what conditions to skip a step. Parameters in the JCL can also provide accounting information for tracking the resources used by a job as well as which machine the job should run on.

In the context of IBM mainframe computers in the S/360 line, a data set or dataset is a computer file having a record organization. Use of this term began with, e.g., DOS/360, OS/360, and is still used by their successors, including the current z/OS. Documentation for these systems historically preferred this term rather than file.

IEFBR14 is an IBM mainframe utility program. It runs in all IBM mainframe environments derived from OS/360, including z/OS. It is a placeholder that returns the exit status zero, similar to the true command on UNIX-like systems.

Lempel–Ziv–Oberhumer (LZO) is a lossless data compression algorithm that is focused on decompression speed.

This article discusses support programs included in or available for OS/360 and successors. IBM categorizes some of these programs as utilities and others as service aids; the boundaries are not always consistent or obvious. Many, but not all, of these programs match the types in utility software.

<span class="mw-page-title-main">Abraham Lempel</span> Israeli computer scientist (1936–2023)

Abraham Lempel was an Israeli computer scientist and one of the fathers of the LZ family of lossless data compression algorithms.

Lempel–Ziv–Storer–Szymanski (LZSS) is a lossless data compression algorithm, a derivative of LZ77, that was created in 1982 by James A. Storer and Thomas Szymanski. LZSS was described in article "Data compression via textual substitution" published in Journal of the ACM.

Grammar-based codes or Grammar-based compression are compression algorithms based on the idea of constructing a context-free grammar (CFG) for the string to be compressed. Examples include universal lossless data compression algorithms. To compress a data sequence , a grammar-based code transforms into a context-free grammar . The problem of finding a smallest grammar for an input sequence is known to be NP-hard, so many grammar-transform algorithms are proposed from theoretical and practical viewpoints. Generally, the produced grammar is further compressed by statistical encoders like arithmetic coding.

In IBM mainframe operating systems, basic partitioned access method (BPAM) is an access method for libraries, called partitioned datasets (PDSes) in IBM terminology. BPAM is used in OS/360, OS/VS2, MVS, z/OS, and others.

LZFSE is an open source lossless data compression algorithm created by Apple Inc. It was released with a simpler algorithm called LZVN.

References

  1. IBM Corporation (25 August 2016). "AMATERSE: Pack and unpack a data set". IBM Knowledge Center. Retrieved Sep 4, 2016.
  2. Miller, Victor S.; Wegman, Mark N. (1988). "Variations on a theme by Ziv and Lempel (Data compression)". IEEE International Conference on Communications, - Spanning the Universe. pp. 390–394. doi:10.1109/ICC.1988.13597. S2CID   56571935 . Retrieved Sep 4, 2016.
  3. Lalonde, Bill. "Terse/MVS". Big Iron. Retrieved Sep 5, 2016.
  4. "openmainframeproject/tersedecompress". GitHub. Retrieved 2020-11-28.
  5. "Hercules-390". Discussion group for users of the Hercules ESA/390 mainframe emulator. Retrieved Sep 5, 2016.
  6. IBM Corporation (25 August 2016). "Specifying the JCL statements for AMATERSE". IBM Knowledge Center. Retrieved Sep 4, 2016.
  7. ppppp is PACK (compress), SPACK (compress, slower and compresses better), or UNPACK (uncompress)