CMS Pipelines

Last updated
Pipelines
Pipjarg1.jpeg
Paradigm Dataflow programming
Designed by John P. Hartmann (IBM)
Developer IBM
First appeared1986
Stable release
1.1.12/0012 / 2020-06-03
Platform IBM z Systems
OS z/VM 7.1
Website http://vm.marist.edu/~pipeline
Influenced by
Pipeline (Unix)

CMS Pipelines is a feature of the VM/CMS operating system that allows the user to create and use a pipeline. The programs in a pipeline operate on a sequential stream of records. A program writes records that are read by the next program in the pipeline. Any program can be combined with any other because reading and writing is done through a device independent interface.

Contents

Overview

CMS Pipelines provides a CMS command, PIPE. The argument string to the PIPE command is the pipeline specification. PIPE selects programs to run and chains them together in a pipeline to pump data through.

Because CMS programs and utilities don't provide a device independent stdin and stdout interface, CMS Pipelines has a built-in library of programs that can be called in a pipeline specification. These built-in programs interface to the operating system, and perform many utility functions.

Data on CMS is structured in logical records rather than a stream of bytes. For textual data a line of text corresponds to a logical record. In CMS Pipelines the data is passed between the stages as logical records.

CMS Pipelines users issue pipeline commands from the terminal or in EXEC procedures. Users can write programs in REXX that can be used in addition to the built-in programs.

Example

A simple example that reads a disk file, separates records containing the string "Hello" from those that do not. The selected records are modified by appending the string "World!" to each of them; the other records are translated to upper case. The two streams are then combined and the records are written to a new output file.

PIPE (end ?)           < input txt         | a: locate /Hello/         | insert / World!/ after        | i: faninany        | > newfile txt a        ? a:        | xlate upper        | i:

In this example, the < stage reads the input disk file and passes the records to the next stage in the pipeline. The locate stage separates the input stream into two output streams. The primary output of locate (records containing Hello) passes the records to the insert stage. The insert stage modifies the input records as specified in its arguments and passes them to its output. The output is connected to faninany that combines records from all input streams to form a single output stream. The output is written to the new disk file.

The secondary output of locate (marked by the second occurrence of the a: label) contains the records that did not meet the selection criterion. These records are translated to upper case (by the xlate stage) and passed to the secondary input stream of faninany (marked by the second occurrence of the i: label).

The pipeline topology in this example consists of two connected pipelines. The end character (the ? in this example) separates the individual pipelines in the pipeline set. Records read from the input file pass through either of the two routes of the pipeline topology. Because neither of the routes contain stages that need to buffer records, CMS Pipelines ensures that records arrive at faninany in the order in which they passed through locate.

The example pipeline is presented in 'portrait form' with the individual stages on separate lines. When a pipeline is typed as a CMS command, all stages are written on a single line.

Features

The concept of a simple pipeline is extended in these ways:

CMS Pipelines offers several features to improve the robustness of programs:

History

John Hartmann, of IBM Denmark, started development of CMS Pipelines in 1980. [1] The product was marketed by IBM as a separate product during the 80's and integrated in VM/ESA late 1991. With each release of VM, the CMS Pipelines code was upgraded as well until it was functionally frozen at the 1.1.10 level in VM/ESA 2.3 in 1997. Since then, the latest level of CMS Pipelines has been available for download from the CMS Pipelines homepage for users who wish to explore new function.

The current level of CMS Pipelines is included in the z/VM releases again since z/VM 6.4, available since November 11, 2016.

An implementation of CMS Pipelines for TSO was released in 1995 as BatchPipeWorks in the BatchPipes/MVS product. The up-to-date TSO implementation has been available as a Service Offering from IBM Denmark until 2010.

Both versions are maintained from a single source code base and commonly referred to as CMS/TSO Pipelines. The specification is available in the Author's Edition. [2]

See also

Related Research Articles

<span class="mw-page-title-main">Text editor</span> Computer software used to edit plain text documents

A text editor is a type of computer program that edits plain text. Such programs are sometimes known as "notepad" software. Text editors are provided with operating systems and software development packages, and can be used to change files such as configuration files, documentation files and programming language source code.

<span class="mw-page-title-main">Conversational Monitor System</span>

The Conversational Monitor System is a simple interactive single-user operating system. CMS was originally developed as part of IBM's CP/CMS operating system, which went into production use in 1967. CMS is part of IBM's VM family, which runs on IBM mainframe computers. VM was first announced in 1972, and is still in use today as z/VM.

In computer programming, standard streams are interconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin), standard output (stdout) and standard error (stderr). Originally I/O happened via a physically connected system console, but standard streams abstract this. When a command is executed via an interactive shell, the streams are typically connected to the text terminal on which the shell is running, but can be changed with redirection or a pipeline. More generally, a child process inherits the standard streams of its parent process.

The C programming language provides many standard library functions for file input and output. These functions make up the bulk of the C standard library header <stdio.h>. The functionality descends from a "portable I/O package" written by Mike Lesk at Bell Labs in the early 1970s, and officially became part of the Unix operating system in Version 7.

Disk formatting is the process of preparing a data storage device such as a hard disk drive, solid-state drive, floppy disk, memory card or USB flash drive for initial use. In some cases, the formatting operation may also create one or more new file systems. The first part of the formatting process that performs basic medium preparation is often referred to as "low-level formatting". Partitioning is the common term for the second part of the process, dividing the device into several sub-devices and, in some cases, writing information to the device allowing an operating system to be booted from it. The third part of the process, usually termed "high-level formatting" most often refers to the process of generating a new file system. In some operating systems all or parts of these three processes can be combined or repeated at different levels and the term "format" is understood to mean an operation in which a new disk medium is fully prepared to store files. Some formatting utilities allow distinguishing between a quick format, which does not erase all existing data and a long option that does erase all existing data.

<span class="mw-page-title-main">VM (operating system)</span> Family of IBM operating systems

VM is a family of IBM virtual machine operating systems used on IBM mainframes System/370, System/390, zSeries, System z and compatible systems, including the Hercules emulator for personal computers.

<span class="mw-page-title-main">Redirect (computing)</span> Form of interprocess communication

In computing, redirection is a form of interprocess communication, and is a function common to most command-line interpreters, including the various Unix shells that can redirect standard streams to user-specified locations.

<span class="mw-page-title-main">IBM 305 RAMAC</span> IBM computer released in 1956

The IBM 305 RAMAC was the first commercial computer that used a moving-head hard disk drive for secondary storage. The system was publicly announced on September 14, 1956, with test units already installed at the U.S. Navy and at private corporations. RAMAC stood for "Random Access Method of Accounting and Control", as its design was motivated by the need for real-time accounting in business.

rzip is a huge-scale data compression computer program designed around initial LZ77-style string matching on a 900 MB dictionary window, followed by bzip2-based Burrows–Wheeler transform and entropy coding (Huffman) on 900 kB output chunks.

<span class="mw-page-title-main">Pipeline (Unix)</span>

In Unix-like computer operating systems, a pipeline is a mechanism for inter-process communication using message passing. A pipeline is a set of processes chained together by their standard streams, so that the output text of each process (stdout) is passed directly as input (stdin) to the next one. The second process is started as the first process is still executing, and they are executed concurrently. The concept of pipelines was championed by Douglas McIlroy at Unix's ancestral home of Bell Labs, during the development of Unix, shaping its toolbox philosophy. It is named by analogy to a physical pipeline. A key feature of these pipelines is their "hiding of internals". This in turn allows for more clarity and simplicity in the system.

In computing, a pipeline, also known as a data pipeline, is a set of data processing elements connected in series, where the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time-sliced fashion. Some amount of buffer storage is often inserted between elements.

UVC-based preservation is an archival strategy for handling the preservation of digital objects. It employs the use of a Universal Virtual Computer (UVC)—a virtual machine (VM) specifically designed for archival purposes, that allows both emulation and migration to a language-neutral format like XML.

In software engineering, a pipeline consists of a chain of processing elements, arranged so that the output of each element is the input of the next; the name is by analogy to a physical pipeline. Usually some amount of buffering is provided between consecutive elements. The information that flows in these pipelines is often a stream of records, bytes, or bits, and the elements of a pipeline may be called filters; this is also called the pipe(s) and filters design pattern. Connecting elements into a pipeline is analogous to function composition.

This article discusses support programs included in or available for OS/360 and successors. IBM categorizes some of these programs as utilities and others as service aids; the boundaries are not always consistent or obvious. Many, but not all, of these programs match the types in utility software.

In computer science, a record-oriented filesystem is a file system where data is stored as collections of records. This is in contrast to a byte-oriented filesystem, where the data is treated as an unformatted stream of bytes. There are several different possible record formats; the details vary depending on the particular system. In general the formats can be fixed-length or variable length, with different physical organizations or padding mechanisms; metadata may be associated with the file records to define the record length, or the data may be part of the record. Different access methods for records may be provided, for example records may be retrieved in sequential order, by key, or by record number.

In computing, tee is a command in command-line interpreters (shells) using standard streams which reads standard input and writes it to both standard output and one or more files, effectively duplicating its input. It is primarily used in conjunction with pipes and filters. The command is named after the T-splitter used in plumbing.

On IBM mainframes, BatchPipes is a batch job processing utility which runs under the MVS/ESA operating system and later versions—OS/390 and z/OS.

In computer programming, flow-based programming (FBP) is a programming paradigm that defines applications as networks of "black box" processes, which exchange data across predefined connections by message passing, where the connections are specified externally to the processes. These black box processes can be reconnected endlessly to form different applications without having to be changed internally. FBP is thus naturally component-oriented.

Input/Output Control System (IOCS) is any of several packages on early IBM entry-level and mainframe computers that provided low level access to records on peripheral equipment. IOCS provides functionality similar to 1960s packages from other vendors, e.g., File Control Processor (FCP) in RCA 3301 Realcom Operating System, GEFRC in GECOS, and to the later Record Management Services (RMS) in DEC VAX/VMS

The CMS file system is the native file system of IBM's Conversational Monitor System (CMS), a component of VM. It was the only file system for CMS until the introduction of the CMS Shared File System with VM/SP.

References

  1. VM and the VM Community, Melinda Varian
  2. CMS/TSO Pipelines Author's Edition Author's Edition