BatchPipes

Last updated

On IBM mainframes, BatchPipes is a batch job processing utility which runs under the MVS/ESA operating system and later versions OS/390 and z/OS. [1]

IBM mainframes are large computer systems produced by IBM since 1952. During the 1960s and 1970s, IBM dominated the large computer market. Current mainframe computer in IBM's line of business computers are developments of the basic design of the IBM System/360.

Computerized batch processing, since the 1964 introduction of the IBM System/360, has primarily referred to the scripted running of one or more programs, as directed by Job Control Language, with no human interaction other than, if JCL-requested, the mounting of one or more pre-determined input and/or output computer tapes.

OS/390 is an IBM operating system for the System/390 IBM mainframe computers.

Contents

Core function

In traditional processing, if data records are written out to sequential (QSAM and BSAM) data set on disk or tape, they cannot be read concurrently by another job. The "writer" and "reader" cannot run at the same time. This is termed file-level interlock or data-set-level interlock.

In IBM mainframe operating systems, Basic sequential access method (BSAM) is an access method to read and write datasets sequentially. BSAM is available on OS/360, OS/VS2, MVS, z/OS, and related operating systems.

In the context of IBM mainframe computers, a data set or dataset is a computer file having a record organization. Use of this term began with OS/360 and is still used by its successors, including the current z/OS. Documentation for these systems historically preferred this term rather than file.

With BatchPipes an installation can arrange for the data to be "piped" between the two jobs. The advantage is that the jobs can run concurrently and it is possible, and very usual, to avoid the time to write the data to secondary storage and to read it back. The combination of these two characteristics, if used judiciously, leads to a reduction in the combined elapsed time of the two jobs, as measured from the start of the writer job to the end of the reader job.

BatchPipes maintains a short queue of records being passed between the writer and the reader. The writer adds records to the back of the queue and the reader takes them from the front. This is deemed record-level interlock and allows the reader and the writer to run concurrently.

A sort is a special case: all the input records must be read before the first output record can be written. Hence there can be no overlap between the input and output phases of a sort. But the input phase can be overlapped with the previous job's output phase. Similarly, the output phase of sort can be overlapped with a downstream job that reads the sorted data.

The Sort/Merge utility is a mainframe program to sort records in a file into a specified order, merge pre-sorted files into a sorted file, or copy selected records. Internally, these utilities use one or more of the standard sorting algorithms, often with proprietary fine-tuned code.

Advanced pipe topologies

More complex topologies than "one reader one writer" are possible.

Criticism

One of the key implementation considerations is scheduling the reader and writer jobs to run together. In practical batch schedules this might not be feasible. Furthermore, if any job in the pipeline fails, recovery actions will be wider than just recovering this single job. For these reasons some installations have found it difficult to implement BatchPipes.

BatchPipePlex

BatchPipes can use the IBM mainframe Coupling Facility to pipe data between different members of a Parallel Sysplex, using the BatchPipePlex facility.

In IBM mainframe computers, a Coupling Facility or CF is a piece of computer hardware which allows multiple processors to access the same data.

In computing, a Parallel Sysplex is a cluster of IBM mainframes acting together as a single system image with z/OS. Used for disaster recovery, Parallel Sysplex combines data sharing and parallel computing to allow a cluster of up to 32 systems to share a workload for high performance and high availability.

BatchPipeWorks

BatchPipes includes a set of pipeline stages based on IBM's CMS Pipelines product developed for the VM/ESA operating system. These stages provide additional processing, without the need for additional batch jobs in the pipeline.

History

BatchPipes Version 1 was developed in the late 1980s and early 1990s simply as a technique to speed up MVS/ESA batch processing. In 1997 the functionality of BatchPipes was integrated into a larger IBM product - SmartBatch (which incorporated two BMC Corporation product features: DataAccelerator and BatchAccelerator). However SmartBatch was discontinued in April 2000.

APT International, based in Monaco, produced a competitive product trademarked as WARP. A few months after the launch of this product, IBM renamed their OS/2 product OS/2 Warp 4, conflicting with the marketing of the performance product that was the only competitor to BatchPipes. This resulted in 7 years of litigation at Tribunal de grande instance de Paris [2] [3]

Subsequently, BatchPipes Version 2 was released, incorporating BatchPipes Version 1 and some additional features from SmartBatch: BatchPipePlex and BatchPipeWorks. BatchPipes Version 2 is still a marketed IBM product.

See also

Related Research Articles

Multiple Virtual Storage, more commonly called MVS, was the most commonly used operating system on the System/370 and System/390 IBM mainframe computers. It was developed by IBM, but is unrelated to IBM's other mainframe operating systems, e.g., VSE, VM, TPF.

z/OS 64-bit operating system for IBM mainframes

z/OS is a 64-bit operating system for IBM mainframes, produced by IBM. It derives from and is the successor to OS/390, which in turn followed a string of MVS versions. Like OS/390, z/OS combines a number of formerly separate, related products, some of which are still optional. z/OS offers the attributes of modern operating systems but also retains much of the functionality originating in the 1960s and each subsequent decade that is still found in daily use. z/OS was first introduced in October 2000.

Time Sharing Option (TSO) is an interactive time-sharing environment for IBM mainframe operating systems, including OS/360 MVT, OS/VS2 (SVS), MVS, OS/390, and z/OS.

CICS transaction management system by IBM

Customer Information Control System (CICS) is a family of mixed language application servers that provide online transaction management and connectivity for applications on IBM mainframe systems under z/OS and z/VSE.

Spooling

In computing, spooling is a specialized form of multi-programming for the purpose of copying data between different devices. In contemporary systems it is usually used for mediating between a computer application and a slow peripheral, such as a printer. Spooling allows programs to "hand off" work to be done by the peripheral and then proceed to other tasks, or do not begin until input has been transcribed. A dedicated program, the spooler, maintains an orderly sequence of jobs for the peripheral and feeds it data at its own rate. Conversely, for slow input peripherals, such as a card reader, a spooler can maintain a sequence of computational jobs waiting for data, starting each job when all of the relevant input is available; see batch processing. The spool itself refers to the sequence of jobs, or the storage area where they are held. In many cases the spooler is able to drive devices at their full rated speed with minimal impact on other processing.

CMS Pipelines implementation of the pipeline concept for VM/CMS systems

CMS Pipelines implements the pipeline concept under the VM/CMS operating system. The programs in a pipeline operate on a sequential stream of records. A program writes records that are read by the next program in the pipeline. Any program can be combined with any other because reading and writing is done through a device independent interface.

In software engineering, a pipeline consists of a chain of processing elements, arranged so that the output of each element is the input of the next; the name is by analogy to a physical pipeline. Usually some amount of buffering is provided between consecutive elements. The information that flows in these pipelines is often a stream of records, bytes, or bits, and the elements of a pipeline may be called filters; this is also called the pipes and filters design pattern. Connecting elements into a pipeline is analogous to function composition.

This article discusses support programs included in or available for OS/360 and successors. IBM categorizes some of these programs as utilities and others as service aids; the boundaries are not always consistent or obvious. Many, but not all, of these programs match the types in utility software.

The Job Entry Subsystem (JES) is a component of IBM's mainframe operating systems that is responsible for managing batch workloads. In modern times, there are two distinct implementations of the Job Entry System called JES2 and JES3. They are designed to provide efficient execution of batch jobs.

The System Display and Search Facility (SDSF) component of IBM's mainframe operating system, z/OS, is an interactive user interface that allows users and administrators to view and control various aspects of the mainframe's operation and system resources. Some of the information displayed in SDSF includes Batch job output, Unix processes, scheduling environments, and status of external devices such as printers and network lines. SDSF is primarily used to access the batch and system log files and dumps.

The Houston Automatic Spooling Priority Program, commonly known as HASP, is an extension of the IBM OS/360 operating system and its successors providing extended support for "job management, data management, task management, and remote job entry."

Operating System/Virtual Storage 1, or OS/VS1, is a discontinued IBM mainframe computer operating system designed to be run on IBM System/370 hardware. It was the successor to the Multiprogramming with a Fixed number of Tasks (MFT) option of System/360's operating system OS/360. OS/VS1, in comparison to its predecessor, supported virtual memory. OS/VS1 was generally available during the 1970s and 1980s, and it is no longer supported by IBM.

In IBM mainframe operating systems from the OS/360 and successors line, a Unit Control Block (UCB) is a memory structure, or a control block, that describes any single input/output peripheral device (unit), or an exposure (alias), to the operating system. Certain data within the UCB also instructs the Input/Output Supervisor (IOS) to use certain closed subroutines in addition to normal IOS processing for additional physical device control.

An access method is a function of a mainframe operating system that enables access to data on disk, tape or other external devices. They were introduced in 1963 in IBM OS/360 operating system. Access methods provide an application programming interface (API) for programmers to transfer data to or from device, and could be compared to device drivers in non-mainframe operating systems, but typically provide a greater level of functionality.

The history of operating systems running on IBM mainframes is a notable chapter of history of mainframe operating systems, because of IBM's long-standing position as the world's largest hardware supplier of mainframe computers.

OS/360 and successors operating system for IBM mainframes

OS/360, officially known as IBM System/360 Operating System, is a discontinued batch processing operating system developed by IBM for their then-new System/360 mainframe computer, announced in 1964; it was heavily influenced by the earlier IBSYS/IBJOB and Input/Output Control System (IOCS) packages. It was one of the earliest operating systems to require the computer hardware to include at least one direct access storage device.

In the original S/360 and S/370 architectures, each processor had its own set of I/O channels and addressed I/O devices with a 12-bit cuu address, containing a 4-bit channel number and an 8-bit unit (device) number to be sent on the channel bus in order to select the device; the operating system had to be configured to reflect the processor and cuu address for each device. The operating system had logic to queue pending I/O on each channel and to handle selection of alternate channels. Initiating an I/O to a channel on a different processor required causing a shoulder tap interrupt on the other processor so that it could initiate the I/O.

Distributed Data Management Architecture

Distributed Data Management Architecture (DDM) is IBM's open, published software architecture for creating, managing and accessing data on a remote computer. DDM was initially designed to support record-oriented files; it was extended to support hierarchical directories, stream-oriented files, queues, and system command processing; it was further extended to be the base of IBM's Distributed Relational Database Architecture (DRDA); and finally, it was extended to support data description and conversion. Defined in the period from 1980 to 1993, DDM specifies necessary components, messages, and protocols, all based on the principles of object-orientation. DDM is not, in itself, a piece of software; the implementation of DDM takes the form of client and server products. As an open architecture, products can implement subsets of DDM architecture and products can extend DDM to meet additional requirements. Taken together, DDM products implement a distributed file system.

References