Virtual Storage Access Method

Last updated

Virtual Storage Access Method (VSAM) [1] is an IBM direct-access storage device (DASD) file storage access method, first used in the OS/VS1, OS/VS2 Release 1 (SVS) and Release 2 (MVS) operating systems, later used throughout the Multiple Virtual Storage (MVS) architecture and now in z/OS. Originally a record-oriented filesystem, [NB 2] VSAM comprises four [NB 2] data set organizations: key-sequenced (KSDS), relative record (RRDS), entry-sequenced (ESDS) and linear (LDS). [2] The KSDS, RRDS and ESDS organizations contain records, while the LDS organization (added later to VSAM) contains a sequence of pages with no intrinsic record structure, for use as a memory-mapped file.

Contents

Overview

An IBM Redbook named "VSAM PRIMER" (especially when used with the "Virtual Storage Access Method (VSAM) Options for Advanced Applications" manual) explains the concepts needed to make use of VSAM. [3] IBM uses the term data set in official documentation as a synonym for file, and direct-access storage device (DASD) for devices with random access to data locations, such as disk drives, as opposed to devices such as tape drives that can only be read sequentially.

VSAM records can be of fixed or variable length. They are organised in fixed-size blocks called control intervals (CIs), [4] [5] and then into larger divisions called Control Areas (CAs). Control Interval sizes are measured in bytes  for example 4 kilobytes   while Control Area sizes are measured in disk tracks or cylinders. Control Intervals are the units of transfer between disk and computer so a read request will read one complete Control Interval. Control Areas are the units of allocation so, when a VSAM data set is defined, an integral number of Control Areas will be allocated.

The Access Method Services utility program IDCAMS is commonly used to manipulate ("delete and define") VSAM data sets. Custom programs can access VSAM datasets through Data Definition (DD) statements in Job Control Language (JCL), via dynamic allocation or in online regions such as in Customer Information Control System (CICS).

Both IMS/DB [ citation needed ] and Db2 [2] :41 [6] are implemented on top of VSAM and use its underlying data structures.

Files

The physical organization of VSAM data sets differs considerably from the organizations used by other access methods, as follows.

A VSAM file is defined as a cluster of VSAM components, e.g., for KSDS a DATA component and an INDEX component.

Control intervals and control areas

VSAM components consist of fixed length physical blocks grouped into fixed length control intervals [4] [5] (CI) and control areas (CA). The size of the CI and CA is determined by the Access Method Services (AMS), and the way in which they are used is normally not visible to the user. There will be a fixed number of control intervals in each control area.

A control interval normally contains multiple records. The records are stored within the control interval starting from the low address upwards. Control information is stored at the other end of the control interval, starting from the high address and moving downwards. The space between the records and the control information is free space. The control information comprises two types of entry: a control interval descriptor field (CIDF) which is always present, and record descriptor fields (RDF) which are present when there are records within the control interval and describe the length of the associated record. Free space within a CI is always contiguous.

When records are inserted into a control interval, they are placed in the correct order relative to other records. This may require records to be moved out of the way inside the control interval. Conversely, when a record is deleted, later records are moved down so that the free space remains contiguous. If there is not enough free space in a control interval for a record to be inserted, the control interval is split. Roughly half the records are stored in the original control interval while the remaining records are moved into a new control interval. The new control interval is taken from a pool of free control intervals within the same control area as the original control interval. If there is no remaining free control interval within that control area, the control area itself is split and the control intervals are distributed equally between the old and the new control areas.

You can use three types of record-orientated file organization with VSAM (the contents of linear data sets have no record structure):

Sequential organization

An ESDS may have an index defined to it to enable access via keys, by defining an Alternate Index. Records in ESDS are stored in order in which they are written by address access. [7] [8] [9] Records are loaded irrespective of their contents and their byte addresses cannot be changed.

Indexed organization

A KSDS has two parts: the index component and the data component. These may be stored on separate disk volumes.

While a basic KSDS only has one key (the primary key), alternate indices may be defined to permit the use of additional fields as secondary keys. An alternate index (AIX) is itself a KSDS.

The data structure used by a KSDS is nowadays known as a B+ tree. [10] [11]

Relative organization

An RRDS may have an index defined to it to enable access via keys, by defining an Alternate Index.

Linear organization

An LDS is an unstructured VSAM dataset with a control interval size of a multiple of 4K. It is used by certain system services.

Data access techniques

There are four types of access techniques for VSAM data:

Sharing data

Sharing of VSAM data between CICS regions can be done by VSAM Record-Level Sharing (RLS). This adds record caching and, more importantly, record locking. Logging and commit processing remain the responsibility of CICS which means that sharing of VSAM data outside a CICS environment is severely restricted.

Sharing between CICS regions and batch jobs requires Transactional VSAM, DFSMStvs. This is an optional program that builds on VSAM RLS by adding logging and two-phase commit, using underlying z/OS system services. This permits generalised sharing of VSAM data.

History

VSAM was introduced as a replacement for older access methods [14] and was intended to add function, to be easier to use and to overcome problems of performance and device-dependence. VSAM was introduced in the 1970s when IBM announced virtual storage operating systems (DOS/VS, OS/VS1 and OS/VS2) for its new System/370 series, as successors of the DOS/360 and OS/360 operating systems running on its System/360 computer series. While backwards compatibility was maintained, the older access methods suffered from performance problems due to the address translation required for virtual storage.

The KSDS organization was designed to replace ISAM, the Indexed Sequential Access Method. Changes in disk technology had meant that searching for data in ISAM data sets had become very inefficient. It was also difficult to move ISAM data sets as there were embedded pointers to physical disk locations which became invalid if the data set was moved. IBM also provided a compatibility interface to allow programs coded to use ISAM to use a KSDS instead.

The RRDS organization was designed to replace BDAM, the Basic Direct Access Method. In some cases, BDAM data sets contained embedded pointers which prevented them from being moved. However, most BDAM data sets did not and the incentive to move from BDAM to VSAM RRDS was much less compelling than that to move from ISAM to VSAM KSDS.

Linear data sets were added later, followed by VSAM RLS and then Transactional VSAM.

See also

Notes

  1. No longer used.
  2. 1 2 With the exception of catalogs, page spaces and swap [NB 1] spaces, which unauthorized applications could access only via specialized OS services. Not to mention the fact that it's been in VSE for ever too and is used in z/VSE

Related Research Articles

<span class="mw-page-title-main">MVS</span> Operating system for IBM mainframes

Multiple Virtual Storage, more commonly called MVS, is the most commonly used operating system on the System/370, System/390 and IBM Z IBM mainframe computers. IBM developed MVS, along with OS/VS1 and SVS, as a successor to OS/360. It is unrelated to IBM's other mainframe operating system lines, e.g., VSE, VM, TPF.

A direct-access storage device (DASD) is a secondary storage device in which "each physical record has a discrete location and a unique address". The term was coined by IBM to describe devices that allowed random access to data, the main examples being drum memory and hard disk drives. Later, optical disc drives and flash memory units are also classified as DASD.

Indexed Sequential Access Method (ISAM) is a method for creating, maintaining, and manipulating computer files of data so that records can be retrieved sequentially or randomly by one or more keys. Indexes of key fields are maintained to achieve fast retrieval of required file records in indexed files. IBM originally developed ISAM for mainframe computers, but implementations are available for most computer systems.

<span class="mw-page-title-main">CICS</span> IBM mainframe transaction monitor

IBM CICS is a family of mixed-language application servers that provide online transaction management and connectivity for applications on IBM mainframe systems under z/OS and z/VSE.

Disk Operating System/360, also DOS/360, or simply DOS, is the discontinued first member of a sequence of operating systems for IBM System/360, System/370 and later mainframes. It was announced by IBM on the last day of 1964, and it was first delivered in June 1966. In its time, DOS/360 was the most widely used operating system in the world.

Record Management Services (RMS) are procedures in the VMS, RSTS/E, RT-11 and RSX-11M operating systems that programs may call to process files and records within files. Its file formats and procedures are similar to of those in some IBM access methods for several of its mainframe computer operating systems and by other vendors for file and record management. VMS RMS is an integral part of the system software; its procedures run in executive mode.

Virtual Telecommunications Access Method (VTAM) is the IBM subsystem that implements Systems Network Architecture (SNA) for mainframe environments. VTAM provides an application programming interface (API) for communication applications, and controls communication equipment such as adapters and controllers. In modern terminology, VTAM provides a communication stack and device drivers.

<span class="mw-page-title-main">File system</span> Computer filing system

In computing, a file system or filesystem governs file organization and access. A local file system is a capability of an operating system that services the applications running on the same computer. A distributed file system is a protocol that provides file access between networked computers.

In the context of IBM mainframe computers in the S/360 line, a data set or dataset is a computer file having a record organization. Use of this term began with, e.g., DOS/360, OS/360, and is still used by their successors, including the current z/OS. Documentation for these systems historically preferred this term rather than file.

This article discusses support programs included in or available for OS/360 and successors. IBM categorizes some of these programs as utilities and others as service aids; the boundaries are not always consistent or obvious. Many, but not all, of these programs match the types in utility software.

A key-sequenced data set (KSDS) is a type of data set used by IBM's VSAM computer data storage system. Each record in a KSDS data file is embedded with a unique key. A KSDS consists of two parts, the data component and a separate index file known as the index component which allows the system to physically locate the record in the data file by its key value. Together, the data and index components are called a cluster.

An entry-sequenced data set (ESDS) is a type of data set used by IBM's VSAM computer data storage system. Records are accessed based on their sequential order, that is, the order in which they were written to the file; which means that accessing a particular record involves searching all the records sequentially until it is located, or by using a relative physical address, i.e. the number of bytes from the beginning of the file to start reading.

A relative record data set (RRDS) is a type of data set organization used by IBM's VSAM computer data storage system. Records are accessed based on their ordinal position in the file. For example, the desired record to be accessed might be the 42nd record in the file out of 999 total.

An access method is a function of a mainframe operating system that enables access to data on disk, tape or other external devices. Access methods were present in several mainframe operating systems since the late 1950s, under a variety of names; the name access method was introduced in 1963 in the IBM OS/360 operating system. Access methods provide an application programming interface (API) for programmers to transfer data to or from device, and could be compared to device drivers in non-mainframe operating systems, but typically provide a greater level of functionality.

The history of IBM mainframe operating systems is significant within the history of mainframe operating systems, because of IBM's long-standing position as the world's largest hardware supplier of mainframe computers. IBM mainframes run operating systems supplied by IBM and by third parties.

<span class="mw-page-title-main">OS/360 and successors</span> Operating system for IBM S/360 and later mainframes

OS/360, officially known as IBM System/360 Operating System, is a discontinued batch processing operating system developed by IBM for their then-new System/360 mainframe computer, announced in 1964; it was influenced by the earlier IBSYS/IBJOB and Input/Output Control System (IOCS) packages for the IBM 7090/7094 and even more so by the PR155 Operating System for the IBM 1410/7010 processors. It was one of the earliest operating systems to require the computer hardware to include at least one direct access storage device.

An indexed file is a computer file with an index that allows easy random access to any record given its file key.

<span class="mw-page-title-main">Distributed Data Management Architecture</span> Open, published architecture for creating, managing and accessing data on a remote computer

Distributed Data Management Architecture (DDM) is IBM's open, published software architecture for creating, managing and accessing data on a remote computer. DDM was initially designed to support record-oriented files; it was extended to support hierarchical directories, stream-oriented files, queues, and system command processing; it was further extended to be the base of IBM's Distributed Relational Database Architecture (DRDA); and finally, it was extended to support data description and conversion. Defined in the period from 1980 to 1993, DDM specifies necessary components, messages, and protocols, all based on the principles of object-orientation. DDM is not, in itself, a piece of software; the implementation of DDM takes the form of client and server products. As an open architecture, products can implement subsets of DDM architecture and products can extend DDM to meet additional requirements. Taken together, DDM products implement a distributed file system.

A linear data set (LDS) is a type of data set organization used by IBM's VSAM computer data storage system.

Data Facility Storage Management Subsystem (DFSMS) is a central component of IBM's flagship operating system z/OS. It includes access methods, utilities and program management functions. Data Facility Storage Management Subsystem is also a collective name for a collection of several products, all but two of which are included in the DFSMS/MVS product.

References

  1. "New Life for Legacy Systems at LaBarge". Datamation . May 11, 2007.
  2. 1 2 Lovelace, Mary; Dovidauskas, Jose; Salla, Alvaro; Sokal, Valeria (August 2022). "1.3.2 Record management". VSAM Demystified (PDF). Redbooks (3 ed.). IBM. p. 5.
  3. "VSAM Primer".
  4. 1 2 "VSAM – Components".
  5. 1 2 "Control Interval Size Limitations". IBM . 27 March 2014.
  6. "User's Guide" (PDF).
  7. "VSAM: introductory".
  8. "Server Functionality". Sequential (VSAM ESDS – Entry Sequenced Dataset)
  9. "ABCs of z/OS System Programming Volume 3". CiteSeerX   10.1.1.469.8853 . An ESDS VSAM data set contains records in the order in which they were entered
  10. "US Patent for Providing record-level alternate-index upgrade locking".
  11. "What is VSAM?". This index is called a B+ tree.
  12. 1 2 "Local shared resources (LSR) or nonshared resources". IBM .
  13. "Sharing VSAM Data Sets". IBM.com (IBM Knowledge Center). describes considerations for sharing VSAM data sets for NSR or LSR/GSR
  14. OS/Virtual Storage 1 Features Supplement (PDF) (First ed.). IBM. August 1972. GC20-1752-0.