Tape library

Last updated

In computer storage, a tape library is a physical area that holds magnetic data tapes. In an earlier era, tape libraries were maintained people known as tape librarians and computer operators and the proper operation of the library was crucial to the running of batch processing jobs. Although tape libraries of this era were not automated, the use of tape management system software could assist in running them.

Contents

Subsequently, tape libraries became physically automated, and as such are sometimes called a tape silo, tape robot, or tape jukebox. These are a storage devices that contain one or more tape drives, a number of slots to hold tape cartridges, a barcode reader to identify tape cartridges, and an automated method for loading tapes (a robot). Such solutions are mostly used for backups and for digital archiving. Additionally, the area where tapes that are not currently in a silo are stored is also called a tape library. One of the earliest examples was the IBM 3850 Mass Storage System (MSS), announced in 1974.

In either era, tape libraries can contain millions of tapes.

Manual era

A manual magnetic tape library, common in the 1960s and 1970s. Rolling carts are used by staff to transfer tapes between the racks in the library and the computer room where the tape drives reside. NDOC magnetic tape library.jpg
A manual magnetic tape library, common in the 1960s and 1970s. Rolling carts are used by staff to transfer tapes between the racks in the library and the computer room where the tape drives reside.

Tapes and batch processing

In the mainframe computer era, especially the IBM mainframe, the most common format in use was the 9-track tape. [1] Some large application systems could require scores of different tapes as part of their batch job runs. [2]

In the data processing applications of the era, the master files for such things as employee payroll information, supplies and stores inventory, or customer accounts were typically kept on tape. [3] [4] Batch jobs to update these master files would take the existing tape master file as input and write out a new tape master file as output. [5] In addition, the set of update transactions themselves might constitute a second input tape. [3] The master file output of one update job would then be the master file input to the next time the job is run, perhaps a day, a week, or a month later. [1] The tapes representing a few past iterations of a master file would typically be retained, in case a problem with the latest version were to be discovered and the job had to be rerun. [1]

Role of tape libraries and librarians

Tape Retention / Scratch Control form, in triplicate Tape Retention Scratch Control triplicate form.jpg
Tape Retention / Scratch Control form, in triplicate

Mainframe computer installations often had a separate room, the tape library, to house their racks and cabinets of tapes. [1] The typical workflow for running a batch job was to go into the library, pull certain tapes off the racks there and load them onto a rolling cart, move the cart into the computer area, mount the tapes onto tape drives for a production run, take the tapes off the drives when the run was over, move the cart back to the library, and put the tapes back on the library racks. Such tape libraries existed at most computer installations. [6]

Even a modestly sized computer installation could have hundreds of tapes, [4] and library sizes of several thousand reels of tapes were commonplace. [6] And they could be much larger: by the mid-1970s, the U.S. Census Bureau and NASA each had tape libraries with around one million tape reels in them. [2] The person in charge of all this was typically called the tape librarian. [1] [4]

In this era, there were no automated tape delivery and mounting systems, and so this action had to be done by computer operators. [6] These people were the ones responsible for mounting tapes onto tape drives as part of running a job. [1] Even careful computer operators could sometimes mount the wrong tape as input to a job or present the reels of a multi-tape dataset out of order. [2] Overwriting a tape that was meant to be preserved was another potential mistake. [4]

It was the tape librarian's responsibility to set up procedures for the handling of tapes to minimize the chances of errors taking place. [4] As one book of the era wrote, "keeping track of the whereabouts of the tapes is a formidable and responsible job." [1]

Supporting software

Tape management systems of this era were software packages whose purpose was to help facilitate tape library operations and management. They kept track of data sets on tape, and produced reports indicating whether a data set should be retained on, or could be scratched from, a tape; they aided in the setup and running of scheduled production jobs, through such things as tape pull lists and pre-printed external gummed tape labels; and they kept track of the physical inventory of tape reels. The most popular of these packages was UCC-1 from University Computing Company, [7] a product that was also known as the Tape Management System. [8] It made several appearances on Datapro Research Corporation's Software Honor Roll. [7] Another was Valu-Lib from Value Computing, Inc., [9] [10] and a third was TLMS II from Capex Corporation. [11]

As use of the mainframe continued on into the following century, tape library management, both manual and automatic, was one element of the offerings of the Data Facility Storage Management Subsystem (MVS) from IBM. [12]

Automated era

Large StorageTek Powderhorn tape library, showing tape cartridges with barcodes packed on shelves in the front and a robot arm moving in the back StorageTek Powderhorn tape library.jpg
Large StorageTek Powderhorn tape library, showing tape cartridges with barcodes packed on shelves in the front and a robot arm moving in the back
Small ADIC Scalar 100 tape library, showing a robot visible on the bottom with two IBM LTO2 tape drives behind it ADIC Scalar 100 tape library.jpg
Small ADIC Scalar 100 tape library, showing a robot visible on the bottom with two IBM LTO2 tape drives behind it

Design

Physically automated tape library devices can store immense amounts of data, ranging from 20 terabytes [13] up to 2.1 exabytes of data [14] as of 2016. Such capacity is multiple thousand times that of a typical hard drive and well in excess of what is capable with network attached storage. Typical entry-level solutions cost around $10,000 USD, [15] while high-end solutions can start at as much as $200,000 USD [16] and cost well in excess of $1 million for a fully expanded and configured library.

For large data-storage, they are a cost-effective solution, with cost per gigabyte as low as 2 cents USD. [17] The tradeoff for their larger capacity is their slower access time, which usually involves mechanical manipulation of tapes. Access to data in a library takes from several seconds to several minutes.

Because of their slow sequential access and huge capacity, tape libraries are primarily used for backups and as the final stage of digital archiving. A typical application of the latter would be an organization's extensive transaction record for legal or auditing purposes. Another example is hierarchical storage management (HSM), in which tape library is used to hold rarely used files from file systems.

Software support

There are several large-scale library-management packages available commercially. Open-source implementations include AMANDA, Bacula, and the minimal mtx program.

Barcode labels

Tape libraries commonly have the capability of optically scanning barcode labels which are attached to each tape, allowing them to automatically maintain an inventory of which tapes are where within the library. Preprinted barcode labels are commercially available or custom labels may be generated using commercial or free software. The barcode label is frequently part of the tape label, information recorded at the beginning of the medium to uniquely identify the tape.

Autoloaders

Dell PowerVault 124T Autoloader Powervault 124T autoloader.jpg
Dell PowerVault 124T Autoloader

Smaller tape libraries with only one drive are known as autoloaders. [18] The term autoloader is also sometimes used synonymously with stacker, [19] a device in which the media are loaded necessarily in a sequential manner. [20]

Other types of autoloaders may operate with optical discs (such as compact discs or DVDs) or floppy disks [ citation needed ].

See also

Related Research Articles

A disk operating system (DOS) is a computer operating system that resides on and can use a disk storage device, such as a floppy disk, hard disk drive, or optical disc. A disk operating system provides a file system for organizing, reading, and writing files on the storage disk, and a means for loading and running programs stored on that disk. Strictly, this definition does not include any other functionality, so it does not apply to more complex OSes, such as Microsoft Windows, and is more appropriately used only for older generations of operating systems.

<span class="mw-page-title-main">Mainframe computer</span> Large computer

A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterprise resource planning, and large-scale transaction processing. A mainframe computer is large but not as large as a supercomputer and has more processing power than some other classes of computers, such as minicomputers, servers, workstations, and personal computers. Most large-scale computer-system architectures were established in the 1960s, but they continue to evolve. Mainframe computers are often used as servers.

<span class="mw-page-title-main">Tape drive</span> Data storage device

A tape drive is a data storage device that reads and writes data on a magnetic tape. Magnetic-tape data storage is typically used for offline, archival data storage. Tape media generally has a favorable unit cost and a long archival stability.

<span class="mw-page-title-main">Microcomputer</span> Small computer with a processor made of one or a few integrated circuits

A microcomputer is a small, relatively inexpensive computer having a central processing unit (CPU) made out of a microprocessor. The computer also includes memory and input/output (I/O) circuitry together mounted on a printed circuit board (PCB). Microcomputers became popular in the 1970s and 1980s with the advent of increasingly powerful microprocessors. The predecessors to these computers, mainframes and minicomputers, were comparatively much larger and more expensive. Many microcomputers are also personal computers. An early use of the term "personal computer" in 1962 predates microprocessor-based designs. (See "Personal Computer: Computers at Companies" reference below). A "microcomputer" used as an embedded control system may have no human-readable input and output devices. "Personal computer" may be used generically or may denote an IBM PC compatible machine.

<span class="mw-page-title-main">History of operating systems</span> Aspect of computing history

Computer operating systems (OSes) provide a set of functions needed and used by most application programs on a computer, and the links needed to control and synchronize computer hardware. On the first computers, with no operating system, every program needed the full hardware specification to run correctly and perform standard tasks, and its own drivers for peripheral devices like printers and punched paper card readers. The growing complexity of hardware and application programs eventually made operating systems a necessity for everyday use.

Job Control Language (JCL) is a name for scripting languages used on IBM mainframe operating systems to instruct the system on how to run a batch job or start a subsystem. The purpose of JCL is to say which programs to run, using which files or devices for input or output, and at times to also indicate under what conditions to skip a step. Parameters in the JCL can also provide accounting information for tracking the resources used by a job as well as which machine the job should run on.

Electronic data processing (EDP) can refer to the use of automated methods to process commercial data. Typically, this uses relatively simple, repetitive activities to process large volumes of similar information. For example: stock updates applied to an inventory, banking transactions applied to account and customer master files, booking and ticketing transactions to an airline's reservation system, billing for utility services. The modifier "electronic" or "automatic" was used with "data processing" (DP), especially c. 1960, to distinguish human clerical data processing from that done by computer.

<span class="mw-page-title-main">Spooling</span> Form of multitasking in computers

In computing, spooling is a specialized form of multi-programming for the purpose of copying data between different devices. In contemporary systems, it is usually used for mediating between a computer application and a slow peripheral, such as a printer. Spooling allows programs to "hand off" work to be done by the peripheral and then proceed to other tasks, or to not begin until input has been transcribed. A dedicated program, the spooler, maintains an orderly sequence of jobs for the peripheral and feeds it data at its own rate. Conversely, for slow input peripherals, such as a card reader, a spooler can maintain a sequence of computational jobs waiting for data, starting each job when all of the relevant input is available; see batch processing. The spool itself refers to the sequence of jobs, or the storage area where they are held. In many cases, the spooler is able to drive devices at their full rated speed with minimal impact on other processing.

Disk Operating System/360, also DOS/360, or simply DOS, is the discontinued first member of a sequence of operating systems for IBM System/360, System/370 and later mainframes. It was announced by IBM on the last day of 1964, and it was first delivered in June 1966. In its time, DOS/360 was the most widely used operating system in the world.

Pertec Computer Corporation (PCC), formerly Peripheral Equipment Corporation (PEC), was a computer company based in Chatsworth, California which originally designed and manufactured peripherals such as floppy drives, tape drives, instrumentation control and other hardware for computers.

A computer operator is a role in IT which oversees the running of computer systems, ensuring that the machines, and computers are running properly. The job of a computer operator as defined by the United States Bureau of Labor Statistics is to "monitor and control ... and respond to ... enter commands ... set controls on computer and peripheral devices. This Excludes Data Entry."

Storage Technology Corporation created several magnetic tape data storage formats. These are commonly used with large computer systems, typically in conjunction with a robotic tape library. The most recent format is the T10000. StorageTek primarily competed with IBM in this market, and continued to do so after its acquisition by Sun Microsystems in 2005 and as part of the Sun Microsystems acquisition by Oracle in 2009.

<span class="mw-page-title-main">9-track tape</span> Magnetic tape format introduced by IBM in 1964

9-track tape is a format for magnetic-tape data storage, introduced with the IBM System/360 in 1964. The 12 inch (12.7 mm) wide magnetic tape media and reels have the same size as the earlier IBM 7-track format it replaced, but the new format has eight data tracks and one parity track for a total of nine parallel tracks. Data is stored as 8-bit characters, spanning the full width of the tape. Various recording methods have been employed during its lifetime as tape speed and data density increased, including PE, GCR, and NRZI. Tapes come in various sizes up to 3,600 feet (1,100 m) in length.

Magnetic-tape data storage is a system for storing digital information on magnetic tape using digital recording.

<span class="mw-page-title-main">IBM storage</span> Product portfolio of IBM

The IBM Storage product portfolio includes disk, flash, tape, NAS storage products, storage software and services. IBM's approach is to focus on data management.

The Sort/Merge utility is a mainframe program to sort records in a file into a specified order, merge pre-sorted files into a sorted file, or copy selected records. Internally, these utilities use one or more of the standard sorting algorithms, often with proprietary fine-tuned code.

The history of IBM mainframe operating systems is significant within the history of mainframe operating systems, because of IBM's long-standing position as the world's largest hardware supplier of mainframe computers. IBM mainframes run operating systems supplied by IBM and by third parties.

<span class="mw-page-title-main">OS/360 and successors</span> Operating system for IBM S/360 and later mainframes

OS/360, officially known as IBM System/360 Operating System, is a discontinued batch processing operating system developed by IBM for their then-new System/360 mainframe computer, announced in 1964; it was influenced by the earlier IBSYS/IBJOB and Input/Output Control System (IOCS) packages for the IBM 7090/7094 and even more so by the PR155 Operating System for the IBM 1410/7010 processors. It was one of the earliest operating systems to require the computer hardware to include at least one direct access storage device.

The IBM 3570 is a series of tape drives and corresponding magnetic tape data storage media formats developed by IBM. The storage technology and media were introduced using the name Magstar MP, combining the IBM storage brand name Magstar with MP for MultiPurpose. The IBM product number 3570 was associated with the tape drives and libraries that used the Magstar MP media.

The Librarian is a version control system and source code management software product originally developed by Applied Data Research for IBM mainframe computers. It was designed to supplant physical punched card decks as a way of maintaining programs, but kept a card model in terms of its interface. During the 1970s and 1980s it was in use at thousands of IBM mainframe installations and was one of the best-selling software products in the computer industry.

References

  1. 1 2 3 4 5 6 7 Popkin, Gary S.; Pike, Arthur H. (1977). Introduction to Data Processing. Boston: Houghton Mifflin Company. pp. 149–151, 260–263. ISBN   0-395-20628-6.
  2. 1 2 3 McCracken, Daniel D. (1976). A Simplified Guide to Structured COBOL Programming. New York: John Wiley & Sons. pp. 259, 264. ISBN   0-471-58284-0.
  3. 1 2 McQuillen, Kevin (1975). System/360–370 Assembler Language (OS). Fresno, California: Mike Murach & Associates. p. 302. LCCN   74-29645.
  4. 1 2 3 4 5 Stern, Nancy; Stern, Robert A. (1980). Structured COBOL Programming (3rd ed.). New York: John Wiley & Sons. pp. 494, 496, 498–499. ISBN   0-471-04913-1.
  5. Ashley, Ruth; Fernandez, Judi N. (1978). Job Control Language: A Self-Teaching Guide. New York: John Wiley & Sons. p. 43. ISBN   0-471-03205-0.
  6. 1 2 3 Conway, Richard; Gries, David (1973). An Introduction to Programming: A Structured Approach using PL/1 and PL/C. Cambridge, Massachusetts: Winthrop. pp. 333–334.
  7. 1 2 Leavitt, Don (January 17, 1977). "Users Put 38 Packages on Honor Roll". Computerworld. p. 23.
  8. "UCC-1 Tape Management Updated with Release 4.7". Computerworld. July 4, 1983. p. 35.
  9. "'Valu-Lib' Can Run Tape Library, Can Interface With Scheduler". Computerworld. May 16, 1973. p. 15.
  10. "'Valu Lib' Update Released For IBM 4300s, Series/36". Computerworld. December 19, 1983. p. 32.
  11. "uncertain". Infosystems. Vol. uncertain. Hitchcock Publishing Company. 1980. p. 90. Archived from the original on March 25, 2023. Retrieved February 22, 2023.
  12. "Introduction to tape library management". IBM. April 5, 2023. Retrieved November 1, 2023.
  13. "HP StorageWorks MSL2024 Tape Library - overview". March 18, 2006. Archived from the original on March 18, 2006. Retrieved June 19, 2018.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
  14. Oracle "StorageTek SL8500 Modular Library System".
  15. HP Small & Medium Business Online Store: HP StorageWorks MSL2024 Tape Libraries
  16. . Cites cost as "From $195,830. (US)"
  17. "The Costs Of Storage". Forbes.
  18. "SNIA Dictionary". Storage Network Industry Association. Retrieved 2010-01-30. tape autoloader...[Storage System] A tape device that provides automated access to multiple tape cartridges, typically via a single tape drive.
  19. "Ten common backup/restore related questions". Sun Microsystems, Inc. Retrieved 2010-01-30. What is a stacker (autoloader) vs a jukebox?
  20. "SNIA Dictionary". Storage Network Industry Association. Retrieved 2010-01-30. media stacker...[Data Recovery] A robotic media handler in which media must be moved sequentially by the robot.