Hierarchical file system

Last updated
Example of a directory structure in a hierarchical file system File Hierarchy.png
Example of a directory structure in a hierarchical file system

In computing, a hierarchical file system is a file system that uses directories to organize files into a tree structure. [1]

Contents

In a hierarchical file system, directories contain information about both files and other directories, called subdirectories which, in turn, can point to other subdirectories, and so on. [2] This is organized as a tree structure, or hierarchy, generally portrayed with the root at the top. The root directory is the base of the hierarchy, and is usually stored at some fixed location on disk.

A hierarchical file system contrasts with a flat file system , where information about all files is stored in a single directory, and there are no subdirectories.

Almost all file systems today are hierarchical. What is referred to as a file system is a specific instance of a hierarchical system. For example, NTFS, HPFS, and ext4, all implement a hierarchical system with different features for buffering, file allocation, and file recovery.

Concepts

Path

A file path is a string of characters that contains the location of a file in a computer's file structure. [3] [4] That is, it represents the directory nodes visited from the root directory to the file as a list of node names, with the items in the list separated by path separators. The path separator is > on Multics, [5] / on Unix-like systems, [6] and \ on MS-DOS 2.0 and later, Windows, and OS/2 systems.

An absolute path begins at the root directory; that is, begins with a path separator character, which, at the beginning of a path, represents the root directory. A path consisting only of a path separator character refers to the root directory. [7]

Working directory

The working directory of a process is a directory dynamically associated with each process. Files are searched relative to the working directory, rather than from the root directory. At logon, the user's working directory is set to their home directory; it can be set afterwards by using a command. [8]

A relative path represents the directory nodes visited from the working directory to the file, rather than from the root directory to the file. A relative path does not begin with a path separator character. [7]

History

Multics

Multics is the first operating system to provide a hierarchical file system. [9] [10] File names can be any length or format, multiple names for a particular file or directory, and symbolic connections between the directories (symlinks) are one of the many other features Multics introduced with its hierarchical file system. [11]

IBM OS/360

OS/360 has had a hierarchical system since its earliest days.[ when? ] [12]

The system catalog is layered on top of a flat file system, where a disk file's "directory entry" is stored in each volume's Volume Table of Contents (VTOC), and files can be referenced either thru the catalog or by specifying the volume serial number directly. Specifying a dataset name without specifying a serial number in JCL implies a catalog search. The catalog stores only a pointer to the volume. [13]

The OS/360 catalog has a number of differences from other hierarchical file systems. The "root directory", a dataset named SYSCTLG, may contain file pointers or links to subdirectories called CVOLS. [13] The path separator is .. All paths are absolute paths, a leading . is always assumed and cannot be coded. Path components (between .) are limited to eight characters. The catalog is not restricted to disk ( DASD ) files, but can also catalog magnetic tape datasets. Not all volumes indicated in the catalog have to be online, and the system allows disk and tape volumes to be mounted on request following a catalog search.

Unlike most other systems, it is possible to create a file without cataloging it. Cataloging is requested thru JCL or a utility program.

A cataloged dataset name might look like PAYROLL.WEEKLY.HOURS.

IBM virtual storage operating systems replaced SYSCTLG with a VSAM Catalog. Cataloging is mandatory for VSAM datasets, but, as before, non-VSAM datasets may be cataloged or not cataloged. The program "Access Method Services" (IDCAMS) maintains the VSAM Catalog.

Personal computers

The CP/M operating system uses a flat file system, with a directory containing information on a maximum of 64 files—adequate when a floppy disk held only 128 KB. IBM PC DOS and MS-DOS 1.0 inherited the same structure. DOS 2.0, which supported hard disk drives, introduced a hierarchical file system. The hierarchical file system was used instead of simply expanding the flat directory for performance reasons. "A flat DOS file structure with a single directory and 10 times as many files would logically require 10 times as long to search." [2] OS/2 and Windows also support a hierarchical file system, using the same path syntax as DOS.

See also

Related Research Articles

<span class="mw-page-title-main">MVS</span> Operating system for IBM mainframes

Multiple Virtual Storage, more commonly called MVS, is the most commonly used operating system on the System/370, System/390 and IBM Z IBM mainframe computers. IBM developed MVS, along with OS/VS1 and SVS, as a successor to OS/360. It is unrelated to IBM's other mainframe operating system lines, e.g., VSE, VM, TPF.

A direct-access storage device (DASD) is a secondary storage device in which "each physical record has a discrete location and a unique address". The term was coined by IBM to describe devices that allowed random access to data, the main examples being drum memory and hard disk drives. Later, optical disc drives and flash memory units are also classified as DASD.

ISAM, an acronym for Indexed Sequential Access Method, is a method for creating, maintaining, and manipulating computer files of data so that records can be retrieved sequentially or randomly by one or more keys. Indexes of key fields are maintained to achieve fast retrieval of required file records in indexed files. IBM originally developed ISAM for mainframe computers, but implementations are available for most computer systems.

<span class="mw-page-title-main">VM (operating system)</span> Family of IBM operating systems

VM is a family of IBM virtual machine operating systems used on IBM mainframes System/370, System/390, zSeries, System z and compatible systems, including the Hercules emulator for personal computers.

Virtual Storage Access Method (VSAM) is an IBM direct-access storage device (DASD) file storage access method, first used in the OS/VS1, OS/VS2 Release 1 (SVS) and Release 2 (MVS) operating systems, later used throughout the Multiple Virtual Storage (MVS) architecture and now in z/OS. Originally a record-oriented filesystem, VSAM comprises four data set organizations: key-sequenced (KSDS), relative record (RRDS), entry-sequenced (ESDS) and linear (LDS). The KSDS, RRDS and ESDS organizations contain records, while the LDS organization simply contains a sequence of pages with no intrinsic record structure, for use as a memory-mapped file.

Job Control Language (JCL) is a name for scripting languages used on IBM mainframe operating systems to instruct the system on how to run a batch job or start a subsystem. The purpose of JCL is to say which programs to run, using which files or devices for input or output, and at times to also indicate under what conditions to skip a step. Parameters in the JCL can also provide accounting information for tracking the resources used by a job as well as which machine the job should run on.

A path is a string of characters used to uniquely identify a location in a directory structure. It is composed by following the directory tree hierarchy in which components, separated by a delimiting character, represent each directory. The delimiting character is most commonly the slash ("/"), the backslash character ("\"), or colon (":"), though some operating systems may use a different delimiter. Paths are used extensively in computer science to represent the directory/file relationships common in modern operating systems and are essential in the construction of Uniform Resource Locators (URLs). Resources can be represented by either absolute or relative paths.

z/OS UNIX System Services is a base element of z/OS. z/OS UNIX is a certified UNIX operating system implementation optimized for mainframe architecture. It is the first UNIX 95 to not be derived from the AT&T source code. Through integration with the rest of z/OS, additional Time Sharing Option (TSO) commands are available alongside the usual UNIX services, making it possible to process UNIX files using ISPF. Extensions in JCL make it possible to use these files in batch processing.

Disk Operating System/360, also DOS/360, or simply DOS, is the discontinued first member of a sequence of operating systems for IBM System/360, System/370 and later mainframes. It was announced by IBM on the last day of 1964, and it was first delivered in June 1966. In its time, DOS/360 was the most widely used operating system in the world.

<span class="mw-page-title-main">File system</span> Computer filing system

In computing, a file system or filesystem governs file organization and access. A local file system is a capability of an operating system that services the applications running on the same computer. A distributed file system is a protocol that provides file access between networked computers.

In the context of IBM mainframe computers in the S/360 line, a data set or dataset is a computer file having a record organization. Use of this term began with, e.g., DOS/360, OS/360, and is still used by their successors, including the current z/OS. Documentation for these systems historically preferred this term rather than file.

IEFBR14 is an IBM mainframe utility program. It runs in all IBM mainframe environments derived from OS/360, including z/OS. It is a placeholder that returns the exit status zero, similar to the true command on UNIX-like systems.

This article discusses support programs included in or available for OS/360 and successors. IBM categorizes some of these programs as utilities and others as service aids; the boundaries are not always consistent or obvious. Many, but not all, of these programs match the types in utility software.

In the IBM System/360 storage architecture, the Volume Table of Contents (VTOC), is a data structure that provides a way of locating the data sets that reside on a particular DASD volume. With the exception of the IBM Z compatible disk layout in Linux on Z, it is the functional equivalent of the MS/PC DOS File Allocation Table (FAT), the NTFS Master File Table (MFT), and an inode table in a file system for a Unix-like system. The VTOC is not used to contain any IPLTEXT and does not have any role in the IPL process, therefore does not have any data used by or functionally equivalent to the MBR. It lists the names of each data set on the volume as well as size, location, and permissions. Additionally, it contains an entry for every area of contiguous free space on the volume. The third record on the first track of the first cylinder of any DASD volume is known as the volume label and must contain a pointer to the location of the VTOC. The location of the VTOC may be specified when the volume is initialized. For performance reasons it may be located as close to the center of the volume as possible, since it is referenced frequently. A VTOC is added to a DASD volume when it is initialized using the Device Support Facilities program, ICKDSF, in current systems.

<span class="mw-page-title-main">Directory (computing)</span> File system structure for locating files

In computing, a directory is a file system cataloging structure which contains references to other computer files, and possibly other directories. On many computers, directories are known as folders, or drawers, analogous to a workbench or the traditional office filing cabinet. The name derives from books like a telephone directory that lists the phone numbers of all the people living in a certain area.

An access method is a function of a mainframe operating system that enables access to data on disk, tape or other external devices. Access methods were present in several mainframe operating systems since the late 1950s, under a variety of names; the name access method was introduced in 1963 in the IBM OS/360 operating system. Access methods provide an application programming interface (API) for programmers to transfer data to or from device, and could be compared to device drivers in non-mainframe operating systems, but typically provide a greater level of functionality.

In IBM mainframe operating systems, such as OS/360, MVS, z/OS, a Data Control Block (DCB) is a description of a dataset in a program. A DCB is coded in Assembler programs using the DCB macro instruction. High level language programmers use library routines containing DCBs.

The history of IBM mainframe operating systems is significant within the history of mainframe operating systems, because of IBM's long-standing position as the world's largest hardware supplier of mainframe computers. IBM mainframes run operating systems supplied by IBM and by third parties.

<span class="mw-page-title-main">OS/360 and successors</span> Operating system for IBM S/360 and later mainframes

OS/360, officially known as IBM System/360 Operating System, is a discontinued batch processing operating system developed by IBM for their then-new System/360 mainframe computer, announced in 1964; it was influenced by the earlier IBSYS/IBJOB and Input/Output Control System (IOCS) packages for the IBM 7090/7094 and even more so by the PR155 Operating System for the IBM 1410/7010 processors. It was one of the earliest operating systems to require the computer hardware to include at least one direct access storage device.

References

  1. Sharma, Vivek; Varshney, Manish; Sharma, Shantanu (2010). Design and Implementation of Operating System. University Science Press. p. 346. ISBN   9789380386416 . Retrieved February 22, 2023.
  2. 1 2 Stephens, Mark (June 6, 1988). "Netware, LAN Manager Spar in Network Software Contest". InfoWorld . Vol. 10, no. 23. Retrieved February 22, 2023.
  3. "HTML File Paths". W3Schools. Retrieved February 22, 2023.
  4. "General | File Paths". Codecademy. Retrieved 2024-03-20.
  5. New User's Introduction to Multics - Part I (PDF). Honeywell Information Systems. November 1979. p. 6-4. CH24-00.
  6. Ritchie, Dennis M.; Thompson, Ken (July 1974). "The UNIX Time-Sharing System". Communications of the ACM . 17 (7): 365–375. doi: 10.1145/361011.361061 .
  7. 1 2 Rosenthal, Hanaan (2007). AppleScript: The Comprehensive Guide to Scripting and Automation on Mac OS X (2nd ed.). Apress. p. 703. ISBN   978-1-59059-653-1 . Retrieved February 23, 2023.
  8. "What is a Current Directory?". www.computerhope.com. Retrieved 2024-03-20.
  9. "Multics Glossary -F-". www.multicians.org. Retrieved April 11, 2018.
  10. R. C. Daley and P. G. Neumann, "A general-purpose file system for secondary storage", AFIPS '65 (Fall, part I) Proceedings of the November 30 – December 1, 1965
  11. "What is MULTICS?". GeeksforGeeks. 2024-01-01. Retrieved 2024-03-20.
  12. IBM Corporation (1965). IBM Operating System/360 Concepts and facilities (PDF). pp. 10–11. Retrieved February 24, 2020.
  13. 1 2 Stephens, David. "The Best Mainframe DASD and Dataset Innovations". Longpela Mainframe Quarterly. Longpela Expertise. Retrieved February 24, 2023.