Unix filesystem

Last updated
Version 7 Unix filesystem layout: subdirectories of "/" and "/usr" Version 7 UNIX SIMH PDP11 Filesystem Layout.png
Version 7 Unix filesystem layout: subdirectories of "/" and "/usr"
An overview of a Unix filesystem layout Standard-unix-filesystem-hierarchy.svg
An overview of a Unix filesystem layout

In Unix and operating systems inspired by it, the file system is considered a central component of the operating system. [1] It was also one of the first parts of the system to be designed and implemented by Ken Thompson in the first experimental version of Unix, dated 1969. [2]

Contents

As in other operating systems, the filesystem provides information storage and retrieval, and one of several forms of interprocess communication, in that the many small programs that traditionally form a Unix system can store information in files so that other programs can read them, although pipes complemented it in this role starting with the Third Edition. Also, the filesystem provides access to other resources through so-called device files that are entry points to terminals, printers, and mice.

The rest of this article uses Unix as a generic name to refer to both the original Unix operating system and its many workalikes.

Principles

The filesystem appears as one rooted tree of directories. [1] Instead of addressing separate volumes such as disk partitions, removable media, and network shares as separate trees (as done in DOS and Windows: each drive has a drive letter that denotes the root of its file system tree), such volumes can be mounted on a directory, causing the volume's file system tree to appear as that directory in the larger tree. [1] The root of the entire tree is denoted /.

In the original Bell Labs Unix, a two-disk setup was customary, where the first disk contained startup programs, while the second contained users' files and programs. This second disk was mounted at the empty directory named usr on the first disk, causing the two disks to appear as one filesystem, with the second disk’s contents viewable at /usr.

Unix directories do not contain files. Instead, they contain the names of files paired with references to so-called inodes, which in turn contain both the file and its metadata (owner, permissions, time of last access, etc., but no name). Multiple names in the file system may refer to the same file, a feature termed a hard link . [1] The mathematical traits of hard links make the file system a limited type of directed acyclic graph, although the directories still form a tree, as they typically may not be hard-linked. (As originally envisioned in 1969, the Unix file system would in fact be used as a general graph with hard links to directories providing navigation, instead of path names. [2] )

File types

The original Unix file system supported three types of files: ordinary files, directories, and "special files", also termed device files. [1] The Berkeley Software Distribution (BSD) and System V each added a file type to be used for interprocess communication: BSD added sockets, [3] while System V added FIFO files.

BSD also added symbolic links (often termed "symlinks") to the range of file types, which are files that refer to other files, and complement hard links. [3] Symlinks were modeled after a similar feature in Multics, [4] and differ from hard links in that they may span filesystems and that their existence is independent of the target object. Other Unix systems may support additional types of files. [5]

Conventional directory layout

Certain conventions exist for locating some kinds of files, such as programs, system configuration files, and users' home directories. These were first documented in the hier(7) man page since Version 7 Unix; [6] subsequent versions, derivatives and clones typically have a similar man page. [7] [8] [9] [10] [11] [12]

The details of the directory layout have varied over time. Although the file system layout is not part of the Single UNIX Specification, several attempts exist to standardize (parts of) it, such as the System V Application Binary Interface, the Intel Binary Compatibility Standard, the Common Operating System Environment, and Linux Foundation's Filesystem Hierarchy Standard (FHS). [13]

Here is a generalized overview of common locations of files on a Unix operating system:

Directory or fileDescription
/The slash / character alone denotes the root of the filesystem tree.
/bin
Stands for binaries and contains certain fundamental utilities, such as ls or cp, that are needed to mount /usr, when that is a separate filesystem, or to run in one-user (administrative) mode when /usr cannot be mounted. In System V.4, this is a symlink to /usr/bin. Otherwise, it needs to be on the root filesystem itself.
/boot
Contains all the files needed for successful booting process. In Research Unix, this was one file rather than a directory. [14] Nowadays usually on the root filesystem itself, unless the system, bootloader etc. require otherwise.
/dev
Stands for devices. Contains file representations of peripheral devices and pseudo-devices. See also: Linux Assigned Names and Numbers Authority. Needs to be on the root filesystem itself.
/etc
Contains system-wide configuration files and system databases; the name stands for et cetera [14] but now a better expansion is editable-text-configurations. Originally also contained "dangerous maintenance utilities" such as init, [6] but these have typically been moved to /sbin or elsewhere. Needs to be on the root filesystem itself.
/home
Contains user home directories on Linux and some other systems. In the original version of Unix, home directories were in /usr instead. [15] Some systems use or have used different locations still: macOS has home directories in /Users, older versions of BSD put them in /u, FreeBSD has /usr/home.
/lib
Originally essential libraries: C libraries, but not Fortran ones. [14] On modern systems, it contains the shared libraries needed by programs in /bin, and possibly loadable kernel module or device drivers. Linux distributions may have variants /lib32 and /lib64 for multi-architecture support.
/media
Default mount point for removable devices, such as USB sticks, media players, etc. By common sense, the directory itself, whose subdirectories are mountpoints, is on the root partition itself.
/mnt
Stands for mount. Empty directory commonly used by system administrators as a temporary mount point. By common sense, the directory itself, whose subdirectories are mountpoints, is on the root partition itself.
/opt
Contains locally installed software. Originated in System V, which has a package manager that installs software to this directory (one subdirectory per package). [16]
/proc
procfs virtual filesystem showing information about processes as files.
/root
The home directory for the superuser root - that is, the system administrator. This account's home directory is usually on the initial filesystem, and hence not in /home (which may be a mount point for another filesystem) in case specific maintenance needs to be performed, during which other filesystems are not available. Such a case could occur, for example, if a hard disk drive suffers failures and cannot be properly mounted.
/sbin
Stands for "system (or superuser) binaries" and contains fundamental utilities, such as init, usually needed to start, maintain and recover the system. Needs to be on the root partition itself.
/srv
Server data (data for services provided by system).
/sys
In some Linux distributions, contains a sysfs virtual filesystem, containing information related to hardware and the operating system. On BSD systems, commonly a symlink to the kernel sources in /usr/src/sys.
/tmp
A place for temporary files not expected to survive a reboot. Many systems clear this directory upon startup or use tmpfs to implement it.
/unix
The Unix kernel in Research Unix and System V. [14] With the addition of virtual memory support to 3BSD, this got renamed /vmunix.
/usr
The "user file system": originally the directory holding user home directories, [15] but already by the Third Edition of Research Unix, ca. 1973, reused to split the operating system's programs over two disks (one of them a 256K fixed-head drive) so that basic commands would either appear in /bin or /usr/bin. [17] It now holds executables, libraries, and shared resources that are not system critical, such as the X Window System, window managers, scripting languages, etc. In older Unix systems, user home directories might still appear in /usr alongside directories containing programs, although by 1984 this depended on local customs. [14]
/include
Stores the development headers used throughout the system. Header files are mostly used by the #include directive in C language, which historically is how the name of this directory was chosen.
/lib
Stores the needed libraries and data files for programs stored within /usr or elsewhere.
/libexec
Holds programs meant to be executed by other programs rather than by users directly. E.g., the Sendmail executable may be found in this directory. [18] Not present in the FHS until 2011; [19] Linux distributions have traditionally moved the contents of this directory into /usr/lib, where they also resided in 4.3BSD.
/local
Resembles /usr in structure, but its subdirectories are used for additions not part of the operating system distribution, such as custom programs or files from a BSD Ports collection. Usually has subdirectories such as /usr/local/lib or /usr/local/bin.
/share
Architecture-independent program data. On Linux and modern BSD derivatives, this directory has subdirectories such as man for manpages, that used to appear directly under /usr in older versions.
/var
Stands for variable. A place for files that might change frequently - especially in size, for example e-mail sent to users on the system, or process-ID lock files.
/log
Contains system log files.
/mail
The place where all incoming mail is stored. Users (other than root) can access their own mail only. Often, this directory is a symbolic link to /var/spool/mail.
/spool
Spool directory. Contains print jobs, mail spools and other queued tasks.
/src
The place where the uncompiled source code of some programs is.
/tmp
The /var/tmp directory is a place for temporary files which should be preserved between system reboots.

See also

Related Research Articles

<span class="mw-page-title-main">Disk partitioning</span> Creation of separate accessible storage areas on a secondary computer storage device

Disk partitioning or disk slicing is the creation of one or more regions on secondary storage, so that each region can be managed separately. These regions are called partitions. It is typically the first step of preparing a newly installed disk, before any file system is created. The disk stores the information about the partitions' locations and sizes in an area known as the partition table that the operating system reads before any other part of the disk. Each partition then appears to the operating system as a distinct "logical" disk that uses part of the actual disk. System administrators use a program called a partition editor to create, resize, delete, and manipulate the partitions. Partitioning allows the use of different filesystems to be installed for different kinds of files. Separating user data from system data can prevent the system partition from becoming full and rendering the system unusable. Partitioning can also make backing up easier. A disadvantage is that it can be difficult to properly size partitions, resulting in having one partition with too much free space and another nearly totally allocated.

fsck System tool for checking the consistency of a file system

The system utility fsck is a tool for checking the consistency of a file system in Unix and Unix-like operating systems, such as Linux, macOS, and FreeBSD. The equivalent programs on MS-DOS and Microsoft Windows are CHKDSK, SFC, and SCANDISK.

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own, such as devices that use magnetic tape. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization. POSIX abandoned tar in favor of pax, yet tar sees continued widespread use.

In computing, a symbolic link is a file whose purpose is to point to a file or directory by specifying a path thereto.

<span class="mw-page-title-main">GoboLinux</span> Linux distribution

GoboLinux is a Linux distribution whose most prominent feature is a reorganization of the traditional Linux file system. Rather than following the Filesystem Hierarchy Standard like most Unix-like systems, each program in a GoboLinux system has its own subdirectory tree, where all of its files may be found. Thus, a program "Foo" has all of its specific files and libraries in /Programs/Foo, under the corresponding version of this program at hand. For example, the commonly known GCC compiler suite version 8.1.0, would reside under the directory /Programs/GCC/8.1.0.

The Unix file system (UFS) is a family of file systems supported by many Unix and Unix-like operating systems. It is a distant descendant of the original filesystem used by Version 7 Unix.

A home directory is a file system directory on a multi-user operating system containing files for a given user of the system. The specifics of the home directory are defined by the operating system involved; for example, Linux / BSD (FHS) systems use /home/⟨username⟩ or /usr/home/⟨username⟩ and Windows systems between 2000 and Server 2003 keep home directories in a folder named Documents and Settings.

The Filesystem Hierarchy Standard (FHS) is a reference describing the conventions used for the layout of Unix-like systems. It has been made popular by its use in Linux distributions, but it is used by other Unix-like systems as well. It is maintained by the Linux Foundation. The latest version is 3.0, released on 3 June 2015.

ln (Unix) Unix file management utility

The ln command is a standard Unix command utility used to create a hard link or a symbolic link (symlink) to an existing file or directory. The use of a hard link allows multiple filenames to be associated with the same file since a hard link points to the inode of a given file, the data of which is stored on disk. On the other hand, symbolic links are special files that refer to other files by name.

<span class="mw-page-title-main">Virtual file system</span> Abstract layer on top of a more concrete file system

A virtual file system (VFS) or virtual filesystem switch is an abstract layer on top of a more concrete file system. The purpose of a VFS is to allow client applications to access different types of concrete file systems in a uniform way. A VFS can, for example, be used to access local and network storage devices transparently without the client application noticing the difference. It can be used to bridge the differences in Windows, classic Mac OS/macOS and Unix filesystems, so that applications can access files on local file systems of those types without having to know what type of file system they are accessing.

The inode is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. Each inode stores the attributes and disk block locations of the object's data. File-system object attributes may include metadata, as well as owner and permission data.

fstab is a system file commonly found in the directory /etc on Unix and Unix-like computer systems. In Linux, it is part of the util-linux package. The fstab file typically lists all available disk partitions and other types of file systems and data sources that may not necessarily be disk-based, and indicates how they are to be initialized or otherwise integrated into the larger file system structure.

cp (Unix) Unix command utility

In computing, cp is a command in various Unix and Unix-like operating systems for copying files and directories. The command has three principal modes of operation, expressed by the types of arguments presented to the program for copying a file to another file, one or more files to a directory, or for copying entire directories to another directory.

file (command) Standard Unix program

The file command is a standard program of Unix and Unix-like operating systems for recognizing the type of data contained in a computer file.

The proc filesystem (procfs) is a special filesystem in Unix-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional tracing methods or direct access to kernel memory. Typically, it is mapped to a mount point named /proc at boot time. The proc file system acts as an interface to internal data structures about running processes in the kernel. In Linux, it can also be used to obtain information about the kernel and to change certain kernel parameters at runtime (sysctl).

In computing, the sticky bit is a user ownership access right flag that can be assigned to files and directories on Unix-like systems.

mount (Unix) Unix command to mount a filesystem

In computing, mount is a command in various operating systems. Before a user can access a file on a Unix-like machine, the file system on the device which contains the file needs to be mounted with the mount command. Frequently mount is used for SD card, USB storage, DVD and other removable storage devices. The command is also available in the EFI shell.

The following tables compare general and technical information for a number of file systems.

In Unix-like operating systems, a device file or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These special files allow an application program to interact with a device by using its device driver via standard input/output system calls. Using standard system calls simplifies many programming tasks, and leads to consistent user-space I/O mechanisms regardless of device features and functions.

<span class="mw-page-title-main">Unix</span> Family of computer operating systems

Unix is a family of multitasking, multi-user computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, and others.

References

  1. 1 2 3 4 5 Ritchie, D.M.; Thompson, K. (July 1978). "The UNIX Time-Sharing System". Bell System Tech. J. 57 (6): 1905–1929. CiteSeerX   10.1.1.112.595 . doi:10.1002/j.1538-7305.1978.tb02136.x.
  2. 1 2 Ritchie, Dennis M. (1979). The Evolution of the Unix Time-sharing System. Language Design and Programming Methodology Conf.
  3. 1 2 Leffler, Samuel J.; McKusick, Marshall Kirk; Karels, Michael J.; Quarterman, John S. (October 1989). The Design and Implementation of the 4.3BSD UNIX Operating System . Addison-Wesley. ISBN   978-0-201-06196-3.
  4. McKusick, Marshall Kirk; et al. "A Fast Filesystem for Unix" (PDF). Freebsd.org. CSRG, UC Berkeley. Retrieved 16 November 2016.
  5. stat(2)    Linux Programmer's Manual – System Calls
  6. 1 2 hier(7)    Version 7 Unix Programmer's Manual
  7. hier(7)    FreeBSD Miscellaneous Information Manual
  8. hier(7)    OpenBSD Miscellaneous Information Manual
  9. "hier(7) man page for 2.9.1 BSD".
  10. "hier(7) man page for ULTRIX 4.2".
  11. "hier(7) man page for SunOS 4.1.3".
  12. hier(7)    Linux Programmer's Manual – Overview, Conventions and Miscellanea
  13. George Kraft IV (1 November 2000). "Where to Install My Products on Linux?". Linux Journal . Retrieved 13 November 2014.
  14. 1 2 3 4 5 Kernighan, Brian W.; Pike, Rob (1984). The UNIX Programming Environment. Prentice-Hall. pp.  63–65. Bibcode:1984upe..book.....K.
  15. 1 2 Ritchie, Dennis. "Unix Notes from 1972" . Retrieved 14 January 2018.
  16. System V Application Binary Interface Edition 4.1 (1997-03-18)
  17. M. D. McIlroy (1987). "A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986" (PDF). Bell Labs. CSTR 139.
  18. "Chapter 7. sendmail". UNICOS/mp Networking Facilities Administration. Cray . Retrieved 14 September 2013.
  19. "fhs-spec revision 44".