Everything is a file

Last updated

"Everything is a file" is an idea that Unix, and its derivatives, handle input/output to and from resources such as documents, hard-drives, modems, keyboards, printers and even some inter-process and network communications as simple streams of bytes exposed through the filesystem name space. [1] Exceptions include semaphores, processes and threads.

The advantage of this approach is that the same set of tools, utilities and APIs can be used on a wide range of resources and a number of file types. When a file is opened, a file descriptor is created, using the file path as an addressing system. The file descriptor is then a byte stream I/O interface on which file operations are performed. File descriptors are also created for objects such as anonymous pipes and network sockets  and therefore a more accurate description of this feature is Everything is a file descriptor. [2] [3]

Additionally, a range of pseudo and virtual filesystems exists which exposes internal kernel data, such as information about processes, to user space in a hierarchical file-like structure. [4] These are mounted into the single file hierarchy.

An example of this purely virtual filesystem is under /proc that exposes many system properties as files. All of these files, in the broader sense of the word, have standard Unix file attributes such as an owner and access permissions, and can be queried by the same classic Unix tools and filters. However, this is not universally considered a fast or portable approach. Some operating systems do not mount /proc by default due to security or speed concerns, relying on system calls instead. [5] It is, though, used heavily by Linux shell utilities, [6] [7] such as the procps ps implementation and BusyBox, which is widely installed on embedded systems. [8] Android Toolbox program depend on it as well. [9]

Another example is sysfs, which is usually mounted to /sys, which exposes kernel data structures. [10] sysfs provides functionality similar to the sysctl mechanism found in BSD operating systems, with the difference that sysfs is implemented as a virtual file system instead of being a purpose-built kernel mechanism. [11] The philosophy behind sysfs is to represent each value with a dedicated file. In addition each file has a maximum size of PAGE_SIZE bytes.

For a kernel module there are three possibilities to use a file below /sys:

The standard sysfs API uses a dedicated terminology: A file is called an attribute, the function executed upon reading an attribute is called show and the one for writing an attribute store. [12]

Sysfs was derived from procfs between Linux kernel versions 2.5-2.6, initially as a dedicated filesystem to debug a new driver model. Both sysfs and procfs are memory-based. Sysfs contains directories for block devices, physical bus types, device classes (such as those used for graphics, networking, input or printing), firmware-specific objects and attributes, kernel modules and the power subsystem. [13]

For example, writing mem to /sys/power/state will trigger a suspend-to-RAM procedure. [14]

Another example of files with specific behaviors are /dev/null and /dev/zero device files. Writes to them will be discarded. [15] This can, for example, be used to redirect unneeded standard streams.

See also

Related Research Articles

ext2, or second extended file system, is a file system for the Linux kernel. It was initially designed by French software developer Rémy Card as a replacement for the extended file system (ext). Having been designed according to the same principles as the Berkeley Fast File System from BSD, it was the first commercial-grade filesystem for Linux.

The Filesystem Hierarchy Standard (FHS) is a reference describing the conventions used for the layout of Unix-like systems. It has been made popular by its use in Linux distributions, but it is used by other Unix-like systems as well. It is maintained by the Linux Foundation. The latest version is 3.0, released on 3 June 2015.

chroot Operation that changes the apparent root directory in Unix-like systems

chroot is an operation on Unix and Unix-like operating systems that changes the apparent root directory for the current running process and its children. A program that is run in such a modified environment cannot name files outside the designated directory tree. The term "chroot" may refer to the chroot(2) system call or the chroot(8) wrapper program. The modified environment is called a chroot jail.

9P is a network protocol developed for the Plan 9 from Bell Labs distributed operating system as the means of connecting the components of a Plan 9 system. Files are key objects in Plan 9. They represent windows, network connections, processes, and almost anything else available in the operating system.

stat (system call) Unix system call

stat is a Unix system call that returns file attributes about an inode. The semantics of stat vary between operating systems. As an example, Unix command ls uses this system call to retrieve information on files that includes:

The proc filesystem (procfs) is a special filesystem in Unix-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional tracing methods or direct access to kernel memory. Typically, it is mapped to a mount point named /proc at boot time. The proc file system acts as an interface to internal data structures about running processes in the kernel. In Linux, it can also be used to obtain information about the kernel and to change certain kernel parameters at runtime (sysctl).

sysfs is a pseudo file system provided by the Linux kernel that exports information about various kernel subsystems, hardware devices, and associated device drivers from the kernel's device model to user space through virtual files. In addition to providing information about various devices and kernel subsystems, exported virtual files are also used for their configuration.

inotify is a Linux kernel subsystem created by John McCutchan, which monitors changes to the filesystem, and reports those changes to applications. It can be used to automatically update directory views, reload configuration files, log changes, backup, synchronize, and upload. The inotifywait and inotifywatch commands allow using the inotify subsystem from the command line. One major use is in desktop search utilities like Beagle, where its functionality permits reindexing of changed files without scanning the filesystem for changes every few minutes, which would be very inefficient.

In computer networking, STREAMS is the native framework in Unix System V for implementing character device drivers, network protocols, and inter-process communication. In this framework, a stream is a chain of coroutines that pass messages between a program and a device driver. STREAMS originated in Version 8 Research Unix, as Streams.

In computing, ioctl is a system call for device-specific input/output operations and other operations which cannot be expressed by regular file semantics. It takes a parameter specifying a request code; the effect of a call depends completely on the request code. Request codes are often device-specific. For instance, a CD-ROM device driver which can instruct a physical device to eject a disc would provide an ioctl request code to do so. Device-independent request codes are sometimes used to give userspace access to kernel functions which are only used by core system software or still under development.

A Unix architecture is a computer operating system system architecture that embodies the Unix philosophy. It may adhere to standards such as the Single UNIX Specification (SUS) or similar POSIX IEEE standard. No single published standard describes all Unix architecture computer operating systems — this is in part a legacy of the Unix wars.

Extended file attributes are file system features that enable users to associate computer files with metadata not interpreted by the filesystem, whereas regular attributes have a purpose strictly defined by the filesystem. Unlike forks, which can usually be as large as the maximum file size, extended attributes are usually limited in size to a value significantly smaller than the maximum file size. Typical uses include storing the author of a document, the character encoding of a plain-text document, or a checksum, cryptographic hash or digital certificate, and discretionary access control information.

The following tables compare general and technical information for a number of file systems.

The Linux booting process involves multiple stages and is in many ways similar to the BSD and other Unix-style boot processes, from which it derives. Although the Linux booting process depends very much on the computer architecture, those architectures share similar stages and software components, including system startup, bootloader execution, loading and startup of a Linux kernel image, and execution of various startup scripts and daemons. Those are grouped into 4 steps: system startup, bootloader stage, kernel stage, and init process. When a Linux system is powered up or reset, its processor will execute a specific firmware/program for system initialization, such as the power-on self-test, invoking the reset vector to start a program at a known address in flash/ROM, then load the bootloader into RAM for later execution. In IBM PC–compatible personal computers (PCs), this firmware/program is either a BIOS or a UEFI monitor, and is stored in the mainboard. In embedded Linux systems, this firmware/program is called boot ROM. After being loaded into RAM, the bootloader will execute to load the second-stage bootloader. The second-stage bootloader will load the kernel image into memory, decompress and initialize it, and then pass control to this kernel image. The second-stage bootloader also performs several operation on the system such as system hardware check, mounting the root device, loading the necessary kernel modules, etc. Finally, the first user-space process starts, and other high-level system initializations are performed.

In computer science, a synthetic file system or a pseudo file system is a hierarchical interface to non-file objects that appear as if they were regular files in the tree of a disk-based or long-term-storage file system. These non-file objects may be accessed with the same system calls or utility programs as regular files and directories. The common term for both regular files and the non-file objects is node.

GVfs is GNOME's userspace virtual filesystem designed to work with the I/O abstraction of GIO, a library available in GLib since version 2.15.1. It installs several modules that are automatically used by applications using the APIs of libgio. There is also FUSE support that allows applications not using GIO to access the GVfs filesystems.

In Unix-like operating systems, a device file, device node, or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These special files allow an application program to interact with a device by using its device driver via standard input/output system calls. Using standard system calls simplifies many programming tasks, and leads to consistent user-space I/O mechanisms regardless of device features and functions.

Toybox is a free and open-source software implementation of over 200 Unix command line utilities such as ls, cp, and mv. The Toybox project was started in 2006, and became a 0BSD licensed BusyBox alternative. Toybox is used for most of Android's command-line tools in all currently supported Android versions, and is also used to build Android on Linux and macOS. All of the tools are tested on Linux, and many of them also work on BSD and macOS.

In the Linux kernel, kernfs is a set of functions that contain the functionality required for creating the pseudo file systems used internally by various kernel subsystems so that they may use virtual files. For example, sysfs provides a set of virtual files by exporting information about hardware devices and associated device drivers from the kernel's device model to user space.

References

  1. In UNIX Everything is a File Archived January 10, 2015, at the Wayback Machine
  2. "Linus Torvalds - 'everything is a file descriptor or a process'". Yarchive.net. Retrieved 2015-08-28.
  3. "Ghosts of Unix Past". Lwn.net. Retrieved 2015-08-28.
  4. Benvenuti, Christian (2006). "3. User-Space-to-Kernel Interface". Understanding Linux network internals (Nachdr. ed.). Beijing Köln: O'Reilly. p. 58. ISBN   9780596002558.
  5. "8. procfs: Gone But Not Forgotten". Freebsd.org. Retrieved 2015-08-28.
  6. Xiao, Yang; Li, Frank Haizhon; Chen, Hui (2011). Handbook of security and networks. Hackensack (NJ): World scientific. p. 160. ISBN   9789814273039.
  7. "27. Upgrading and customizing the kernel". Red Hat Linux Networking and System Administration. John Wiley & Sons. 2007. p. 662. ISBN   9780471777311.
  8. "busybox - BusyBox: The Swiss Army Knife of Embedded Linux". Git.busybox.net. Retrieved 2015-08-28.
  9. "platform_system_core/ps.c at master · android/platform_system_core · GitHub". GitHub.com. 2015-03-09. Retrieved 2015-08-28.
  10. Mochel, Patrick; Murphy, Mike (16 August 2011). "sysfs - _The_ filesystem for exporting kernel objects". kernel.org. Archived from the original on 13 March 2024. Retrieved 15 June 2024.
  11. Vskills, Team. "SysFS and proc". Tutorial. Retrieved 2024-06-15.
  12. "sysfs, procfs, sysctl, debugfs and other similar kernel interfaces". John's Blog. 2013-11-20. Retrieved 2024-06-15.
  13. Patrick Mochel, The sysfs subsystem, 2005 Linux Symposium, pp. 314-317
  14. Wysocki, Rafael J. "System Power Management Sleep States". kernel.org. Retrieved 15 June 2024.
  15. "null(4) - Linux manual page". www.man7.org. Retrieved 2024-06-15.