Union mount

Last updated

In computer operating systems, union mounting is a way of combining multiple directories into one that appears to contain their combined contents. [1] Union mounting is supported in Linux, BSD and several of its successors, and Plan 9, with similar but subtly different behavior.

Contents

As an example application of union mounting, consider the need to update the information contained on a CD-ROM or DVD. While a CD-ROM is not writable, one can overlay the CD's mount point with a writable directory in a union mount. Then, updating files in the union directory will cause them to end up in the writable directory, giving the illusion that the CD-ROM's contents have been updated. [1] [2]

Implementations

Plan 9

In the Plan 9 operating system from Bell Labs (mid-1980s onward), union mounting is a central concept, replacing several older Unix conventions with union directories; for example, several directories containing executables, unioned together at a single /bin directory, replace the PATH variable for command lookup in the shell. [3]

Plan 9 union semantics are greatly simplified compared to the implementations for POSIX-style operating systems: the union of two directories is simply the concatenation of their contents, so a directory listing of the union may display duplicate names. Also, no effort is made to recursively merge subdirectories, leading to an extremely simple implementation. [4] Directories are unioned in a controllable order; u/name, where u is a union directory, denotes the file called name in the first constituent directory that contains such a file. [4]

Unix and BSD

Unix/POSIX implementations of unions have different requirements from the Plan 9 implementation due to constraints in the traditional Unix file system behavior, which greatly complicates their implementation and often leads to compromises. [5] Problems that union mounting on Unix-like operating systems encounters include:

Early attempts to add unioning to Unix filesystems included the 3-d filesystem (Bell Labs) and the Translucent File Service in SunOS (Sun Microsystems, 1988 [2] ). An implementation of union mounting was added to the BSD version of Unix in version 4.4 (1994), taking inspiration from these earlier attempts, Plan 9 and the stackable file systems in Spring (Sun, 1994). [1] 4.4BSD implements the stack-of-directories approach outlined above. As in Plan 9, operations traverse this stack top-down to resolve names, but unlike Plan 9, BSD union mounts are recursive, so that the contents of subdirectories appear merged in the union directory. Also unlike the Plan 9 version, all layers except the top are read-only: modifying files in the union causes their contents to first be copied into the top layer of the stack, where the modifications are then applied. Deletion of files is implemented by writing a special type of file called a whiteout to the top directory, which has the effect of marking the file name as non-existent and hiding files with the same name in the lower layers of the stack. [1] Whiteouts require support from the underlying file system. [4]

Linux

Union mounting was implemented for Linux 0.99 in 1993; this initial implementation was called the Inheriting File System, but was abandoned by its developer because of its complexity. [2] The next major implementation was UnionFS, which grew out of the FiST project at Stony Brook University. [6] [5] An attempt to replace UnionFS, aufs, was released in 2006, followed in 2009 by OverlayFS. [2] In 2014 OverlayFS union mount implementation was added to the standard Linux kernel source code. [7]

Similarly, GlusterFS offers the ability to mount different filesystems distributed across a network, rather than being located on the same machine. [8]

MergerFS, originally released in 2014, is an actively developed open-source FUSE plugin, allowing pooling of arbitrary directories. [9]

Related Research Articles

<span class="mw-page-title-main">Plan 9 from Bell Labs</span> Distributed operating system

Plan 9 from Bell Labs is a distributed operating system which originated from the Computing Science Research Center (CSRC) at Bell Labs in the mid-1980s and built on UNIX concepts first developed there in the late 1960s. Since 2000, Plan 9 has been free and open-source. The final official release was in early 2015.

ext2, or second extended file system, is a file system for the Linux kernel. It was initially designed by French software developer Rémy Card as a replacement for the extended file system (ext). Having been designed according to the same principles as the Berkeley Fast File System from BSD, it was the first commercial-grade filesystem for Linux.

Linux has several filesystem drivers for the File Allocation Table (FAT) filesystem format. These are commonly known by the names used in the mount command to invoke particular drivers in the kernel: msdos, vfat, and umsdos.

<span class="mw-page-title-main">GoboLinux</span> Linux distribution

GoboLinux is a Linux distribution whose most prominent feature is a reorganization of the traditional Linux file system. Rather than following the Filesystem Hierarchy Standard like most Unix-like systems, each program in a GoboLinux system has its own subdirectory tree, where all of its files may be found. Thus, a program "Foo" has all of its specific files and libraries in /Programs/Foo, under the corresponding version of this program at hand. For example, the commonly known GCC compiler suite version 8.1.0, would reside under the directory /Programs/GCC/8.1.0.

The Unix file system (UFS) is a family of file systems supported by many Unix and Unix-like operating systems. It is a distant descendant of the original filesystem used by Version 7 Unix.

<span class="mw-page-title-main">Virtual file system</span> Abstract layer on top of a more concrete file system

A virtual file system (VFS) or virtual filesystem switch is an abstract layer on top of a more concrete file system. The purpose of a VFS is to allow client applications to access different types of concrete file systems in a uniform way. A VFS can, for example, be used to access local and network storage devices transparently without the client application noticing the difference. It can be used to bridge the differences in Windows, classic Mac OS/macOS and Unix filesystems, so that applications can access files on local file systems of those types without having to know what type of file system they are accessing.

Unionfs is a filesystem service for Linux, FreeBSD and NetBSD which implements a union mount for other file systems. It allows files and directories of separate file systems, known as branches, to be transparently overlaid, forming a single coherent file system. Contents of directories which have the same path within the merged branches will be seen together in a single merged directory, within the new, virtual filesystem.

Filesystem in Userspace (FUSE) is a software interface for Unix and Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code. This is achieved by running file system code in user space while the FUSE module provides only a bridge to the actual kernel interfaces.

rm (Unix) Unix command utility

rm is a basic command on Unix and Unix-like operating systems used to remove objects such as computer files, directories and symbolic links from file systems and also special files such as device nodes, pipes and sockets, similar to the del command in MS-DOS, OS/2, and Microsoft Windows. The command is also available in the EFI shell.

The proc filesystem (procfs) is a special filesystem in Unix-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional tracing methods or direct access to kernel memory. Typically, it is mapped to a mount point named /proc at boot time. The proc file system acts as an interface to internal data structures about running processes in the kernel. In Linux, it can also be used to obtain information about the kernel and to change certain kernel parameters at runtime (sysctl).

sync is a standard system call in the Unix operating system, which commits all data from the kernel filesystem buffers to non-volatile storage, i.e., data which has been scheduled for writing via low-level I/O system calls. Higher-level I/O layers such as stdio may maintain separate buffers of their own.

The following tables compare general and technical information for a number of file systems.

chattr is the command in Linux that allows a user to set certain attributes of a file. lsattr is the command that displays the attributes of a file.

Btrfs is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager, developed together. It was founded by Chris Mason in 2007 for use in Linux, and since November 2013, the file system's on-disk format has been declared stable in the Linux kernel.

aufs implements a union mount for Linux file systems. The name originally stood for AnotherUnionFS until version 2.

libguestfs is a C library and a set of tools for accessing and modifying virtual disk images used in platform virtualization. The tools can be used for viewing and editing virtual machines (VMs) managed by libvirt and files inside VMs, scripting changes to VMs, creating VMs, and much else besides. It was created because of security issues, when virtual disk images are mounted directly on the host system.

<span class="mw-page-title-main">Rump kernel</span> Software run in userspace that offers kernel functionality

The NetBSD rump kernel is the first implementation of the "anykernel" concept where drivers either can be compiled into or run in the monolithic kernel or in user space on top of a light-weight kernel. The NetBSD drivers can be used on top of the rump kernel on a wide range of POSIX operating systems, such as the Hurd, Linux, NetBSD, DragonFly BSD, Solaris kernels and even Cygwin, along with the file system utilities built with the rump libraries. The rump kernels can also run without POSIX directly on top of the Xen hypervisor, an L4 microkernel using the Genode OS Framework or even on "OS-less" bare metal.

OverlayFS is a union mount filesystem implementation for Linux. It combines multiple different underlying mount points into one, resulting in single directory structure that contains underlying files and sub-directories from all sources. Common applications overlay a read/write partition over a read-only partition, such as with LiveCDs and IoT devices with limited flash memory write cycles.

References

  1. 1 2 3 4 Pendry, Jan-Simon; Marshall Kirk McKusick (December 1995). "Union Mounts in 4.4BSD-Lite". Proceedings of the USENIX Technical Conference on UNIX and Advanced Computing Systems: 25–33. Retrieved 25 November 2007.
  2. 1 2 3 4 5 6 Aurora, Valerie; Henson (March 2009). "Unioning file systems: Architecture, features, and design choices". LWN.net. Retrieved 21 December 2009.
  3. Pike, R.; Presotto, D.; Thompson, K.; Trickey, H.; Winterbottom, P. "The Use of Name Spaces in Plan 9". Random Contrarian Insurgent Organization web site cat-v.org. Bell Labs. Retrieved 27 October 2016.
  4. 1 2 3 Aurora, Valerie; Henson (March 2009). "Union file systems: Implementations, part I". LWN.net. Retrieved 21 December 2009.
  5. 1 2 3 4 5 Wright, Charles P.; Jay Dave; Puja Gupta; Harikesavan Krishnan; Erez Zadok; Mohammad Nayyer Zubair. "Versatility and Unix Semantics in a Fan-Out Unification File System". Stony Brook University Technical Report FSL-04-01b. Retrieved 25 November 2007.
  6. Aurora, Valerie; Henson (April 2009). "Unioning file systems: Implementations, part 2". LWN.net. Retrieved 21 December 2009.
  7. Larabel, Michael (29 September 2014). "OverlayFS Proposed for the Linux 3.18 Kernel". Phoronix.com . Retrieved 12 October 2015.
  8. "About". GlusterNews. 14 November 2009. Archived from the original on 7 April 2013. Retrieved 4 March 2013. GlusterFS is an open source, distributed file system capable of scaling to several petabytes (actually, 72 brontobytes!) and handling thousands of clients. GlusterFS clusters together storage building blocks over Infiniband RDMA or TCP/IP interconnect, aggregating disk and memory resources and managing data in a single global namespace. GlusterFS is based on a stackable user space design and can deliver exceptional performance for diverse workloads.
  9. "MergerFS project on GitHub". github.com . Retrieved 15 September 2021.