Andrew File System

Last updated November 26, 2024

The Andrew File System (AFS) is a distributed file system which uses a set of trusted servers to present a homogeneous, location-transparent file name space to all the client workstations. It was developed by Carnegie Mellon University as part of the Andrew Project.^[1] Originally named "Vice",^[2] "Andrew" refers to Andrew Carnegie and Andrew Mellon. Its primary use is in distributed computing.

Features

AFS^[3] has several benefits over traditional networked file systems, particularly in the areas of security and scalability. One enterprise AFS deployment at Morgan Stanley exceeds 25,000 clients.^[4] AFS uses Kerberos for authentication, and implements access control lists on directories for users and groups. Each client caches files on the local filesystem for increased speed on subsequent requests for the same file. This also allows limited filesystem access in the event of a server crash or a network outage.

AFS uses the weak consistency model.^[5] Read and write operations on an open file are directed only to the locally cached copy. When a modified file is closed, the changed portions are copied back to the file server. Cache consistency is maintained by callback mechanism. When a file is cached, the server makes a note of this and promises to inform the client if the file is updated by someone else. Callbacks are discarded and must be re-established after any client, server, or network failure, including a timeout. Re-establishing a callback involves a status check and does not require re-reading the file itself.

A consequence of the file locking strategy is that AFS does not support large shared databases or record updating within files shared between client systems. This was a deliberate design decision based on the perceived needs of the university computing environment. For example, in the original email system for the Andrew Project, the Andrew Message System, a single file per message is used, like maildir, rather than a single file per mailbox, like mbox. See AFS and buffered I/O Problems for handling shared databases.

A significant feature of AFS is the volume, a tree of files, sub-directories and AFS mountpoints (links to other AFS volumes). Volumes are created by administrators and linked at a specific named path in an AFS cell. Once created, users of the filesystem may create directories and files as usual without concern for the physical location of the volume. A volume may have a quota assigned to it in order to limit the amount of space consumed. As needed, AFS administrators can move that volume to another server and disk location without the need to notify users; the operation can even occur while files in that volume are being used.

AFS volumes can be replicated to read-only cloned copies. When accessing files in a read-only volume, a client system will retrieve data from a particular read-only copy. If at some point, that copy becomes unavailable, clients will look for any of the remaining copies. Again, users of that data are unaware of the location of the read-only copy; administrators can create and relocate such copies as needed. The AFS command suite guarantees that all read-only volumes contain exact copies of the original read-write volume at the time the read-only copy was created.

The file name space on an Andrew workstation is partitioned into a shared and local name space. The shared name space (usually mounted as /afs on the Unix filesystem) is identical on all workstations. The local name space is unique to each workstation. It only contains temporary files needed for workstation initialization and symbolic links to files in the shared name space.

The Andrew File System heavily influenced Version 4 of Sun Microsystems' popular Network File System (NFS). Additionally, a variant of AFS, the DCE Distributed File System (DFS) was adopted by the Open Software Foundation in 1989 as part of their Distributed Computing Environment. Finally AFS (version two) was the predecessor of the Coda file system.

Implementations

Besides the original, a few other implementations were developed. OpenAFS was built from source released by Transarc (IBM) in 2000.^[6] Transarc software became deprecated and lost support.^{[ when? ]} Arla was an independent implementation of AFS developed at the Royal Institute of Technology in Stockholm in the late 1990s and early 2000s.^[7]^[8]

A fourth implementation of an AFS client exists in the Linux kernel source code since at least version 2.6.10.^[9] Committed by Red Hat, this is a fairly simple implementation still incomplete as of January 2024^[update].^[10]

Available permissions

The following Access Control List (ACL) permissions can be granted:

Lookup (l): allows a user to list the contents of the AFS directory, examine the ACL associated with the directory and access subdirectories.
Insert (i): allows a user to add new files or subdirectories to the directory.
Delete (d): allows a user to remove files and subdirectories from the directory.
Administer (a): allows a user to change the ACL for the directory. Users always have this right on their home directory, even if they accidentally remove themselves from the ACL.

Permissions that affect files and subdirectories include:

Read (r): allows a user to look at the contents of files in a directory and list files in subdirectories. Files that are to be granted read access to any user, including the owner, need to have the standard UNIX "owner read" permission set.
Write (w): allows a user to modify files in a directory. Files that are to be granted write access to any user, including the owner, need to have the standard UNIX "owner write" permission set.
Lock (k): allows the processor to run programs that need to "flock" files in the directory.

Additionally, AFS includes Application ACLs (A)-(H) which have no effect on access to files.

Related Research Articles

ext2, or second extended file system, is a file system for the Linux kernel. It was initially designed by French software developer Rémy Card as a replacement for the extended file system (ext). Having been designed according to the same principles as the Berkeley Fast File System from BSD, it was the first commercial-grade filesystem for Linux.

ext3, or third extended filesystem, is a journaled file system that is commonly used with the Linux kernel. It used to be the default file system for many popular Linux distributions but generally has been supplanted by its successor version ext4. The main advantage of ext3 over its predecessor, ext2, is journaling, which improves reliability and eliminates the need to check the file system after an improper, a.k.a. unclean, shutdown.

Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems (Sun) in 1984, allowing a user on a client computer to access files over a computer network much like local storage is accessed. NFS, like many other protocols, builds on the Open Network Computing Remote Procedure Call system. NFS is an open IETF standard defined in a Request for Comments (RFC), allowing anyone to implement the protocol.

Coda is a distributed file system developed as a research project at Carnegie Mellon University since 1987 under the direction of Mahadev Satyanarayanan. It descended directly from an older version of Andrew File System (AFS-2) and offers many similar features. The InterMezzo file system was inspired by Coda.

Apache Subversion is a version control system distributed as open source under the Apache License. Software developers use Subversion to maintain current and historical versions of files such as source code, web pages, and documentation. Its goal is to be a mostly compatible successor to the widely used Concurrent Versions System (CVS).

A log-structured filesystem is a file system in which data and metadata are written sequentially to a circular buffer, called a log. The design was first proposed in 1988 by John K. Ousterhout and Fred Douglis and first implemented in 1992 by Ousterhout and Mendel Rosenblum for the Unix-like Sprite distributed operating system.

OpenSSI is an open-source single-system image clustering system. It allows a collection of computers to be treated as one large system, allowing applications running on any one machine access to the resources of all the machines in the cluster.

In computing, the Global File System 2 (GFS2) is a shared-disk file system for Linux computer clusters. GFS2 allows all members of a cluster to have direct concurrent access to the same shared block storage, in contrast to distributed file systems which distribute data throughout the cluster. GFS2 can also be used as a local file system on a single computer.

The Distributed Computing Environment (DCE) is a software system developed in the early 1990s from the work of the Open Software Foundation (OSF), a consortium founded in 1988 that included Apollo Computer, IBM, Digital Equipment Corporation, and others. The DCE supplies a framework and a toolkit for developing client/server applications. The framework includes:

In computing, a file system or filesystem governs file organization and access. A local file system is a capability of an operating system that services the applications running on the same computer. A distributed file system is a protocol that provides file access between networked computers.

Filesystem in Userspace (FUSE) is a software interface for Unix and Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code. This is achieved by running file system code in user space while the FUSE module provides only a bridge to the actual kernel interfaces.

A diskless node is a workstation or personal computer without disk drives, which employs network booting to load its operating system from a server.

Most file systems include attributes of files and directories that control the ability of users to read, change, navigate, and execute the contents of the file system. In some cases, menu options or functions may be made visible or hidden depending on a user's permission level; this kind of user interface is referred to as permission-driven.

Lustre is a type of parallel distributed file system, generally used for large-scale cluster computing. The name Lustre is a portmanteau word derived from Linux and cluster. Lustre file system software is available under the GNU General Public License and provides high performance file systems for computer clusters ranging in size from small workgroup clusters to large-scale, multi-site systems. Since June 2005, Lustre has consistently been used by at least half of the top ten, and more than 60 of the top 100 fastest supercomputers in the world, including the world's No. 1 ranked TOP500 supercomputer in November 2022, Frontier, as well as previous top supercomputers such as Fugaku, Titan and Sequoia.

In computing, a directory is a file system cataloging structure which contains references to other computer files, and possibly other directories. On many computers, directories are known as folders, or drawers, analogous to a workbench or the traditional office filing cabinet. The name derives from books like a telephone directory that lists the phone numbers of all the people living in a certain area.

ext4 is a journaling file system for Linux, developed as the successor to ext3.

A clustered file system (CFS) is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.

An automounter is any program or software facility which automatically mounts filesystems in response to access operations by user programs. An automounter system utility, when notified of file and directory access attempts under selectively monitored subdirectory trees, dynamically and transparently makes local or remote devices accessible.

A journaling file system is a file system that keeps track of changes not yet committed to the file system's main part by recording the goal of such changes in a data structure known as a "journal", which is usually a circular log. In the event of a system crash or power failure, such file systems can be brought back online more quickly with a lower likelihood of becoming corrupted.

References

↑ What is Andrew Archived September 9, 2011, at the Wayback Machine - part of CMU's official site chronicling the history of the Andrew Project.
↑ Garfinkel, Simson L. (May–June 1989). "Ripples Across the Academic Market" (PDF). Technology Review. pp. 9–13. Retrieved 25 January 2016.
↑ Howard, J.H.; Kazar, M.L.; Nichols, S.G.; Nichols, D.A.; Satyanarayanan, M.; Sidebotham, R.N. & West, M.J. (February 1988). "Scale and Performance in a Distributed File System". ACM Transactions on Computer Systems. 6 (1): 51–81. CiteSeerX 10.1.1.71.5072 . doi:10.1145/35037.35059. S2CID 52848606.
↑ Moore, Phillip (2004). "When Your Business Depends On It — The Evolution of a Global File System for a Global Enterprise" (PDF). Archived from the original (PDF) on 2017-07-09. Retrieved 2009-06-18.
↑ Yaniv Pessach (2013), Distributed Storage (Distributed Storage: Concepts, Algorithms, and Implementations ed.), Amazon, OL 25423189M
↑ Opening Up AFS
↑ Assar Westerlund and Johan Danielsson (1998). "Arla-a free AFS client". Proceedings of the 1998 USENIX, Freenix Track. CiteSeerX 10.1.1.16.1360 .
↑ Magnus Ahltorp, Love Hörnquist-Åstrand and Assar Westerlund (2000). "Porting the Arla file system to Windows NT". Workshop on Management and Administration of Distributed Environments. CiteSeerX 10.1.1.512.9570 .
↑ Linux kernel AFS documentation for 2.6.10
↑ "Linux kernel AFS documentation for 6.7.1". linux.no. 7 January 2024. Retrieved 11 May 2024.

Andrew File System

Contents

Features

Implementations

Available permissions

See also

Related Research Articles

References

External links

Further reading