AdvFS

Last updated
AdvFS
Developer(s) Digital Equipment Corporation
Full nameTru64 UNIX Advanced File System
Introduced1993;30 years ago (1993) with OSF/1
Structures
Bad blocksTable
Limits
Max. volume size16 TiB
Max. file size16 TiB
Max. filename length255 bytes
Other
Supported operating systems Tru64 UNIX

AdvFS, also known as Tru64 UNIX Advanced File System, is a file system developed in the late 1980s to mid-1990s [1] by Digital Equipment Corporation for their OSF/1 version of the Unix operating system (later Digital UNIX/Tru64 UNIX). [2] In June 2008, it was released as free software under the GPL-2.0-only license. [3] AdvFS has been used in high-availability systems where fast recovery from downtime is essential. [4] :428

Contents

Functionality

AdvFS uses a relatively advanced concept of a storage pool (called a file domain) and of logical file systems (called file sets). A file domain is composed of any number of block devices, which could be partitions, LVM or LSM devices. A file set is a logical file system created in a single file domain. Administrators can add or remove volumes from an active file domain, providing that there is enough space on the remaining file domain, in case of removal. This was one of the trickier original features to implement because all data or metadata residing on the disk being removed had to first be migrated, online, to other disks, prior to removal.

File sets can be balanced, meaning that file content of file sets be balanced across physical volumes. Particular files in a file set can be striped across available volumes.

Administrators can take a snapshot (or clone) of any active or inactive file set. This allows for easy on-line backups and editing.

Another feature allows administrators to add or remove block devices from a file domain, while the file domain has active users. This add/remove feature allows migration to larger devices or migration from potentially failing hardware without a system shutdown.

Features

Its features include:

Under Linux, AdvFS supports an additional ‘’syncv’’ system call to atomically commit changes to multiple files. [6]

History

AdvFS, also known as Tru64 UNIX Advanced File System, was developed by Digital Equipment Corporation engineers in the late 1980s to mid-1990s [1] in Bellevue, WA (DECwest). They had previously worked on the earlier (cancelled) MICA and OZIX projects there.

It was first delivered on the DEC OSF/1 system (later Digital UNIX/Tru64 UNIX). Over time, development moved to teams located in Bellevue, WA and Nashua, NH. Versions were always one version number behind the operating system version. Thus, DEC OSF/1 v3.2 had AdvFS v2.x, Digital UNIX 4.0 had AdvFS v3.x and Tru64 UNIX 5.x had AdvFS v4.x. It is generally considered[ citation needed ] that only AdvFS v4 had matured to production level stability, with a sufficient set of tools to get administrators out of any kind of trouble. [2] The original team had enough confidence in its log based recovery to release it without an "fsck" style recovery utility on the assumption that the file system journal would always be allocated on mirrored drives.

In 1996, Lee and Thekkath [7] described the use of AdvFS on top of a novel disk virtualisation layer known as Petal. In a later paper, [8] Thekkath et al. describe their own file system (Frangipani) built on top of Petal and compare it to the performance of AdvFS running on the same storage layer.

Shapiro and Miller [9] compared the performance of files stored in AdvFS to Oracle RDBMS version 7.3.4 BLOB storage.

Compaq Sierra Parallel File System (PFS) created a cluster file system based on multiple local AdvFS filesystems; testing carried out at Lawrence Livermore National Laboratory (LLNL) in 2000–2001 found that while the underlying AdvFS filesystem had adequate performance (albeit with high CPU utilisation), the PFS clustering layer on top of it performed poorly. [10]

On June 23, 2008, its source code was released by Hewlett-Packard [3] under the GPL-2.0-only license (instead of the recently released GPLv3) at SourceForge in order to be compatible with the also GPL-2.0-only licensed Linux kernel. [11]

Related Research Articles

<span class="mw-page-title-main">Tru64 UNIX</span> Computer operating system

Tru64 UNIX is a discontinued 64-bit UNIX operating system for the Alpha instruction set architecture (ISA), currently owned by Hewlett-Packard (HP). Previously, Tru64 UNIX was a product of Compaq, and before that, Digital Equipment Corporation (DEC), where it was known as Digital UNIX.

ext3, or third extended filesystem, is a journaled file system that is commonly used by the Linux kernel. It used to be the default file system for many popular Linux distributions. Stephen Tweedie first revealed that he was working on extending ext2 in Journaling the Linux ext2fs Filesystem in a 1998 paper, and later in a February 1999 kernel mailing list posting. The filesystem was merged with the mainline Linux kernel in November 2001 from 2.4.15 onward. Its main advantage over ext2 is journaling, which improves reliability and eliminates the need to check the file system after an unclean shutdown. Its successor is ext4.

Network File System (NFS) is a distributed file system protocol originally developed by Sun Microsystems (Sun) in 1984, allowing a user on a client computer to access files over a computer network much like local storage is accessed. NFS, like many other protocols, builds on the Open Network Computing Remote Procedure Call system. NFS is an open IETF standard defined in a Request for Comments (RFC), allowing anyone to implement the protocol.

The Unix file system (UFS) is a family of file systems supported by many Unix and Unix-like operating systems. It is a distant descendant of the original filesystem used by Version 7 Unix.

tmpfs is a temporary file storage paradigm implemented in many Unix-like operating systems. It is intended to appear as a mounted file system, but data is stored in volatile memory instead of a persistent storage device. A similar construction is a RAM disk, which appears as a virtual disk drive and hosts a disk file system.

<span class="mw-page-title-main">File system</span> Format or program for storing files and directories

In computing, a file system or filesystem is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one large body of data with no way to tell where one piece of data stopped and the next began, or where any piece of data was located when it was time to retrieve it. By separating the data into pieces and giving each piece a name, the data are easily isolated and identified. Taking its name from the way a paper-based data management system is named, each group of data is called a "file". The structure and logic rules used to manage the groups of data and their names is called a "file system."

The Smart File System (SFS) is a journaling filesystem used on Amiga computers and AmigaOS-derived operating systems. It is designed for performance, scalability and integrity, offering improvements over standard Amiga filesystems as well as some special or unique features.

Filesystem in Userspace (FUSE) is a software interface for Unix and Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code. This is achieved by running file system code in user space while the FUSE module provides only a bridge to the actual kernel interfaces.

<span class="mw-page-title-main">USB mass storage device class</span>

The USB mass storage device class is a set of computing communications protocols, specifically a USB Device Class, defined by the USB Implementers Forum that makes a USB device accessible to a host computing device and enables file transfers between the host and the USB device. To a host, the USB device acts as an external hard drive; the protocol set interfaces with a number of storage devices.

The VERITAS File System is an extent-based file system. It was originally developed by VERITAS Software. Through an OEM agreement, VxFS is used as the primary filesystem of the HP-UX operating system. With on-line defragmentation and resize support turned on via license, it is known as OnlineJFS. It is also supported on AIX, Linux, Solaris, OpenSolaris, SINIX/Reliant UNIX, UnixWare and SCO OpenServer. VxFS was originally developed for AT&T's Unix System Laboratories. VxFS is packaged as a part of the Veritas Storage Foundation.

<span class="mw-page-title-main">SRM firmware</span>

The SRM firmware is the boot firmware written by Digital Equipment Corporation (DEC) for computer systems based on the DEC Alpha microprocessor. SRM are the initials of (Alpha) System Reference Manual, the publication detailing the Alpha AXP architecture and which specified various features of the SRM firmware.

NILFS or NILFS2 is a log-structured file system implementation for the Linux kernel. It was developed by Nippon Telegraph and Telephone Corporation (NTT) CyberSpace Laboratories and a community from all over the world. NILFS was released under the terms of the GNU General Public License (GPL).

GPFS is high-performance clustered file system software developed by IBM. It can be deployed in shared-disk or shared-nothing distributed parallel modes, or a combination of these. It is used by many of the world's largest commercial companies, as well as some of the supercomputers on the Top 500 List. For example, it is the filesystem of the Summit at Oak Ridge National Laboratory which was the #1 fastest supercomputer in the world in the November 2019 TOP500 list of supercomputers. Summit is a 200 Petaflops system composed of more than 9,000 POWER9 processors and 27,000 NVIDIA Volta GPUs. The storage filesystem called Alpine has 250 PB of storage using Spectrum Scale on IBM ESS storage hardware, capable of approximately 2.5TB/s of sequential I/O and 2.2TB/s of random I/O.

sync is a standard system call in the Unix operating system, which commits all data in the kernel filesystem to non-volatile storage buffers, i.e., data which has been scheduled for writing via low-level I/O system calls. Higher-level I/O layers such as stdio may maintain separate buffers of their own.

<span class="mw-page-title-main">The Sleuth Kit</span>

The Sleuth Kit (TSK) is a library and collection of Unix- and Windows-based utilities for extracting data from disk drives and other storage so as to facilitate the forensic analysis of computer systems. It forms the foundation for Autopsy, a better known tool that is essentially a graphical user interface to the command line utilities bundled with The Sleuth Kit.

CacheFS is the name used for several similar software technologies designed to speed up distributed file system file access for networked computers. These technologies operate by storing (cached) copies of files on secondary memory, typically a local hard disk, so that if a file is accessed again, it can be done locally at much higher speeds than networks typically allow.

Ceph is an open-source software-defined storage platform that implements object storage on a single distributed computer cluster and provides 3-in-1 interfaces for object-, block- and file-level storage. Ceph aims primarily for completely distributed operation without a single point of failure, scalability to the exabyte level, and to be freely available. Since version 12, Ceph does not rely on other filesystems and can directly manage HDDs and SSDs with its own storage backend BlueStore and can completely self reliantly expose a POSIX filesystem.

Data Protector software is automated backup and recovery software for single-server to enterprise environments, supporting disk storage or tape storage targets. It provides cross-platform, online backup of data for Microsoft Windows, Unix, and Linux operating systems. The last version to use the OmniBack name was version 4.1, which was retired in 2004.

References

  1. 1 2 "Revision history?". SourceForge.net . Retrieved 2008-06-25.
  2. 1 2 Steven M. Hancock (January 2001). Tru64 Unix File System Administration Handbook. Digital Press. p. 258. ISBN   978-1-55558-227-2.
  3. 1 2 Press release concerning the release of the AdvFS source code
  4. Brady, Don. Designing GIS for high availability and high performance. High Performance Computing in the Asia-Pacific Region, 2000. Proceedings. The Fourth International Conference/Exhibition on. pp. 423–431. doi:10.1109/HPC.2000.846591. AdvFS is a journaled, local file system that provides higher availability, and greater flexibility and recovery than traditional UNIX file systems. The recovery takes just a few seconds for AdvFS...
  5. Amir H. Majidimehr (1996). Optimizing UNIX for Performance. Prentice Hall PTR. p. 69. ISBN   978-0-13-111551-4. Log-structured file system implementations include the AIX Journalled File System (JFS), the DEC Advanced File System (AdvFS), and the SUN UFS with Transaction Logging in Solaris DiskSuite.
  6. Verma, Rajat, et al. "Failure-atomic updates of application data in a Linux file system.” 13th USENIX Conference on File and Storage Technologies (FAST 15). 2015. online version
  7. Lee, Edward K., and Chandramohan A. Thekkath. "Petal: Distributed virtual disks." ACM SIGPLAN Notices. Vol. 31. No. 9. ACM, 1996. Available online
  8. Chandramohan A. Thekkath, Timothy Mann, and Edward K. Lee. 1997. Frangipani: a scalable distributed file system. SIGOPS Oper. Syst. Rev. 31, 5 (October 1997), 224-237. Also in: Chandramohan A. Thekkath, Timothy Mann, and Edward K. Lee. 1997. Frangipani: a scalable distributed file system. In Proceedings of the sixteenth ACM symposium on Operating systems principles (SOSP '97), William M. Waite (Ed.). ACM, New York, NY, USA, 224-237. Online version
  9. Shapiro, Michael, and Ethan Miller. "Managing databases with binary large objects." Mass Storage Systems, 1999. 16th IEEE Symposium on. IEEE, 1999. Available online
  10. Uselton, A C. The Performance of PFS, the Compaq Sierra Product’s Parallel File System. United States: N. p., 2001. Web. doi:10.2172/15006183. Available online
  11. Linus Torvalds (2000-09-08). "Linux-2.4.0-test8". lkml.iu.edu. Retrieved 2015-11-21. The only one of any note that I'd like to point out directly is the clarification in the COPYING file, making it clear that it's only _that_particular version of the GPL that is valid for the kernel. This should not come as any surprise, as that's the same license that has been there since 0.12 or so, but I thought I'd make that explicit