Versioning file system

Last updated

A versioning file system is any computer file system which allows a computer file to exist in several versions at the same time. Thus it is a form of revision control. Most common versioning file systems keep a number of old copies of the file. Some limit the number of changes per minute or per hour to avoid storing large numbers of trivial changes. Others instead take periodic snapshots whose contents can be accessed using methods similar as those for normal file access.

Contents

Similar technologies

Backup

A versioning file system is similar to a periodic backup, with several key differences.

In comparison to revision control systems

Versioning file systems provide some of the features of revision control systems. However, unlike most revision control systems, they are transparent to users, not requiring a separate "commit" step to record a new revision.

Journaling file system

Versioning file systems should not be confused with journaling file systems. Whereas journaling file systems work by keeping a log of the changes made to a file before committing those changes to that file system (and overwriting the prior version), a versioning file system keeps previous copies of a file when saving new changes. The two features serve different purposes and are not mutually exclusive.

Object Storage

Some Object storage implementations offers object versioning such as Amazon S3.

Implementations

ITS

An early implementation of versioning, possibly the first, was in MIT's ITS. In ITS, a filename consisted of two six-character parts; if the second part was numeric (consisted only of digits), it was treated as a version number. When specifying a file to open for read or write, one could supply a second part of ">"; when reading, this meant to open the highest-numbered version of the file; when writing, it meant to increment the highest existing version number and create the new version for writing.

Another early implementation of versioning was in TENEX, which became TOPS-20. [1]

Files-11 (RSX-11 and OpenVMS)

A powerful example of a file versioning system is built into the RSX-11 and OpenVMS operating system from Digital Equipment Corporation. In essence, whenever an application opens a file for writing, the file system automatically creates a new instance of the file, with a version number appended to the name. Version numbers start at 1 and count upward as new instances of a file are created. When an application opens a file for reading, it can either specify the exact file name including version number, or just the file name without the version number, in which case the most recent instance of the file is opened. The "purge" DCL/CCL command can be used at any time to manage the number of versions in a specific directory. By default, all but the highest numbered versions of all files in the current directory will be deleted; this behavior can be overridden with the /keep=n switch and/or by specifying directory path(s) and/or filename patterns. VMS systems are often scripted to purge user directories on a regular schedule; this is sometimes misconstrued by end-users as a property of the versioning system.

Linux

On February 8, 2004, Kiran-Kumar Muniswamy-Reddy, Charles P. Wright, Andrew Himmer, and Erez Zadok (all from Stony Brook University) proposed an application that was user friendly to many of the users who tested the app. The system was developed with Linux software, so it was first operated on Linux. [2]

LMFS

The Lisp Machine File System supports versioning. This was provided by implementations from MIT, LMI, Symbolics and Texas Instruments. Such an operating system was Symbolics Genera.

macOS

Starting with Lion (10.7), macOS has a feature called Versions which allows Time Machine-like saving and browsing of past versions of documents for applications written to use Versions. This functionality, however, takes place at the application layer, not the filesystem layer; [3] Lion and later releases do not incorporate a true versioning file system.

SCO OpenServer

HTFS, adopted as the primary filesystem for SCO OpenServer in 1995, supports file versioning. Versioning is enabled on a per-directory basis by setting the directory's setuid bit, which is inherited when subdirectories are created. If versioning is enabled, a new file version is created when a file or directory is removed, or when an existing file is opened with truncation. Non-current versions remain in the filesystem namespace, under the name of the original file but with a suffix attached consisting of a semicolon and version sequence number. All but the current version are hidden from directory reads (unless the SHOWVERSIONS environment variable is set), but versions are otherwise accessible for all normal operations. The environment variable and general accessibility allow versions to be managed with the usual filesystem utilities, though there is also an "undelete" command that can be used to purge and restore files, enable and disable versioning on directories, etc.

Others

The following are not versioning filesystems, but allow similar functionality.

See also

Related Research Articles

XFS is a high-performance 64-bit journaling file system created by Silicon Graphics, Inc (SGI) in 1993. It was the default file system in SGI's IRIX operating system starting with its version 5.3. XFS was ported to the Linux kernel in 2001; as of June 2014, XFS is supported by most Linux distributions; Red Hat Enterprise Linux uses it as its default file system.

In software engineering, version control is a class of systems responsible for managing changes to computer programs, documents, large web sites, or other collections of information. Version control is a component of software configuration management.

<span class="mw-page-title-main">Apache Subversion</span> Free and open-source software versioning and revision control system

Apache Subversion is a software versioning and revision control system distributed as open source under the Apache License. Software developers use Subversion to maintain current and historical versions of files such as source code, web pages, and documentation. Its goal is to be a mostly compatible successor to the widely used Concurrent Versions System (CVS).

In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", whereas the noun and adjective form is "backup". Backups can be used to recover data after its loss from data deletion or corruption, or to recover data from an earlier time. Backups provide a simple form of disaster recovery; however not all backup systems are able to reconstitute a computer system or other complex configuration such as a computer cluster, active directory server, or database server.

<span class="mw-page-title-main">File system</span> Format or program for storing files and directories

In computing, a file system or filesystem is a method and data structure that the operating system uses to control how data is stored and retrieved. Without a file system, data placed in a storage medium would be one large body of data with no way to tell where one piece of data stopped and the next began, or where any piece of data was located when it was time to retrieve it. By separating the data into pieces and giving each piece a name, the data are easily isolated and identified. Taking its name from the way a paper-based data management system is named, each group of data is called a "file". The structure and logic rules used to manage the groups of data and their names is called a "file system."

Filesystem in Userspace (FUSE) is a software interface for Unix and Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code. This is achieved by running file system code in user space while the FUSE module provides only a bridge to the actual kernel interfaces.

<span class="mw-page-title-main">Shadow Copy</span> Microsoft technology for storage snapshots

Shadow Copy is a technology included in Microsoft Windows that can create backup copies or snapshots of computer files or volumes, even when they are in use. It is implemented as a Windows service called the Volume Shadow Copy service. A software VSS provider service is also included as part of Windows to be used by Windows applications. Shadow Copy technology requires either the Windows NTFS or ReFS filesystems in order to create and store shadow copies. Shadow Copies can be created on local and external volumes by any Windows component that uses this technology, such as when creating a scheduled Windows Backup or automatic System Restore point.

<span class="mw-page-title-main">Git</span> Software for version control of files

Git is a distributed version control system that tracks changes in any set of computer files, usually used for coordinating work among programmers who are collaboratively developing source code during software development. Its goals include speed, data integrity, and support for distributed, non-linear workflows.

File attributes are a type of meta-data that describe and may modify how files and/or directories in a filesystem behave. Typical file attributes may, for example, indicate or specify whether a file is visible, modifiable, compressed, or encrypted. The availability of most file attributes depends on support by the underlying filesystem where attribute data must be stored along with other control structures. Each attribute can have one of two states: set and cleared. Attributes are considered distinct from other metadata, such as dates and times, filename extensions or file system permissions. In addition to files, folders, volumes and other file system objects may have attributes.

<span class="mw-page-title-main">Snapshot (computer storage)</span> Recorded state of a computer storage system at a particular point in time

In computer systems, a snapshot is the state of a system at a particular point in time. The term was coined as an analogy to that in photography.

NILFS or NILFS2 is a log-structured file system implementation for the Linux kernel. It was developed by Nippon Telegraph and Telephone Corporation (NTT) CyberSpace Laboratories and a community from all over the world. NILFS was released under the terms of the GNU General Public License (GPL).

Content-addressable storage (CAS), also referred to as content-addressed storage or fixed-content storage, is a way to store information so it can be retrieved based on its content, not its name or location. It has been used for high-speed storage and retrieval of fixed content, such as documents stored for compliance with government regulations. Content-addressable storage is similar to content-addressable memory.

EncFS is a Free (LGPL) FUSE-based cryptographic filesystem. It transparently encrypts files, using an arbitrary directory as storage for the encrypted files.

In software development, version control is a class of systems responsible for managing changes to computer programs or other collections of information such that revisions have a logical and consistent organization. The following tables include general and technical information on notable version control and software configuration management (SCM) software. For SCM software not suitable for source code, see Comparison of open-source configuration management software.

The following tables compare general and technical information for a number of file systems.

<span class="mw-page-title-main">Time Machine (macOS)</span> Backup software application developed by Apple and distributed as part of macOS

Time Machine is the backup mechanism of macOS, the desktop operating system developed by Apple. The software is designed to work with both local storage devices and network-attached disks, and is most commonly used with external disk drives connected using either USB or Thunderbolt. It was first introduced in Mac OS X 10.5 Leopard, which appeared in October 2007 and incrementally refined in subsequent releases of macOS. Time Machine was revamped in macOS 11 Big Sur to support APFS, thereby enabling "faster, more compact, and more reliable backups" than were possible previously.

chattr is the command in Linux that allows a user to set certain attributes of a file. lsattr is the command that displays the attributes of a file.

Btrfs is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager, developed together. It was founded by Chris Mason in 2007 for use in Linux, and since November 2013, the file system's on-disk format has been declared stable in the Linux kernel.

The Linear Tape File System (LTFS) is a file system that allows files stored on magnetic tape to be accessed in a similar fashion to those on disk or removable flash drives. It requires both a specific format of data on the tape media and software to provide a file system interface to the data.

Apple File System (APFS) is a proprietary file system developed and deployed by Apple Inc. for macOS Sierra (10.12.4) and later, iOS 10.3, tvOS 10.2, watchOS 3.2, and all versions of iPadOS. It aims to fix core problems of HFS+, APFS's predecessor on these operating systems. APFS is optimized for solid-state drive storage and supports encryption, snapshots, and increased data integrity, among other capabilities.

References

  1. Daniel G. Bobrow, Jerry D. Burchfiel, Daniel L. Murphy, Raymond S. Tomlinson, TENEX, A Paged Time Sharing System for the PDP-10 (Communications of the ACM, Vol. 15, pp. 135-143, March 1972)
  2. Kiran-Kumar Muniswamy-Reddy, Charles P. Wright, Andrew Himmer, and Erez Zadok (8 February 2004). "A Versatile and User-Oriented Versioning File System". FiST: Stackable File System Language and Templates.{{cite web}}: CS1 maint: multiple names: authors list (link)
  3. "Mac OS X Lion file versions, part 2". 6 August 2011. Retrieved 28 April 2012.
  4. Version Control with Subversion: Next Generation Open Source Version Control
  5. "About Apple File System". Apple Developer Documentation. Retrieved 2021-06-09.
  6. http://www.oracle.com/technetwork/articles/servers-storage-admin/advanced-btrfs-1734952.html Snapshots, Clones, and Seed Devices" "snapshots" sub bullet.
  7. pDumpFS Homepage
  8. "Git Internals". Git is fundamentally a content-addressable filesystem with a VCS user interface written on top of it.
  9. "What is Gitfs". Presslabs. 24 July 2015. Retrieved 2022-03-07.