Versioning file system

Last updated • 7 min readFrom Wikipedia, The Free Encyclopedia

A versioning file system is any computer file system which allows a computer file to exist in several versions at the same time. Thus it is a form of revision control. Most common versioning file systems keep a number of old copies of the file. Some limit the number of changes per minute or per hour to avoid storing large numbers of trivial changes. Others instead take periodic snapshots whose contents can be accessed using methods similar as those for normal file access.

Contents

Similar technologies

Backup

A versioning file system is similar to a periodic backup, with several key differences.

In comparison to revision control systems

Versioning file systems provide some of the features of revision control systems. However, unlike most revision control systems, they are transparent to users, not requiring a separate "commit" step to record a new revision.

Journaling file system

Versioning file systems should not be confused with journaling file systems. Whereas journaling file systems work by keeping a log of the changes made to a file before committing those changes to that file system (and overwriting the prior version), a versioning file system keeps previous copies of a file when saving new changes. The two features serve different purposes and are not mutually exclusive.

Object storage

Some object storage implementations offers object versioning, such as Amazon S3.

Implementations

ITS

An early implementation of versioning, possibly the first, was in MIT's ITS. In ITS, a filename consisted of two six-character parts; if the second part was numeric (consisted only of digits), it was treated as a version number. When specifying a file to open for read or write, one could supply a second part of ">"; when reading, this meant to open the highest-numbered version of the file; when writing, it meant to increment the highest existing version number and create the new version for writing.

Another early implementation of versioning was in TENEX, which became TOPS-20. [1]

Files-11 (RSX-11 and OpenVMS)

A powerful example of a file versioning system is built into the RSX-11 and OpenVMS operating system from Digital Equipment Corporation. In essence, whenever an application opens a file for writing, the file system automatically creates a new instance of the file, with a version number appended to the name. Version numbers start at 1 and count upward as new instances of a file are created. When an application opens a file for reading, it can either specify the exact file name including version number, or just the file name without the version number, in which case the most recent instance of the file is opened. The "purge" DCL/CCL command can be used at any time to manage the number of versions in a specific directory. By default, all but the highest numbered versions of all files in the current directory will be deleted; this behavior can be overridden with the /keep=n switch and/or by specifying directory path(s) and/or filename patterns. VMS systems are often scripted to purge user directories on a regular schedule; this is sometimes misconstrued by end-users as a property of the versioning system.

Linux

On February 8, 2004, Kiran-Kumar Muniswamy-Reddy, Charles P. Wright, Andrew Himmer, and Erez Zadok (all from Stony Brook University) proposed a stackable file system Versionfs, providing a versioning layer on top of any other Linux file systems. [3]

LMFS

The Lisp Machine File System supports versioning. This was provided by implementations from MIT, LMI, Symbolics and Texas Instruments. Such an operating system was Symbolics Genera.

macOS

Starting with Lion (10.7), macOS has a feature called Versions which allows Time Machine-like saving and browsing of past versions of documents for applications written to use Versions. This functionality, however, takes place at the application layer, not the filesystem layer; [4] Lion and later releases do not incorporate a true versioning file system.

SCO OpenServer

HTFS, adopted as the primary filesystem for SCO OpenServer in 1995, supports file versioning. Versioning is enabled on a per-directory basis by setting the directory's setuid bit, which is inherited when subdirectories are created. If versioning is enabled, a new file version is created when a file or directory is removed, or when an existing file is opened with truncation. Non-current versions remain in the filesystem namespace, under the name of the original file but with a suffix attached consisting of a semicolon and version sequence number. All but the current version are hidden from directory reads (unless the SHOWVERSIONS environment variable is set), but versions are otherwise accessible for all normal operations. The environment variable and general accessibility allow versions to be managed with the usual filesystem utilities, though there is also an "undelete" command that can be used to purge and restore files, enable and disable versioning on directories, etc.

Others

The following are not versioning filesystems, but allow similar functionality.

See also

Related Research Articles

XFS is a high-performance 64-bit journaling file system created by Silicon Graphics, Inc (SGI) in 1993. It was the default file system in SGI's IRIX operating system starting with its version 5.3. XFS was ported to the Linux kernel in 2001; as of June 2014, XFS is supported by most Linux distributions; Red Hat Enterprise Linux uses it as its default file system.

<span class="mw-page-title-main">Apache Subversion</span> Free and open-source software versioning and revision control system

Apache Subversion is a version control system distributed as open source under the Apache License. Software developers use Subversion to maintain current and historical versions of files such as source code, web pages, and documentation. Its goal is to be a mostly compatible successor to the widely used Concurrent Versions System (CVS).

In information technology, a backup, or data backup is a copy of computer data taken and stored elsewhere so that it may be used to restore the original after a data loss event. The verb form, referring to the process of doing so, is "back up", whereas the noun and adjective form is "backup". Backups can be used to recover data after its loss from data deletion or corruption, or to recover data from an earlier time. Backups provide a simple form of disaster recovery; however not all backup systems are able to reconstitute a computer system or other complex configuration such as a computer cluster, active directory server, or database server.

<span class="mw-page-title-main">File system</span> Computer filing system

In computing, a file system or filesystem governs file organization and access. A local file system is a capability of an operating system that services the applications running on the same computer. A distributed file system is a protocol that provides file access between networked computers.

Filesystem in Userspace (FUSE) is a software interface for Unix and Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code. This is achieved by running file system code in user space while the FUSE module provides only a bridge to the actual kernel interfaces.

<span class="mw-page-title-main">Shadow Copy</span> Microsoft technology for storage snapshots

Shadow Copy is a technology included in Microsoft Windows that can create backup copies or snapshots of computer files or volumes, even when they are in use. It is implemented as a Windows service called the Volume Shadow Copy service. A software VSS provider service is also included as part of Windows to be used by Windows applications. Shadow Copy technology requires either the Windows NTFS or ReFS filesystems in order to create and store shadow copies. Shadow Copies can be created on local and external volumes by any Windows component that uses this technology, such as when creating a scheduled Windows Backup or automatic System Restore point.

<span class="mw-page-title-main">Git</span> Distributed version control software system

Git is a distributed version control system that tracks versions of files. It is often used to control source code by programmers collaboratively developing software.

File attributes are a type of meta-data that describe and may modify how files and/or directories in a filesystem behave. Typical file attributes may, for example, indicate or specify whether a file is visible, modifiable, compressed, or encrypted. The availability of most file attributes depends on support by the underlying filesystem where attribute data must be stored along with other control structures. Each attribute can have one of two states: set and cleared. Attributes are considered distinct from other metadata, such as dates and times, filename extensions or file system permissions. In addition to files, folders, volumes and other file system objects may have attributes.

<span class="mw-page-title-main">CHKDSK</span> System tool in DOS, OS/2 and Windows

In computing, CHKDSK is a system tool and command in DOS, Digital Research FlexOS, IBM/Toshiba 4690 OS, IBM OS/2, Microsoft Windows and related operating systems. It verifies the file system integrity of a volume and attempts to fix logical file system errors. It is similar to the fsck command in Unix and similar to Microsoft ScanDisk, which co-existed with CHKDSK in Windows 9x and MS-DOS 6.x.

<span class="mw-page-title-main">Snapshot (computer storage)</span> Recorded state of a computer storage system at a particular point in time

In computer systems, a snapshot is the state of a system at a particular point in time. The term was coined as an analogy to that in photography.

NILFS or NILFS2 is a log-structured file system implementation for the Linux kernel. It was developed by Nippon Telegraph and Telephone Corporation (NTT) CyberSpace Laboratories and a community from all over the world. NILFS was released under the terms of the GNU General Public License (GPL).

Content-addressable storage (CAS), also referred to as content-addressed storage or fixed-content storage, is a way to store information so it can be retrieved based on its content, not its name or location. It has been used for high-speed storage and retrieval of fixed content, such as documents stored for compliance with government regulations. Content-addressable storage is similar to content-addressable memory.

The following tables describe attributes of notable version control and software configuration management (SCM) software systems that can be used to compare and contrast the various systems.

The following tables compare general and technical information for a number of file systems.

ext4 is a journaling file system for Linux, developed as the successor to ext3.

<span class="mw-page-title-main">Time Machine (macOS)</span> Backup software application developed by Apple and distributed as part of macOS

Time Machine is the backup mechanism of macOS, the desktop operating system developed by Apple. The software is designed to work with both local storage devices and network-attached disks, and is commonly used with external disk drives connected using either USB or Thunderbolt. It was first introduced in Mac OS X 10.5 Leopard, which was released in October 2007 and incrementally refined in subsequent releases of macOS. Time Machine was revamped in macOS 11 Big Sur to support APFS, thereby enabling "faster, more compact, and more reliable backups" than were possible previously.

chattr is the command in Linux that allows a user to set certain attributes of a file. lsattr is the command that displays the attributes of a file.

Btrfs is a computer storage format that combines a file system based on the copy-on-write (COW) principle with a logical volume manager, developed together. It was founded by Chris Mason in 2007 for use in Linux, and since November 2013, the file system's on-disk format has been declared stable in the Linux kernel.

Resilient File System (ReFS), codenamed "Protogon", is a Microsoft proprietary file system introduced with Windows Server 2012 with the intent of becoming the "next generation" file system after NTFS.

Bcachefs is a copy-on-write (COW) file system for Linux-based operating systems. Its primary developer, Kent Overstreet, first announced it in 2015, and it was added to the Linux kernel beginning with 6.7. It is intended to compete with the modern features of ZFS or Btrfs, and the speed and performance of ext4 or XFS.

References

  1. Daniel G. Bobrow, Jerry D. Burchfiel, Daniel L. Murphy, Raymond S. Tomlinson, TENEX, A Paged Time Sharing System for the PDP-10 (Communications of the ACM, Vol. 15, pp. 135-143, March 1972)
  2. linux-tux3 on GitHub.
  3. Kiran-Kumar Muniswamy-Reddy; Charles P. Wright; Andrew Himmer; Erez Zadok (8 February 2004). A Versatile and User-Oriented Versioning File System. Third USENIX Conference on File and Storage Technologies (FAST 2004).
  4. "Mac OS X Lion file versions, part 2". 6 August 2011. Retrieved 28 April 2012.
  5. Version Control with Subversion: Next Generation Open Source Version Control
  6. "About Apple File System". Apple Developer Documentation. Retrieved 2021-06-09.
  7. http://www.oracle.com/technetwork/articles/servers-storage-admin/advanced-btrfs-1734952.html Snapshots, Clones, and Seed Devices" "snapshots" sub bullet.
  8. pDumpFS Homepage
  9. "Git Internals". Git is fundamentally a content-addressable filesystem with a VCS user interface written on top of it.
  10. "What is Gitfs". Presslabs. 24 July 2015. Retrieved 2022-03-07.