Quantcast File System

Quantcast File System (QFS)
Developer(s)	Sriram Rao, Michael Ovsiannikov, Quantcast
Stable release	2.2.6 / May 6, 2023;8 months ago
Written in	C++
Type	Distributed File System
License	Apache License 2.0
Website	github.com/quantcast/qfs

Last updated February 04, 2024

Quantcast File System (QFS) is an open-source distributed file system software package for large-scale MapReduce or other batch-processing workloads. It was designed as an alternative to the Apache Hadoop Distributed File System (HDFS), intended to deliver better performance and cost-efficiency for large-scale processing clusters.

Design

QFS is software that runs on a cluster of hundreds or thousands of commodity Linux servers and allows other software layers to interact with them as if they were one giant hard drive. It has three components:

A chunk server runs on each machine that will host data, manages I/O to its hard drives, and monitors its activity and capacity.
A central process called the metaserver keeps the directory structure and maps of files to physical storage. It coordinates activities of all the chunk servers and monitors the overall health of the file system. For high performance it holds all its data in memory, writing checkpoints and transaction logs to disk for recovery.
A client component is the interface point that presents a file system application programming interface (API) to other layers of the software. It makes requests of the metaserver to identify which chunk servers hold (or will hold) its data, then interacts with the chunk servers directly to read and write.

In a cluster of hundreds or thousands of machines, the odds are low that all will be running and reachable at any given moment, so fault tolerance is the central design challenge. QFS meets it with Reed–Solomon error correction. The form of Reed–Solomon encoding used in QFS stores redundant data in nine places and can reconstruct the file from any six of these stripes.^[2] When it writes a file, it by default stripes it across nine physically different machines — six holding the data, three holding parity information. Any three of those can become unavailable. If any six remain readable, QFS can reconstruct the original data. The result is fault tolerance at a cost of a 50% expansion of data.

QFS is written in the programming language C++, operates within a fixed memory footprint, and uses direct input and output (I/O).

History

QFS evolved from the Kosmos File System (KFS), an open source project started by Kosmix in 2005. Quantcast adopted KFS in 2007, built its own improvements on it over the next several years, and released QFS 1.0 as an open source project in September, 2012.^[3]

Related Research Articles

RAID is a data storage virtualization technology that combines multiple physical disk drive components into one or more logical units for the purposes of data redundancy, performance improvement, or both. This is in contrast to the previous concept of highly reliable mainframe disk drives referred to as "single large expensive disk" (SLED).

Google File System is a proprietary distributed file system developed by Google to provide efficient, reliable access to data using large clusters of commodity hardware. Google file system was replaced by Colossus in 2010.

Lustre is a type of parallel distributed file system, generally used for large-scale cluster computing. The name Lustre is a portmanteau word derived from Linux and cluster. Lustre file system software is available under the GNU General Public License and provides high performance file systems for computer clusters ranging in size from small workgroup clusters to large-scale, multi-site systems. Since June 2005, Lustre has consistently been used by at least half of the top ten, and more than 60 of the top 100 fastest supercomputers in the world, including the world's No. 1 ranked TOP500 supercomputer in November 2022, Frontier, as well as previous top supercomputers such as Fugaku, Titan and Sequoia.

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use. It has since also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.

A clustered file system (CFS) is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.

Ceph is a free and open-source software-defined storage platform that provides object storage, block storage, and file storage built on a common distributed cluster foundation. Ceph provides completely distributed operation without a single point of failure and scalability to the exabyte level, and is freely available. Since version 12 (Luminous), Ceph does not rely on any other conventional filesystem and directly manages HDDs and SSDs with its own storage backend BlueStore and can expose a POSIX filesystem.

<span class="mw-page-title-main">Computer cluster</span> Set of computers configured in a distributed computing system

A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The newest manifestation of cluster computing is cloud computing.

A reliable multicast is any computer networking protocol that provides a reliable sequence of packets to multiple recipients simultaneously, making it suitable for applications such as multi-receiver file transfer.

CloudStore was Kosmix's C++ implementation of the Google File System. It parallels the Hadoop project, which is implemented in the Java programming language. CloudStore supports incremental scalability, replication, checksumming for data integrity, client side fail-over and access from C++, Java and Python. There is a FUSE module so that the file system can be mounted on Linux.

Sector/Sphere is an open source software suite for high-performance distributed data storage and processing. It can be broadly compared to Google's GFS and MapReduce technology. Sector is a distributed file system targeting data storage over a large number of commodity computers. Sphere is the programming architecture framework that supports in-storage parallel data processing for data stored in Sector. Sector/Sphere operates in a wide area network (WAN) setting.

Moose File System (MooseFS) is an open-source, POSIX-compliant distributed file system developed by Core Technology. MooseFS aims to be fault-tolerant, highly available, highly performing, scalable general-purpose network distributed file system for data centers. Initially proprietary software, it was released to the public as open source on May 30, 2008.

XtreemFS is an object-based, distributed file system for wide area networks. XtreemFS' outstanding feature is full and real fault tolerance, while maintaining POSIX file system semantics. Fault-tolerance is achieved by using Paxos-based lease negotiation algorithms and is used to replicate files and metadata. SSL and X.509 certificates support make XtreemFS usable over public networks.

Apache ZooKeeper is an open-source server for highly reliable distributed coordination of cloud applications. It is a project of the Apache Software Foundation.

BeeGFS is a parallel file system, developed and optimized for high-performance computing. BeeGFS includes a distributed metadata architecture for scalability and flexibility reasons. Its most used and widely known aspect is data throughput.

HPCC, also known as DAS, is an open source, data-intensive computing system platform developed by LexisNexis Risk Solutions. The HPCC platform incorporates a software architecture implemented on commodity computing clusters to provide high-performance, data-parallel processing for applications utilizing big data. The HPCC platform includes system configurations to support both parallel batch data processing (Thor) and high-performance online query applications using indexed data files (Roxie). The HPCC platform also includes a data-centric declarative programming language for parallel data processing called ECL.

In computing, a distributed file system (DFS) or network file system is any file system that allows access to files from multiple hosts sharing via a computer network. This makes it possible for multiple users on multiple machines to share files and storage resources.

A distributed file system for cloud is a file system that allows many clients to have access to data and supports operations on that data. Each data file may be partitioned into several parts called chunks. Each chunk may be stored on different remote machines, facilitating the parallel execution of applications. Typically, data is stored in files in a hierarchical tree, where the nodes represent directories. There are several ways to share files in a distributed architecture: each solution must be suitable for a certain type of application, depending on how complex the application is. Meanwhile, the security of the system must be ensured. Confidentiality, availability and integrity are the main keys for a secure system.

Apache Spark is an open-source unified analytics engine for large-scale data processing. Spark provides an interface for programming clusters with implicit data parallelism and fault tolerance. Originally developed at the University of California, Berkeley's AMPLab, the Spark codebase was later donated to the Apache Software Foundation, which has maintained it since.

LizardFS is an open source distributed file system that is POSIX-compliant and licensed under GPLv3. It was released in 2013 as fork of MooseFS. LizardFS is also offering a paid Technical Support with possibility of configurating and setting up the cluster and active cluster monitoring.

The MapR File System is a clustered file system that supports both very large-scale and high-performance uses. MapR FS supports a variety of interfaces including conventional read/write file access via NFS and a FUSE interface, as well as via the HDFS interface used by many systems such as Apache Hadoop and Apache Spark. In addition to file-oriented access, MapR FS supports access to tables and message streams using the Apache HBase and Apache Kafka APIs, as well as via a document database interface.

References

↑ Release 2.2.6
↑ "QFS improves performance of Hadoop file system - Strata". Archived from the original on 2012-11-19. Retrieved 2012-12-06.
↑ "Quantcast releases bigger, faster, stronger Hadoop file system — Tech News and Analysis". Archived from the original on 2012-12-03. Retrieved 2012-12-06.

External links

Official website

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Release 2.2.6

[2] "QFS improves performance of Hadoop file system - Strata". Archived from the original on 2012-11-19. Retrieved 2012-12-06.

[3] "Quantcast releases bigger, faster, stronger Hadoop file system — Tech News and Analysis". Archived from the original on 2012-12-03. Retrieved 2012-12-06.

[1]

[2]

[3]