Company type | Privately funded |
---|---|
Industry | Software, computer storage |
Founded | 2005 |
Founder | Anand Avati Anand Babu Periasamy |
Headquarters | |
Number of locations | 2 |
Key people | Anand Babu (AB) Periasamy (CTO) and Hitesh Chellani (CEO) |
Products | Cloud storage |
Number of employees | 60 |
Website | www |
Gluster Inc. (formerly known as Z RESEARCH [1] [2] [3] ) was a software company that provided an open source platform for scale-out public and private cloud storage. The company was privately funded and headquartered in Sunnyvale, California, with an engineering center in Bangalore, India. Gluster was funded by Nexus Venture Partners and Index Ventures. Gluster was acquired by Red Hat on October 7, 2011. [4]
The name Gluster comes from the combination of the terms GNU and cluster. [2] Despite the similarity in names, Gluster is not related to the Lustre file system and does not incorporate any Lustre code. Gluster based its product on GlusterFS, an open-source software-based network-attached filesystem that deploys on commodity hardware. [5] The initial version of GlusterFS was written by Anand Babu Periasamy, Gluster's founder and CTO. [6] In May 2010 Ben Golub became the president and chief executive officer. [7] [8]
Red Hat became the primary author and maintainer of the GlusterFS open-source project after acquiring the Gluster company in October 2011. [4] The product was first marketed as Red Hat Storage Server, but in early 2015 renamed to be Red Hat Gluster Storage since Red Hat has also acquired the Ceph file system technology. [9]
Red Hat Gluster Storage is in the retirement phase of its lifecycle with a end of support life date of December 31, 2024. [10]
The GlusterFS architecture aggregates compute, storage, and I/O resources into a global namespace. Each server plus attached commodity storage (configured as direct-attached storage, JBOD, or using a storage area network) is considered to be a node. Capacity is scaled by adding additional nodes or adding additional storage to each node. Performance is increased by deploying storage among more nodes. High availability is achieved by replicating data n-way between nodes.
For public cloud deployments, GlusterFS offers an Amazon Web Services (AWS) Amazon Machine Image (AMI), which is deployed on Elastic Compute Cloud (EC2) instances rather than physical servers and the underlying storage is Amazon's Elastic Block Storage (EBS). [11] In this environment, capacity is scaled by deploying more EBS storage units, performance is scaled by deploying more EC2 instances, and availability is scaled by n-way replication between AWS availability zones.
A typical on-premises, or private cloud deployment will consist of GlusterFS installed as a virtual appliance on top of multiple commodity servers running hypervisors such as KVM, Xen, or VMware; or on bare metal. [12]
Original author(s) | Gluster |
---|---|
Developer(s) | Red Hat, Inc. |
Stable release | |
Operating system | Linux, OS X, FreeBSD, NetBSD, OpenSolaris |
Type | Distributed file system |
License | GNU General Public License v3 [14] |
Website | www |
GlusterFS is a scale-out network-attached storage file system. It has found applications including cloud computing, streaming media services, and content delivery networks. GlusterFS was developed originally by Gluster, Inc. and then by Red Hat, Inc., as a result of Red Hat acquiring Gluster in 2011. [15]
In June 2012, Red Hat Storage Server was announced as a commercially supported integration of GlusterFS with Red Hat Enterprise Linux. [16] Red Hat bought Inktank Storage in April 2014, which is the company behind the Ceph distributed file system, and re-branded GlusterFS-based Red Hat Storage Server to "Red Hat Gluster Storage". [17]
GlusterFS aggregates various storage servers over Ethernet or Infiniband RDMA interconnect into one large parallel network file system. It is free software, with some parts licensed under the GNU General Public License (GPL) v3 while others are dual licensed under either GPL v2 or the Lesser General Public License (LGPL) v3. GlusterFS is based on a stackable user space design.
GlusterFS has a client and server component. Servers are typically deployed as storage bricks, with each server running a glusterfsd daemon to export a local file system as a volume . The glusterfs client process, which connects to servers with a custom protocol over TCP/IP, InfiniBand or Sockets Direct Protocol, creates composite virtual volumes from multiple remote servers using stackable translators. By default, files are stored whole, but striping of files across multiple remote volumes is also possible. The client may mount the composite volume using a GlusterFS native protocol via the FUSE mechanism or using NFS v3 protocol using a built-in server translator, or access the volume via the gfapi client library. The client may re-export a native-protocol mount, for example via the kernel NFSv4 server, SAMBA, or the object-based OpenStack Storage (Swift) protocol using the "UFO" (Unified File and Object) translator.
Most of the functionality of GlusterFS is implemented as translators, including file-based mirroring and replication, file-based striping, file-based load balancing, volume failover, scheduling and disk caching, storage quotas, and volume snapshots with user serviceability (since GlusterFS version 3.6).
The GlusterFS server is intentionally kept simple: it exports an existing directory as-is, leaving it up to client-side translators to structure the store. The clients themselves are stateless, do not communicate with each other, and are expected to have translator configurations consistent with each other. GlusterFS relies on an elastic hashing algorithm, rather than using either a centralized or distributed metadata model. The user can add, delete, or migrate volumes dynamically, which helps to avoid configuration coherency problems. This allows GlusterFS to scale up to several petabytes on commodity hardware by avoiding bottlenecks that normally affect more tightly coupled distributed file systems.
GlusterFS provides data reliability and availability through various kinds of replication: replicated volumes and geo-replication. [18] Replicated volumes ensure that there exists at least one copy of each file across the bricks, so if one fails, data is still stored and accessible. Geo-replication provides a master-slave model of replication, where volumes are copied across geographically distinct locations. This happens asynchronously and is useful for availability in case of a whole data center failure.
GlusterFS has been used as the foundation for academic research [19] [20] and a survey article. [21]
Red Hat markets the software for three markets: "on-premises", public cloud and "private cloud". [22]
Network-attached storage (NAS) is a file-level computer data storage server connected to a computer network providing data access to a heterogeneous group of clients. The term "NAS" can refer to both the technology and systems involved, or a specialized device built for such functionality.
In computing, the Global File System 2 or GFS2 is a shared-disk file system for Linux computer clusters. GFS2 allows all members of a cluster to have direct concurrent access to the same shared block storage, in contrast to distributed file systems which distribute data throughout the cluster. GFS2 can also be used as a local file system on a single computer.
Filesystem in Userspace (FUSE) is a software interface for Unix and Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code. This is achieved by running file system code in user space while the FUSE module provides only a bridge to the actual kernel interfaces.
Google File System is a proprietary distributed file system developed by Google to provide efficient, reliable access to data using large clusters of commodity hardware. Google file system was replaced by Colossus in 2010.
GPFS is high-performance clustered file system software developed by IBM. It can be deployed in shared-disk or shared-nothing distributed parallel modes, or a combination of these. It is used by many of the world's largest commercial companies, as well as some of the supercomputers on the Top 500 List. For example, it is the filesystem of the Summit at Oak Ridge National Laboratory which was the #1 fastest supercomputer in the world in the November 2019 Top 500 List. Summit is a 200 Petaflops system composed of more than 9,000 POWER9 processors and 27,000 NVIDIA Volta GPUs. The storage filesystem is called Alpine.
A clustered file system (CFS) is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.
Ceph is a free and open-source software-defined storage platform that provides object storage, block storage, and file storage built on a common distributed cluster foundation. Ceph provides completely distributed operation without a single point of failure and scalability to the exabyte level, and is freely available. Since version 12 (Luminous), Ceph does not rely on any other conventional filesystem and directly manages HDDs and SSDs with its own storage backend BlueStore and can expose a POSIX filesystem.
oVirt is a free, open-source virtualization management platform. It was founded by Red Hat as a community project on which Red Hat Virtualization is based. It allows centralized management of virtual machines, compute, storage and networking resources, from an easy-to-use web-based front-end with platform independent access. KVM on x86-64, PowerPC64 and s390x architecture are the only hypervisors supported, but there is an ongoing effort to support ARM architecture in a future releases.
Moose File System (MooseFS) is an open-source, POSIX-compliant distributed file system developed by Core Technology. MooseFS aims to be fault-tolerant, highly available, highly performing, scalable general-purpose network distributed file system for data centers. Initially proprietary software, it was released to the public as open source on May 30, 2008.
Tahoe-LAFS is a free and open, secure, decentralized, fault-tolerant, distributed data store and distributed file system. It can be used as an online backup system, or to serve as a file or Web host similar to Freenet, depending on the front-end used to insert and access files in the Tahoe system. Tahoe can also be used in a RAID-like fashion using multiple disks to make a single large Redundant Array of Inexpensive Nodes (RAIN) pool of reliable data storage.
RozoFS is a free software distributed file system. It comes as a free software, licensed under the GNU GPL v2. RozoFS uses erasure coding for redundancy.
In computing, a distributed file system (DFS) or network file system is any file system that allows access to files from multiple hosts sharing via a computer network. This makes it possible for multiple users on multiple machines to share files and storage resources.
Software-defined storage (SDS) is a marketing term for computer data storage software for policy-based provisioning and management of data storage independent of the underlying hardware. Software-defined storage typically includes a form of storage virtualization to separate the storage hardware from the software that manages it. The software enabling a software-defined storage environment may also provide policy management for features such as data deduplication, replication, thin provisioning, snapshots and backup.
Red Hat Gluster Storage, formerly Red Hat Storage Server, is a computer storage product from Red Hat. It is based on open source technologies such as GlusterFS and Red Hat Enterprise Linux.
A distributed file system for cloud is a file system that allows many clients to have access to data and supports operations on that data. Each data file may be partitioned into several parts called chunks. Each chunk may be stored on different remote machines, facilitating the parallel execution of applications. Typically, data is stored in files in a hierarchical tree, where the nodes represent directories. There are several ways to share files in a distributed architecture: each solution must be suitable for a certain type of application, depending on how complex the application is. Meanwhile, the security of the system must be ensured. Confidentiality, availability and integrity are the main keys for a secure system.
ObjectiveFS is a distributed file system developed by Objective Security Corp. It is a POSIX-compliant file system built with an object store backend. It was initially released with AWS S3 backend, and has later implemented support for Google Cloud Storage and object store devices. It was released for beta in early 2013, and the first version was officially released on August 11, 2013.
Dell Technologies PowerFlex, is a commercial software-defined storage product from Dell Technologies that creates a server-based storage area network (SAN) from local server storage using x86 servers. It converts this direct-attached storage into shared block storage that runs over an IP-based network.
LizardFS is an open source distributed file system that is POSIX-compliant and licensed under GPLv3. It was released in 2013 as fork of MooseFS. LizardFS is also offering a paid Technical Support with possibility of configurating and setting up the cluster and active cluster monitoring.
ONTAP or Data ONTAP or Clustered Data ONTAP (cDOT) or Data ONTAP 7-Mode is NetApp's proprietary operating system used in storage disk arrays such as NetApp FAS and AFF, ONTAP Select, and Cloud Volumes ONTAP. With the release of version 9.0, NetApp decided to simplify the Data ONTAP name and removed the word "Data" from it, removed the 7-Mode image, therefore, ONTAP 9 is the successor of Clustered Data ONTAP 8.
Z Research was officially formed in June 2005 by AB (Anand Babu) aka "rooty" who is the CTO and myself with the goal of commoditizing Supercomputing and Superstorage and in the process validating yet another a business model around "Free Software", thus evangelizing "Free Software" and promoting the fact building businesses around "Free Software" is the way forward.