Gluster

Last updated • 5 min readFrom Wikipedia, The Free Encyclopedia
Gluster, Inc.
Company typePrivately funded
Industry Software, computer storage
Founded2005
FounderAnand Avati
Anand Babu Periasamy  OOjs UI icon edit-ltr-progressive.svg
Headquarters
Number of locations
2
Key people
Anand Babu (AB) Periasamy (CTO) and Hitesh Chellani (CEO)
ProductsCloud storage
Number of employees
60
Website www.gluster.org

Gluster Inc. (formerly known as Z RESEARCH [1] [2] [3] ) was a software company that provided an open source platform for scale-out public and private cloud storage. The company was privately funded and headquartered in Sunnyvale, California, with an engineering center in Bangalore, India. Gluster was funded by Nexus Venture Partners and Index Ventures. Gluster was acquired by Red Hat on October 7, 2011. [4]

Contents

History

The name Gluster comes from the combination of the terms GNU and cluster. [2] Despite the similarity in names, Gluster is not related to the Lustre file system and does not incorporate any Lustre code. Gluster based its product on GlusterFS, an open-source software-based network-attached filesystem that deploys on commodity hardware. [5] The initial version of GlusterFS was written by Anand Babu Periasamy, Gluster's founder and CTO. [6] In May 2010 Ben Golub became the president and chief executive officer. [7] [8]

Red Hat became the primary author and maintainer of the GlusterFS open-source project after acquiring the Gluster company in October 2011. [4] The product was first marketed as Red Hat Storage Server, but in early 2015 renamed to be Red Hat Gluster Storage since Red Hat has also acquired the Ceph file system technology. [9]

Red Hat Gluster Storage is in the retirement phase of its lifecycle with a end of support life date of December 31, 2024. [10]

Architecture

The GlusterFS architecture aggregates compute, storage, and I/O resources into a global namespace. Each server plus attached commodity storage (configured as direct-attached storage, JBOD, or using a storage area network) is considered to be a node. Capacity is scaled by adding additional nodes or adding additional storage to each node. Performance is increased by deploying storage among more nodes. High availability is achieved by replicating data n-way between nodes.

Public cloud deployment

For public cloud deployments, GlusterFS offers an Amazon Web Services (AWS) Amazon Machine Image (AMI), which is deployed on Elastic Compute Cloud (EC2) instances rather than physical servers and the underlying storage is Amazon's Elastic Block Storage (EBS). [11] In this environment, capacity is scaled by deploying more EBS storage units, performance is scaled by deploying more EC2 instances, and availability is scaled by n-way replication between AWS availability zones.

Private cloud deployment

A typical on-premises, or private cloud deployment will consist of GlusterFS installed as a virtual appliance on top of multiple commodity servers running hypervisors such as KVM, Xen, or VMware; or on bare metal. [12]

GlusterFS

GlusterFS
Original author(s) Gluster
Developer(s) Red Hat, Inc.
Stable release
10.1 [13] / 19 January 2022 (2022-01-19)
Operating system Linux, OS X, FreeBSD, NetBSD, OpenSolaris
Type Distributed file system
License GNU General Public License v3 [14]
Website www.gluster.org

GlusterFS is a scale-out network-attached storage file system. It has found applications including cloud computing, streaming media services, and content delivery networks. GlusterFS was developed originally by Gluster, Inc. and then by Red Hat, Inc., as a result of Red Hat acquiring Gluster in 2011. [15]

In June 2012, Red Hat Storage Server was announced as a commercially supported integration of GlusterFS with Red Hat Enterprise Linux. [16] Red Hat bought Inktank Storage in April 2014, which is the company behind the Ceph distributed file system, and re-branded GlusterFS-based Red Hat Storage Server to "Red Hat Gluster Storage". [17]

Design

GlusterFS aggregates various storage servers over Ethernet or Infiniband RDMA interconnect into one large parallel network file system. It is free software, with some parts licensed under the GNU General Public License (GPL) v3 while others are dual licensed under either GPL v2 or the Lesser General Public License (LGPL) v3. GlusterFS is based on a stackable user space design.

GlusterFS has a client and server component. Servers are typically deployed as storage bricks, with each server running a glusterfsd daemon to export a local file system as a volume . The glusterfs client process, which connects to servers with a custom protocol over TCP/IP, InfiniBand or Sockets Direct Protocol, creates composite virtual volumes from multiple remote servers using stackable translators. By default, files are stored whole, but striping of files across multiple remote volumes is also possible. The client may mount the composite volume using a GlusterFS native protocol via the FUSE mechanism or using NFS v3 protocol using a built-in server translator, or access the volume via the gfapi client library. The client may re-export a native-protocol mount, for example via the kernel NFSv4 server, SAMBA, or the object-based OpenStack Storage (Swift) protocol using the "UFO" (Unified File and Object) translator.

Most of the functionality of GlusterFS is implemented as translators, including file-based mirroring and replication, file-based striping, file-based load balancing, volume failover, scheduling and disk caching, storage quotas, and volume snapshots with user serviceability (since GlusterFS version 3.6).

The GlusterFS server is intentionally kept simple: it exports an existing directory as-is, leaving it up to client-side translators to structure the store. The clients themselves are stateless, do not communicate with each other, and are expected to have translator configurations consistent with each other. GlusterFS relies on an elastic hashing algorithm, rather than using either a centralized or distributed metadata model. The user can add, delete, or migrate volumes dynamically, which helps to avoid configuration coherency problems. This allows GlusterFS to scale up to several petabytes on commodity hardware by avoiding bottlenecks that normally affect more tightly coupled distributed file systems.

GlusterFS provides data reliability and availability through various kinds of replication: replicated volumes and geo-replication. [18] Replicated volumes ensure that there exists at least one copy of each file across the bricks, so if one fails, data is still stored and accessible. Geo-replication provides a master-slave model of replication, where volumes are copied across geographically distinct locations. This happens asynchronously and is useful for availability in case of a whole data center failure.

GlusterFS has been used as the foundation for academic research [19] [20] and a survey article. [21]

Red Hat markets the software for three markets: "on-premises", public cloud and "private cloud". [22]

See also

Related Research Articles

<span class="mw-page-title-main">Network-attached storage</span> Computer data storage server

Network-attached storage (NAS) is a file-level computer data storage server connected to a computer network providing data access to a heterogeneous group of clients. The term "NAS" can refer to both the technology and systems involved, or a specialized device built for such functionality.

In computing, the Global File System 2 or GFS2 is a shared-disk file system for Linux computer clusters. GFS2 allows all members of a cluster to have direct concurrent access to the same shared block storage, in contrast to distributed file systems which distribute data throughout the cluster. GFS2 can also be used as a local file system on a single computer.

Filesystem in Userspace (FUSE) is a software interface for Unix and Unix-like computer operating systems that lets non-privileged users create their own file systems without editing kernel code. This is achieved by running file system code in user space while the FUSE module provides only a bridge to the actual kernel interfaces.

Google File System is a proprietary distributed file system developed by Google to provide efficient, reliable access to data using large clusters of commodity hardware. Google file system was replaced by Colossus in 2010.

GPFS is high-performance clustered file system software developed by IBM. It can be deployed in shared-disk or shared-nothing distributed parallel modes, or a combination of these. It is used by many of the world's largest commercial companies, as well as some of the supercomputers on the Top 500 List. For example, it is the filesystem of the Summit at Oak Ridge National Laboratory which was the #1 fastest supercomputer in the world in the November 2019 Top 500 List. Summit is a 200 Petaflops system composed of more than 9,000 POWER9 processors and 27,000 NVIDIA Volta GPUs. The storage filesystem is called Alpine.

A clustered file system (CFS) is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.

Ceph is a free and open-source software-defined storage platform that provides object storage, block storage, and file storage built on a common distributed cluster foundation. Ceph provides completely distributed operation without a single point of failure and scalability to the exabyte level, and is freely available. Since version 12 (Luminous), Ceph does not rely on any other conventional filesystem and directly manages HDDs and SSDs with its own storage backend BlueStore and can expose a POSIX filesystem.

oVirt Free, open-source virtualization management platform

oVirt is a free, open-source virtualization management platform. It was founded by Red Hat as a community project on which Red Hat Virtualization is based. It allows centralized management of virtual machines, compute, storage and networking resources, from an easy-to-use web-based front-end with platform independent access. KVM on x86-64, PowerPC64 and s390x architecture are the only hypervisors supported, but there is an ongoing effort to support ARM architecture in a future releases.

Moose File System (MooseFS) is an open-source, POSIX-compliant distributed file system developed by Core Technology. MooseFS aims to be fault-tolerant, highly available, highly performing, scalable general-purpose network distributed file system for data centers. Initially proprietary software, it was released to the public as open source on May 30, 2008.

Tahoe-LAFS is a free and open, secure, decentralized, fault-tolerant, distributed data store and distributed file system. It can be used as an online backup system, or to serve as a file or Web host similar to Freenet, depending on the front-end used to insert and access files in the Tahoe system. Tahoe can also be used in a RAID-like fashion using multiple disks to make a single large Redundant Array of Inexpensive Nodes (RAIN) pool of reliable data storage.

RozoFS is a free software distributed file system. It comes as a free software, licensed under the GNU GPL v2. RozoFS uses erasure coding for redundancy.

In computing, a distributed file system (DFS) or network file system is any file system that allows access to files from multiple hosts sharing via a computer network. This makes it possible for multiple users on multiple machines to share files and storage resources.

Software-defined storage (SDS) is a marketing term for computer data storage software for policy-based provisioning and management of data storage independent of the underlying hardware. Software-defined storage typically includes a form of storage virtualization to separate the storage hardware from the software that manages it. The software enabling a software-defined storage environment may also provide policy management for features such as data deduplication, replication, thin provisioning, snapshots and backup.

Red Hat Gluster Storage, formerly Red Hat Storage Server, is a computer storage product from Red Hat. It is based on open source technologies such as GlusterFS and Red Hat Enterprise Linux.

A distributed file system for cloud is a file system that allows many clients to have access to data and supports operations on that data. Each data file may be partitioned into several parts called chunks. Each chunk may be stored on different remote machines, facilitating the parallel execution of applications. Typically, data is stored in files in a hierarchical tree, where the nodes represent directories. There are several ways to share files in a distributed architecture: each solution must be suitable for a certain type of application, depending on how complex the application is. Meanwhile, the security of the system must be ensured. Confidentiality, availability and integrity are the main keys for a secure system.

ObjectiveFS is a distributed file system developed by Objective Security Corp. It is a POSIX-compliant file system built with an object store backend. It was initially released with AWS S3 backend, and has later implemented support for Google Cloud Storage and object store devices. It was released for beta in early 2013, and the first version was officially released on August 11, 2013.

<span class="mw-page-title-main">Dell Technologies PowerFlex</span> Software-defined storage product

Dell Technologies PowerFlex, is a commercial software-defined storage product from Dell Technologies that creates a server-based storage area network (SAN) from local server storage using x86 servers. It converts this direct-attached storage into shared block storage that runs over an IP-based network.

LizardFS is an open source distributed file system that is POSIX-compliant and licensed under GPLv3. It was released in 2013 as fork of MooseFS. LizardFS is also offering a paid Technical Support with possibility of configurating and setting up the cluster and active cluster monitoring.

ONTAP or Data ONTAP or Clustered Data ONTAP (cDOT) or Data ONTAP 7-Mode is NetApp's proprietary operating system used in storage disk arrays such as NetApp FAS and AFF, ONTAP Select, and Cloud Volumes ONTAP. With the release of version 9.0, NetApp decided to simplify the Data ONTAP name and removed the word "Data" from it, removed the 7-Mode image, therefore, ONTAP 9 is the successor of Clustered Data ONTAP 8.

References

  1. "About Us". gluster.com. 2008. Archived from the original on 2010-09-09. Retrieved 2022-07-31.
  2. 1 2 Raj, Chandan (2011-09-20). "California based Indian Entrepreneurs powering petabytes of cloud storage, the Gluster story". YourStory. Bengaluru, India: Scribd . Retrieved 2022-07-31.
  3. Chellani, Hitesh (2007-05-12). "Roadmap and support questions". gluster-devel (Mailing list). Retrieved 31 July 2022. Z Research was officially formed in June 2005 by AB (Anand Babu) aka "rooty" who is the CTO and myself with the goal of commoditizing Supercomputing and Superstorage and in the process validating yet another a business model around "Free Software", thus evangelizing "Free Software" and promoting the fact building businesses around "Free Software" is the way forward.
  4. 1 2 "Red Hat to Acquire Gluster". redhat.com. October 4, 2011. Archived from the original on May 30, 2013. Retrieved 2013-08-16.
  5. "Gluster: Open source scale-out NAS". InfoStor.com. 2011-02-17. Retrieved 2013-08-16.
  6. Kovar, Joseph F. (21 June 2010). "Page 17 - 2010 Storage Superstars: 25 You Need To Know". Crn.com. Retrieved 2013-08-16.
  7. Jason Kincaid (May 18, 2010). "Former Plaxo CEO Ben Golub Joins Gluster, An Open Source Storage Platform Startup". Tech Crunch. Retrieved August 20, 2013.
  8. "Former Plaxo CEO takes top spot at Gluster". Silicon Valley Business Journal. May 19, 2010. Retrieved August 20, 2013.
  9. "New product names. Same Great features". Archived from the original on April 2, 2015. Retrieved October 27, 2016.
  10. Red Hat access website (2022-10-10). "Red Hat Gluster Storage Life Cycle".
  11. Nathan Eddy (2011-02-11). "Gluster Introduces NAS Virtual Appliances for VMware, Amazon Web Services". Eweek.com. Retrieved 2013-08-16.
  12. "Gluster Virtual Storage Appliance". Storage Switzerland, LLC. Retrieved 1 September 2013.
  13. "glusterfs-10.1 released". 19 January 2022. Retrieved 11 September 2022.
  14. "Gluster 3.1: Understanding the GlusterFS License". Gluster Documentation. Gluster.org. Archived from the original on 3 May 2016. Retrieved 30 April 2014.
  15. Timothy Prickett Morgan (4 October 2011). "Red Hat snatches storage Gluster file system for $136m". The Register . Retrieved 3 July 2016.
  16. Timothy Prickett Morgan (27 June 2012). "Red Hat Storage Server NAS takes on Lustre, NetApp". The Register. Retrieved 30 May 2013.
  17. "Red Hat Storage. New product names. Same great features". redhat.com. 20 March 2015. Archived from the original on 2 April 2015. Retrieved 20 March 2015.
  18. "GlusterFS Documentation" . Retrieved January 28, 2018.
  19. Noronha, Ranjit; Panda, Dhabaleswar K (9–12 September 2008). IMCa: A High Performance Caching Front-End for GlusterFS on InfiniBand (PDF). 37th International Conference on Parallel Processing, 2008. ICPP '08. IEEE. doi:10.1109/ICPP.2008.84 . Retrieved 14 June 2011.
  20. Kwidama, Sevickson (2007–2008), Streaming and storing CineGrid data: A study on optimization methods (PDF), University of Amsterdam System and Network Engineering, archived from the original (PDF) on 2014-03-08, retrieved 10 June 2011
  21. Klaver, Jeroen; van der Jagt, Roel (14 July 2010), Distributed file system on the SURFnet network Report (PDF), University of Amsterdam System and Network Engineering, retrieved 9 June 2012[ dead link ]
  22. "Red Hat Storage Server". Web site. Red Hat. Retrieved 30 May 2013.