Linux-HA

Last updated
Heartbeat
Initial release1999
Final release
3.0.6 / February 2015;9 years ago (2015-02)
Written in C, Python
Operating system Linux, several UNIX variants
Type Cluster messaging layer
License GNU General Public License v2, GNU Lesser General Public License v2.1
Website Archived 2009-05-08 at the Wayback Machine

The Linux-HA (High-Availability Linux) project provides a high-availability (clustering) solution for Linux, FreeBSD, OpenBSD, Solaris and Mac OS X which promotes reliability, availability, and serviceability (RAS). [1]

Contents

The project's main software product is Heartbeat, a GPL-licensed portable cluster management program for high-availability clustering. Its most important features are:

History

The project originated from a mailing list started in November 1997. Eventually Harald Milz wrote an odd sort of Linux-HA HOWTO. Unlike most HOWTOs, this was not about how to configure or use existing software, it was a collection of HA techniques which one could use if one were to write HA software for Linux.

Alan Robertson was inspired by this description and thought that he could perhaps write some of the software for the project to act as a sort of initial seed crystal to help jump start the project. He got this initial software running on 18 March 1998. [2] He created the first web site for the project on 19 October 1998, [3] and the first version of the software was released on 15 November 1998. [4] The first production customer of the software was Rudy Pawul of ISO-NE. The ISO-NE web site went into production in the second half of 1999.

At this point, the project was limited to two nodes and very simple takeover semantics, and no resource monitoring. [1]

This was cured with version 2 of the software, which added n-node clusters, resource monitoring, dependencies, and policies. Version 2.0.0 came out on 29 July 2005. [5] This release represented another important milestone as it was the first version where very large contributions (in terms of code size) were made by the Linux-HA community at large. This series of releases brought the project to a level of feature parity-or-superiority with respect to commercial HA software.

After version 2.1.4, the cluster resource manager component (responsible for starting and stopping resources and monitoring resource and node failure) was split off into a separate project called Pacemaker, [6] and the resource agents and other "glue" infrastructure were moved to separate packages. Thus with the version 3 series, the name Heartbeat should be used for the cluster messaging layer only. [7]

See also

Notes

  1. 1 2 Alan Robertson The Evolution of The LinuxHA project. IBM Linux Technology Center, 2010
  2. "Linux-HA heart beats!". Lists.linux-ha.org. Archived from the original on 2008-11-19. Retrieved 2016-03-04.
  3. "MAC addr takeover". Lists.linux-ha.org. 1998-10-16. Archived from the original on 2011-07-19. Retrieved 2016-03-04.
  4. "Heartbeat Software Now Available". Archived from the original on November 16, 2005. Retrieved April 28, 2017.
  5. "[Linux-HA] Heartbeat, DRBD, Named-chroot, Fedora Core 4". Lists.linux-ha.org. Archived from the original on 2008-07-05. Retrieved 2016-03-04.
  6. "Project History". ClusterLabs.org. Retrieved 2016-03-04.
  7. "Heartbeat". Linux-HA.org. 2010-01-25. Archived from the original on 2016-03-04. Retrieved 2016-03-04.

Related Research Articles

<span class="mw-page-title-main">Beowulf cluster</span> Type of computing cluster

A Beowulf cluster is a computer cluster of what are normally identical, commodity-grade computers networked into a small local area network with libraries and programs installed which allow processing to be shared among them. The result is a high-performance parallel computing cluster from inexpensive personal computer hardware.

<span class="mw-page-title-main">Portage (software)</span> Gentoo package management system

Portage is a package management system originally created for and used by Gentoo Linux and also by ChromeOS, Calculate, Sabayon, and Funtoo Linux among others. Portage is based on the concept of ports collections. Gentoo is sometimes referred to as a meta-distribution due to the extreme flexibility of Portage, which makes it operating-system-independent. The Gentoo/Alt project was concerned with using Portage to manage other operating systems, such as BSDs, macOS and Solaris. The most notable of these implementations is the Gentoo/FreeBSD project.

chroot is an operation on Unix and Unix-like operating systems that changes the apparent root directory for the current running process and its children. A program that is run in such a modified environment cannot name files outside the designated directory tree. The term "chroot" may refer to the chroot(2) system call or the chroot(8) wrapper program. The modified environment is called a chroot jail.

Open Cluster Framework (OCF) is a set of standards for computer clustering.

High-availability clusters are groups of computers that support server applications that can be reliably utilized with a minimum amount of down-time. They operate by using high availability software to harness redundant computers in groups or clusters that provide continued service when system components fail. Without clustering, if a server running a particular application crashes, the application will be unavailable until the crashed server is fixed. HA clustering remedies this situation by detecting hardware/software faults, and immediately restarting the application on another system without requiring administrative intervention, a process known as failover. As part of this process, clustering software may configure the node before starting the application on it. For example, appropriate file systems may need to be imported and mounted, network hardware may have to be configured, and some supporting applications may need to be running as well.

Veritas Cluster Server is a high-availability cluster software for Unix, Linux and Microsoft Windows computer systems, created by Veritas Technologies. It provides application cluster capabilities to systems running other applications, including databases, network file sharing, and electronic commerce websites.

GPFS is high-performance clustered file system software developed by IBM. It can be deployed in shared-disk or shared-nothing distributed parallel modes, or a combination of these. It is used by many of the world's largest commercial companies, as well as some of the supercomputers on the Top 500 List. For example, it is the filesystem of the Summit at Oak Ridge National Laboratory which was the #1 fastest supercomputer in the world in the November 2019 TOP500 list of supercomputers. Summit is a 200 Petaflops system composed of more than 9,000 POWER9 processors and 27,000 NVIDIA Volta GPUs. The storage filesystem called Alpine has 250 PB of storage using Spectrum Scale on IBM ESS storage hardware, capable of approximately 2.5TB/s of sequential I/O and 2.2TB/s of random I/O.

<span class="mw-page-title-main">Distributed Replicated Block Device</span> Distributed replicated storage system for Linux

DRBD is a distributed replicated storage system for the Linux platform. It is implemented as a kernel driver, several userspace management applications, and some shell scripts. DRBD is traditionally used in high availability (HA) computer clusters, but beginning with DRBD version 9, it can also be used to create larger software defined storage pools with a focus on cloud integration.

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use. It has since also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.

IBM Reliable Scalable Cluster Technology (RSCT) is a set of software components that together provide a comprehensive clustering environment for AIX, Linux, Solaris, and Windows operating systems. RSCT is the infrastructure used by a variety of IBM products to provide clusters with improved system availability, scalability, and ease of use. It follows a list of main RSCT components:

The Red Hat Cluster includes software to create a high availability and load balancing cluster. Both can be used on the same system although this use case is unlikely. Both products, the High Availability Add-On and Load Balancer Add-On, are based on open-source community projects. Red Hat Cluster developers contribute code upstream for the community. Computational clustering is not part of cluster suite, but instead provided by Red Hat MRG.


This is a comparison of notable free and open-source configuration management software, suitable for tasks like server configuration, orchestration and infrastructure as code typically performed by a system administrator.

<span class="mw-page-title-main">Computer cluster</span> Set of computers configured in a distributed computing system

A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The newest manifestation of cluster computing is cloud computing.

<span class="mw-page-title-main">RPM Package Manager</span> Package management system

RPM Package Manager (RPM) is a free and open-source package management system. The name RPM refers to the .rpm file format and the package manager program itself. RPM was intended primarily for Linux distributions; the file format is the baseline package format of the Linux Standard Base.

Progress Chef is a configuration management tool written in Ruby and Erlang. It uses a pure-Ruby, domain-specific language (DSL) for writing system configuration "recipes". Chef is used to streamline the task of configuring and maintaining a company's servers, and can integrate with cloud-based platforms such as Amazon EC2, Google Cloud Platform, Oracle Cloud, OpenStack, IBM Cloud, Microsoft Azure, and Rackspace to automatically provision and configure new machines. Chef contains solutions for both small and large scale systems.

<span class="mw-page-title-main">Slurm Workload Manager</span> Free and open-source job scheduler for Linux and similar computers

The Slurm Workload Manager, formerly known as Simple Linux Utility for Resource Management (SLURM), or simply Slurm, is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world's supercomputers and computer clusters.

Pacemaker is an open-source high availability resource manager software used on computer clusters since 2004. Until about 2007, it was part of the Linux-HA project, then was split out to be its own project.

Kubernetes is an open-source container orchestration system for automating software deployment, scaling, and management. Originally designed by Google, the project is now maintained by a worldwide community of contributors, and the trademark is held by the Cloud Native Computing Foundation.

<span class="mw-page-title-main">Proxmox Virtual Environment</span> Linux distribution for server virtualization

Proxmox Virtual Environment is a hyper-converged infrastructure open-source software. It is a hosted hypervisor that can run operating systems including Linux and Windows on x64 hardware. It is a Debian-based Linux distribution with a modified Ubuntu LTS kernel and allows deployment and management of virtual machines and containers. Two types of virtualization are supported: container-based with LXC, and full virtualization with KVM. It includes a web-based management interface. There is also a mobile application available for controlling PVE environments.

References