OpenSSI

Last updated
OpenSSI
Developer(s) OpenSSI Team [1]
Stable release
1.9.3 [1] / 1 September 2007
Preview release
1.9.6 [1] / 18 February 2010
Operating system Linux
Type Cluster software
License GPL v2
Website https://openssi.org (archive.org)

OpenSSI is an open-source single-system image clustering system. It allows a collection of computers to be treated as one large system, allowing applications running on any one machine access to the resources of all the machines in the cluster. [2] [3]

Contents

OpenSSI is based on the Linux operating system and was released as an open source project by Compaq in 2001. [4] It is the final stage of a long process of development, stretching back to LOCUS, developed in the early 1980s.

Description

OpenSSI allows a cluster of individual computers (nodes) to be treated as one large system. Processes run on any node have full access to the resources of all nodes. Processes can be migrated from node to node automatically to balance system utilization. Inbound network connections can be directed to the least loaded node available.

OpenSSI is designed to be used for both high performance and high availability clusters. It is possible to create an OpenSSI cluster with no single point of failure, for example the file system can be mirrored between two nodes, so if one node crashes the process accessing the file will fail over to the other node. Alternatively the cluster can be designed in such a manner that every node has direct access to the file system.

Features

Single process space

OpenSSI provides a single process space – every process is visible from every node, and can be managed from any node using the normal Linux commands (ps, kill, renice and so on). The Linux /proc virtual filesystem shows all running processes on all nodes.

The implementation of the single process space is accomplished using the VPROC abstraction invented by Locus for the OSF/1 AD operating system.

Migration

OpenSSI allows migration of running processes between nodes. When running processes are migrated they continue to have access to any open files, IPC objects or network connections.

Processes can be manually migrated, either by the process calling the special OpenSSI migrate(2) system call, or by writing a node number to a special file in the processes /proc directory.

Processes may also, if the user wants, be automatically migrated in order to balance load across the cluster. OpenSSI uses an algorithm developed by the MOSIX project for determining the load on each node.

Single root

OpenSSI provides a single root for the cluster - from any node the same files and directories are available. OpenSSI uses several mechanisms to provide the single root – CFS (the OpenSSI Cluster File System), SAN cluster filesystems and parallel mounts of network file systems.

OpenSSI uses the context dependent symbolic link (CDSL) feature, inspired by HP's TruCluster system, to allow access to node-specific files in a manner transparent to non cluster-aware applications. A CDSL may point to different files on each node in the cluster.

CFS

CFS, the OpenSSI Cluster File System provides transparent inter-node access to an underlying real file system on one node.

CFS is stacked on top of the real file system and co-ordinates access from different nodes using a token mechanism. One node has physical access to the underlying file system and performs all read and write operations. At any one time one node owns a token, representing a part of the underlying file, this implies that that part of the file is in the cache of the owning node. If another node tries to access that part of the file the token is stolen and the cache contents are copied to the stealing node. The OpenSSI CFS implementation is remarkably similar to that used by HP TruCluster. [5]

CFS is also used to co-ordinate access to shared memory segments.

CFS can be used in a fault tolerant system by using shared disk subsystems (dual ported SCSI or SAN), or by using DRBD. If the node that is currently directly accessing the file system crashes then the CFS mount fails over to the other node that is directly connected to the disk and the cluster now accesses the file system via that node.

SAN clustered file systems

OpenSSI can use SAN based clustered file systems for its root provided they provide a POSIX compatible file system interface. Lustre and Global File System have been tested, too.

With a clustered file system, each node mounts the file system in parallel and access to the files goes directly from the node to the file system.

NFS

OpenSSI mounts NFS files systems in parallel on each node. Every node accesses the NFS server directly.

Single I/O space

OpenSSI provides cluster-wide access to all I/O devices on the system, with some limitations - it is not possible for a node to mount a block device from another node.

The udev device manager is used to manage the /dev directory. Each node runs its own copy of udev to create the appropriate device nodes in a subdirectory of /dev, /dev/1 for node 1, /dev/2 for node 2 and so on.

Single IPC space

OpenSSI provides internode access to all the standard Linux inter-process communication mechanisms, shared memory, semaphores, SYSV message queues, pipes and Unix domain sockets.

In order to implement cluster wide shared memory – distributed shared memory – OpenSSI uses the CFS token system. At any one time a memory segment may be readable by one or more nodes, or writable by one node. If a node without write access to a segment tries to write then the segment is marked unreadable on all other nodes and writable on the current node. If a node without read access tries to read a segment then the current value is copied from a node where it was valid and if it was writable it is marked readable.

Cluster IP address

OpenSSI uses LVS to provide fault-tolerant load balanced IP services. Inbound network connections are received by a director node which redirects them to the least loaded server node. (A node may be both a director and server). In the event of director node failure another director node takes over and the system continues to accept inbound connections.

Distributions

The OpenSSI software is available for various Linux distributions. The OpenSSI kernel is distribution independent but various distribution specific Linux user level systems need to be modified, for example the init process and the system startup scripts.

In 2010 the most recent supported Linux distributions were:

  1. Fedora Core 3
  2. Debian Sarge

Since 2008, work was in progress to port OpenSSI to Debian Etch and Lenny distributions. [6]

History

The origins of OpenSSI date back to the early 1980s, when the LOCUS distributed operating system was developed at UCLA. The group that created ″LOCUS″ went on to form the Locus Computing Corporation in 1982, and produced versions of the technology derived from it under several names. In the mid-1990s, this work culminated in the UnixWare NonStop Clusters product at Tandem Computers, which by 1995 took over the team of the former Locus CC and rights to the technology. Compaq would purchase Tandem Computers in 1997. The NonStop Clusters were commercialized by Santa Cruz Operation as an add-on for UnixWare. When SCO Group stopped selling the product, the developers (brought in by the Tandem acquisition and now working at Compaq) ported the ″NonStop Clusters″ code to Linux and in 2001, now called OpenSSI, released it as open source. Employees continued development for some time after Compaq being acquired by Hewlett-Packard (which happened in 2002). Over the next decade OpenSSI was enhanced by independent contributors.

See also

Related Research Articles

<span class="mw-page-title-main">Tru64 UNIX</span> Computer operating system

Tru64 UNIX is a discontinued 64-bit UNIX operating system for the Alpha instruction set architecture (ISA), currently owned by Hewlett-Packard (HP). Previously, Tru64 UNIX was a product of Compaq, and before that, Digital Equipment Corporation (DEC), where it was known as Digital UNIX.

<span class="mw-page-title-main">Beowulf cluster</span> Type of computing cluster

A Beowulf cluster is a computer cluster of what are normally identical, commodity-grade computers networked into a small local area network with libraries and programs installed which allow processing to be shared among them. The result is a high-performance parallel computing cluster from inexpensive personal computer hardware.

MOSIX is a proprietary distributed operating system. Although early versions were based on older UNIX systems, since 1999 it focuses on Linux clusters and grids. In a MOSIX cluster/grid there is no need to modify or to link applications with any library, to copy files or login to remote nodes, or even to assign processes to different nodes – it is all done automatically, like in an SMP.

openMosix Distributed operating system

openMosix was a free cluster management system that provided single-system image (SSI) capabilities, e.g. automatic work distribution among nodes. It allowed program processes to migrate to machines in the node's network that would be able to run that process faster. It was particularly useful for running parallel applications having low to moderate input/output (I/O). It was released as a Linux kernel patch, but was also available on specialized Live CDs. openMosix development has been halted by its developers, but the LinuxPMI project is continuing development of the former openMosix code.

In distributed computing, a single system image (SSI) cluster is a cluster of machines that appears to be one single system. The concept is often considered synonymous with that of a distributed operating system, but a single image may be presented for more limited purposes, just job scheduling for instance, which may be achieved by means of an additional layer of software over conventional operating system images running on each node. The interest in SSI clusters is based on the perception that they may be simpler to use and administer than more specialized clusters.

Kerrighed is an open source single-system image (SSI) cluster software project. The project started in October 1998 at the Paris research group The French National Institute for Research in Computer Science and Control. From 2006 to 2011, the project was mainly developed by Kerlabs. In January, 2012 the Linux clustering mission of Kerlabs was adopted by a new company: We Cluster, Inc. headquartered in Pacific Grove, California. January 18, 2012: Kerrighed 3.0 has been ported to Ubuntu 12.04 with Linux Kernel v3.2.

udev is a device manager for the Linux kernel. As the successor of devfsd and hotplug, udev primarily manages device nodes in the /dev directory. At the same time, udev also handles all user space events raised when hardware devices are added into the system or removed from it, including firmware loading as required by certain devices.

Lustre is a type of parallel distributed file system, generally used for large-scale cluster computing. The name Lustre is a portmanteau word derived from Linux and cluster. Lustre file system software is available under the GNU General Public License and provides high performance file systems for computer clusters ranging in size from small workgroup clusters to large-scale, multi-site systems. Since June 2005, Lustre has consistently been used by at least half of the top ten, and more than 60 of the top 100 fastest supercomputers in the world, including the world's No. 1 ranked TOP500 supercomputer in November 2022, Frontier, as well as previous top supercomputers such as Fugaku, Titan and Sequoia.

In database computing, Oracle Real Application Clusters (RAC) — an option for the Oracle Database software produced by Oracle Corporation and introduced in 2001 with Oracle9i — provides software for clustering and high availability in Oracle database environments. Oracle Corporation includes RAC with the Enterprise Edition, provided the nodes are clustered using Oracle Clusterware.

HAL is a software subsystem for UNIX-like operating systems providing hardware abstraction.

A diskless shared-root cluster is a way to manage several machines at the same time. Instead of each having its own operating system (OS) on its local disk, there is only one image of the OS available on a server, and all the nodes use the same image.

A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.

In Unix-like operating systems, a device file or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These special files allow an application program to interact with a device by using its device driver via standard input/output system calls. Using standard system calls simplifies many programming tasks, and leads to consistent user-space I/O mechanisms regardless of device features and functions.

<span class="mw-page-title-main">Computer cluster</span> Set of computers configured in a distributed computing system

A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software.

LOCUS is a discontinued distributed operating system developed at UCLA during the 1980s. It was notable for providing an early implementation of the single-system image idea, where a cluster of machines appeared to be one larger machine.

Locus Computing Corporation was formed in 1982 by Gerald J. Popek, Charles S. Kline and Gregory I. Thiel to commercialize the technologies developed for the LOCUS distributed operating system at UCLA. Locus was notable for commercializing single-system image software and producing the Merge package which allowed the use of DOS and Windows 3.1 software on Unix systems.

NonStop Clusters (NSC) was an add-on package for SCO UnixWare that allowed creation of fault-tolerant single-system image clusters of machines running UnixWare. NSC was one of the first commercially available highly available clustering solutions for commodity hardware.

<span class="mw-page-title-main">Linoma Software</span>

Linoma Software was a developer of secure managed file transfer and IBM i software solutions. The company was acquired by HelpSystems in June 2016; HelpSystems changed its name to Fortra in November 2022. Mid-sized companies, large enterprises and government entities use Linoma's software products to protect sensitive data and comply with data security regulations such as PCI DSS, HIPAA/HITECH, SOX, GLBA and state privacy laws. Linoma's software runs on a variety of platforms including Windows, Linux, UNIX, IBM i, AIX, Solaris, HP-UX and Mac OS X.

systemd Suite of system components for Linux

Systemd is a software suite that provides an array of system components for Linux operating systems. The main aim is to unify service configuration and behavior across Linux distributions. Its primary component is a "system and service manager" – an init system used to bootstrap user space and manage user processes. It also provides replacements for various daemons and utilities, including device management, login management, network connection management, and event logging. The name systemd adheres to the Unix convention of naming daemons by appending the letter d. It also plays on the term "System D", which refers to a person's ability to adapt quickly and improvise to solve problems.

<span class="mw-page-title-main">Proxmox Virtual Environment</span> Linux distribution for server virtualization

Proxmox Virtual Environment is a hyper-converged infrastructure open-source software. It is a hosted hypervisor that can run operating systems including Linux and Windows on x64 hardware. It is a Debian-based Linux distribution with a modified Ubuntu LTS kernel and allows deployment and management of virtual machines and containers. Proxmox VE includes a web console and command-line tools, and provides a REST API for third-party tools. Two types of virtualization are supported: container-based with LXC, and full virtualization with KVM. It includes a web-based management interface.

References

  1. 1 2 3 OpenSSI (Single System Image) Clusters for Linux, archived from the original on 2013-12-26, retrieved 2023-07-01 (archived)
  2. Ferri, Richard; Watson, Brian J. (2003-08-01), samag.com, Sys Admin Magazine > Server Management > Introducing the OpenSSI Project, archived from the original on 2004-10-27, retrieved 2008-07-06 (archived)
  3. Pfister, Gregory F. (1998), In Search of Clusters , Upper Saddle River, NJ: Prentice Hall PTR, ISBN   978-0-13-899709-0, OCLC   38300954
  4. Orlowski, Andrew (2001-11-14), "Compaq cavalry rescues Linux clusters", The Register, retrieved 2008-10-06
  5. Fafrak, Scott; Lola, Jim A.; Nichols, Brad; Yates, Gregory (2003), TruCluster Server Handbook, Digital Press, pp. 342–345, ISBN   1-55558-259-1
  6. OpenSSI on Debian Etch, prerelease, archived from the original on 2013-04-26, retrieved 2023-07-01 (archived)