VMScluster

Last updated

A VMScluster, originally known as a VAXcluster, is a computer cluster involving a group of computers running the OpenVMS operating system. Whereas tightly coupled multiprocessor systems run a single copy of the operating system, a VMScluster is loosely coupled: each machine runs its own copy of OpenVMS, but the disk storage, lock manager, and security domain are all cluster-wide, providing a single system image abstraction. Machines can join or leave a VMScluster without affecting the rest of the cluster. For enhanced availability, VMSclusters support the use of dual-ported disks connected to two machines or storage controllers simultaneously. With OpenVMS now ported to Alpha and IA-64 machines, the facility originally named VAXclustering was renamed to VMSclustering.

Contents

Initial release

Digital Equipment Corporation first announced VAXclusters in May 1983. At this stage, clustering required specialised communications hardware, as well as some major changes to low-level subsystems in VMS. The software and hardware were designed jointly. VAXcluster support was first added in VAX/VMS V4.0, which was released in 1984. This version only supported clustering over CI. Later releases of version 4 supported clustering over LAN. A LAN-based cluster is often called a LAVc, for Local Area Network VMScluster, and allows, among other things, bootstrapping a possibly diskless satellite node over the network using the system disk of a bootnode.

At the center of each cluster was a star coupler, to which every node (computer) and data storage device in the cluster was connected by one or two pairs of CI cables, short for Computer Interconnect. Each pair of cables had a transmission rate of 70 megabits per second, a high speed for that era. Using two pairs gave an aggregate transmission rate of 140 megabits per second, with redundancy in case one cable failed; the star couplers also had redundant wiring for better availability.

Each CI cable connected to its computer via a CI Port, which could send and receive packets without any CPU involvement. To send a packet, a CPU had only to create a small data structure in memory and append it to a "send" queue; similarly, the CI Port would append each incoming message to a "receive" queue. Tests showed that a VAX-11/780 could send and receive 3000 messages per second, even though it was nominally a 1-MIPS machine. The closely related Mass Storage Control Protocol (MSCP) allowed similarly high performance from the mass storage subsystem. In addition, MSCP packets were very easily transported over the CI allowing remote access to storage devices.

VAXclustering was the first clustering system to achieve commercial success, and was a major selling point for VAX systems.

Later developments

In 1986, DEC added VAXclustering support to their MicroVAX minicomputers, running over Ethernet instead of special-purpose hardware. While not giving the high-availability advantages of the CI hardware, these Local Area VAXclusters provided an attractive expansion path for buyers of low-end minicomputers.

Later versions of OpenVMS (V5.0 and later) supported "mixed interconnect" VAXclusters (using both CI and Ethernet), and VAXclustering over DSSI (Digital Systems and Storage Interconnect), SCSI and FDDI, among other transports. Eventually, as high-bandwidth wide area networking became available, clustering was extended to allow satellite data links and long-distance terrestrial links. This allowed the creation of disaster-tolerant clusters; by locating the single VAXcluster in several diverse geographical areas, the cluster could survive infrastructure failures and natural disasters.

VAXclustering was greatly aided by the introduction of terminal servers using the LAT protocol. By allowing ordinary serial terminals to access the host nodes via Ethernet, it became possible for any terminal to rapidly and easily connect to any host node. This made it much simpler to accomplish fail over of the user terminals from one node of the cluster to another.

Clustering over TCP/IP is supported in OpenVMS version 8.4, which was released in 2010. With Gigabit Ethernet now common and 10 Gigabit Ethernet being introduced, standard networking cables and cards are quite sufficient to support VMSclustering.

Features

OpenVMS supports up to 96 nodes in a single cluster, and allows mixed-architecture clusters, where VAX and Alpha systems, or Alpha and Itanium systems can co-exist in a single cluster (Various organizations have demonstrated triple-architecture clusters and cluster configurations with up to 150 nodes, but these configurations are not officially supported).

Unlike many other clustering solutions, VMScluster offers transparent and fully distributed read-write with record-level locking, which means that the same disk and even the same file can be accessed by several cluster nodes at once; the locking occurs only at the level of a single record of a file, which would usually be one line of text or a single record in a database. This allows the construction of high-availability multiply redundant database servers.

Cluster connections can span upwards of 500 miles (800 km), allowing member nodes to be located in different buildings on an office campus, or in different cities.

Host-based volume shadowing allows volumes (of the same or of different sizes) to be shadowed (mirrored) across multiple controllers and multiple hosts, allowing the construction of disaster-tolerant environments.

Full access into the distributed lock manager (DLM) is available to application programmers, and this allows applications to coordinate arbitrary resources and activities across all cluster nodes. This includes file-level coordination, but the resources and activities and operations that can be coordinated with the DLM are completely arbitrary.

With the supported capability of rolling upgrades and multiple system disks, cluster configurations can be maintained on-line and upgraded incrementally. This allows cluster configurations to continue to provide application and data access while a subset of the member nodes are upgraded to newer software versions. [1] [2] Cluster uptimes are frequently measured in years with the current longest uptime being at least sixteen years. [3]

Related Research Articles

Digital Equipment Corporation, using the trademark Digital, was a major American company in the computer industry from the 1960s to the 1990s. The company was co-founded by Ken Olsen and Harlan Anderson in 1957. Olsen was president until forced to resign in 1992, after the company had gone into precipitous decline.

VAX Computer architecture, and a range of computers

VAX is a line of superminicomputers and workstations developed by the Digital Equipment Corporation (DEC) in the mid-1970s. The VAX-11/780, introduced October 25, 1977, was the first of a range of popular and influential computers implementing the VAX instruction set architecture (ISA). Over 100 models were introduced over the lifetime of the design, with the last members arriving in the early 1990s. The VAX was succeeded by the DEC Alpha, which included several features from VAX machines to make porting from the VAX easier.

OpenVMS Computer operating system

OpenVMS, often referred to as just VMS, is a multi-user, multiprocessing virtual memory-based operating system designed for use in time-sharing, batch processing, and transaction processing. It was first released by Digital Equipment Corporation as VAX/VMS alongside the VAX-11/780 minicomputer in 1977. Since 2014 OpenVMS is developed and supported by a company named VMS Software Inc. (VSI).

DECnet is a suite of network protocols created by Digital Equipment Corporation. Originally released in 1975 in order to connect two PDP-11 minicomputers, it evolved into one of the first peer-to-peer network architectures, thus transforming DEC into a networking powerhouse in the 1980s. Initially built with three layers, it later (1982) evolved into a seven-layer OSI-compliant networking protocol.

ARCNET

Attached Resource Computer NETwork is a communications protocol for local area networks. ARCNET was the first widely available networking system for microcomputers; it became popular in the 1980s for office automation tasks. It was later applied to embedded systems where certain features of the protocol are especially useful.

Ultrix is the brand name of Digital Equipment Corporation's (DEC) discontinued native Unix operating systems for the PDP-11, VAX, MicroVAX and DECstations.

Xserve Apple rack-mounted server

Xserve is a line of rack unit computers designed by Apple Inc. for use as servers. Introduced in 2002, it was Apple's first designated server hardware design since the Apple Network Server in 1996. In the meantime, ordinary Power Macintosh G3 and G4 models were rebranded as Macintosh Server G3 and Macintosh Server G4 with some alterations to the hardware, such as added Gigabit Ethernet cards, UltraWide SCSI cards, extra large and fast hard drives etc. and shipped with Mac OS X Server software. The Xserve initially featured one or two PowerPC G4 processors, but later switched over to the then-new PowerPC G5, transitioned to Intel with the Core 2-based Xeon offerings and subsequently switched again to two quad-core Intel Nehalem microprocessors.

In computing, the Global File System 2 or GFS2 is a shared-disk file system for Linux computer clusters. GFS2 allows all members of a cluster to have direct concurrent access to the same shared block storage, in contrast to distributed file systems which distribute data throughout the cluster. GFS2 can also be used as a local file system on a single computer.

MySQL Cluster is a technology providing shared-nothing clustering and auto-sharding for the MySQL database management system. It is designed to provide high availability and high throughput with low latency, while allowing for near linear scalability. MySQL Cluster is implemented through the NDB or NDBCLUSTER storage engine for MySQL.

Lustre is a type of parallel distributed file system, generally used for large-scale cluster computing. The name Lustre is a portmanteau word derived from Linux and cluster. Lustre file system software is available under the GNU General Public License and provides high performance file systems for computer clusters ranging in size from small workgroup clusters to large-scale, multi-site systems. Since June 2005, Lustre has consistently been used by at least half of the top ten, and more than 60 of the top 100 fastest supercomputers in the world, including the world's No. 1 ranked TOP500 supercomputer in June 2020, Fugaku, as well as previous top supercomputers such as Titan and Sequoia.

High-availability clusters are groups of computers that support server applications that can be reliably utilized with a minimum amount of down-time. They operate by using high availability software to harness redundant computers in groups or clusters that provide continued service when system components fail. Without clustering, if a server running a particular application crashes, the application will be unavailable until the crashed server is fixed. HA clustering remedies this situation by detecting hardware/software faults, and immediately restarting the application on another system without requiring administrative intervention, a process known as failover. As part of this process, clustering software may configure the node before starting the application on it. For example, appropriate file systems may need to be imported and mounted, network hardware may have to be configured, and some supporting applications may need to be running as well.

A NetApp FAS is a computer storage product by NetApp running the ONTAP operating system; the terms ONTAP, AFF, ASA, FAS are often used as synonyms. "Filer" is also used as a synonym although this is not an official name. There are three types of FAS systems: Hybrid, All-Flash, and All SAN Array:

  1. NetApp proprietary custom-build hardware appliances with HDD or SSD drives called hybrid Fabric-Attached Storage
  2. NetApp proprietary custom-build hardware appliances with only SSD drives and optimized ONTAP for low latency called ALL-Flash FAS
  3. All SAN Array build on top of AFF platform, and provide only SAN-based data protocol connectivity.
VAX-11 Family of minicomputers by Digital Equipment Corporation

The VAX-11 is a discontinued family of superminicomputers developed and manufactured by Digital Equipment Corporation (DEC) using processors implementing the Virtual Address eXtension (VAX) instruction set architecture (ISA), succeeding the PDP-11. The VAX-11/780 was the first VAX computer.

PATHWORKS was the trade name used by Digital Equipment Corporation of Maynard, Massachusetts for a series of programs that eased the interoperation of Digital's minicomputers with personal computers. It was available for both PC and Mac systems, with support for MS-DOS, OS/2 and Microsoft Windows on the PC.

A diskless shared-root cluster is a way to manage several machines at the same time. Instead of each having its own operating system (OS) on its local disk, there is only one image of the OS available on a server, and all the nodes use the same image.

A clustered file system is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.

Computer cluster

A computer cluster is a set of loosely or tightly connected computers that work together so that, in many aspects, they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software.

The VAXft was a family of fault-tolerant minicomputers developed and manufactured by Digital Equipment Corporation (DEC) using processors implementing the VAX instruction set architecture (ISA). "VAXft" stood for "Virtual Address Extension, fault tolerant". These systems ran the OpenVMS operating system, and were first supported by VMS 5.4. Two layered software products, VAXft System Services and VMS Volume Shadowing, were required to support the fault-tolerant features of the VAXft and for the redundancy of data stored on hard disk drives.

The Digital Storage Systems Interconnect (DSSI) is a computer bus developed by Digital Equipment Corporation for connecting storage devices and clustering VAX systems. It was designed as a smaller and lower-cost replacement for the earlier DEC Computer Interconnect that would be more suitable for use in office environments. DSSI was superseded by Parallel SCSI.

ONTAP or Data ONTAP or Clustered Data ONTAP (cDOT) or Data ONTAP 7-Mode is NetApp's proprietary operating system used in storage disk arrays such as NetApp FAS and AFF, ONTAP Select and Cloud Volumes ONTAP. With the release of version 9.0, NetApp decided to simplify the Data ONTAP name and removed word "Data" from it and remove 7-Mode image, therefore, ONTAP 9 is successor from Clustered Data ONTAP 8.

References

  1. "VSI OpenVMS Cluster Systems" (PDF). August 2019.
  2. "VSI Products - Clusters".
  3. Uptimes Project breakdown for VMSclusters

Further reading