Shared-nothing architecture

Last updated

A shared-nothing architecture (SN) is a distributed computing architecture in which each update request is satisfied by a single node (processor/memory/storage unit) in a computer cluster. The intent is to eliminate contention among nodes. Nodes do not share (independently access) the same memory or storage.

Contents

One alternative architecture is shared everything, in which requests are satisfied by arbitrary combinations of nodes. This may introduce contention, as multiple nodes may seek to update the same data at the same time. It also contrasts with shared-disk and shared-memory architectures.

SN eliminates single points of failure, allowing the overall system to continue operating despite failures in individual nodes and allowing individual nodes to upgrade hardware or software without a system-wide shutdown. [1]

A SN system can scale simply by adding nodes, since no central resource bottlenecks the system. [2] In databases, a term for the part of a database on a single node is a shard . A SN system typically partitions its data among many nodes. A refinement is to replicate commonly used but infrequently modified data across many nodes, allowing more requests to be resolved on a single node.

History

Michael Stonebraker at the University of California, Berkeley used the term in a 1986 database paper. [3] Teradata delivered the first SN database system in 1983. [4] Tandem Computers NonStop systems, a shared-nothing implementation of hardware and software was released to market in 1976. [5] [6] Tandem Computers later released NonStop SQL, a shared-nothing relational database, in 1984. [7]

Applications

Shared-nothing is popular for web development.

Shared-nothing architectures are prevalent for data warehousing applications, although requests that require data from multiple nodes can dramatically reduce throughput. [8]

See also

Related Research Articles

A distributed database is a database in which data is stored across different physical locations. It may be stored in multiple computers located in the same physical location ; or maybe dispersed over a network of interconnected computers. Unlike parallel systems, in which the processors are tightly coupled and constitute a single database system, a distributed database system consists of loosely coupled sites that share no physical components.

<span class="mw-page-title-main">Ingres (database)</span> Database software

Ingres Database is a proprietary SQL relational database management system intended to support large commercial and government applications.

Scalability is the property of a system to handle a growing amount of work. One definition for software systems specifies that this may be done by adding resources to the system.

Tandem Computers, Inc. was the dominant manufacturer of fault-tolerant computer systems for ATM networks, banks, stock exchanges, telephone switching centers, 911 systems, and other similar commercial transaction processing applications requiring maximum uptime and zero data loss. The company was founded by Jimmy Treybig in 1974 in Cupertino, California. It remained independent until 1997, when it became a server division within Compaq. It is now a server division within Hewlett Packard Enterprise, following Hewlett-Packard's acquisition of Compaq and the split of Hewlett-Packard into HP Inc. and Hewlett Packard Enterprise.

NonStop is a series of server computers introduced to market in 1976 by Tandem Computers Inc., beginning with the NonStop product line. It was followed by the Tandem Integrity NonStop line of lock-step fault tolerant computers, now defunct. The original NonStop product line is currently offered by Hewlett Packard Enterprise since Hewlett-Packard Company's split in 2015. Because NonStop systems are based on an integrated hardware/software stack, Tandem and later HPE also developed the NonStop OS operating system for them.

MapReduce is a programming model and an associated implementation for processing and generating big data sets with a parallel, distributed algorithm on a cluster.

In database computing, Oracle Real Application Clusters (RAC) — an option for the Oracle Database software produced by Oracle Corporation and introduced in 2001 with Oracle9i — provides software for clustering and high availability in Oracle database environments. Oracle Corporation includes RAC with the Enterprise Edition, provided the nodes are clustered using Oracle Clusterware.

<span class="mw-page-title-main">Database machine</span>

A database machines or back end processor is a computer or special hardware that stores and retrieves data from a database. It is specially designed for database access and is tightly coupled to the main (front-end) computer(s) by a high-speed channel, whereas a database server is a general-purpose computer that holds a database and it's loosely coupled via a local area network to its clients.

In computing, the term data warehouse appliance (DWA) was coined by Foster Hinshaw for a computer architecture for data warehouses (DW) specifically marketed for big data analysis and discovery that is simple to use and has a high performance for the workload. A DWA includes an integrated set of servers, storage, operating systems, and databases.

<span class="mw-page-title-main">Britton Lee, Inc.</span> American relational database company

Britton Lee Inc. was a pioneering relational database company. Renamed ShareBase, it was acquired by Teradata in June, 1990.

<span class="mw-page-title-main">Computer cluster</span> Set of computers configured in a distributed computing system

A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The newest manifestation of cluster computing is cloud computing.

A database shard, or simply a shard, is a horizontal partition of data in a database or search engine. Each shard is held on a separate database server instance, to spread load.

<span class="mw-page-title-main">Greenplum</span>

Greenplum is a big data technology based on MPP architecture and the Postgres open source database technology. The technology was created by a company of the same name headquartered in San Mateo, California around 2005. Greenplum was acquired by EMC Corporation in July 2010.

<span class="mw-page-title-main">Michael Stonebraker</span> American computer scientist (born 1943)

Michael Ralph Stonebraker is a computer scientist specializing in database systems. Through a series of academic prototypes and commercial startups, Stonebraker's research and products are central to many relational databases. He is also the founder of many database companies, including Ingres Corporation, Illustra, Paradigm4, StreamBase Systems, Tamr, Vertica and VoltDB, and served as chief technical officer of Informix. For his contributions to database research, Stonebraker received the 2014 Turing Award, often described as "the Nobel Prize for computing."

Volt Active Data is an in-memory database designed by Michael Stonebraker, Sam Madden, and Daniel Abadi.

Data-intensive computing is a class of parallel computing applications which use a data parallel approach to process large volumes of data typically terabytes or petabytes in size and typically referred to as big data. Computing applications that devote most of their execution time to computational requirements are deemed compute-intensive, whereas applications are deemed data-intensive require large volumes of data and devote most of their processing time to I/O and manipulation of data.

H-Store is an experimental database management system (DBMS). It was designed for online transaction processing applications. H-Store was developed by a team at Brown University, Carnegie Mellon University, the Massachusetts Institute of Technology, and Yale University in 2007 by researchers Michael Stonebraker, Sam Madden, Andy Pavlo and Daniel Abadi.

<span class="mw-page-title-main">Oracle NoSQL Database</span> Distributed database

Oracle NoSQL Database is a NoSQL-type distributed key-value database from Oracle Corporation. It provides transactional semantics for data manipulation, horizontal scalability, and simple administration and monitoring.

NewSQL is a class of relational database management systems that seek to provide the scalability of NoSQL systems for online transaction processing (OLTP) workloads while maintaining the ACID guarantees of a traditional database system.

Database scalability is the ability of a database to handle changing demands by adding/removing resources. Databases use a host of techniques to cope.

References

  1. Wright, Dave (2014-09-17). "The Advantages of a Shared Nothing Architecture for Truly Non-Disruptive Upgrades". netapp.com. Retrieved 2019-10-31.
  2. Blankenhorn, Dana (February 27, 2006). "Shared nothing coming to open source". ZDNet. Retrieved June 21, 2012.
  3. Michael Stonebraker (1986). "The Case for Shared Nothing Architecture" (PDF). Database Engineering. 9 (1).
  4. "Teradata History". Teradata.com. Retrieved 2013-06-16.
  5. ""Tandem History: An Introduction"". Center Magazine: A Newsletter for Tandem Employees. 6 (1). Winter 1986.
  6. "History of TANDEM COMPUTERS, INC. – FundingUniverse". www.fundinguniverse.com. Retrieved 2023-03-01.
  7. "NonStop SQL, A Distributed, High-Performance, High-Availability Implementation of SQL, Tandem Technical Report TR-87.4" (PDF). Archived from the original (PDF) on 2012-03-16. Retrieved 2012-10-11.
  8. "Article on Shared Nothing from the point of view of a Shared Nothing Vendor" (PDF).