Developer | National Partnership for Advanced Computational Infrastructure, SDSC, UCSD |
---|---|
OS family | Linux (Unix-like) |
Working state | Active |
Source model | Open source |
Latest release | 7.0 [1] (Manzanita) / 1 December 2017 |
Available in | English |
Kernel type | Monolithic (Linux) |
License | Various |
Official website | www |
Rocks Cluster Distribution (originally NPACI Rocks) is a Linux distribution intended for high-performance computing (HPC) clusters. It was started by National Partnership for Advanced Computational Infrastructure and the San Diego Supercomputer Center (SDSC) in 2000. [2] It was initially funded in part by an NSF grant (2000–07), [3] but was funded by the follow-up NSF grant through 2011. [4]
Rocks was initially based on the Red Hat Linux (RHL) distribution, however modern versions of Rocks were based on CentOS, with a modified Anaconda installer that simplifies mass installation onto many computers. Rocks includes many tools (such as Message Passing Interface (MPI)) which are not part of CentOS but are integral components that make a group of computers into a cluster.
Installations can be customized with additional software packages at install-time by using special user-supplied CDs (called "Roll CDs"). The "Rolls" extend the system by integrating seamlessly and automatically into the management and packaging mechanisms used by base software, greatly simplifying installation and configuration of large numbers of computers. [5] Over a dozen Rolls have been created, including the Sun Grid Engine (SGE) roll, the Condor roll, the Lustre roll, the Java roll, and the Ganglia roll.
By October 2010, Rocks was used for academic, government, and commercial organizations, employed in 1,376 clusters, on every continent except Antarctica. [6] The largest registered academic cluster, having 8632 CPUs, is GridKa, [7] operated by the Karlsruhe Institute of Technology in Karlsruhe, Germany. There are also a number of clusters ranging down to fewer than 10 CPUs, representing the early stages in the construction of larger systems, as well as being used for courses in cluster design. This easy scalability was a major goal in the development of Rocks, both for the researchers involved, [2] and for the NSF:
Broader impact mirrors intellectual merit, and specifically lies in Rocks' new capabilities enabling management of very large clusters such as those emerging from the NSF Track 2 program, the ease of configuration of clusters supporting virtualization capabilities and generally the continuing effect of Rocks on installation and use of Linux clusters across NSF communities.
— SDCI: NMI: Improvement: The Rocks Cluster Toolkit and Extensions to Build User-Defined Cyberenvironments [4]
Release Date | Rocks version | CentOS version |
---|---|---|
Dec 2017 | Rocks 7.0 | CentOS 7.4 |
May 2015 | Rocks 6.2 | CentOS 6.6 |
Apr 2014 | Rocks 6.1.1 | CentOS 6.5 |
Nov 2012 | Rocks 6.1 | CentOS 6.3 |
A Linux distribution is an operating system made from a software collection that includes the Linux kernel, and often a package management system. Linux users usually obtain their operating system by downloading one of the Linux distributions, which are available for a wide variety of systems ranging from embedded devices and personal computers to powerful supercomputers.
Grid computing is the use of widely distributed computer resources to reach a common goal. A computing grid can be thought of as a distributed system with non-interactive workloads that involve many files. Grid computing is distinguished from conventional high-performance computing systems such as cluster computing in that grid computers have each node set to perform a different task/application. Grid computers also tend to be more heterogeneous and geographically dispersed than cluster computers. Although a single grid can be dedicated to a particular application, commonly a grid is used for a variety of purposes. Grids are often constructed with general-purpose grid middleware software libraries. Grid sizes can be quite large.
A Beowulf cluster is a computer cluster of what are normally identical, commodity-grade computers networked into a small local area network with libraries and programs installed which allow processing to be shared among them. The result is a high-performance parallel computing cluster from inexpensive personal computer hardware.
A web hosting service is a type of Internet hosting service that hosts websites for clients, i.e. it offers the facilities required for them to create and maintain a site and makes it accessible on the World Wide Web. Companies providing web hosting services are sometimes called web hosts.
Yellow Dog Linux (YDL) is a discontinued free and open-source operating system for high-performance computing on multi-core processor computer architectures, focusing on GPU systems and computers using the POWER7 processor. The original developer was Terra Soft Solutions, which was acquired by Fixstars in October 2008. Yellow Dog Linux was first released in the spring of 1999 for Apple Macintosh PowerPC-based computers. The most recent version, Yellow Dog Linux 7, was released on August 6, 2012. Yellow Dog Linux lent its name to the popular YUM Linux software updater, derived from YDL's YUP and thus called Yellowdog Updater, Modified.
A live CD is a complete bootable computer installation including operating system which runs directly from a CD-ROM or similar storage device into a computer's memory, rather than loading from a hard disk drive. A live CD allows users to run an operating system for any purpose without installing it or making any changes to the computer's configuration. Live CDs can run on a computer without secondary storage, such as a hard disk drive, or with a corrupted hard disk drive or file system, allowing data recovery.
The Berkeley Open Infrastructure for Network Computing is an open-source middleware system for volunteer computing. Developed originally to support SETI@home, it became the platform for many other applications in areas as diverse as medicine, molecular biology, mathematics, linguistics, climatology, environmental science, and astrophysics, among others. The purpose of BOINC is to enable researchers to utilize processing resources of personal computers and other devices around the world.
Oracle Grid Engine, previously known as Sun Grid Engine (SGE), CODINE or GRD, was a grid computing computer cluster software system, acquired as part of a purchase of Gridware, then improved and supported by Sun Microsystems and later Oracle. There have been open source versions and multiple commercial versions of this technology, initially from Sun, later from Oracle and then from Univa Corporation.
Storage Resource Broker (SRB) is data grid management computer software used in computational science research projects. SRB is a logical distributed file system based on a client-server architecture which presents users with a single global logical namespace or file hierarchy. Essentially, the software enables a user to use a single mechanism to work with multiple data sources.
United Devices, Inc. was a privately held, commercial volunteer computing company that focused on the use of grid computing to manage high-performance computing systems and enterprise cluster management. Its products and services allowed users to "allocate workloads to computers and devices throughout enterprises, aggregating computing power that would normally go unused." It operated under the name Univa UD for a time, after merging with Univa on September 17, 2007.
TeraGrid was an e-Science grid computing infrastructure combining resources at eleven partner sites. The project started in 2001 and operated from 2004 through 2011.
The Texas Advanced Computing Center (TACC) at the University of Texas at Austin, United States, is an advanced computing research center that is based on comprehensive advanced computing resources and supports services to researchers in Texas and across the U.S. The mission of TACC is to enable discoveries that advance science and society through the application of advanced computing technologies. Specializing in high performance computing, scientific visualization, data analysis & storage systems, software, research & development and portal interfaces, TACC deploys and operates advanced computational infrastructure to enable the research activities of faculty, staff, and students of UT Austin. TACC also provides consulting, technical documentation, and training to support researchers who use these resources. TACC staff members conduct research and development in applications and algorithms, computing systems design/architecture, and programming tools and environments.
Software remastering is software development that recreates system software and applications while incorporating customizations, with the intent that it is copied and run elsewhere for "off-label" usage. The term comes from remastering in media production, where it is similarly distinguished from mere copying.
Univa was a software company that developed workload management and cloud management products for compute-intensive applications in the data center and across public, private, and hybrid clouds, before being acquired by Altair Engineering in September 2020.
A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software.
nanoHUB.org is a science and engineering gateway comprising community-contributed resources and geared toward education, professional networking, and interactive simulation tools for nanotechnology. Funded by the United States National Science Foundation (NSF), it is a product of the Network for Computational Nanotechnology (NCN). NCN supports research efforts in nanoelectronics; nanomaterials; nanoelectromechanical systems (NEMS); nanofluidics; nanomedicine, nanobiology; and nanophotonics.
The Slurm Workload Manager, formerly known as Simple Linux Utility for Resource Management (SLURM), or simply Slurm, is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world's supercomputers and computer clusters.
In computing, a system virtual machine is a virtual machine (VM) that provides a complete system platform and supports the execution of a complete operating system (OS). These usually emulate an existing architecture, and are built with the purpose of either providing a platform to run programs where the real hardware is not available for use, or of having multiple instances of virtual machines leading to more efficient use of computing resources, both in terms of energy consumption and cost effectiveness, or both. A VM was originally defined by Popek and Goldberg as "an efficient, isolated duplicate of a real machine".