Catamount (operating system)

Last updated
The Cray XT3 supercomputer which uses Catamount on its compute nodes Shaguar.jpg
The Cray XT3 supercomputer which uses Catamount on its compute nodes

Catamount is an operating system for supercomputers. [1]

Catamount is a lightweight kernel that provides basic functionality and aims for efficiency. The roots of Catamount go back to 1991 when SUNMOS was developed by Sandia National Laboratories and the University of New Mexico as a lightweight operating system. The Cray XT3 uses Catamount on compute nodes and Linux on server nodes. [1]

A case study by the IEEE assessed the performance of a particle transport code on AWE's Cray XT3 8,000-core supercomputer while running images of the Catamount and the Cray Linux Environment (CLE) operating systems. This work demonstrated that by running a number of small benchmarks on a test machine it is possible to speculate as to the performance impact of upgrading from one operating system to another on the system as a whole. The study's findings enabled researchers to minimise system downtime while exploring software upgrades. The results show that benchmark tests run on less than 256 cores would suggest that the impact of upgrading the operating system to CLE was less than 10%. [2]

See also

Related Research Articles

<span class="mw-page-title-main">Supercomputer</span> Type of extremely powerful computer

A supercomputer is a computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instructions per second (MIPS). Since 2017, supercomputers have existed which can perform over 1017 FLOPS (a hundred quadrillion FLOPS, 100 petaFLOPS or 100 PFLOPS). For comparison, a desktop computer has performance in the range of hundreds of gigaFLOPS (1011) to tens of teraFLOPS (1013). Since November 2017, all of the world's fastest 500 supercomputers run on Linux-based operating systems. Additional research is being conducted in the United States, the European Union, Taiwan, Japan, and China to build faster, more powerful and technologically superior exascale supercomputers.

UNICOS is a range of Unix and after it Linux operating system (OS) variants developed by Cray for its supercomputers. UNICOS is the successor of the Cray Operating System (COS). It provides network clustering and source code compatibility layers for some other Unixes. UNICOS was originally introduced in 1985 with the Cray-2 system and later ported to other Cray models. The original UNICOS was based on UNIX System V Release 2, and had many Berkeley Software Distribution (BSD) features added to it.

In computing, floating point operations per second is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate measure than measuring instructions per second.

Cray Inc., a subsidiary of Hewlett Packard Enterprise, is an American supercomputer manufacturer headquartered in Seattle, Washington. It also manufactures systems for data storage and analytics. Several Cray supercomputer systems are listed in the TOP500, which ranks the most powerful supercomputers in the world.

<span class="mw-page-title-main">ASCI Red</span> Supercomputer

ASCI Red was the first computer built under the Accelerated Strategic Computing Initiative (ASCI), the supercomputing initiative of the United States government created to help the maintenance of the United States nuclear arsenal after the 1992 moratorium on nuclear testing.

SUNMOS is an operating system jointly developed by Sandia National Laboratories and the Computer Science Department at the University of New Mexico. The goal of the project, started in 1991, is to develop a highly portable, yet efficient, operating system for massively parallel-distributed memory systems.

<span class="mw-page-title-main">Edinburgh Parallel Computing Centre</span> Supercomputing centre at the University of Edinburgh

EPCC, formerly the Edinburgh Parallel Computing Centre, is a supercomputing centre based at the University of Edinburgh. Since its foundation in 1990, its stated mission has been to accelerate the effective exploitation of novel computing throughout industry, academia and commerce.

Red Storm is a supercomputer architecture designed for the US Department of Energy’s National Nuclear Security Administration Advanced Simulation and Computing Program. Cray, Inc developed it based on the contracted architectural specifications provided by Sandia National Laboratories. The architecture was later commercially produced as the Cray XT3.

<span class="mw-page-title-main">Cray XT3</span> Distributed memory massively parallel MIMD supercomputer

The Cray XT3 is a distributed memory massively parallel MIMD supercomputer designed by Cray Inc. with Sandia National Laboratories under the codename Red Storm. Cray turned the design into a commercial product in 2004. The XT3 derives much of its architecture from the previous Cray T3E system, and also from the Intel ASCI Red supercomputer.

<span class="mw-page-title-main">Cray XMT</span>

Cray XMT is a scalable multithreaded shared memory supercomputer architecture by Cray, based on the third generation of the Tera MTA architecture, targeted at large graph problems. Presented in 2005, it supersedes the earlier unsuccessful Cray MTA-2. It uses the Threadstorm3 CPUs inside Cray XT3 blades. Designed to make use of commodity parts and existing subsystems for other commercial systems, it alleviated the shortcomings of Cray MTA-2's high cost of fully custom manufacture and support. It brought various substantial improvements over Cray MTA-2, most notably nearly tripling the peak performance, and vastly increased maximum CPU count to 8,192 and maximum memory to 128 TB, with a data TLB of maximal 512 TB.

HECToR was a British academic national supercomputer service funded by EPSRC, Natural Environment Research Council (NERC) and BBSRC for the UK academic community. The HECToR service was run by partners including EPCC, Science and Technology Facilities Council (STFC) and Numerical Algorithms Group (NAG).

<span class="mw-page-title-main">TOP500</span> Database project devoted to the ranking of computers

The TOP500 project ranks and details the 500 most powerful non-distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these updates always coincides with the International Supercomputing Conference in June, and the second is presented at the ACM/IEEE Supercomputing Conference in November. The project aims to provide a reliable basis for tracking and detecting trends in high-performance computing and bases rankings on HPL benchmarks, a portable implementation of the high-performance LINPACK benchmark written in Fortran for distributed-memory computers.

<span class="mw-page-title-main">Cray XT5</span> Family of supercomputers

The Cray XT5 is an updated version of the Cray XT4 supercomputer, launched on November 6, 2007. It includes a faster version of the XT4's SeaStar2 interconnect router called SeaStar2+, and can be configured either with XT4 compute blades, which have four dual-core AMD Opteron processor sockets, or XT5 blades, with eight sockets supporting dual or quad-core Opterons. The XT5 uses a 3-dimensional torus network topology.

Compute Node Linux (CNL) is a runtime environment based on the Linux kernel for the Cray XT3, Cray XT4, Cray XT5, Cray XT6, Cray XE6 and Cray XK6 supercomputer systems based on SUSE Linux Enterprise Server. CNL forms part of the Cray Linux Environment. As of November 2011 systems running CNL were ranked 3rd, 6th and 8th among the fastest supercomputers in the world.

The National Center for Computational Sciences (NCCS) is a United States Department of Energy (DOE) Leadership Computing Facility that houses the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility charged with helping researchers solve challenging scientific problems of global interest with a combination of leading high-performance computing (HPC) resources and international expertise in scientific computing.

<span class="mw-page-title-main">Jaguar (supercomputer)</span> Japans next fastest Intel x86 based supercomputer

Jaguar or OLCF-2 was a petascale supercomputer built by Cray at Oak Ridge National Laboratory (ORNL) in Oak Ridge, Tennessee. The massively parallel Jaguar had a peak performance of just over 1,750 teraFLOPS. It had 224,256 x86-based AMD Opteron processor cores, and operated with a version of Linux called the Cray Linux Environment. Jaguar was a Cray XT5 system, a development from the Cray XT4 supercomputer.

A lightweight kernel (LWK) operating system is one used in a large computer with many processor cores, termed a parallel computer.

<span class="mw-page-title-main">Titan (supercomputer)</span> American supercomputer

Titan or OLCF-3 was a supercomputer built by Cray at Oak Ridge National Laboratory for use in a variety of science projects. Titan was an upgrade of Jaguar, a previous supercomputer at Oak Ridge, that uses graphics processing units (GPUs) in addition to conventional central processing units (CPUs). Titan was the first such hybrid to perform over 10 petaFLOPS. The upgrade began in October 2011, commenced stability testing in October 2012 and it became available to researchers in early 2013. The initial cost of the upgrade was US$60 million, funded primarily by the United States Department of Energy.

<span class="mw-page-title-main">Supercomputer operating system</span> Use of Operative System by type of extremely powerful computer

A supercomputer operating system is an operating system intended for supercomputers. Since the end of the 20th century, supercomputer operating systems have undergone major transformations, as fundamental changes have occurred in supercomputer architecture. While early operating systems were custom tailored to each supercomputer to gain speed, the trend has been moving away from in-house operating systems and toward some form of Linux, with it running all the supercomputers on the TOP500 list in November 2017. In 2021, top 10 computers run for instance Red Hat Enterprise Linux (RHEL), or some variant of it or other Linux distribution e.g. Ubuntu.

XK7 is a supercomputing platform, produced by Cray, launched on October 29, 2012. XK7 is the second platform from Cray to use a combination of central processing units ("CPUs") and graphical processing units ("GPUs") for computing; the hybrid architecture requires a different approach to programming to that of CPU-only supercomputers. Laboratories that host XK7 machines host workshops to train researchers in the new programming languages needed for XK7 machines. The platform is used in Titan, the world's second fastest supercomputer in the November 2013 list as ranked by the TOP500 organization. Other customers include the Swiss National Supercomputing Centre which has a 272 node machine and Blue Waters has a machine that has Cray XE6 and XK7 nodes that performs at approximately 1 petaFLOPS (1015 floating-point operations per second).

References

  1. 1 2 An Evaluation of the Oak Ridge National Laboratory Cray XT3 by Sadaf R. Alam etal International Journal of High Performance Computing Applications February 2008 vol. 22 no. 1 52-80
  2. Hammond, S.D.; Mudalige, G.R.; Smith, J.A.; Davis, J.A.; Jarvis, S.A.; Holt, J.; Miller, I.; Herdman, J.A.; Vadgama, A. (2010). "To upgrade or not to upgrade? Catamount vs. Cray Linux Environment". 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and PHD Forum (IPDPSW). pp. 1–8. doi:10.1109/IPDPSW.2010.5470885. ISBN   978-1-4244-6533-0. S2CID   15273643.