Sierra (supercomputer)

Last updated
Sierra
Sierra Supercomputer (48002385338).jpg
ActiveSince 2018 [1]
Operators National Nuclear Security Administration
Location Lawrence Livermore National Laboratory
ArchitectureIBM POWER9 CPUs
Nvidia Tesla V100 GPUs
Mellanox EDR InfiniBand [2]
Power11 MW
Memory2–2.4 PiB [1]
Speed125 petaflops (peak) [2]
Ranking TOP500 : 6
PurposeNuclear weapon simulations [3]
Website hpc.llnl.gov/hardware/compute-platforms/sierra

Sierra or ATS-2 is a supercomputer built for the Lawrence Livermore National Laboratory for use by the National Nuclear Security Administration as the second Advanced Technology System. It is primarily used for predictive applications in nuclear weapon stockpile stewardship, helping to assure the safety, reliability, and effectiveness of the United States' nuclear weapons.

Contents

Sierra is very similar in architecture to the Summit supercomputer built for the Oak Ridge National Laboratory. The nodes in Sierra are Witherspoon IBM S922LC OpenPOWER servers with two GPUs per CPU and four GPUs per node. These nodes are connected with EDR InfiniBand. In 2019 Sierra was upgraded with IBM Power System AC922 nodes. [4] [5]

Sierra is composed of 4,474 nodes, 4,284 of which are compute nodes. Each node has 256GB of RAM, 44 IBM POWER9 cores spread across two physical sockets, and Four Nvidia Tesla V100 GPUs, each providing 16GB of VRAM. This gives the complete system 8,948 CPUs, 17,896 GPUs, 1.14 PB of RAM, and 286 TB of VRAM. [6]

Sierra has consistently appeared on the Top500 list, peaking at #2 in November 2018, at #6 on the June 2023 Top500 list. and at #10 on the November 2023 Top500 list. Only 4.6 petaflops of its performance come from its CPUs, with the large majority (120.9 petaflops) coming from the Tesla GPUs. [6]

See also

Related Research Articles

<span class="mw-page-title-main">Supercomputer</span> Type of extremely powerful computer

A supercomputer is a type of computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instructions per second (MIPS). Since 2017, supercomputers have existed, which can perform over 1017 FLOPS (a hundred quadrillion FLOPS, 100 petaFLOPS or 100 PFLOPS). For comparison, a desktop computer has performance in the range of hundreds of gigaFLOPS (1011) to tens of teraFLOPS (1013). Since November 2017, all of the world's fastest 500 supercomputers run on Linux-based operating systems. Additional research is being conducted in the United States, the European Union, Taiwan, Japan, and China to build faster, more powerful and technologically superior exascale supercomputers.

In computing, floating point operations per second is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate measure than measuring instructions per second.

<span class="mw-page-title-main">IBM Blue Gene</span> Series of supercomputers by IBM

Blue Gene was an IBM project aimed at designing supercomputers that can reach operating speeds in the petaFLOPS (PFLOPS) range, with low power consumption.

<span class="mw-page-title-main">MareNostrum</span> Supercomputer in the Barcelona Supercomputing Center

MareNostrum is the main supercomputer in the Barcelona Supercomputing Center. It is the most powerful supercomputer in Spain, one of thirteen supercomputers in the Spanish Supercomputing Network and one of the seven supercomputers of the European infrastructure PRACE.

<span class="mw-page-title-main">Roadrunner (supercomputer)</span> Former supercomputer built by IBM

Roadrunner was a supercomputer built by IBM for the Los Alamos National Laboratory in New Mexico, USA. The US$100-million Roadrunner was designed for a peak performance of 1.7 petaflops. It achieved 1.026 petaflops on May 25, 2008, to become the world's first TOP500 LINPACK sustained 1.0 petaflops system.

<span class="mw-page-title-main">TOP500</span> Database project devoted to the ranking of computers

The TOP500 project ranks and details the 500 most powerful non-distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these updates always coincides with the International Supercomputing Conference in June, and the second is presented at the ACM/IEEE Supercomputing Conference in November. The project aims to provide a reliable basis for tracking and detecting trends in high-performance computing and bases rankings on HPL benchmarks, a portable implementation of the high-performance LINPACK benchmark written in Fortran for distributed-memory computers.

The Green500 is a biannual ranking of supercomputers, from the TOP500 list of supercomputers, in terms of energy efficiency. The list measures performance per watt using the TOP500 measure of high performance LINPACK benchmarks at double-precision floating-point format.

<span class="mw-page-title-main">Sequoia (supercomputer)</span> IBM supercomputer at Lawrence Livermore National Laboratory

IBM Sequoia was a petascale Blue Gene/Q supercomputer constructed by IBM for the National Nuclear Security Administration as part of the Advanced Simulation and Computing Program (ASC). It was delivered to the Lawrence Livermore National Laboratory (LLNL) in 2011 and was fully deployed in June 2012. Sequoia was dismantled in 2020, its last position on the top500.org list was #22 in the November 2019 list.

The National Center for Computational Sciences (NCCS) is a United States Department of Energy (DOE) Leadership Computing Facility that houses the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility charged with helping researchers solve challenging scientific problems of global interest with a combination of leading high-performance computing (HPC) resources and international expertise in scientific computing.

<span class="mw-page-title-main">Tianhe-1</span> Supercomputer

Tianhe-I, Tianhe-1, or TH-1 is a supercomputer capable of an Rmax of 2.5 peta FLOPS. Located at the National Supercomputing Center of Tianjin, China, it was the fastest computer in the world from October 2010 to June 2011 and was one of the few petascale supercomputers in the world.

<span class="mw-page-title-main">Supercomputing in Europe</span> Overview of supercomputing in Europe

Several centers for supercomputing exist across Europe, and distributed access to them is coordinated by European initiatives to facilitate high-performance computing. One such initiative, the HPC Europa project, fits within the Distributed European Infrastructure for Supercomputing Applications (DEISA), which was formed in 2002 as a consortium of eleven supercomputing centers from seven European countries. Operating within the CORDIS framework, HPC Europa aims to provide access to supercomputers across Europe.

<span class="mw-page-title-main">Titan (supercomputer)</span> American supercomputer

Titan or OLCF-3 was a supercomputer built by Cray at Oak Ridge National Laboratory for use in a variety of science projects. Titan was an upgrade of Jaguar, a previous supercomputer at Oak Ridge, that uses graphics processing units (GPUs) in addition to conventional central processing units (CPUs). Titan was the first such hybrid to perform over 10 petaFLOPS. The upgrade began in October 2011, commenced stability testing in October 2012 and it became available to researchers in early 2013. The initial cost of the upgrade was US$60 million, funded primarily by the United States Department of Energy.

<span class="mw-page-title-main">POWER9</span> 2017 family of multi-core microprocessors by IBM

POWER9 is a family of superscalar, multithreading, multi-core microprocessors produced by IBM, based on the Power ISA. It was announced in August 2016. The POWER9-based processors are being manufactured using a 14 nm FinFET process, in 12- and 24-core versions, for scale out and scale up applications, and possibly other variations, since the POWER9 architecture is open for licensing and modification by the OpenPOWER Foundation members.

<span class="mw-page-title-main">NVLink</span> High speed chip interconnect

NVLink is a wire-based serial multi-lane near-range communications link developed by Nvidia. Unlike PCI Express, a device can consist of multiple NVLinks, and devices use mesh networking to communicate instead of a central hub. The protocol was first announced in March 2014 and uses a proprietary high-speed signaling interconnect (NVHS).

Volta is the codename, but not the trademark, for a GPU microarchitecture developed by Nvidia, succeeding Pascal. It was first announced on a roadmap in March 2013, although the first product was not announced until May 2017. The architecture is named after 18th–19th century Italian chemist and physicist Alessandro Volta. It was Nvidia's first chip to feature Tensor Cores, specially designed cores that have superior deep learning performance over regular CUDA cores. The architecture is produced with TSMC's 12 nm FinFET process. The Ampere microarchitecture is the successor to Volta.

<span class="mw-page-title-main">Summit (supercomputer)</span> Supercomputer developed by IBM

Summit or OLCF-4 is a supercomputer developed by IBM for use at Oak Ridge Leadership Computing Facility (OLCF), a facility at the Oak Ridge National Laboratory, capable of 200 petaFLOPS thus making it the 5th fastest supercomputer in the world after Frontier (OLCF-5), Fugaku, LUMI, and Leonardo, with Frontier being the fastest. It held the number 1 position from November 2018 to June 2020. Its current LINPACK benchmark is clocked at 148.6 petaFLOPS.

<span class="mw-page-title-main">Nvidia DGX</span> Line of Nvidia produced servers and workstations

Nvidia DGX is a line of Nvidia-produced servers and workstations which specialize in using GPGPU to accelerate deep learning applications. The typical design of a DGX system is based upon a rackmount chassis with motherboard that carries high performance x86 server CPUs. The main component of a DGX system is a set of 4 to 16 Nvidia Tesla GPU modules on an independent system board. DGX systems have large heatsinks and powerful fans to adequately cool thousands of watts of thermal output. The GPU modules are typically integrated into the system using a version of the SXM socket or by a PCIe x16 slot.

<span class="mw-page-title-main">Leonardo (supercomputer)</span> Supercomputer in Italy

Leonardo is a petascale supercomputer located at the CINECA datacenter in Bologna, Italy. The system consists of an Atos BullSequana XH2000 computer, with close to 14,000 Nvidia Ampere GPUs and 200Gbit/s Nvidia Mellanox HDR InfiniBand connectivity. Inaugurated in November 2022, Leonardo is capable of 250 petaflops, making it one of the top five fastest supercomputers in the world. It debuted on the TOP500 in November 2022 ranking fourth in the world, and second in Europe.

Hewlett Packard Enterprise El Capitan, is an upcoming exascale supercomputer, hosted at the Lawrence Livermore National Laboratory in Livermore, United States and projected to become operational in 2024. It is based on the Cray EX Shasta architecture. When deployed, El Capitan is projected to displace Frontier as the world's fastest supercomputer.

<span class="mw-page-title-main">Michael Gschwind</span> American computer scientist

Michael Karl Gschwind is an American computer scientist who currently is a director and principal engineer at Meta Platforms in Menlo Park, California. He is recognized for his seminal contributions to the design and exploitation of general-purpose programmable accelerators, as an early advocate of sustainability in computer design and as a prolific inventor.

References

  1. 1 2 Morgan, Timothy Prickett (2017-10-05). "The Clever Machinations Of Livermore's Sierra Supercomputer". The Next Platform.
  2. 1 2 "Installation of Sierra Supercomputer Steams Along at LLNL". HPCwire. 2017-11-20.
  3. Smith, Ryan (2014-11-17). "NVIDIA Volta, IBM POWER9 Land Contracts For New US Government Supercomputers". AnandTech.
  4. "Sierra - IBM Power System AC922, IBM POWER9 22C 3.1GHz, NVIDIA Volta GV100, Dual-rail Mellanox EDR Infiniband | TOP500". www.top500.org. Retrieved 2020-12-15.
  5. "IBM Power System AC922 - Details". www.ibm.com. Retrieved 2020-12-15.
  6. 1 2 "Sierra | HPC @ LLNL". hpc.llnl.gov. Retrieved 2023-07-26.