IBM Roadrunner

Last updated
IBM Roadrunner

Roadrunner supercomputer HiRes.jpg

Roadrunner components
Active Operational in 2008
Final completion in 2009
Sponsors IBM
Operators National Nuclear Security Administration
Location Los Alamos National Laboratory
Architecture 12,960 IBM PowerXCell 8i CPUs, 6,480 AMD Opteron dual-core processors, InfiniBand
Power 2.35 MW
Operating system Red Hat Enterprise Linux
Space 296 racks, 560 m2 (6,000 sq ft)
Memory 103.6 TiB
Storage 1,000,000 TiB
Speed 1.042 petaFLOPS
Cost US$100 million [1]
Ranking TOP500 : 10, June 2011
Purpose Modeling the decay of the U.S. nuclear arsenal
Legacy First TOP500 Linpack sustained 1.0 petaflops, May 25, 2008
Web site www.lanl.gov/roadrunner/

Roadrunner was a supercomputer built by IBM for the Los Alamos National Laboratory in New Mexico, USA. The US$100-million Roadrunner was designed for a peak performance of 1.7 petaflops. It achieved 1.026 petaflops on May 25, 2008, to become the world's first TOP500 LINPACK sustained 1.0 petaflops system. [2] [3]

Supercomputer extremely powerful computer for its era

A supercomputer is a computer with a high level of performance compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instructions per second (MIPS). Since 2017, there are supercomputers which can perform up to nearly a hundred quadrillion FLOPS. Since November 2017, all of the world's fastest 500 supercomputers run Linux-based operating systems. Additional research is being conducted in China, the United States, the European Union, Taiwan and Japan to build even faster, more powerful and more technologically superior exascale supercomputers.

IBM American multinational technology and consulting corporation

International Business Machines Corporation (IBM) is an American multinational information technology company headquartered in Armonk, New York, with operations in over 170 countries. The company began in 1911, founded in Endicott, New York, as the Computing-Tabulating-Recording Company (CTR) and was renamed "International Business Machines" in 1924.

Los Alamos National Laboratory research laboratory for the design of nuclear weapons

Los Alamos National Laboratory is a United States Department of Energy national laboratory initially organized during World War II for the design of nuclear weapons as part of the Manhattan Project. It is located a short distance northwest of Santa Fe, New Mexico in the southwestern United States.

Contents

In November 2008, it reached a top performance of 1.456 petaFLOPS, retaining its top spot in the TOP500 list. [4] It was also the fourth-most energy-efficient supercomputer in the world on the Supermicro Green500 list, with an operational rate of 444.94 megaflops per watt of power used. The hybrid Roadrunner design was then reused for several other energy efficient supercomputers. [5] Roadrunner was decommissioned by Los Alamos on March 31, 2013. [6] In its place, Los Alamos commissioned a supercomputer called Cielo, which was installed in 2010. Cielo was smaller and more energy efficient than Roadrunner, and cost $54 million. [6]

TOP500

The TOP500 project ranks and details the 500 most powerful non-distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these updates always coincides with the International Supercomputing Conference in June, and the second is presented at the ACM/IEEE Supercomputing Conference in November. The project aims to provide a reliable basis for tracking and detecting trends in high-performance computing and bases rankings on HPL, a portable implementation of the high-performance LINPACK benchmark written in Fortran for distributed-memory computers.

Cielo was a supercomputer located at Los Alamos National Laboratory in New Mexico, United States built by Cray Inc. From 31 March 2013, with the retirement of IBM Roadrunner, it took over as their front line computer.

Overview

IBM built the computer for the U.S. Department of Energy's (DOE) National Nuclear Security Administration. [7] [8] It was a hybrid design with 12,960 IBM PowerXCell 8i [9] and 6,480 AMD Opteron dual-core processors [10] in specially designed blade servers connected by InfiniBand. The Roadrunner used Red Hat Enterprise Linux along with Fedora [11] as its operating systems and was managed with xCAT distributed computing software. It also used the Open MPI Message Passing Interface implementation. [12]

United States Department of Energy Cabinet-level department of the United States government

The United States Department of Energy (DOE) is a cabinet-level department of the United States Government concerned with the United States' policies regarding energy and safety in handling nuclear material. Its responsibilities include the nation's nuclear weapons program, nuclear reactor production for the United States Navy, energy conservation, energy-related research, radioactive waste disposal, and domestic energy production. It also directs research in genomics; the Human Genome Project originated in a DOE initiative. DOE sponsors more research in the physical sciences than any other U.S. federal agency, the majority of which is conducted through its system of National Laboratories. The agency is administered by the United States Secretary of Energy, and its headquarters are located in Southwest Washington, D.C., on Independence Avenue in the James V. Forrestal Building, named for James Forrestal, as well as in Germantown, Maryland.

National Nuclear Security Administration United States federal government agency

The National Nuclear Security Administration (NNSA) is a United States federal agency responsible for safeguarding national security through the military application of nuclear science. NNSA maintains and enhances the safety, security, and effectiveness of the U.S. nuclear weapons stockpile without nuclear explosive testing; works to reduce the global danger from weapons of mass destruction; provides the United States Navy with safe and effective nuclear propulsion; and responds to nuclear and radiological emergencies in the United States and abroad.

Opteron trademark

Opteron is AMD's x86 former server and workstation processor line, and was the first processor which supported the AMD64 instruction set architecture. It was released on April 22, 2003, with the SledgeHammer core (K8) and was intended to compete in the server and workstation markets, particularly in the same segment as the Intel Xeon processor. Processors based on the AMD K10 microarchitecture were announced on September 10, 2007, featuring a new quad-core configuration. The most-recently released Opteron CPUs are the Piledriver-based Opteron 4300 and 6300 series processors, codenamed "Seoul" and "Abu Dhabi" respectively. In January 2016, the first ARMv8-A based Opteron SoC was released.

Roadrunner occupied approximately 296 server racks [13] which covered 560 square metres (6,000 sq ft) [14] and became operational in 2008. It was decommissioned March 31, 2013. [13] The DOE used the computer for simulating how nuclear materials age in order to predict whether the USA's aging arsenal of nuclear weapons are both safe and reliable. Other uses for the Roadrunner included the science, financial, automotive and aerospace industries.

Hybrid design

Roadrunner differed from other contemporary supercomputers because it continued the hybrid approach [13] to supercomputer design introduced by Seymour Cray in 1964 with the Control Data Corporation CDC 6600 and continued with the order of magnitude faster CDC 7600 in 1969. However, in this architecture the peripheral processors were used only for operating system functions and all applications ran in the one central processor. Most previous supercomputers had only used one processor architecture, since it was thought to be easier to design and program for. To realize the full potential of Roadrunner, all software had to be written specially for this hybrid architecture. The hybrid design consisted of dual-core Opteron server processors manufactured by AMD using the standard AMD64 architecture. Attached to each Opteron core is a PowerXCell 8i processor manufactured by IBM using Power Architecture and Cell technology. As a supercomputer, the Roadrunner was considered an Opteron cluster with Cell accelerators, as each node consists of a Cell attached to an Opteron core and the Opterons to each other. [15]

Control Data Corporation defunct supercomputer firm

Control Data Corporation (CDC) was a mainframe and supercomputer firm. CDC was one of the nine major United States computer companies through most of the 1960s; the others were IBM, Burroughs Corporation, DEC, NCR, General Electric, Honeywell, RCA, and UNIVAC. CDC was well-known and highly regarded throughout the industry at the time. For most of the 1960s, Seymour Cray worked at CDC and developed a series of machines that were the fastest computers in the world by far, until Cray left the company to found Cray Research (CRI) in the 1970s. After several years of losses in the early 1980s, in 1988 CDC started to leave the computer manufacturing business and sell the related parts of the company, a process that was completed in 1992 with the creation of Control Data Systems, Inc. The remaining businesses of CDC currently operate as Ceridian.

CDC 6600 computer

The CDC 6600 was the flagship of the 6000 series of mainframe computer systems manufactured by Control Data Corporation. Generally considered to be the first successful supercomputer, it outperformed the industry's prior recordholder, the IBM 7030 Stretch, by a factor of three. With performance of up to three megaFLOPS, the CDC 6600 was the world's fastest computer from 1964 to 1969, when it relinquished that status to its successor, the CDC 7600.

CDC 7600

The CDC 7600 was the Seymour Cray-designed successor to the CDC 6600, extending Control Data's dominance of the supercomputer field into the 1970s. The 7600 ran at 36.4 MHz and had a 65 Kword primary memory using magnetic core and variable-size secondary memory. It was generally about ten times as fast as the CDC 6600 and could deliver about 10 MFLOPS on hand-compiled code, with a peak of 36 MFLOPS. In addition, in benchmark tests in early 1970 it was shown to be slightly faster than its IBM rival, the IBM System/360, Model 195. When the system was released in 1969, it sold for around $5 million in base configurations, and considerably more as options and features were added.

Development

Roadrunner was in development from 2002 and went online in 2006. Due to its novel design and complexity it was constructed in three phases and became fully operational in 2008. Its predecessor was a machine also developed at Los Alamos named Dark Horse. [16] This machine was one of the earliest hybrid architecture systems originally based on ARM and then moved to the Cell processor. It was entirely a 3D design, its design integrated 3D memory, networking, processors and a number of other technologies.

Phase 1

The first phase of the Roadrunner was building a standard Opteron based cluster, while evaluating the feasibility to further construct and program the future hybrid version. This Phase 1 Roadrunner reached 71 teraflops and was in full operation at Los Alamos National Laboratory in 2006.

Phase 2

Phase 2 known as AAIS (Advanced Architecture Initial System) included building a small hybrid version of the finished system using an older version of the Cell processor. This phase was used to build prototype applications for the hybrid architecture. It went online in January 2007.

Phase 3

The goal of Phase 3 was to reach sustained performance in excess of 1 petaflops. Additional Opteron nodes and new PowerXCell processors were added to the design. These PowerXCell processors are five times as powerful as the Cell processors used in Phase 2. It was built to full scale at IBM’s Poughkeepsie, New York facility, [1] where it broke the 1 petaflops barrier during its fourth attempt on May 25, 2008. The complete system was moved to its permanent location in New Mexico in the summer of 2008. [1]

Technical specifications

Processors

Roadrunner used two different models of processors. The first is the AMD Opteron 2210 , running at 1.8 GHz. Opterons are used both in the computational nodes feeding the Cells with useful data and in the system operations and communication nodes passing data between computing nodes and helping the operators running the system. Roadrunner has a total of 6,912 Opteron processors with 6,480 used for computation and 432 for operation. The Opterons are connected together by HyperTransport links. Each Opteron has two cores for a total 13,824 cores.

The second processor is the IBM PowerXCell 8i , running at 3.2 GHz. These processors have one general purpose core (PPE), and eight special performance cores (SPE) for floating point operations. Roadrunner has a total of 12,960 PowerXCell processors, with 12,960 PPE cores and 103,680 SPE cores, for a total of 116,640 cores.

TriBlade

A schematic description of the TriBlade module. TriBlade.png
A schematic description of the TriBlade module.

Logically, a TriBlade consists of two dual-core Opterons with 16 GB RAM and four PowerXCell 8i CPUs with 16 GB Cell RAM. [10]

Physically, a TriBlade consists of one LS21 Opteron blade, an expansion blade, and two QS22 Cell blades. The LS21 has two 1.8 GHz dual-core Opterons with 16 GB memory for the whole blade, providing 8GB for each CPU. Each QS22 has two PowerXCell 8i CPUs, running at 3.2 GHz and 8 GB memory, which makes 4 GB for each CPU. The expansion blade connects the two QS22 via four PCIe x8 links to the LS21, two links for each QS22. It also provides outside connectivity via an InfiniBand 4x DDR adapter. This makes a total width of four slots for a single TriBlade. Three TriBlades fit into one BladeCenter H chassis. The expansion blade is connected to the Opteron blade via HyperTransport.

Connected Unit (CU)

A Connected Unit is 60 BladeCenter H full of TriBlades, that is 180 TriBlades. All TriBlades are connected to a 288-port Voltaire ISR2012 Infiniband switch. Each CU also has access to the Panasas file system through twelve System x3755 servers. [10]

CU system information: [10]

Roadrunner cluster

A schematic overview of the tiered composition of the Roadrunner supercomputer cluster. Roadrunner-schematic.png
A schematic overview of the tiered composition of the Roadrunner supercomputer cluster.

The final cluster is made up of 18 connected units, which are connected via eight additional (second-stage) Infiniband ISR2012 switches. Each CU is connected through twelve uplinks for each second-stage switch, which makes a total of 96 uplink connections. [10]

Overall system information: [10]

Shutdown

IBM Roadrunner was shut down on March 31, 2013. [13] While the supercomputer was one of the fastest in the world, its energy efficiency was relatively low. Roadrunner delivered 444 megaflops per watt vs the 886 megaflops per watt of a comparable supercomputer. [17] Before the supercomputer is dismantled, researchers will spend one month performing memory and data routing experiments that will aid in designing future supercomputers. [13]

After IBM Roadrunner is dismantled, the electronics will be shredded. [18] Los Alamos will perform the majority of the supercomputer's destruction, citing the classified nature of its calculations. Some of its parts will be retained for historical purposes. [18]

See also

Related Research Articles

In computing, floating point operations per second is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases it is a more accurate measure than measuring instructions per second.

IBM Blue Gene series of supercomputers by IBM

Blue Gene is an IBM project aimed at designing supercomputers that can reach operating speeds in the PFLOPS (petaFLOPS) range, with low power consumption.

Cell is a multi-core microprocessor microarchitecture that combines a general-purpose PowerPC core of modest performance with streamlined coprocessing elements which greatly accelerate multimedia and vector processing applications, as well as many other forms of dedicated computation.

MareNostrum supercomputer

MareNostrum is the name of the main supercomputer in the Barcelona Supercomputing Center (BSC). It is the most powerful supercomputer in Spain, one of thirteen supercomputers in the Spanish Supercomputing Network and one of the seven supercomputers of the European infrastructure PRACE.

IBM BladeCenter

The IBM BladeCenter was IBM's blade server architecture, until it was replaced by Flex System. The x86 division was later sold to Lenovo in 2014.

Torrenza was an initiative announced by Advanced Micro Devices (AMD) in 2006 to improve support for the integration of specialized coprocessors in systems based on AMD Opteron microprocessors. Torrenza does not refer to a specific product or specific technology, though the primary focus is on the integration of coprocessor devices directly connected to the Opteron processors' HyperTransport links, and other co-processors connected via PCI Express. The initiative's stated goals include improving technical and technology support for third-party developers of coprocessing devices, reducing the cost of implementing HyperTransport interfaces on these devices, and improving the performance of the integrated system. It can be argued, that the original idea behind Torrenza was successfully implemented in form of Heterogeneous System Architecture by AMD and the other members of the HSA Foundation.

The first commercial Cell microprocessor, the Cell BE, was designed for the Sony PlayStation 3. IBM designed the PowerXCell 8i for use in the Roadrunner supercomputer.

Magerit

Magerit is the name of the one of the most powerful supercomputers in Spain. It also reached the second best Spanish position in the TOP500 list of supercomputers. This computer is installed in CeSViMa, a research center of the Technical University of Madrid.

In computing, performance per watt is a measure of the energy efficiency of a particular computer architecture or computer hardware. Literally, it measures the rate of computation that can be delivered by a computer for every watt of power consumed. This rate is typically measured by performance on the LINPACK benchmark when trying to compare between computing systems.

Cray XT5

The Cray XT5 is an updated version of the Cray XT4 supercomputer, launched on November 6, 2007. It includes a faster version of the XT4's SeaStar2 interconnect router called SeaStar2+, and can be configured either with XT4 compute blades, which have four dual-core AMD Opteron processor sockets, or XT5 blades, with eight sockets supporting dual or quad-core Opterons. The XT5 uses a 3-dimensional torus network topology.

Sequoia (supercomputer)

IBM Sequoia is a petascale Blue Gene/Q supercomputer constructed by IBM for the National Nuclear Security Administration as part of the Advanced Simulation and Computing Program (ASC). It was delivered to the Lawrence Livermore National Laboratory (LLNL) in 2011 and was fully deployed in June 2012.

QPACE is a massively parallel and scalable supercomputer designed for applications in lattice quantum chromodynamics.

The National Center for Computational Sciences (NCCS) is a United States Department of Energy Leadership Computing Facility. The NCCS provides resources for calculation and simulation in fields including astrophysics, materials, and climate research. This research is intended to enhance American competitiveness in industry. The NCCS, founded in 1992 and located at Oak Ridge National Laboratory (ORNL), currently manages a 2.33-petaflop Cray XT5 supercomputer named Jaguar for use in open research by academic and corporate researchers. Jaguar was named the world's fastest computer at SC09, a position it held until October 2010. Founded in 1992, the NCCS is a managed activity of the Advanced Scientific Computing Research program of the Department of Energy Office of Science (DOE-SC).

Jaguar (supercomputer) supercomputer that used to be at Oak Ridge National Laboratory

Jaguar or OLCF-2 was a petascale supercomputer built by Cray at Oak Ridge National Laboratory (ORNL) in Oak Ridge, Tennessee. The massively parallel Jaguar had a peak performance of just over 1,750 teraFLOPS. It had 224,256 x86-based AMD Opteron processor cores, and operated with a version of Linux called the Cray Linux Environment. Jaguar was a Cray XT5 system, a development from the Cray XT4 supercomputer.

K computer supercomputer in Kobe

The K computer – named for the Japanese word "kei" (京), meaning 10 quadrillion (1016) – is a supercomputer manufactured by Fujitsu, currently installed at the Riken Advanced Institute for Computational Science campus in Kobe, Hyōgo Prefecture, Japan. The K computer is based on a distributed memory architecture with over 80,000 compute nodes. It is used for a variety of applications, including climate research, disaster prevention and medical research. The K computer's operating system is based on the Linux kernel, with additional drivers designed to make use of the computer's hardware.

National Computer Center for Higher Education (France)

The National Computer Center for Higher Education (CINES), based in Montpellier, is a public institution of French administrative character placed under the supervision of the Ministry of Research and Innovation (MESRI), and created by decree in 1999.

Supercomputing in Europe

Several centers for supercomputing exist across Europe, and distributed access to them is coordinated by European initiatives to facilitate high-performance computing. One such initiative, the HPC Europa project, fits within the Distributed European Infrastructure for Supercomputing Applications (DEISA), which was formed in 2002 as a consortium of eleven supercomputing centers from seven European countries. Operating within the CORDIS framework, HPC Europa aims to provide access to supercomputers across Europe.

The Cray XK6 is an enhanced version of the Cray XE6 supercomputer, announced in May 2011. The XK6 uses the same "blade" architecture of the XE6, with each XK6 blade comprising four compute "nodes". Each node consists of a 16-core AMD Opteron 6200 processor with 16 or 32 GB of DDR3 RAM and an Nvidia Tesla X2090 GPGPU with 6 GB of GDDR5 RAM, the two connected via PCI Express 2.0. Two Gemini router ASICs are shared between the nodes on a blade, providing a 3-dimensional torus network topology between nodes. This means that it has 576 GB of Graphics memory and over 1500 CPU cores, several orders of magnitude more powerful than the best publicly available computer on the market.

Supercomputer architecture

Approaches to supercomputer architecture have taken dramatic turns since the earliest systems were introduced in the 1960s. Early supercomputer architectures pioneered by Seymour Cray relied on compact innovative designs and local parallelism to achieve superior computational peak performance. However, in time the demand for increased computational power ushered in the age of massively parallel systems.

References

  1. 1 2 3 "Fact Sheet & Background: Roadrunner Smashes the Petaflop Barrier". IBM. 9 June 2008. Retrieved April 1, 2013.
  2. Gaudin, Sharon (2008-06-09). "IBM's Roadrunner smashes 4-minute mile of supercomputing". Computerworld. Archived from the original on 2008-12-24. Retrieved 2008-06-10.
  3. Fildes, Jonathan (2008-06-09). "Supercomputer sets petaflop pace". BBC News. Retrieved 2008-06-09.
  4. "TOP500 Supercomputer Sites". top500.org. 11 November 2008.
  5. "The Green500 List — June 2009". The Green500.
  6. 1 2 Montoya, Susan (30 March 2013). "End of the Line for Roadrunner Supercomputer". The Associated Press.
  7. "IBM to Build World's First Cell Broadband Engine Based Supercomputer". IBM. 2006-09-06. Retrieved 2008-05-31.
  8. "IBM Selected to Build New DOE Supercomputer". NNSA. 2006-09-06. Archived from the original on 2008-06-18. Retrieved 2008-05-31.
  9. "International Supercomputing Conference to Host First Panel Discussion on Breaking the Petaflops Barrier". TOP500 Supercomputing Sites. 9 June 2008. Archived from the original on 11 October 2008. Retrieved 11 October 2015.
  10. 1 2 3 4 5 6 Koch, Ken (2008-03-13). "Roadrunner Platform Overview" (PDF). Los Alamos National Laboratory. Retrieved 2008-05-31.
  11. Borrett, Ann (2007). "Roadrunner - Integrated Hybrid Node" (PDF).
  12. Squyres, Jeff. "Open MPI: 10^15 Flops Can't Be Wrong" (PDF). Open MPI . Retrieved 2008-11-22.
  13. 1 2 3 4 5 6 Brodkin, Jon. "World's top supercomputer from '09 is now obsolete, will be dismantled". Ars Technica. Retrieved March 31, 2013.
  14. "Los Alamos computer breaks petaflop barrier". IBM. 2008-06-09. Retrieved 2008-06-12.
  15. Barker, Kevin J.; Davis, Kei; Hoisie, Adolfy; Kerbyson, Darren J.; Lang, Mike; Pakin, Scott; Sancho, Jose C. (2008). "Entering the petaflop era: The architecture and performance of Roadrunner" (PDF). International Conference for High Performance Computing, Networking, Storage and Analysis: 1–11. doi:10.1109/SC.2008.5217926. ISBN   978-1-4244-2834-2 . Retrieved 2013-04-02.
  16. Poole, Steve (September 2006). "DarkHorse: a Proposed PetaScale Architecture" (PDF). Los Alamos National Laboratory. Retrieved 11 October 2015.
  17. "Top500 List - November 2012". TOP500 . Retrieved April 2, 2013.
  18. 1 2 "World's first petascale supercomputer will be shredded to bits". Ars Technica. Retrieved April 1, 2013.
Records
Preceded by
Blue Gene/L
478.20 teraflops
World's most powerful supercomputer
June 2008 – November 2009
Succeeded by
Jaguar
1.75 petaflops