Frontier (supercomputer)

Last updated

Frontier
Frontier Supercomputer (2).jpg
Active
  • Deployment: Sep. 2021
  • Completion: May 2022
Operators Oak Ridge National Laboratory and U.S. Department of Energy
Location Oak Ridge Leadership Computing Facility
Power22.7 MW [1]
Operating system HPE Cray OS
Space680 m2 (7,300 sq ft)
Speed1.194  exaFLOPS (Rmax) / 1.67982  exaFLOPS (Rpeak) [1]
CostUS$600 million (estimated cost)
PurposeScientific research and development
Website www.olcf.ornl.gov/frontier/ OOjs UI icon edit-ltr-progressive.svg

Hewlett Packard Enterprise Frontier, or OLCF-5, is the world's first exascale supercomputer. It is hosted at the Oak Ridge Leadership Computing Facility (OLCF) in Tennessee, United States and became operational in 2022. As of December 2023, Frontier is the world's fastest supercomputer. It is based on the Cray EX and is the successor to Summit (OLCF-4). Frontier achieved an Rmax of 1.102 exaFLOPS, which is 1.102 quintillion floating-point operations per second, using AMD CPUs and GPUs. [2] [3] [4] [5] [6]

Contents

Measured at 62.86 gigaflops/watt, Frontier TDS topped the Green500 list for most efficient supercomputer [6] until it was dethroned in efficiency by the Flatiron Institute's Henri supercomputer in November 2022. [7]

Design

Frontier uses 9,472 AMD Epyc 7713 "Trento" 64 core 2 GHz CPUs (606,208 cores) and 37,888 Instinct MI250X GPUs (8,335,360 cores). They can perform double-precision operations at the same speed as single precision. [8]

"Trento" is an optimized third-generation EPYC CPU [9] ("Milan"), which is based on the Zen 3 microarchitecture.

It occupies 74 19-inch (48 cm) rack cabinets. [10] Each cabinet hosts 64 blades, each consisting of 2 nodes.

Blades are interconnected by HPE Slingshot 64-port switches that provides 12.8 terabits/second of bandwidth. Groups of blades are linked in a dragonfly topology with at most three hops between any two nodes. Cabling is either optical or copper, customized to minimize cable length. Total cabling runs 145 km (90 mi). Frontier is liquid-cooled, allowing 5x the density of air-cooled architectures. [8]

Each node consists of one CPU, 4 GPUs and 4 terabytes of flash memory. Each GPU has 128 GB of RAM soldered onto it. [8]

Frontier has coherent interconnects between CPUs and GPUs, allowing GPU memory to be accessed coherently by code running on the Epyc CPUs. [11]

Frontier uses an internal 75 TB/s read / 35 TB/s write / 15 billion IOPS flash storage system, along with the 700 PB Orion site-wide Lustre filesystem. [12]

Frontier consumes 21 megawatts (MW), compared to its predecessor Summit's 13 MW.

History

One of the largest challenges during development was power consumption. Existing information pointed to hundreds of thousands of GPUs being necessary to achieve 1 exaFLOP, with a total power consumption of 150-500 MW. Thus, high efficiency was a primary target of the project. [8]

Oak Ridge partnered with HPE Cray and AMD to build the system at a cost of US$600 million. It began deployment in 2021 [13] and reached full capability in 2022. [14] It clocked 1.1 exaflops Rmax in May 2022, making it the world's fastest supercomputer as measured in the June 2022 edition of the TOP500 list, replacing Fugaku. [1] [15]

Upon its release, the supercomputer topped the Green500 list for most efficient supercomputer, measured at 62.68 gigaflops/watt. [6] ORNL Director Thomas Zacharia said: "Frontier is ushering in a new era of exascale computing to solve the world’s biggest scientific challenges." He added: "This milestone offers just a preview of Frontier’s unmatched capability as a tool for scientific discovery. It is the result of more than a decade of collaboration among the national laboratories, academia and private industry, including DOE's Exascale Computing Project, which is deploying the applications, software technologies, hardware and integration necessary to ensure impact at the exascale." [12]

See also

Related Research Articles

<span class="mw-page-title-main">Supercomputer</span> Type of extremely powerful computer

A supercomputer is a type of computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instructions per second (MIPS). Since 2017, supercomputers have existed which can perform over 1017 FLOPS (a hundred quadrillion FLOPS, 100 petaFLOPS or 100 PFLOPS). For comparison, a desktop computer has performance in the range of hundreds of gigaFLOPS (1011) to tens of teraFLOPS (1013). Since November 2017, all of the world's fastest 500 supercomputers run on Linux-based operating systems. Additional research is being conducted in the United States, the European Union, Taiwan, Japan, and China to build faster, more powerful and technologically superior exascale supercomputers.

Floating point operations per second is a measure of computer performance in computing, useful in fields of scientific computations that require floating-point calculations.

<span class="mw-page-title-main">MareNostrum</span> Supercomputer in the Barcelona Supercomputing Center

MareNostrum is the main supercomputer in the Barcelona Supercomputing Center. It is the most powerful supercomputer in Spain, one of thirteen supercomputers in the Spanish Supercomputing Network and one of the seven supercomputers of the European infrastructure PRACE.

The Oak Ridge Leadership Computing Facility (OLCF), formerly the National Leadership Computing Facility, is a designated user facility operated by Oak Ridge National Laboratory and the Department of Energy. It contains several supercomputers, the largest of which is an HPE OLCF-5 named Frontier, which was ranked 1st on the TOP500 list of world's fastest supercomputers as of June 2023. It is located in Oak Ridge, Tennessee.

<span class="mw-page-title-main">TOP500</span> Database project devoted to the ranking of computers

The TOP500 project ranks and details the 500 most powerful non-distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these updates always coincides with the International Supercomputing Conference in June, and the second is presented at the ACM/IEEE Supercomputing Conference in November. The project aims to provide a reliable basis for tracking and detecting trends in high-performance computing and bases rankings on HPL benchmarks, a portable implementation of the high-performance LINPACK benchmark written in Fortran for distributed-memory computers.

The Green500 is a biannual ranking of supercomputers, from the TOP500 list of supercomputers, in terms of energy efficiency. The list measures performance per watt using the TOP500 measure of high performance LINPACK benchmarks at double-precision floating-point format.

The National Center for Computational Sciences (NCCS) is a United States Department of Energy (DOE) Leadership Computing Facility that houses the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility charged with helping researchers solve challenging scientific problems of global interest with a combination of leading high-performance computing (HPC) resources and international expertise in scientific computing.

<span class="mw-page-title-main">Jaguar (supercomputer)</span> Cray supercomputer at Oak Ridge National Laboratory

Jaguar or OLCF-2 was a petascale supercomputer built by Cray at Oak Ridge National Laboratory (ORNL) in Oak Ridge, Tennessee. The massively parallel Jaguar had a peak performance of just over 1,750 teraFLOPS. It had 224,256 x86-based AMD Opteron processor cores, and operated with a version of Linux called the Cray Linux Environment. Jaguar was a Cray XT5 system, a development from the Cray XT4 supercomputer.

Exascale computing refers to computing systems capable of calculating at least "1018 IEEE 754 Double Precision (64-bit) operations (multiplications and/or additions) per second (exaFLOPS)"; it is a measure of supercomputer performance.

<span class="mw-page-title-main">Tsubame (supercomputer)</span> Series of supercomputers

Tsubame is a series of supercomputers that operates at the GSIC Center at the Tokyo Institute of Technology in Japan, designed by Satoshi Matsuoka.

<span class="mw-page-title-main">History of supercomputing</span>

The history of supercomputing goes back to the 1960s when a series of computers at Control Data Corporation (CDC) were designed by Seymour Cray to use innovative designs and parallelism to achieve superior computational peak performance. The CDC 6600, released in 1964, is generally considered the first supercomputer. However, some earlier computers were considered supercomputers for their day such as the 1954 IBM NORC in the 1950s, and in the early 1960s, the UNIVAC LARC (1960), the IBM 7030 Stretch (1962), and the Manchester Atlas (1962), all of which were of comparable power.

<span class="mw-page-title-main">Supercomputing in Europe</span> Overview of supercomputing in Europe

Several centers for supercomputing exist across Europe, and distributed access to them is coordinated by European initiatives to facilitate high-performance computing. One such initiative, the HPC Europa project, fits within the Distributed European Infrastructure for Supercomputing Applications (DEISA), which was formed in 2002 as a consortium of eleven supercomputing centers from seven European countries. Operating within the CORDIS framework, HPC Europa aims to provide access to supercomputers across Europe.

<span class="mw-page-title-main">Titan (supercomputer)</span> American supercomputer

Titan or OLCF-3 was a supercomputer built by Cray at Oak Ridge National Laboratory for use in a variety of science projects. Titan was an upgrade of Jaguar, a previous supercomputer at Oak Ridge, that uses graphics processing units (GPUs) in addition to conventional central processing units (CPUs). Titan was the first such hybrid to perform over 10 petaFLOPS. The upgrade began in October 2011, commenced stability testing in October 2012 and it became available to researchers in early 2013. The initial cost of the upgrade was US$60 million, funded primarily by the United States Department of Energy.

XK7 is a supercomputing platform, produced by Cray, launched on October 29, 2012. XK7 is the second platform from Cray to use a combination of central processing units ("CPUs") and graphical processing units ("GPUs") for computing; the hybrid architecture requires a different approach to programming to that of CPU-only supercomputers. Laboratories that host XK7 machines host workshops to train researchers in the new programming languages needed for XK7 machines. The platform is used in Titan, the world's second fastest supercomputer in the November 2013 list as ranked by the TOP500 organization. Other customers include the Swiss National Supercomputing Centre which has a 272 node machine and Blue Waters has a machine that has Cray XE6 and XK7 nodes that performs at approximately 1 petaFLOPS (1015 floating-point operations per second).

<span class="mw-page-title-main">Summit (supercomputer)</span> Supercomputer developed by IBM

Summit or OLCF-4 is a supercomputer developed by IBM for use at Oak Ridge Leadership Computing Facility (OLCF), a facility at the Oak Ridge National Laboratory, capable of 200 petaFLOPS thus making it the 5th fastest supercomputer in the world after Frontier (OLCF-5), Fugaku, LUMI, and Leonardo, with Frontier being the fastest. It held the number 1 position from November 2018 to June 2020. Its current LINPACK benchmark is clocked at 148.6 petaFLOPS.

<span class="mw-page-title-main">Epyc</span> AMD brand for server microprocessors

Epyc is a brand of multi-core x86-64 microprocessors designed and sold by AMD, based on the company's Zen microarchitecture. Introduced in June 2017, they are specifically targeted for the server and embedded system markets.

<span class="mw-page-title-main">Fugaku (supercomputer)</span> Japanese supercomputer

Fugaku(Japanese: 富岳) is a petascale supercomputer at the Riken Center for Computational Science in Kobe, Japan. It started development in 2014 as the successor to the K computer and made its debut in 2020. It is named after an alternative name for Mount Fuji.

<span class="mw-page-title-main">Aurora (supercomputer)</span> Planned supercomputer

Aurora is a supercomputer that was sponsored by the United States Department of Energy (DOE) and designed by Intel and Cray for the Argonne National Laboratory. It has been the second fastest supercomputer in the world since 2023. It is expected that after optimizing its performance it will exceed 2 ExaFLOPS, making it the fastest computer ever.

<span class="mw-page-title-main">LUMI</span> Supercomputer in Finland

LUMI is a petascale supercomputer located at the CSC data center in Kajaani, Finland. As of January 2023, the computer is the fastest supercomputer in Europe.

Hewlett Packard Enterprise El Capitan, is an upcoming exascale supercomputer, hosted at the Lawrence Livermore National Laboratory in Livermore, United States and projected to become operational in 2024. It is based on the Cray EX Shasta architecture. When deployed, El Capitan is projected to displace Frontier as the world's fastest supercomputer.

References

  1. 1 2 3 "TOP500 November 2023". November 11, 2023. Retrieved November 16, 2023.
  2. Wells, Jack (March 19, 2018). "Powering the Road to National HPC Leadership". OpenPOWER Summit 2018. Archived from the original on August 4, 2020. Retrieved March 25, 2018.
  3. Bethea, Katie (February 13, 2018). "Frontier: OLCF'S Exascale Future – Oak Ridge Leadership Computing Facility". Oak Ridge National Laboratory - Leadership Computing Facility. Archived from the original on March 10, 2018.
  4. "DOE Under Secretary for Science Dabbar's Exascale Update". insideHPC. October 9, 2020. Archived from the original on October 28, 2020.
  5. Don Clark (May 30, 2022). "U.S. Retakes Top Spot in Supercomputer Race". The New York Times. Archived from the original on June 1, 2022. Retrieved June 1, 2022.
  6. 1 2 3 Larabel, Michael (May 30, 2022). "AMD-Powered Frontier Supercomputer Tops Top500 At 1.1 Exaflops, Tops Green500 Too". Phoronix . Archived from the original on June 6, 2022. Retrieved June 1, 2022.
  7. Anton Shilov (November 15, 2022). "Nvidia Steals AMD's Supercomputer Efficiency World Record". Tom's Hardware. Archived from the original on February 6, 2023. Retrieved December 10, 2022.
  8. 1 2 3 4 Choi, Charles Q. (June 24, 2022). "The Beating Heart of the World's First Exascale Supercomputer". IEEE Spectrum. Archived from the original on August 14, 2022. Retrieved August 14, 2022.
  9. "Crusher Quick-Start Guide — OLCF User Documentation". docs.olcf.ornl.gov. Archived from the original on November 8, 2022. Retrieved November 8, 2022.
  10. "FRONTIER Spec Sheet". Archived from the original on May 31, 2022. Retrieved May 31, 2022.
  11. "AMD Preparing More Linux Code For The Frontier Supercomputer". Archived from the original on May 28, 2021.
  12. 1 2 "Frontier supercomputer debuts as world's fastest, breaking exascale barrier". Oak Ridge National Laboratory. May 30, 2022. Archived from the original on June 1, 2022. Retrieved June 17, 2022.
  13. "US Closes in on Exascale: Frontier Installation Is Underway". HPC Wire. September 29, 2021. Archived from the original on January 3, 2022. Retrieved January 3, 2022.
  14. "First Look At Oak Ridge's "Frontier" Exascaler, Contrasted To Argonne's "Aurora"". Next Platform. October 4, 2021. Archived from the original on January 3, 2022. Retrieved January 3, 2022.
  15. "US Takes Supercomputer Top Spot with First True Exascale Machine". Archived from the original on May 31, 2022. Retrieved May 31, 2022.
Records
Preceded by World's most powerful supercomputer
May 2022 
Incumbent