Active |
|
---|---|
Sponsors | U.S. Department of Energy |
Operators | Lawrence Livermore National Laboratory and U.S. Department of Energy |
Location | Livermore Computing Complex |
Architecture | HPE Cray Shasta |
Power | 30 MW [1] |
Operating system | TOSS |
Space | TBA |
Memory | 5.4375 petabytes [2] |
Storage | TBA |
Speed | 1.742 exaFLOPS (Rmax) / 2.746 exaFLOPS (Rpeak) [2] |
Cost | US$600 million (estimated cost) |
Purpose | Scientific research and development, stockpile stewardship [3] |
Hewlett Packard Enterprise El Capitan is an exascale supercomputer, hosted at the Lawrence Livermore National Laboratory in Livermore, California, United States, that became operational in 2024. It is based on the Cray EX Shasta architecture. El Capitan displaced Frontier as the world's fastest supercomputer in the 64th edition of the Top500 (Nov 2024). El Capitan is the third exascale system deployed by the United States and its primary purpose is to support the stockpile stewardship program of the US National Nuclear Security Administration.
El Capitan uses a combined 11,039,616 CPU and GPU cores consisting of 43,808 AMD 4th Gen EPYC 24C "Genoa" 24-core 1.8 GHz CPUs (1,051,392 cores) and 43,808 AMD Instinct MI300A GPUs (9,988,224 cores). The MI300A consists of 24 Zen4-based CPU cores and a CDNA3-based GPU integrated onto a single organic package, along with 128GB of HBM3 memory. [4]
Blades are interconnected by an HPE Slingshot 64-port switch that provides 12.8 terabits/second of bandwidth. Groups of blades are linked in a dragonfly topology with at most three hops between any two nodes. Cabling is either optical or copper, customized to minimize cable length. Total cabling runs 145 km (90 mi).
El Capitan uses an APU architecture, where the CPU and GPU share an internal on-chip coherent interconnect.
El Capitan takes up 7,500 square feet (700 m2) of floor space, similar to two tennis courts. [5] It is made up of at least 87 compute racks, including the "Rabbit" NVM-Express fast storage arrays and compute nodes. According to The Next Platform: "El Capitan has a total of 11,136 nodes in liquid-cooled Cray EX racks, with four MI300A compute engines per node and a total of 44,544 devices across the system. Each device has 128 GB of HBM3 main memory shared across the CPU and GPU chiplets, which runs at 5.2 GHz and delivers an aggregate 5.3 TB/sec of aggregate bandwidth into and out of the CPU and GPU chiplets." [6]
El Capitan was ordered as a part of the Department of Energy's CORAL-2 initiative, intended to replace Sierra, an IBM/NVIDIA machine deployed in 2018. The original design envisioned hundreds of thousands of GPUs and 40 MW of power.[ citation needed ] LLNL partnered with HPE Cray and AMD to build the system. [7]
Three El Capitan prototypes – named rzVernal, Tioga, and Tenaya – were powerful enough to be listed on the TOP500 supercomputer list in June 2023. [8] rzVernal reached 4.1 petaflops. [9] In early July, the first components of El Capitan were installed at Lawrence Livermore, with complete installation expected by mid-2024. [10]
By November 18, 2024, El Capitan was operational and verified as the world's fastest supercomputer, achieving 1.742 exaFLOPs. [11]
In February 2025, it officially launched at the Lawrence Livermore National Laboratory (LNNL) in California. The lab noted that the supercomputer cost $600 million to build and will handle various sensitive and classified tasks having to do with the U.S. stockpile of nuclear weapons. [12]
El Capitan was officially dedicated on January 9, 2025. The ceremony was attended by the CEOs of Hewlett Packard Enterprise (HPE) and Advanced Micro Devices (AMD), Antonio Neri and Lisa Su, who were present to celebrate the occasion. [1]
During the event, both CEOs discussed the implications of El Capitan for their companies' AI initiatives. Neri said, "There is complete leverage," highlighting the parallels between El Capitan and the systems used for training artificial intelligence. Su elaborated, "It's basically the same building blocks, as Antonio said, configured in a different way," emphasizing the adaptability of the technology developed for El Capitan in enhancing their AI efforts. [1]