ASCI Red

Last updated

ASCI Red
Asci red - tflop4m.jpeg
ActiveTwo-Thirds Operational March 1997, Fully Operational June 1997, [1] decommissioned 2006 [2]
Sponsors Intel Corporation [1]
Operators Sandia National Laboratories, US Department of Energy
Location Sandia National Laboratories, United States
Power850 kW
Operating system Cougar / TOS (a Mach kernel derivative)
Space1,600 sq ft (150 m2) [3]
Memory1212 gigabytes
Speed1.3 teraflops (peak) [1]
Ranking TOP500 : 1, June 2000 [4]
Purposenuclear materials testing, other
LegacyFirst Supercomputer to achieve over 1.0 teraflops on LINPACK test
Website web.archive.org

ASCI Red (also known as ASCI Option Red or TFLOPS) was the first computer built under the Accelerated Strategic Computing Initiative (ASCI), [5] [6] the supercomputing initiative of the United States government created to help the maintenance of the United States nuclear arsenal after the 1992 moratorium on nuclear testing.

Contents

ASCI Red was built by Intel and installed at Sandia National Laboratories in late 1996. The design was based on the Intel Paragon computer. The original goals to deliver a true teraflop machine by the end of 1996 that would be capable of running an ASCI application using all memory and nodes by September 1997 were met. [7] It was used by the US government from the years of 1997 to 2005 and was the world's fastest supercomputer until late 2000. [4] [6] It was the first ASCI machine that the Department of Energy acquired, [6] and also the first supercomputer to score above one teraflops on the LINPACK benchmark, a test that measures a computer's calculation speed. Later upgrades to ASCI Red allowed it to perform above two teraflops.

ASCI Red earned a reputation for reliability that some veterans say has never been beaten. Sandia director Bill Camp said that ASCI Red had the best reliability of any supercomputer ever built, and “was supercomputing’s high-water mark in longevity, price, and performance.” [8]

ASCI Red was decommissioned in 2006. [2]

System structure

The ASCI Red supercomputer was a distributed memory MIMD (Multiple Instruction, Multiple Data) message-passing computer. The design provided high degrees of scalability for I/O, memory, compute nodes, storage capacity, and communications; standard parallel interfaces also made it possible to port parallel applications to the machine. The machine was structured into four partitions: Compute, Service, I/O, and System. Parallel applications executed in the Compute Partition which contained nodes optimized for floating point performance. The compute nodes had only the features required for efficient computation – they were not purposed for general interactive services. The Service Partition provided an integrated, scalable host that supported interactive users (log-in sessions), application development, and system administration. The I/O Partition supported disk I/O, a scalable parallel file system and network services. The System Partition supported initial booting and system Reliability, Availability, and Serviceability (RAS) capabilities. [7]

The Service partition helps integrate all of the different parts of ASCI Red together. It provides a scalable host for users, and it is used for general system administration. [1] The I/O Partition provides a file system and network services, and the Service partition is made up of the log-in screens, tools for application development, and utilities for network connections. [5] The Compute partition contains nodes that are designed for floating point performance. This is where the actual computing takes place. [5] Every one of the compute nodes accommodated two 200 MHz Pentium Pro processors, each with a 16 KB level-1 cache and a 256 KB level-2 cache, which were upgraded later to two 333 MHz Pentium II OverDrive processors, each with a 32 KB level-1 cache and a 512 KB level-2 cache. [9] According to Intel, the ASCI Red Computer is also the first large scale supercomputer to be built entirely of common commercially available components. [10]

All of ASCI Red's partitions are interconnected to form one supercomputer, however at the same time none of the nodes support global shared memory. Each of the nodes works in its own memory, and each shares data with the others through "explicit message-passing". [11]

Technical specifications

The computer itself took up almost 1,600 square feet (150 m2) of space, [3] and is made up of 104 "cabinets". Of those cabinets, 76 are computers (processors), 8 are switches, and 20 are disks. It has a total of 1212 GB of RAM, and 9298 separate processors. The original machine used Intel Pentium Pro processors each clocked at 200 MHz. These were later upgraded to specially packaged Pentium II Xeon processors, each clocked at 333 MHz. Overall, it required 850 kW of power (not including air conditioning). What sets ASCI Option Red aside from all of its predecessors in supercomputing is its high I/O bandwidth. Previous supercomputers had multi-GFLOPS performance, yet their slow I/O speeds would slow down, or bottleneck the systems. Intel's TFLOPS PFS is an extremely efficient "Parallel File System" that can sustain transfer speeds of up to 1 GB/s, eliminating bottlenecks. [12]

First to TFLOPS

In December, 1996, three quarters of ASCI Red was measured at a world record 1.06 TFLOPS on MP LINPACK and held the record for fastest supercomputer in the world for several consecutive years, maxing out at 2.38 TFLOPS after a processor and memory upgrade in 1999. [4] [7] The system used Pentium Pro processors when initially constructed and when it recorded performance above one TFLOPS. In that configuration, when fully built it recorded 1.6 TFLOPS of performance. Upgrades later in 1999, to specially packaged Pentium II Xeon processors, pushed performance to 3.1 TFLOPS. [8]

Operating system

The different partitions of ASCI Red run on different operating systems. For example, users of the computer work in an environment called "Teraflops OS", an operating system (once called Paragon OS) that was originally developed for the Intel Paragon XP/S Supercomputer. [5] ASCI Red's Compute partition runs on an operating system named Cougar. [11] Cougar is a Sandia Labs and University of New Mexico collaboration; it is a lightweight OS based on PUMA and SUNMOS, two systems that were also designed for use on the Paragon supercomputer. [11] It consists of a light weight kernel, the Process Control Thread, and other utilities and libraries. The Linux 2.4 kernel was ported to the system and a custom CNIC driver was written, but the heavy weight OS did not perform as well as the Cougar lightweight kernel on many benchmarks. [11]

Related Research Articles

<span class="mw-page-title-main">Supercomputer</span> Type of extremely powerful computer

A supercomputer is a computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instructions per second (MIPS). Since 2017, supercomputers have existed which can perform over 1017 FLOPS (a hundred quadrillion FLOPS, 100 petaFLOPS or 100 PFLOPS). For comparison, a desktop computer has performance in the range of hundreds of gigaFLOPS (1011) to tens of teraFLOPS (1013). Since November 2017, all of the world's fastest 500 supercomputers run on Linux-based operating systems. Additional research is being conducted in the United States, the European Union, Taiwan, Japan, and China to build faster, more powerful and technologically superior exascale supercomputers.

In computing, floating point operations per second is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate measure than measuring instructions per second.

<span class="mw-page-title-main">IBM Blue Gene</span> Series of supercomputers by IBM

Blue Gene was an IBM project aimed at designing supercomputers that can reach operating speeds in the petaFLOPS (PFLOPS) range, with low power consumption.

<span class="mw-page-title-main">Pentium Pro</span> Sixth-generation x86 microprocessor by Intel

The Pentium Pro is a sixth-generation x86 microprocessor developed and manufactured by Intel and introduced on November 1, 1995. It introduced the P6 microarchitecture and was originally intended to replace the original Pentium in a full range of applications. While the Pentium and Pentium MMX had 3.1 and 4.5 million transistors, respectively, the Pentium Pro contained 5.5 million transistors. Later, it was reduced to a more narrow role as a server and high-end desktop processor and was used in supercomputers like ASCI Red, the first computer to reach the trillion floating point operations per second (teraFLOPS) performance mark in 1996. The Pentium Pro was capable of both dual- and quad-processor configurations. It only came in one form factor, the relatively large rectangular Socket 8. The Pentium Pro was succeeded by the Pentium II Xeon in 1998.

<span class="mw-page-title-main">Advanced Simulation and Computing Program</span>

The Advanced Simulation and Computing Program is a super-computing program run by the National Nuclear Security Administration, in order to simulate, test, and maintain the United States nuclear stockpile. The program was created in 1995 in order to support the Stockpile Stewardship Program. The goal of the initiative is to extend the lifetime of the current aging stockpile.

SUNMOS is an operating system jointly developed by Sandia National Laboratories and the Computer Science Department at the University of New Mexico. The goal of the project, started in 1991, is to develop a highly portable, yet efficient, operating system for massively parallel-distributed memory systems.

<span class="mw-page-title-main">IBM Scalable POWERparallel</span> Series of supercomputers by IBM

Scalable POWERparallel (SP) is a series of supercomputers from IBM. SP systems were part of the IBM RISC System/6000 (RS/6000) family, and were also called the RS/6000 SP. The first model, the SP1, was introduced in February 1993, and new models were introduced throughout the 1990s until the RS/6000 was succeeded by eServer pSeries in October 2000. The SP is a distributed memory system, consisting of multiple RS/6000-based nodes interconnected by an IBM-proprietary switch called the High Performance Switch (HPS). The nodes are clustered using software called PSSP, which is mainly written in Perl.

Red Storm is a supercomputer architecture designed for the US Department of Energy’s National Nuclear Security Administration Advanced Simulation and Computing Program. Cray, Inc developed it based on the contracted architectural specifications provided by Sandia National Laboratories. The architecture was later commercially produced as the Cray XT3.

<span class="mw-page-title-main">Cray XT3</span> Distributed memory massively parallel MIMD supercomputer

The Cray XT3 is a distributed memory massively parallel MIMD supercomputer designed by Cray Inc. with Sandia National Laboratories under the codename Red Storm. Cray turned the design into a commercial product in 2004. The XT3 derives much of its architecture from the previous Cray T3E system, and also from the Intel ASCI Red supercomputer.

<span class="mw-page-title-main">TOP500</span> Database project devoted to the ranking of computers

The TOP500 project ranks and details the 500 most powerful non-distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these updates always coincides with the International Supercomputing Conference in June, and the second is presented at the ACM/IEEE Supercomputing Conference in November. The project aims to provide a reliable basis for tracking and detecting trends in high-performance computing and bases rankings on HPL benchmarks, a portable implementation of the high-performance LINPACK benchmark written in Fortran for distributed-memory computers.

The QCDOC is a supercomputer technology focusing on using relatively cheap low power processing elements to produce a massively parallel machine. The machine is custom-made to solve small but extremely demanding problems in the fields of quantum physics.

The National Center for Computational Sciences (NCCS) is a United States Department of Energy (DOE) Leadership Computing Facility that houses the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility charged with helping researchers solve challenging scientific problems of global interest with a combination of leading high-performance computing (HPC) resources and international expertise in scientific computing.

Portals is a low-level network API for high-performance networking on high-performance computing systems developed by Sandia National Laboratories and the University of New Mexico. Portals is currently the lowest-level network programming interface on the commercially successful XT line of supercomputers from Cray.

A lightweight kernel (LWK) operating system is one used in a large computer with many processor cores, termed a parallel computer.

New York Blue is an 18 rack Blue Gene/L and a 2 rack Blue Gene/P massively parallel supercomputer based on the IBM system-on-chip technology. It is located in the New York Center for Computational Sciences (NYCCS). The supercomputer is owned by Stony Brook University and is located at Brookhaven National Laboratory in Upton, Long Island, New York. The funds for this machine were provided by the New York state, with the leadership of the NYS Assembly. It began operating on July 15, 2007, when it was the fifth most powerful supercomputer dedicated to general research. According to Stony Brook provost Robert McGrath, it would also rank within the top 10 when including supercomputers available only for military research. The renovation of laboratory space was supported by the New York state and U.S. DOE fund. As of June 2010, the Blue Gene/L was ranked 67th in the Top 500 supercomputing rankings. Together with the Computational Center for Nanotechnology Innovations at Rensselaer Polytechnic Institute, New York Blue provides New York state with more computing power available for general research than any state in the nation.

<span class="mw-page-title-main">Supercomputing in Japan</span> Overview of supercomputing in Japan

Japan operates a number of centers for supercomputing which hold world records in speed, with the K computer becoming the world's fastest in June 2011. and Fugaku took the lead in June 2020, and furthered it, as of November 2020, to 3 times faster than number two computer.

<span class="mw-page-title-main">History of supercomputing</span> Aspect of history

The term supercomputing arose in the late 1920s in the United States in response to the IBM tabulators at Columbia University. The CDC 6600, released in 1964, is sometimes considered the first supercomputer. However, some earlier computers were considered supercomputers for their day such as the 1954 IBM NORC in the 1950s, and in the early 1960s, the UNIVAC LARC (1960), the IBM 7030 Stretch (1962), and the Manchester Atlas (1962), all of which were of comparable power.

<span class="mw-page-title-main">Supercomputing in Europe</span> Overview of supercomputing in Europe

Several centers for supercomputing exist across Europe, and distributed access to them is coordinated by European initiatives to facilitate high-performance computing. One such initiative, the HPC Europa project, fits within the Distributed European Infrastructure for Supercomputing Applications (DEISA), which was formed in 2002 as a consortium of eleven supercomputing centers from seven European countries. Operating within the CORDIS framework, HPC Europa aims to provide access to supercomputers across Europe.

<span class="mw-page-title-main">Xeon Phi</span> Series of x86 manycore processors from Intel

Xeon Phi was a series of x86 manycore processors designed and made by Intel. It was intended for use in supercomputers, servers, and high-end workstations. Its architecture allowed use of standard programming languages and application programming interfaces (APIs) such as OpenMP.

<span class="mw-page-title-main">Cray XC40</span> Supercomputer manufactured by Cray

The Cray XC40 is a massively parallel multiprocessor supercomputer manufactured by Cray. It consists of Intel Haswell Xeon processors, with optional Nvidia Tesla or Intel Xeon Phi accelerators, connected together by Cray's proprietary "Aries" interconnect, stored in air-cooled or liquid-cooled cabinets. The XC series supercomputers are available with the Cray DataWarp applications I/O accelerator technology.

References

  1. 1 2 3 4 Thomas, Robert. "ASCI Red Homepage". Sandia National Laboratories. Archived from the original on September 26, 2011. Retrieved October 30, 2011.
  2. 1 2 "Sandia's ASCI Red, world's first teraflop supercomputer, is decommissioned". sandia.gov. June 29, 2006. Archived from the original on September 29, 2013. Retrieved May 26, 2014.
  3. 1 2 Mattson, Timothy. "An Overview of the Intel TFLOPS Supercompute" (PDF). MIT. Retrieved October 30, 2011.
  4. 1 2 3 "TOP500.org Ranking History for ASCI Red". TOP500 Supercomputer Sites. Retrieved October 29, 2011.
  5. 1 2 3 4 Mattson, Timothy. "The ASCI Option Red Supercomputer". Archived from the original on May 28, 2010. Retrieved October 27, 2011.
  6. 1 2 3 Garg, Sharad (2001). "Performance Evaluation of Parallel File Systems for PC Clusters and ASCI Red". Proceedings 2001 IEEE International Conference on Cluster Computing. IEEE. pp. 172–177. doi:10.1109/CLUSTR.2001.959973. ISBN   0-7695-1116-3. S2CID   13224481.
  7. 1 2 3 "7X Performance Results – Final Report: ASCI Red vs. Red Storm" (PDF). Retrieved November 17, 2011.
  8. 1 2 "Sandia's ASCI Red, world's first teraflop supercomputer, is decommissioned" (PDF). Retrieved January 8, 2013.
  9. "TOP500.org feature page on the ASCI Red of the Sandia National Laboratory". Archived from the original on January 9, 2016. Retrieved January 8, 2016.
  10. Warren, Michael (November 1997). "Pentium Pro Inside: I. A Treecode at 430 Gigaflops on ASCI Red, II. Price/Performance of $50/Mflop on Loki and Hyglac". Proceedings of the ACM/IEEE Conference. IEEE: 61. doi:10.1109/SC.1997.10057. S2CID   13167835.
  11. 1 2 3 4 Brightwell; Riesen; Underwood; Hudson; Bridges; MacCabe (2003). "A performance comparison of Linux and a lightweight kernel". Proceedings IEEE International Conference on Cluster Computing CLUSTR-03. pp. 251–258. doi:10.1109/CLUSTR.2003.1253322. ISBN   0-7695-2066-9. S2CID   7454194.
  12. Garg, Sharad (1998). "TFLOPS PFS: Architecture and Design of A Highly Efficient Parallel File System". Proceedings of the IEEE/ACM SC98 Conference. IEEE. p. 2. doi:10.1109/SC.1998.10003. ISBN   0-8186-8707-X. S2CID   8683745.
Records
Preceded by
CP-PACS/2048
368.20 gigaflops
World's most powerful supercomputer
June 1997 – June 2000
Succeeded by
ASCI White
4.938 teraflops