|Active||Became operational October 29, 2012; decommissioned August 2, 2019|
|Sponsors||US DOE and NOAA (<10%)|
|Location||Oak Ridge National Laboratory|
|Architecture||18,688 AMD Opteron 6274 16-core CPUs|
18,688 Nvidia Tesla K20X GPUs
|Operating system||Cray Linux Environment|
|Space||404 m2 (4352 ft2)|
|Memory||693.5 TiB (584 TiB CPU and 109.5 TiB GPU)|
|Storage||40 PB, 1.4 TB/s IO Lustre filesystem|
|Speed||17.59 petaFLOPS (LINPACK)|
27 petaFLOPS theoretical peak
|Cost||US$97 million (equivalent to $124 million in 2022)|
|Ranking||TOP500 : 4th, June 2017|
|Legacy||Ranked 1 on TOP500 when built.|
First GPU based supercomputer to perform over 10 petaFLOPS
Titan or OLCF-3 was a supercomputer built by Cray at Oak Ridge National Laboratory for use in a variety of science projects. Titan was an upgrade of Jaguar, a previous supercomputer at Oak Ridge, that uses graphics processing units (GPUs) in addition to conventional central processing units (CPUs). Titan was the first such hybrid to perform over 10 petaFLOPS. The upgrade began in October 2011, commenced stability testing in October 2012 and it became available to researchers in early 2013. The initial cost of the upgrade was US$60 million, funded primarily by the United States Department of Energy.
Titan was eclipsed at Oak Ridge by Summit in 2019, which was built by IBM and features fewer nodes with much greater GPU capability per node as well as local per-node non-volatile caching of file data from the system's parallel file system.
Titan employed AMD Opteron CPUs in conjunction with Nvidia Tesla GPUs to improve energy efficiency while providing an order of magnitude increase in computational power over Jaguar. It used 18,688 CPUs paired with an equal number of GPUs to perform at a theoretical peak of 27 petaFLOPS; in the LINPACK benchmark used to rank supercomputers' speed, it performed at 17.59 petaFLOPS. This was enough to take first place in the November 2012 list by the TOP500 organization, but Tianhe-2 overtook it on the June 2013 list.
Titan was available for any scientific purpose; access depends on the importance of the project and its potential to exploit the hybrid architecture. Any selected programs must also be executable on other supercomputers to avoid sole dependence on Titan. Six vanguard programs were the first selected. They dealt mostly with molecular scale physics or climate models, while 25 others were queued behind them. The inclusion of GPUs compelled authors to alter their programs. The modifications typically increased the degree of parallelism, given that GPUs offer many more simultaneous threads than CPUs. The changes often yield greater performance even on CPU-only machines.
Plans to create a supercomputer capable of 20 petaFLOPS at the Oak Ridge Leadership Computing Facility (OLCF) at Oak Ridge National Laboratory (ORNL) originated as far back as 2005, when Jaguar was built. petaFLOPS system in 2016 as part of ORNL's plan to operate an exascale (1000 petaFLOPS to 1 exaFLOPS) machine by 2020. The initial plan to build a new 15,000 square meter (160,000 ft2) building for Titan, was discarded in favor of using Jaguar's existing infrastructure. The precise system architecture was not finalized until 2010, although a deal with Nvidia to supply the GPUs was signed in 2009. Titan was first announced at the private ACM/IEEE Supercomputing Conference (SC10) on November 16, 2010, and was publicly announced on October 11, 2011, as the first phase of the Titan upgrade began.Titan will itself be replaced by an approximately 200
Jaguar had received various upgrades since its creation. It began with the Cray XT3 platform that yielded 25 teraFLOPS.By 2008, Jaguar had been expanded with more cabinets and upgraded to the XT4 platform, reaching 263 teraFLOPS. In 2009, it was upgraded to the XT5 platform, hitting 1.4 petaFLOPS. Its final upgrades brought Jaguar to 1.76 petaFLOPS.
Titan was funded primarily by the US Department of Energy through ORNL. Funding was sufficient to purchase the CPUs but not all of the GPUs so the National Oceanic and Atmospheric Administration agreed to fund the remaining nodes in return for computing time. million upfront, of which the NOAA contribution was less than $10 million, but precise figures were covered by non-disclosure agreements. The full term of the contract with Cray included $97 million, excluding potential upgrades.ORNL scientific computing chief Jeff Nichols noted that Titan cost approximately $60
The yearlong conversion began October 9, 2011. XT5 blades (two 6-core CPUs per node, four nodes per blade), were upgraded to XK7 blade (one 16-core CPU per node, four nodes per blade) while the remainder of the machine remained in use. In December, computation was moved to the 96 XK7 cabinets while the remaining 104 cabinets were upgraded to XK7 blades. ORNL's external ESnet connection was upgraded from 10 Gbit/s to 100 Gbit/s and the system interconnect (the network over which CPUs communicate with each other) was updated. The Seastar design used in Jaguar was upgraded to the Gemini interconnect used in Titan which connects the nodes into a direct 3D torus interconnect network. Gemini uses wormhole flow control internally. The system memory was doubled to 584 TiB. 960 of the XK7 nodes (10 cabinets) were fitted with a Fermi based GPU as Kepler GPUs were not then available; these 960 nodes were referred to as TitanDev and used to test code. This first phase of the upgrade increased the peak performance of Jaguar to 3.3 petaFLOPS. Beginning on September 13, 2012, Nvidia K20X GPUs were fitted to all of Jaguar's XK7 compute blades, including the 960 TitanDev nodes. In October, the task was completed and the computer was finally renamed Titan.Between October and December, 96 of Jaguar's 200 cabinets, each containing 24
In March 2013, Nvidia launched the GTX Titan, a consumer graphics card that uses the same GPU die as the K20X GPUs in Titan. GPUs. ORNL announced on April 8 that the repairs were complete and acceptance test completion was announced on June 11, 2013.Titan underwent acceptance testing in early 2013 but only completed 92% of the tests, short of the required 95%. The problem was discovered to be excess gold in the female edge connectors of the motherboards' PCIe slots causing cracks in the motherboards' solder. The cost of repair was borne by Cray and between 12 and 16 cabinets were repaired each week. Throughout the repairs users were given access to the available CPUs. On March 11, they gained access to 8,972
Titan's hardware has a theoretical peak performance of 27 petaFLOPS with "perfect" software. On November 12, 2012, the TOP500 organization that ranks the world's supercomputers by LINPACK performance, ranked Titan first at 17.59 petaFLOPS, displacing IBM Sequoia. Titan also ranked third on the Green500, the same 500 supercomputers ranked in terms of energy efficiency. In the June 2013 TOP500 ranking, Titan fell to second place behind Tianhe-2 and to twenty-ninth on the Green500 list. Titan did not re-test for the June 2013 ranking, because it would still have ranked second, at 27 petaFLOPS.
Titan uses Jaguar's 200 cabinets, covering 404 square meters (4,352 ft2), with replaced internals and upgraded networking. Reusing Jaguar's power and cooling systems saved approximately $20 million. Power is provided to each cabinet at three-phase 480 V. This requires thinner cables than the US standard 208 V, saving $1 million in copper. At its peak, Titan draws 8.2 MW, 1.2 MW more than Jaguar, but runs almost ten times as fast in terms of floating point calculations. In the event of a power failure, carbon fiber flywheel power storage can keep the networking and storage infrastructure running for up to 16 seconds. After 2 seconds without power, diesel generators fire up, taking approximately 7 seconds to reach full power. They can provide power indefinitely. The generators are designed only to keep the networking and storage components powered so that a reboot is much quicker; the generators are not capable of powering the processing infrastructure.
Titan has 18,688 nodes (4 nodes per blade, 24 blades per cabinet), each containing a 16-core AMD Opteron 6274 CPU with 32 GB of DDR3 ECC memory and an Nvidia Tesla K20X GPU with 6 GB GDDR5 ECC memory. There are a total of 299,008 processor cores, and a total of 693.6 TiB of CPU and GPU RAM.
Initially, Titan used Jaguar's 10 PB of Lustre storage with a transfer speed of 240 GB/s, but in April 2013, the storage was upgraded to 40 PB with a transfer rate of 1.4 TB/s. GPUs were selected for their vastly higher parallel processing efficiency over CPUs. Although the GPUs have a slower clock speed than the CPUs, each GPU contains 2,688 CUDA cores at 732 MHz, resulting in a faster overall system. Consequently, the CPUs' cores are used to allocate tasks to the GPUs rather than directly processing the data as in conventional supercomputers.
Titan runs the Cray Linux Environment, a full version of Linux on the login nodes that users directly access, but a smaller, more efficient version on the compute nodes.
Titan's components are air-cooled by heat sinks, but the air is chilled before being pumped through the cabinets. °C (42 °F), which in turn cools recirculated air.Fan noise is so loud that hearing protection is required for people spending more than 15 minutes in the machine room. The system has a cooling capacity of 23.2 MW (6600 tons) and works by chilling water to 5.5
Researchers also have access to EVEREST (Exploratory Visualization Environment for Research and Technology) to better understand the data that Titan outputs. EVEREST is a visualization room with a 10 by 3 meter (33 by 10 ft) screen and a smaller, secondary screen. The screens are 37 and 33 megapixels respectively with stereoscopic 3D capability.
In 2009, the Oak Ridge Leadership Computing Facility that manages Titan narrowed the fifty applications for first use of the supercomputer down to six "vanguard" codes chosen for the importance of the research and for their ability to fully utilize the system.The six vanguard projects to use Titan were:
In 2013 thirty-one codes were planned to run on Titan, typically four or five at any one time.
The code of many projects has to be modified to suit the GPU processing of Titan, but each code is required to be executable on CPU-based systems so that projects do not become solely dependent on Titan.OLCF formed the Center for Accelerated Application Readiness (CAAR) to aid with the adaptation process. It holds developer workshops at Nvidia headquarters to educate users about the architecture, compilers and applications on Titan. CAAR has been working on compilers with Nvidia and code vendors to integrate directives for GPUs into their programming languages. Researchers can thus express parallelism in their code with their existing programming language, typically Fortran, C or C++, and the compiler can express it to the GPUs. Dr. Bronson Messer, a computational astrophysicist, said of the task: "an application using Titan to the utmost must also find a way to keep the GPU busy, remembering all the while that the GPU is fast, but less flexible than the CPU." Moab Cluster Suite is used to prioritize jobs to nodes to keep utilization high; it improved efficiency from 70% to approximately 95% in the tested software. Some projects found that the changes increased efficiency of their code on non-GPU machines; the performance of Denovo doubled on CPU-based machines.
The amount of code alteration required to run on the GPUs varies by project. According to Dr. Messer of NRDF, only a small percentage of his code runs on GPUs because the calculations are relatively simple but processed repeatedly and in parallel.NRDF is written in CUDA Fortran, a version of Fortran with CUDA extensions for the GPUs. Chimera's third "head" was the first to run on the GPUs as the nuclear burning could most easily be simulated by GPU architecture. Other aspects of the code were planned to be modified in time. On Jaguar, the project modeled 14 or 15 nuclear species but Messer anticipated simulating up to 200 species, allowing far greater precision when comparing the simulation to empirical observation.
A supercomputer is a computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instructions per second (MIPS). Since 2017, there have existed supercomputers which can perform over 1017 FLOPS (a hundred quadrillion FLOPS, 100 petaFLOPS or 100 PFLOPS). For comparison, a desktop computer has performance in the range of hundreds of gigaFLOPS (1011) to tens of teraFLOPS (1013). Since November 2017, all of the world's fastest 500 supercomputers run on Linux-based operating systems. Additional research is being conducted in the United States, the European Union, Taiwan, Japan, and China to build faster, more powerful and technologically superior exascale supercomputers.
Oak Ridge National Laboratory (ORNL) is a federally funded research and development center in Oak Ridge, Tennessee, United States. Founded in 1943, the laboratory is now sponsored by the United States Department of Energy and administered by UT–Battelle, LLC.
In computing, floating point operations per second is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases, it is a more accurate measure than measuring instructions per second.
Cray Inc., a subsidiary of Hewlett Packard Enterprise, is an American supercomputer manufacturer headquartered in Seattle, Washington. It also manufactures systems for data storage and analytics. Several Cray supercomputer systems are listed in the TOP500, which ranks the most powerful supercomputers in the world.
Linaro DDT is a commercial C, C++ and Fortran 90 debugger. It is widely used for debugging parallel Message Passing Interface (MPI) and threaded programs, including those running on clusters of Linux machines.
The Oak Ridge Leadership Computing Facility (OLCF), formerly the National Leadership Computing Facility, is a designated user facility operated by Oak Ridge National Laboratory and the Department of Energy. It contains several supercomputers, the largest of which is an HPE OLCF-5 named Frontier, which was ranked 1st on the TOP500 list of world's fastest supercomputers as of June 2022. It is located in Oak Ridge, Tennessee.
The TOP500 project ranks and details the 500 most powerful non-distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these updates always coincides with the International Supercomputing Conference in June, and the second is presented at the ACM/IEEE Supercomputing Conference in November. The project aims to provide a reliable basis for tracking and detecting trends in high-performance computing and bases rankings on HPL benchmarks, a portable implementation of the high-performance LINPACK benchmark written in Fortran for distributed-memory computers.
The National Center for Computational Sciences (NCCS) is a United States Department of Energy (DOE) Leadership Computing Facility that houses the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility charged with helping researchers solve challenging scientific problems of global interest with a combination of leading high-performance computing (HPC) resources and international expertise in scientific computing.
Jaguar or OLCF-2 was a petascale supercomputer built by Cray at Oak Ridge National Laboratory (ORNL) in Oak Ridge, Tennessee. The massively parallel Jaguar had a peak performance of just over 1,750 teraFLOPS. It had 224,256 x86-based AMD Opteron processor cores, and operated with a version of Linux called the Cray Linux Environment. Jaguar was a Cray XT5 system, a development from the Cray XT4 supercomputer.
The National Institute for Computational Sciences (NICS) is funded by the National Science Foundation and managed by the University of Tennessee. NICS was home to Kraken, the most powerful computer in the world managed by academia. The NICS petascale scientific computing environment is housed at Oak Ridge National Laboratory (ORNL), home to the world's most powerful computing complex. The mission of NICS, a member of the Extreme Science and Engineering Discovery Environment (XSEDE - formerly TeraGrid), is to enable the scientific discoveries of researchers nationwide by providing leading-edge computational resources, together with support for their effective use, and leveraging extensive partnership opportunities.
Tianhe-I, Tianhe-1, or TH-1 is a supercomputer capable of an Rmax of 2.5 peta FLOPS. Located at the National Supercomputing Center of Tianjin, China, it was the fastest computer in the world from October 2010 to June 2011 and was one of the few petascale supercomputers in the world.
Exascale computing refers to computing systems capable of calculating at least "1018 IEEE 754 Double Precision (64-bit) operations (multiplications and/or additions) per second (exaFLOPS)"; it is a measure of supercomputer performance.
Kraken was a Cray XT5 supercomputer that entered into full production mode on February 2, 2009. Kraken was operated by the University of Tennessee and was the most powerful computer in the world managed by academia at the time. It was housed in the Oak Ridge Leadership Computing facility at Oak Ridge National Laboratory. Kraken was decommissioned on April 30, 2014.
The Cray XK6 made by Cray is an enhanced version of the Cray XE6 supercomputer, announced in May 2011. The XK6 uses the same "blade" architecture of the XE6, with each XK6 blade comprising four compute "nodes". Each node consists of a 16-core AMD Opteron 6200 processor with 16 or 32 GB of DDR3 RAM and an Nvidia Tesla X2090 GPGPU with 6 GB of GDDR5 RAM, the two connected via PCI Express 2.0. Two Gemini router ASICs are shared between the nodes on a blade, providing a 3-dimensional torus network topology between nodes. This means that it has 576 GB of Graphics memory and over 1500 CPU cores, several orders of magnitude more powerful than the best publicly available computer on the market.
XK7 is a supercomputing platform, produced by Cray, launched on October 29, 2012. XK7 is the second platform from Cray to use a combination of central processing units ("CPUs") and graphical processing units ("GPUs") for computing; the hybrid architecture requires a different approach to programming to that of CPU-only supercomputers. Laboratories that host XK7 machines host workshops to train researchers in the new programming languages needed for XK7 machines. The platform is used in Titan, the world's second fastest supercomputer in the November 2013 list as ranked by the TOP500 organization. Other customers include the Swiss National Supercomputing Centre which has a 272 node machine and Blue Waters has a machine that has Cray XE6 and XK7 nodes that performs at approximately 1 petaFLOPS (1015 floating-point operations per second).
The Cray XC40 is a massively parallel multiprocessor supercomputer manufactured by Cray. It consists of Intel Haswell Xeon processors, with optional Nvidia Tesla or Intel Xeon Phi accelerators, connected together by Cray's proprietary "Aries" interconnect, stored in air-cooled or liquid-cooled cabinets. The XC series supercomputers are available with the Cray DataWarp applications I/O accelerator technology.
Summit or OLCF-4 is a supercomputer developed by IBM for use at Oak Ridge Leadership Computing Facility (OLCF), a facility at the Oak Ridge National Laboratory, capable of 200 petaFLOPS thus making it the 5th fastest supercomputer in the world after Frontier (OLCF-5), Fugaku, LUMI, and Leonardo, with Frontier being the fastest. It held the number 1 position from November 2018 to June 2020. Its current LINPACK benchmark is clocked at 148.6 petaFLOPS.
Nvidia DGX is a line of Nvidia-produced servers and workstations which specialize in using GPGPU to accelerate deep learning applications. The typical design of a DGX system is based upon a rackmount chassis with motherboard that carries high performance x86 server CPUs. The main component of a DGX system is a set of 4 to 16 Nvidia Tesla GPU modules on an independent system board. DGX systems have large heatsinks and powerful fans to adequately cool thousands of watts of thermal output. The GPU modules are typically integrated into the system using a version of the SXM socket.
Hewlett Packard Enterprise Frontier, or OLCF-5, is the world's first and fastest exascale supercomputer, hosted at the Oak Ridge Leadership Computing Facility (OLCF) in Tennessee, United States and first operational in 2022. It is based on the Cray EX and is the successor to Summit (OLCF-4). As of March 2023, Frontier is the world's fastest supercomputer. Frontier achieved an Rmax of 1.102 exaFLOPS, which is 1.102 quintillion operations per second, using AMD CPUs and GPUs. Measured at 62.86 gigaflops/watt, Frontier topped the Green500 list for most efficient supercomputer, until it was dethroned by Flatiron Institute's Henri supercomputer in November 2022.
Aurora is a planned supercomputer, originally contracted to be completed by 2018 but through a series of delays at the prime contractor, Intel Corporation, now planned to be completed in 2023. It was originally planned be the worlds’ fastest supercomputer with over 2 exaflops, however a series of delays have cast that into doubt. It is sponsored by the United States Department of Energy (DOE) and designed by Intel and Cray for the Argonne National Laboratory. It will have c. 2 exaFLOPS in computing power which is approximately a quintillion (260 or 1018) calculations per second and will have an expected cost of US$500 million. It will follow Frontier, which was the world's first exascale supercomputer in 2022 and the first half of 2023.