HECToR

Last updated

HECToR
ActiveOctober 2007 - March 2014
Sponsors EPSRC, NERC and BBSRC
OperatorsPartners including EPCC, STFC and NAG)
Location University of Edinburgh, Scotland, United Kingdom
Architecture Cray XE6, 90,112 cores
Operating system Cray Linux Environment
Memory90 terabytes
Storage>1 petabyte
Speed800 teraflops
PurposeUK academic community use
Web site www.hector.ac.uk

HECToR (High End Computing Terascale Resource) was a British academic national supercomputer service funded by EPSRC, Natural Environment Research Council (NERC) and BBSRC for the UK academic community. The HECToR service was run by partners including EPCC, Science and Technology Facilities Council (STFC) and Numerical Algorithms Group (NAG). [1]

Contents

The supercomputer itself (currently a Cray XE6) was located at the University of Edinburgh in Scotland. The first phase came on line in October 2007, and, by the time it was decommissioned, it had been upgraded to Phase 3 configuration, with a peak performance of over 800 teraflops. [2] Its successor is called ARCHER. [3]

Hardware

HECToR's hardware configuration has been progressively upgraded since the system was first commissioned.

Phase 1

HECToR's initial configuration, known as Phase 1, featured 60 Cray XT4 cabinets containing 1416 compute blades, giving a total of 11,328 2.8 GHz AMD Opteron processor cores, connected to 576 terabytes of RAID backing storage, later increased to 934 TB. The peak performance of the system was 59 teraflops. [4]

In August 2008, 28 Cray X2 Black Widow vector compute nodes were added to the system. Each node had 4 vector processors, giving a total of 112 processors. Each processor was capable of 25.6 gigaflops, giving a peak performance of 2.87 teraflops. Each 4-processor node shared 32 gigabytes of memory. [4]

Phase 2a

In the summer of 2009, the XT4 cabinets were upgraded with quad-core 2.3 GHz Opteron processors with 8 GB memory each. This doubled the number of processor cores to 22,656, and increased total system memory to 45.3 terabytes. Peak performance was increased to 208 teraflops. [2]

Phase 2b

The Phase 2b upgrade, performed in 2010, involved installation of a new 20-cabinet Cray XT6 system featuring 12-core Opteron 6100 processors, giving a total of 44,544 cores and a peak performance of over 360 teraflops. At the same time the existing XT4 system was reduced to approximately half its original size. A further upgrade took place later in 2010 to replace the SeaStar2 interconnect with a new interconnect technology, codenamed Gemini. [5]

Phase 3

The Phase 3 upgrade took place in November and December 2011. It involved extending the XT6 system to 30 cabinets containing 704 compute blades, and upgrading the processors to 16-core, 2.3 GHz Interlagos Opterons, giving a total of 90,112 cores. The operating system was also upgraded to CLE 4.0. [6]

Software

HECToR's operating system is Cray Linux Environment (CLE), formerly known as UNICOS/lc. A variety of applications, compilers and utilities are available to users.

HECToR supports four compiler suites:

Compilation for the HECToR backend nodes is facilitated through the Cray compilation scripts: ftn, cc, and CC.

Related Research Articles

FLOPS Measure of computer performance

In computing, floating point operations per second is a measure of computer performance, useful in fields of scientific computations that require floating-point calculations. For such cases it is a more accurate measure than measuring instructions per second.

Opteron Server and workstation processor line by Advanced Micro Devices

Opteron is AMD's x86 former server and workstation processor line, and was the first processor which supported the AMD64 instruction set architecture. It was released on April 22, 2003, with the SledgeHammer core (K8) and was intended to compete in the server and workstation markets, particularly in the same segment as the Intel Xeon processor. Processors based on the AMD K10 microarchitecture were announced on September 10, 2007, featuring a new quad-core configuration. The most-recently released Opteron CPUs are the Piledriver-based Opteron 4300 and 6300 series processors, codenamed "Seoul" and "Abu Dhabi" respectively.

Cray Inc., a subsidiary of Hewlett Packard Enterprise, is an American supercomputer manufacturer headquartered in Seattle, Washington. It also manufactures systems for data storage and analytics. Several Cray supercomputer systems are listed in the TOP500, which ranks the most powerful supercomputers in the world.

Quadrics (company)

Quadrics was a supercomputer company formed in 1996 as a joint venture between Alenia Spazio and the technical team from Meiko Scientific. They produced hardware and software for clustering commodity computer systems into massively parallel systems. Their highpoint was in June 2003 when six out of the ten fastest supercomputers in the world were based on Quadrics' interconnect. They officially closed on June 29, 2009.

EPCC, formerly the Edinburgh Parallel Computing Centre, is a supercomputing centre based at the University of Edinburgh. Since its foundation in 1990, its stated mission has been to accelerate the effective exploitation of novel computing throughout industry, academia and commerce.

Red Storm is a supercomputer architecture designed for the US Department of Energy’s National Nuclear Security Administration Advanced Simulation and Computing Program. Cray, Inc developed it based on the contracted architectural specifications provided by Sandia National Laboratories. The architecture was later commercially produced as the Cray XT3.

Cray XT3 Distributed memory massively parallel MIMD supercomputer

The Cray XT3 is a distributed memory massively parallel MIMD supercomputer designed by Cray Inc. with Sandia National Laboratories under the codename Red Storm. Cray turned the design into a commercial product in 2004. The XT3 derives much of its architecture from the previous Cray T3E system, and also from the Intel ASCI Red supercomputer.

Roadrunner (supercomputer)

Roadrunner was a supercomputer built by IBM for the Los Alamos National Laboratory in New Mexico, USA. The US$100-million Roadrunner was designed for a peak performance of 1.7 petaflops. It achieved 1.026 petaflops on May 25, 2008, to become the world's first TOP500 LINPACK sustained 1.0 petaflops system.

The Bigben supercomputer was a Cray XT3 MPP system with 2068 nodes located at Pittsburgh Supercomputing Center. It was decommissioned on March 31, 2010. Bigben was a part of the TeraGrid.

Cray XT5 Family of supercomputers

The Cray XT5 is an updated version of the Cray XT4 supercomputer, launched on November 6, 2007. It includes a faster version of the XT4's SeaStar2 interconnect router called SeaStar2+, and can be configured either with XT4 compute blades, which have four dual-core AMD Opteron processor sockets, or XT5 blades, with eight sockets supporting dual or quad-core Opterons. The XT5 uses a 3-dimensional torus network topology.

The National Center for Computational Sciences (NCCS) is a United States Department of Energy (DOE) Leadership Computing Facility that houses the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility charged with helping researchers solve challenging scientific problems of global interest with a combination of leading high-performance computing (HPC) resources and international expertise in scientific computing.

Jaguar (supercomputer)

Jaguar or OLCF-2 was a petascale supercomputer built by Cray at Oak Ridge National Laboratory (ORNL) in Oak Ridge, Tennessee. The massively parallel Jaguar had a peak performance of just over 1,750 teraFLOPS. It had 224,256 x86-based AMD Opteron processor cores, and operated with a version of Linux called the Cray Linux Environment. Jaguar was a Cray XT5 system, a development from the Cray XT4 supercomputer.

A lightweight kernel (LWK) operating system is one used in a large computer with many processor cores, termed a parallel computer.

The Cray XT6 is an updated version of the Cray XT5 supercomputer, launched on 16 November 2009. The dual- or quad-core AMD Opteron 2000-series processors of the XT5 are replaced in the XT6 with eight- or 12-core Opteron 6100 processors, giving up to 2,304 cores per cabinet. The XT6 includes the same SeaStar2+ interconnect router as the XT5, which is used to provide a 3-dimensional torus network topology between nodes. Each XT6 node has two processor sockets, one SeaStar2+ router and either 32 or 64 GB of DDR3 SDRAM memory. Four nodes form one X6 compute blade.

Cray XE6

The Cray XE6 made by Cray is an enhanced version of the Cray XT6 supercomputer, officially announced on 25 May 2010. The XE6 uses the same computer blade found in the XT6, with eight- or 12-core Opteron 6100 processors giving up to 3,072 cores per cabinet, but replaces the SeaStar2+ interconnect router used in the Cray XT5 and XT6 with the faster and more scalable Gemini router ASIC. This is used to provide a 3-dimensional torus network topology between nodes. Each XE6 node has two processor sockets and either 32 or 64 GB of DDR3 SDRAM memory. Two nodes share one Gemini router ASIC.

Supercomputing in Europe Overview of supercomputing in Europe

Several centers for supercomputing exist across Europe, and distributed access to them is coordinated by European initiatives to facilitate high-performance computing. One such initiative, the HPC Europa project, fits within the Distributed European Infrastructure for Supercomputing Applications (DEISA), which was formed in 2002 as a consortium of eleven supercomputing centers from seven European countries. Operating within the CORDIS framework, HPC Europa aims to provide access to supercomputers across Europe.

The Cray XK6 made by Cray is an enhanced version of the Cray XE6 supercomputer, announced in May 2011. The XK6 uses the same "blade" architecture of the XE6, with each XK6 blade comprising four compute "nodes". Each node consists of a 16-core AMD Opteron 6200 processor with 16 or 32 GB of DDR3 RAM and an Nvidia Tesla X2090 GPGPU with 6 GB of GDDR5 RAM, the two connected via PCI Express 2.0. Two Gemini router ASICs are shared between the nodes on a blade, providing a 3-dimensional torus network topology between nodes. This means that it has 576 GB of Graphics memory and over 1500 CPU cores, several orders of magnitude more powerful than the best publicly available computer on the market.

Titan (supercomputer) American supercomputer

Titan or OLCF-3 was a supercomputer built by Cray at Oak Ridge National Laboratory for use in a variety of science projects. Titan was an upgrade of Jaguar, a previous supercomputer at Oak Ridge, that uses graphics processing units (GPUs) in addition to conventional central processing units (CPUs). Titan was the first such hybrid to perform over 10 petaFLOPS. The upgrade began in October 2011, commenced stability testing in October 2012 and it became available to researchers in early 2013. The initial cost of the upgrade was US$60 million, funded primarily by the United States Department of Energy.

Catamount (operating system)

Catamount is an operating system for supercomputers.

XK7 is a supercomputing platform, produced by Cray, launched on October 29, 2012. XK7 is the second platform from Cray to use a combination of central processing units ("CPUs") and graphical processing units ("GPUs") for computing; the hybrid architecture requires a different approach to programming to that of CPU-only supercomputers. Laboratories that host XK7 machines host workshops to train researchers in the new programming languages needed for XK7 machines. The platform is used in Titan, the world's second fastest supercomputer in the November 2013 list as ranked by the TOP500 organization. Other customers include the Swiss National Supercomputing Centre which has a 272 node machine and Blue Waters has a machine that has Cray XE6 and XK7 nodes that performs at approximately 1 petaFLOPS (1015 floating-point operations per second).

References

  1. Inside the UK's fastest machine, James Randerson, Guardian, Wed 2 January 2008
  2. 1 2 "HECToR Hardware". www.hector.ac.uk. UoE HPCX Ltd. Retrieved 7 May 2010.
  3. http://www.archer.ac.uk
  4. 1 2 "HECToR Phase 1 Hardware Configuration". www.hector.ac.uk. UoE HPCX Ltd. Retrieved 7 May 2010.
  5. "HECToR Phase2B Upgrade". HECToRNews. UoE HPCX Ltd. (7). December 2009.
  6. "HECToR Phase 3 Upgrade". www.hector.ac.uk. UoE HPCX Ltd. 18 October 2011. Retrieved 28 February 2012.

Coordinates: 55°55′18″N3°10′26″W / 55.9217°N 3.174°W / 55.9217; -3.174