Magerit

Last updated

Magerit 3
UPM-CeSViMa-SupercomputadorMagerit-2011.jpg
Active2006
Location CeSViMa (UPM)
Spain
Architecture 64-bit Power ISA:
PowerPC (2006-2010)
IBM POWER7 (2011-2019);
x86-64 (Xeon) (2019 onward)
Operating system SUSE Linux Enterprise Server; CentOS
Memory13.056,0 TB
Speed182.78 TFlops [1]
Ranking TOP500 : 136, June 2011
Website http://www.cesvima.upm.es

Magerit is one of the most powerful supercomputers in Spain. It also reached the second best Spanish position in the TOP500 list of supercomputers. It is installed in CeSViMa, a research center of the Technical University of Madrid.

Contents

Magerit was first installed in 2006 and reached the 9th fastest in Europe and the 34th in the world, [2] the second best position of a Spanish supercomputer in the list. It also reached the 275th position in the first Green500 list published. [3] It is no longer among the TOP500.

The second version, installed in 2011 reached the 1st position of Spain, 44th of Europe and 136th fastest of the world. [4] It also reached the 18th position in the Green500 list. [5]

Magerit (for *Materit or *Mageterit) is the most ancient recorded name of the current city of Madrid. The name comes from the Arabic name of a fortress built on the Manzanares River in the 9th century AD, and means "place of abundant water". [6]

History

Computer Science School, first location of Magerit supercomputer UPM-FI--Bloque-5.jpg
Computer Science School, first location of Magerit supercomputer

First steps (2005)

Magerit was created as a collaboration between Technical University of Madrid and IBM. The computer is housed in the newly created CeSViMa. This first version had only 124 nodes and was housed temporarily in the Computer Science School of Madrid. The funding was provided by the Spanish Ministry of Education and Science and the Autonomous Region of Madrid.

Joining the Spanish Supercomputer Network (2006–2007)

In late 2006 CeSViMa joined Spanish Supercomputing Network (Red Española de Supercomputación or RES in Spanish) and the supercomputer was upgraded. The new configuration has 1204 nodes reaching a speed of 14 TFLOPS. [7] This is considered the first version due to its inclusion in the TOP500 list in the 34th position, the second best position of a Spanish supercomputer in the list.

In 2007 the first users from the access committee of Spanish Supercomputing Network (the agreement makes that the Network can schedule the use of the 68% of the resources) and users managed at local (CeSViMa) access committee (using the other 32%).

Migration and small upgrades (2008–2010)

CeSViMa Building in the Scientific and Technology Park of UPM, current location UPM-CeSViMa.jpg
CeSViMa Building in the Scientific and Technology Park of UPM, current location

In May 2008, CeSViMa and Magerit supercomputer migrated to a new building 40°24′15.65″N03°50′4.75″W / 40.4043472°N 3.8346528°W / 40.4043472; -3.8346528 in the same campus, only 500 meters from previous location at Computer Science School.

The computer was upgraded: change of communication switch, storage subsystem and replacement of some blades with a new version. This upgrade increased the power of the supercomputer near 2 TFLOPS reaching 15.95 TFLOPS. [8] This upgrade did not avoid the fall from the TOP500 list in November 2008.

In this configuration the 59.7% of the supercomputer CPU time is assigned via RES access committee and 40.3% is assigned via CeSViMa policies.

One year later, in 2009, the operating system and other system software were upgraded (migrating to SUSE Linux Enterprise Server 10).

During 2010, CeSViMa acquire a new massive storage system with 1 petabyte of capacity in parallel with the own storage of Magerit.[ citation needed ]

Upgrade (2011)

In the first half of 2011, the supercomputer was fully upgraded replacing all computer nodes and interconnexion networks with the latest technologies in only one month (a record time). [9]

This configuration reached the 136th [10] position in the TOP500 list [11] and the 18th [12] position in the related Green500 list (both widely used as the supercomputer reference ranking) becoming the most powerful supercomputer and ecological supercomputer in Spain. [9]

The new distribution of use is 80% managed by CeSViMa-UPM access committee and 20% managed by Spanish Supercomputing Network. Although the RES managed percent is lower, the resources donated to the network increased 4–5 times.

The upgrade does not include the storage subsystem (maintain the storage upgraded in 2008). There is a small upgrade planned in next few years to adapt the storage system to the new requirements.

Architecture

Two versions of the supercomputer can be considered:

First version (2005–2010)

First version of Magerit Supercomputer (Photo 2009) UPM-CeSViMa-SupercomputadorMagerit.jpg
First version of Magerit Supercomputer (Photo 2009)

This setup reached the second best position in the TOP500 list (34th, November 2006). When this version enters in production it reach the 2nd of Spain, 9th of Europe and 34th of the world in the TOP500 [2] list and the 275th position in the first Green500 [13] list

The final version setup (reached after the upgrade of 2008) is a cluster of 1204 nodes eServer BladeCenter (1036 JS20 and 168 JS21, both PowerPC 64-bit) under SUSE Linux Enterprise Server 9. [14]

The system has a distributed storage system with a capacity of 190 TB under GPFS. The access to this shared storage is provided by a high bandwidth switch that allows peaks of 1 Tbit/s.

All the nodes are interconnected with a low latency (2.6 – 3.2 μs [15] ) and high bandwidth network called Myrinet. This network is used only for MPI messages of users' tasks.

Finally, an auxiliary Ethernet network is deployed for administration tasks.

Second version (2011)

Second version of Magerit Supercomputer (photo 2011) UPM-CeSViMa-SupercomputadorMagerit-2011.jpg
Second version of Magerit Supercomputer (photo 2011)

This setup converts Magerit into the most powerful supercomputer of Spain. [4] When this setup enters in production stage in 2011, it reach the first position of Spain, 44th of Europe and 136th of the world. [11]

The system maintains the cluster architecture with 245 PS702 nodes, each one with 16 cores in two 64-bit processors POWER7 (eight cores each) 3.0 GHz, 32 GB of RAM and 300 GB of local hard disk. Each core provides 18.38 Gflops.

The interconnection was replaced with an Infiniband network, a high-bandwidth (40 Gbit/s) and low latency (0.3 μs). The system maintains two independent Gigabit Ethernet for auxiliary tasks: deployment of images and access to storage subsystem.

The storage system remains the same (192 TB under GPFS) with a bandwidth near 1 Tbit/s.

The upgrade includes an update of the software: operating system (SLES11SP1), deployment system (xCAT, eXtreme Cluster Administration Toolkit) and all software and libraries used in the system.

Third version (2019)

Magerit is upgraded [16] with Lenovo ThinkSystem SD530 nodes.

Use

Magerit processes batch jobs with large processing requirements, such as models of the universe, simulations of materials and climate models. An example of project is the project Cajal Blue Brain (Spanish participation in Blue Brain Project).

These jobs are organized by a queue manager. Due to the characteristic of the jobs (runs in hundred of CPUs a few days) its impossible to use more conventional access to the resources. The supercomputer must be running jobs without interrupts all the year.

The use of a queue manager of batch jobs allows a global scheduling of the resources increasing the use of the resources and a fair play between users.

Access to resources

The system is available to any person, institution or company that requests access via:

Related Research Articles

<span class="mw-page-title-main">IBM Blue Gene</span> Series of supercomputers by IBM

Blue Gene was an IBM project aimed at designing supercomputers that can reach operating speeds in the petaFLOPS (PFLOPS) range, with low power consumption.

<span class="mw-page-title-main">ASCI Red</span> Supercomputer

ASCI Red was the first computer built under the Accelerated Strategic Computing Initiative (ASCI), the supercomputing initiative of the United States government created to help the maintenance of the United States nuclear arsenal after the 1992 moratorium on nuclear testing.

<span class="mw-page-title-main">IBM BladeCenter</span> Blade server architecture by IBM

The IBM BladeCenter was IBM's blade server architecture, until it was replaced by Flex System in 2012. The x86 division was later sold to Lenovo in 2014.

<span class="mw-page-title-main">Roadrunner (supercomputer)</span>

Roadrunner was a supercomputer built by IBM for the Los Alamos National Laboratory in New Mexico, USA. The US$100-million Roadrunner was designed for a peak performance of 1.7 petaflops. It achieved 1.026 petaflops on May 25, 2008, to become the world's first TOP500 LINPACK sustained 1.0 petaflops system.

<span class="mw-page-title-main">TOP500</span> Database project devoted to the ranking of computers

The TOP500 project ranks and details the 500 most powerful non-distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these updates always coincides with the International Supercomputing Conference in June, and the second is presented at the ACM/IEEE Supercomputing Conference in November. The project aims to provide a reliable basis for tracking and detecting trends in high-performance computing and bases rankings on HPL benchmarks, a portable implementation of the high-performance LINPACK benchmark written in Fortran for distributed-memory computers.

<span class="mw-page-title-main">Spanish Supercomputing Network</span> Distributed infrastructure involving the interconnection of 12 supercomputers in Spain

The Spanish Supercomputing Network (RES) is a distributed infrastructure involving the interconnexion of 12 supercomputers which work together to offer High Performance Computing resources to the scientific community. It is coordinated by the Barcelona Supercomputing Center (BSC).

<span class="mw-page-title-main">Supercomputing and Visualization Center of Madrid</span> Supercomputer site in Madrid, Spain

The Supercomputing and Visualization Center of Madrid (CeSViMa), also called Madrid Supercomputing and Visualization Center, depends on the computer science faculty of the Technical University of Madrid. This center houses Magerit, one of the most powerful supercomputers in Spain. This center is a member of the Spanish Supercomputing Network, the Spanish e-Science Network and the Madrid Laboratories and Infraestructures Network.

<span class="mw-page-title-main">FinisTerrae</span> Supercomputer

Finisterrae was the 100th supercomputer in Top500 ranking in November 2007. Running at 12.97 teraFLOPS, it would rank at position 258 on the list as of June 2008. It is also the third most powerful supercomputer in Spain. It is located in Galicia.

QPACE is a massively parallel and scalable supercomputer designed for applications in lattice quantum chromodynamics.

The National Center for Computational Sciences (NCCS) is a United States Department of Energy (DOE) Leadership Computing Facility that houses the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility charged with helping researchers solve challenging scientific problems of global interest with a combination of leading high-performance computing (HPC) resources and international expertise in scientific computing.

<span class="mw-page-title-main">Supercomputing in Japan</span> Overview of supercomputing in Japan

Japan operates a number of centers for supercomputing which hold world records in speed, with the K computer being the world's fastest from June 2011 to June 2012, and Fugaku holding the lead from June 2020 until June 2022.

<span class="mw-page-title-main">National Computer Center for Higher Education (France)</span>

The National Computer Center for Higher Education, based in Montpellier, is a public institution under the supervision of the Ministry of Higher Education and Research (MESR) created by a decree issued in 1999. CINES offers IT services for public research in France. It is one of the major national centers for computing power supply for research in France.

<span class="mw-page-title-main">Tsubame (supercomputer)</span> Series of supercomputers

Tsubame is a series of supercomputers that operates at the GSIC Center at the Tokyo Institute of Technology in Japan, designed by Satoshi Matsuoka.

<span class="mw-page-title-main">Supercomputing in Europe</span> Overview of supercomputing in Europe

Several centers for supercomputing exist across Europe, and distributed access to them is coordinated by European initiatives to facilitate high-performance computing. One such initiative, the HPC Europa project, fits within the Distributed European Infrastructure for Supercomputing Applications (DEISA), which was formed in 2002 as a consortium of eleven supercomputing centers from seven European countries. Operating within the CORDIS framework, HPC Europa aims to provide access to supercomputers across Europe.

<span class="mw-page-title-main">Titan (supercomputer)</span> American supercomputer

Titan or OLCF-3 was a supercomputer built by Cray at Oak Ridge National Laboratory for use in a variety of science projects. Titan was an upgrade of Jaguar, a previous supercomputer at Oak Ridge, that uses graphics processing units (GPUs) in addition to conventional central processing units (CPUs). Titan was the first such hybrid to perform over 10 petaFLOPS. The upgrade began in October 2011, commenced stability testing in October 2012 and it became available to researchers in early 2013. The initial cost of the upgrade was US$60 million, funded primarily by the United States Department of Energy.

The PRIMEHPC FX10 is a supercomputer designed and manufactured by Fujitsu. Announced on 7 November 2011 at the Supercomputing Conference, the PRIMEHPC FX10 is an improved and commercialized version of the K computer, which was the first supercomputer to obtain more than 10 PFLOPS on the LINPACK benchmark. In its largest configuration, the PRIMEHPC FX10 has a peak performance 23.2 PFLOPS, power consumption of 22.4 MW, and a list price of US$655.4 million. It was succeeded by the PRIMEHPC FX100 with SPARC64 XIfx processors in 2015.

<span class="mw-page-title-main">Appro</span> American technology company

Appro was a developer of supercomputing supporting High Performance Computing (HPC) markets focused on medium- to large-scale deployments. Appro was based in Milpitas, California with a computing center in Houston, Texas, and a manufacturing and support subsidiary in South Korea and Japan.

XK7 is a supercomputing platform, produced by Cray, launched on October 29, 2012. XK7 is the second platform from Cray to use a combination of central processing units ("CPUs") and graphical processing units ("GPUs") for computing; the hybrid architecture requires a different approach to programming to that of CPU-only supercomputers. Laboratories that host XK7 machines host workshops to train researchers in the new programming languages needed for XK7 machines. The platform is used in Titan, the world's second fastest supercomputer in the November 2013 list as ranked by the TOP500 organization. Other customers include the Swiss National Supercomputing Centre which has a 272 node machine and Blue Waters has a machine that has Cray XE6 and XK7 nodes that performs at approximately 1 petaFLOPS (1015 floating-point operations per second).

<span class="mw-page-title-main">Fermi (supercomputer)</span> Supercomputer located at CINECA

Fermi is a 2.097 petaFLOPS supercomputer located at CINECA.

<span class="mw-page-title-main">Galileo (supercomputer)</span> Supercomputer in Italy

Galileo is a 1.1 petaFLOPS supercomputer located at CINECA in Bologna, Italy.

References

  1. "Magerit-3". Centro de Supercomputación y Visualización de Madrid.
  2. 1 2 "November 2006 - TOP500 Supercomputer Sites". top500.org.
  3. Magerit Green500 certificate in November 2007 [ permanent dead link ]
  4. 1 2 "June 2011 - TOP500 Supercomputer Sites". top500.org.
  5. "The Green500 List :: Environmentally Responsible Supercomputing :: The Green500 June 2011". 5 July 2011. Archived from the original on 5 July 2011.
  6. "Madrid History – Museums – Suggested Itineraries Madrid". Indigoguide.com. Archived from the original on 1 January 2011. Retrieved 3 February 2010.
  7. "BladeCenter JS20 Cluster, PPC 970, 2.2 GHz, Myrinet - TOP500 Supercomputing Sites". 11 October 2008. Archived from the original on 11 October 2008.
  8. "BladeCenter JS20 Cluster, PPC 970, 2.2 GHz, Myrinet - TOP500 Supercomputer Sites". top500.org.
  9. 1 2 Nota de prensa de la UPM: Technical University of Madrid installs the most powerful supercomputer in Spain Archived 16 August 2011 at the Wayback Machine
  10. "Magerit - BladeCenter PS702 Express, Power7 3.3GHz, Infiniband - TOP500 Supercomputer Sites". top500.org.
  11. 1 2 "Magerit - BladeCenter PS702 Express, Power7 3.3GHz, Infiniband - TOP500 Supercomputer Sites". top500.org.
  12. "The Green500 List :: Environmentally Responsible Supercomputing :: Green500 Certificates". Archived from the original on 24 April 2012. Retrieved 18 February 2012.
  13. Green500: Magerit's Green500 Certificate in nov 2007 [ permanent dead link ]
  14. Madrid, Centro de Supercomputación y Visualización de. "HPC - Computación de alto rendimiento". Archived from the original on 19 July 2011. Retrieved 23 June 2011.
  15. "Myricom Myrinet-2000: Performance Measurement". myricom. Retrieved 29 March 2011.
  16. "Plataforma de Contratación del Sector Público" (in Spanish). Contrataciondelestado.es. Retrieved 21 February 2023.
  17. "Solicitudes de servicios – CeSViMa". cesvima.upm.es.