Graph500

Last updated July 21, 2024

The Graph500 is a rating of supercomputer systems, focused on data-intensive loads. The project was announced on International Supercomputing Conference in June 2010. The first list was published at the ACM/IEEE Supercomputing Conference in November 2010. New versions of the list are published twice a year. The main performance metric used to rank the supercomputers is GTEPS (giga- traversed edges per second).

Benchmark

The benchmark used in Graph500 stresses the communication subsystem of the system, instead of counting double precision floating-point.^[1] It is based on a breadth-first search in a large undirected graph (a model of Kronecker graph with average degree of 16). There are three computation kernels in the benchmark: the first kernel is to generate the graph and compress it into sparse structures CSR or CSC (Compressed Sparse Row/Column); the second kernel does a parallel BFS search of some random vertices (64 search iterations per run); the third kernel runs a single-source shortest paths (SSSP) computation. Six possible sizes (Scales) of graph are defined: toy (2²⁶ vertices; 17 GB of RAM), mini (2²⁹; 137 GB), small (2³²; 1.1 TB), medium (2³⁶; 17.6 TB), large (2³⁹; 140 TB), and huge (2⁴²; 1.1 PB of RAM).^[4]

The reference implementation of the benchmark contains several versions:^[5]

serial high-level in GNU Octave
serial low-level in C
parallel C version with usage of OpenMP
two versions for Cray-XMT
basic MPI version (with MPI-1 functions)
optimized MPI version (with MPI-2 one-sided communications)

The implementation strategy that have won the championship on the Japanese K computer is described in.^[6]

Top 10 ranking

According to June 2024 release of the list, for the BFS results section, Fugaku ranks highest, but in the SSSP results section Wuhan Supercomputer ranks highest, then Pengcheng Cloudbrain-II, then Fugaku; table shows for BFS results:^[7]

Rank	Country	Site	Machine (architecture)	Number of nodes	Number of cores	Problem scale	GTEPS
1	Japan	RIKEN Advanced Institute for Computational Science	Supercomputer Fugaku (Fujitsu A64FX)	152064	7299072	42	166029
2	China	Wuhan	Kunpeng 920+Tesla A100	252	6999552	41	115357.6
3	USA	Frontier	HPE Cray EX235a	9248	8730112	40	29654.6
4	China	Pengcheng Lab	Pengcheng Cloudbrain-II (Kunpeng 920+Ascend 910)	488	93696	40	25242.9
5	USA	DOE/SC/Argonne National Laboratory	HPE Cray EX - Intel Exascale Compute Blade	4096	25591808	40	24250.2
6	China	National Supercomputing Center in Wuxi	Sunway TaihuLight (Sunway MPP)	40768	10599680	40	23755.7

Spain (Barcelona), has a new supercomputer MareNostrum 5 ACC, ranked 8th.

2022

According to November 2022 release of the list:^[8]

Rank	Country	Site	Machine (architecture)	Number of nodes	Number of cores	Problem scale	GTEPS
1	Japan	RIKEN Advanced Institute for Computational Science	Supercomputer Fugaku (Fujitsu A64FX)	158976	7630848	41	102955
2	China	Pengcheng Lab	Pengcheng Cloudbrain-II (Kunpeng 920+Ascend 910)	488	93696	40	25242.9
3	China	National Supercomputing Center in Wuxi	Sunway TaihuLight (Sunway MPP)	40768	10599680	40	23755.7
4	Japan	Information Technology Center, University of Tokyo	Wisteria/BDEC-01 (PRIMEHPC FX1000)	7680	368640	37	16118
5	Japan	Japan Aerospace Exploration Agency	TOKI-SORA (PRIMEHPC FX1000)	5760	276480	36	10813
6	EU	EuroHPC/CSC	LUMI-C (HPE Cray EX)	1492	190976	38	8467.71
7	US	Oak Ridge National Laboratory	OLCF Summit (IBM POWER9)	2048	86016	40	7665.7
8	Germany	Leibniz Rechenzentrum	SuperMUC-NG (ThinkSystem SD530 Xeon Platinum 8174 24C 3.1GHz Intel Omni-Path)	4096	196608	39	6279.47
9	Germany	Zuse Institute Berlin	Lise (Intel Omni-Path)	1270	121920	38	5423.94
10	China	National Engineering Research Center for Big Data Technology and System	DepGraph Supernode (DepGraph (+GPU Tesla A100))	1	128	33	4623.379

2020

Arm-based Fugaku took the top spot of the list.^[9]

2016

According to June 2016 release of the list:^[10]

Rank	Site	Machine (architecture)	Number of nodes	Number of cores	Problem scale	GTEPS
1	Riken Advanced Institute for Computational Science	K computer (Fujitsu custom)	82944	663552	40	38621.4
2	National Supercomputing Center in Wuxi	Sunway TaihuLight (NRCPC - Sunway MPP)	40768	10599680	40	23755.7
3	Lawrence Livermore National Laboratory	IBM Sequoia (Blue Gene/Q)	98304	1572864	41	23751
4	Argonne National Laboratory	IBM Mira (Blue Gene/Q)	49152	786432	40	14982
5	Forschungszentrum Jülich	JUQUEEN (Blue Gene/Q)	16384	262144	38	5848
6	CINECA	Fermi (Blue Gene/Q)	8192	131072	37	2567
7	Changsha, China	Tianhe-2 (NUDT custom)	8192	196608	36	2061.48
8	CNRS/IDRIS-GENCI	Turing (Blue Gene/Q)	4096	65536	36	1427
8	Science and Technology Facilities Council – Daresbury Laboratory	Blue Joule (Blue Gene/Q)	4096	65536	36	1427
8	University of Edinburgh	DIRAC (Blue Gene/Q)	4096	65536	36	1427
8	EDF R&D	Zumbrota (Blue Gene/Q)	4096	65536	36	1427
8	Victorian Life Sciences Computation Initiative	Avoca (Blue Gene/Q)	4096	65536	36	1427

2014

According to June 2014 release of the list:^[2]

Rank	Site	Machine (architecture)	Number of nodes	Number of cores	Problem scale	GTEPS
1	RIKEN Advanced Institute for Computational Science	K computer (Fujitsu custom)	65536	524288	40	17977.1
2	Lawrence Livermore National Laboratory	IBM Sequoia (Blue Gene/Q)	65536	1048576	40	16599
3	Argonne National Laboratory	IBM Mira (Blue Gene/Q)	49152	786432	40	14328
4	Forschungszentrum Jülich	JUQUEEN (Blue Gene/Q)	16384	262144	38	5848
5	CINECA	Fermi (Blue Gene/Q)	8192	131072	37	2567
6	Changsha, China	Tianhe-2 (NUDT custom)	8192	196608	36	2061.48
7	CNRS/IDRIS-GENCI	Turing (Blue Gene/Q)	4096	65536	36	1427
7	Science and Technology Facilities Council - Daresbury Laboratory	Blue Joule (Blue Gene/Q)	4096	65536	36	1427
7	University of Edinburgh	DIRAC (Blue Gene/Q)	4096	65536	36	1427
7	EDF R&D	Zumbrota (Blue Gene/Q)	4096	65536	36	1427
7	Victorian Life Sciences Computation Initiative	Avoca (Blue Gene/Q)	4096	65536	36	1427

2013

According to June 2013 release of the list:^[11]

Rank	Site	Machine (architecture)	Number of nodes	Number of cores	Problem scale	GTEPS
1	Lawrence Livermore National Laboratory	IBM Sequoia (Blue Gene/Q)	65536	1048576	40	15363
2	Argonne National Laboratory	IBM Mira (Blue Gene/Q)	49152	786432	40	14328
3	Forschungszentrum Jülich	JUQUEEN (Blue Gene/Q)	16384	262144	38	5848
4	RIKEN Advanced Institute for Computational Science	K computer (Fujitsu custom)	65536	524288	40	5524.12
5	CINECA	Fermi (Blue Gene/Q)	8192	131072	37	2567
6	Changsha, China	Tianhe-2 (NUDT custom)	8192	196608	36	2061.48
7	CNRS/IDRIS-GENCI	Turing (Blue Gene/Q)	4096	65536	36	1427
7	Science and Technology Facilities Council - Daresbury Laboratory	Blue Joule (Blue Gene/Q)	4096	65536	36	1427
7	University of Edinburgh	DIRAC (Blue Gene/Q)	4096	65536	36	1427
7	EDF R&D	Zumbrota (Blue Gene/Q)	4096	65536	36	1427
7	Victorian Life Sciences Computation Initiative	Avoca (Blue Gene/Q)	4096	65536	36	1427

Related Research Articles

A supercomputer is a type of computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instructions per second (MIPS). Since 2017, supercomputers have existed which can perform over 10¹⁷ FLOPS (a hundred quadrillion FLOPS, 100 petaFLOPS or 100 PFLOPS). For comparison, a desktop computer has performance in the range of hundreds of gigaFLOPS (10¹¹) to tens of teraFLOPS (10¹³). Since November 2017, all of the world's fastest 500 supercomputers run on Linux-based operating systems. Additional research is being conducted in the United States, the European Union, Taiwan, Japan, and China to build faster, more powerful and technologically superior exascale supercomputers.

<span class="mw-page-title-main">Breadth-first search</span> Algorithm to search the nodes of a graph

Breadth-first search (BFS) is an algorithm for searching a tree data structure for a node that satisfies a given property. It starts at the tree root and explores all nodes at the present depth prior to moving on to the nodes at the next depth level. Extra memory, usually a queue, is needed to keep track of the child nodes that were encountered but not yet explored.

Blue Gene was an IBM project aimed at designing supercomputers that can reach operating speeds in the petaFLOPS (PFLOPS) range, with low power consumption.

High-performance computing (HPC) uses supercomputers and computer clusters to solve advanced computation problems.

ASCI Red was the first computer built under the Accelerated Strategic Computing Initiative (ASCI), the supercomputing initiative of the United States government created to help the maintenance of the United States nuclear arsenal after the 1992 moratorium on nuclear testing.

The TOP500 project ranks and details the 500 most powerful non-distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these updates always coincides with the International Supercomputing Conference in June, and the second is presented at the ACM/IEEE Supercomputing Conference in November. The project aims to provide a reliable basis for tracking and detecting trends in high-performance computing and bases rankings on HPL benchmarks, a portable implementation of the high-performance LINPACK benchmark written in Fortran for distributed-memory computers.

The Green500 is a biannual ranking of supercomputers, from the TOP500 list of supercomputers, in terms of energy efficiency. The list measures performance per watt using the TOP500 measure of high performance LINPACK benchmarks at double-precision floating-point format.

Exascale computing refers to computing systems capable of calculating at least "10¹⁸ IEEE 754 Double Precision (64-bit) operations (multiplications and/or additions) per second (exaFLOPS)"; it is a measure of supercomputer performance.

HPC Challenge Benchmark combines several benchmarks to test a number of independent attributes of the performance of high-performance computer (HPC) systems. The project has been co-sponsored by the DARPA High Productivity Computing Systems program, the United States Department of Energy and the National Science Foundation.

The K computer – named for the Japanese word/numeral "kei" (京), meaning 10 quadrillion (10¹⁶) – was a supercomputer manufactured by Fujitsu, installed at the Riken Advanced Institute for Computational Science campus in Kobe, Hyōgo Prefecture, Japan. The K computer was based on a distributed memory architecture with over 80,000 compute nodes. It was used for a variety of applications, including climate research, disaster prevention and medical research. The K computer's operating system was based on the Linux kernel, with additional drivers designed to make use of the computer's hardware.

The LINPACK Benchmarks are a measure of a system's floating-point computing power. Introduced by Jack Dongarra, they measure how fast a computer solves a dense n by n system of linear equations Ax = b, which is a common task in engineering.

The PRIMEHPC FX10 is a supercomputer designed and manufactured by Fujitsu. Announced on 7 November 2011 at the Supercomputing Conference, the PRIMEHPC FX10 is an improved and commercialized version of the K computer, which was the first supercomputer to obtain more than 10 PFLOPS on the LINPACK benchmark. In its largest configuration, the PRIMEHPC FX10 has a peak performance 23.2 PFLOPS, power consumption of 22.4 MW, and a list price of US$655.4 million. It was succeeded by the PRIMEHPC FX100 with SPARC64 XIfx processors in 2015.

The number of traversed edges per second (TEPS) that can be performed by a supercomputer cluster is a measure of both the communications capabilities and computational power of the machine. This is in contrast to the more standard metric of floating-point operations per second (FLOPS), which does not give any weight to the communication capabilities of the machine. The term first entered usage in 2010 with the advent of petascale computing, and has since been measured for many of the world's largest supercomputers.

<span class="mw-page-title-main">Tianhe-2</span> Supercomputer in Guangzhou, China

Tianhe-2 or TH-2 is a 3.86-petaflop supercomputer located in the National Supercomputer Center in Guangzhou, China. It was developed by a team of 1,300 scientists and engineers.

Fermi is a 2.097 petaFLOPS supercomputer located at CINECA.

The High Performance Conjugate Gradients Benchmark is a supercomputing benchmark test proposed by Michael Heroux from Sandia National Laboratories, and Jack Dongarra and Piotr Luszczek from the University of Tennessee. It is intended to model the data access patterns of real-world applications such as sparse matrix calculations, thus testing the effect of limitations of the memory subsystem and internal interconnect of the supercomputer on its computing performance. Because it is internally I/O bound, HPCG testing generally achieves only a tiny fraction of the peak FLOPS the computer could theoretically deliver.

The breadth-first-search algorithm is a way to explore the vertices of a graph layer by layer. It is a basic algorithm in graph theory which can be used as a part of other graph algorithms. For instance, BFS is used by Dinic's algorithm to find maximum flow in a graph. Moreover, BFS is also one of the kernel algorithms in Graph500 benchmark, which is a benchmark for data-intensive supercomputing problems. This article discusses the possibility of speeding up BFS through the use of parallel computing.

Fugaku(Japanese: 富岳) is a petascale supercomputer at the Riken Center for Computational Science in Kobe, Japan. It started development in 2014 as the successor to the K computer and made its debut in 2020. It is named after an alternative name for Mount Fuji.

The A64FX is a 64-bit ARM architecture microprocessor designed by Fujitsu. The processor is replacing the SPARC64 V as Fujitsu's processor for supercomputer applications. It powers the Fugaku supercomputer, ranked in the TOP500 as the fastest supercomputer in the world from June 2020, until falling to second place behind Frontier in June 2022.

JUWELS is a supercomputer developed by Atos and hosted by the Jülich Supercomputing Centre (JSC) of the Forschungszentrum Jülich.

References

1 2 The Exascale Report (March 15, 2012). "The Case for the Graph 500 – Really Fast or Really Productive? Pick One". Inside HPC.
1 2 "June 2014 | Graph 500". Archived from the original on June 28, 2014. Retrieved June 26, 2014.
↑ Ueno, Koji; Suzumura, Toyotaro; Maruyama, Naoya; Fujisawa, Katsuki; Matsuoka, Satoshi (2016). "Extreme scale breadth-first search on supercomputers". 2016 IEEE International Conference on Big Data (Big Data). pp. 1040–1047. doi:10.1109/BigData.2016.7840705. ISBN 978-1-4673-9005-7. S2CID 8680200.
↑ Performance Evaluation of Graph500 on Large-Scale Distributed Environment // IEEE IISWC 2011, Austin, TX; presentation
↑ "Graph500: адекватный рейтинг" (in Russian). Open Systems #1 2011.
↑ Ueno, K.; Suzumura, T.; Maruyama, N.; Fujisawa, K.; Matsuoka, S. (December 1, 2016). "Extreme scale breadth-first search on supercomputers". 2016 IEEE International Conference on Big Data (Big Data). pp. 1040–1047. doi:10.1109/BigData.2016.7840705. ISBN 978-1-4673-9005-7. S2CID 8680200.
↑ "Complete Results - Graph 500". 2024. Retrieved July 20, 2024.
↑ "November 2022; Graph 500". June 14, 2017. Retrieved November 18, 2022.
↑ "Fujitsu and RIKEN Take First Place in Graph500 Ranking with Supercomputer Fugaku". HPCwire. June 23, 2020. Retrieved August 8, 2020.
↑ "June 2016 | Graph 500". Archived from the original on June 24, 2016. Retrieved July 6, 2016.
↑ "June 2013 | Graph 500". Archived from the original on June 21, 2013. Retrieved June 19, 2013.

External links

Official website
June 2014 Graph 500
Introducing the Graph 500, paper by Sandia

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[inside-exascale-2012-03-1] 1 2 The Exascale Report (March 15, 2012). "The Case for the Graph 500 – Really Fast or Really Productive? Pick One". Inside HPC.

[graph500.org-2] 1 2 "June 2014 | Graph 500". Archived from the original on June 28, 2014. Retrieved June 26, 2014.

[3] Ueno, Koji; Suzumura, Toyotaro; Maruyama, Naoya; Fujisawa, Katsuki; Matsuoka, Satoshi (2016). "Extreme scale breadth-first search on supercomputers". 2016 IEEE International Conference on Big Data (Big Data). pp. 1040–1047. doi:10.1109/BigData.2016.7840705. ISBN 978-1-4673-9005-7. S2CID 8680200.

[4] Performance Evaluation of Graph500 on Large-Scale Distributed Environment // IEEE IISWC 2011, Austin, TX; presentation

[5] "Graph500: адекватный рейтинг" (in Russian). Open Systems #1 2011.

[6] Ueno, K.; Suzumura, T.; Maruyama, N.; Fujisawa, K.; Matsuoka, S. (December 1, 2016). "Extreme scale breadth-first search on supercomputers". 2016 IEEE International Conference on Big Data (Big Data). pp. 1040–1047. doi:10.1109/BigData.2016.7840705. ISBN 978-1-4673-9005-7. S2CID 8680200.

[7] "Complete Results - Graph 500". 2024. Retrieved July 20, 2024.

[8] "November 2022; Graph 500". June 14, 2017. Retrieved November 18, 2022.

[9] "Fujitsu and RIKEN Take First Place in Graph500 Ranking with Supercomputer Fugaku". HPCwire. June 23, 2020. Retrieved August 8, 2020.

[10] "June 2016 | Graph 500". Archived from the original on June 24, 2016. Retrieved July 6, 2016.

[11] "June 2013 | Graph 500". Archived from the original on June 21, 2013. Retrieved June 19, 2013.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]