Valentina Salapura | |
---|---|
Alma mater | Technische Universität Wien (PhD), University of Zagreb (M.S) |
Awards | Gordon Bell Prize (2006) [1] |
Scientific career | |
Fields | High-performance computing, computer architecture |
Institutions | IBM, AMD, Google |
Valentina Salapura is a researcher and expert in high-performance computing (HPC), supercomputing, and computer architecturet. [2] She has contributed to designing and developing advanced computing systems, focusing on scalable architectures, parallel processing, and energy-efficient computing. Her work spans both academic research and industry applications.
Salapura earned her PhD in computer science from the Technische Universität Wien in Vienna, Austria. She also holds M.S. degrees in computer science and electrical engineering from the University of Zagreb in Croatia. [3] Her early academic work focused on optimizing computer architectures and developing computing methodologies. [ citation needed ]
Salapura has held positions at IBM, AMD, and Google. [4] At IBM, she contributed to the development of the Blue Gene supercomputer and was involved in the architecture of the Power8 processor. [5]
Later, Salapura joined AMD Research, where she worked on distributed computing and supercomputing technologies, leading the development of high-performance computing (HPC) software libraries and the software architecture for the Frontier system, the world's first public exascale computer.
Salapura's work in HPC includes integrating heterogeneous computing and accelerators into hyperscale data centers. Her research into energy efficiency in computing emphasizes the design of systems that balance high performance with minimal energy consumption. [6]
Salapura was a leader in the development of the BlueGene system, contributing to the design of BlueGene/L, BlueGene/P, BlueGene/Q, and Frontier supercomputers.
Blue Gene/L employed low-frequency, low-power embedded PowerPC cores with floating-point accelerators. This design traded individual processor speed for higher power efficiency, making it suitable for massively parallel applications. The system reduced power consumption by utilizing many low-power cores to perform computations simultaneously. [7] [8] [9] [10]
Blue Gene/P improved upon its predecessor by increasing the density of processor cores. Each rack contained 1,024 nodes with a total of 4,096 processor cores. The design focused on maximizing power efficiency, with Blue Gene/P installations ranking near the top of the Green500 lists in 2007–2008 for their energy efficiency [11] [12] [13] [14] [15]
The BlueGene/Q system, particularly the Sequoia installation at Lawrence Livermore National Laboratory, achieved 16.32 petaflops of performance using 1,572,864 cores. This system was the first supercomputer to utilize more than one million cores. It was primarily water-cooled and consisted of 96 racks, 98,304 compute nodes, and 1.6 petabytes of memory. Sequoia was significantly more power-efficient compared to its predecessors. [16] [17]
Frontier, developed by Hewlett Packard Enterprise and AMD and installed at Oak Ridge National Laboratory, became the world's first exascale supercomputer in May 2022. Frontier can achieve 1.194 exaflops in the high-performance Linpack (HPL) benchmark. The system uses 8,699,904 CPU and GPU cores and features HPE's Slingshot 11 network for data transfer. Frontier is cooled by a water system that pumps 60,000 gallons per minute. [ citation needed ]
Salapura has contributed to multiple subfields, including HPC, supercomputing, and distributed systems. her work has been used to advance quantum chromodynamics simulations.
She has also worked in cloud computing, [18] focusing on virtualization and resiliency. Her early work on processor architecture and microarchitecture design has influenced subsequent advancements.
Salapura has received several awards, including the ACM Gordon Bell Prize in 2006 for her work on Blue Gene/L. [1] She has co-invented over 500 patents and was named a Fellow of the IEEE in 2012 for her contributions to multiprocessor systems. [19]
At ISC 2023, Salapura explored the emerging dynamic between hyperscaler and HPC ecosystems.
At ICCD 2012, Salapura discussed the adoption of cloud computing and its implications for virtualization and resiliency.
At the 2007 Grace Hopper Conference, Salapura discussed the shift from faster single processors to multiprocessor systems.
A supercomputer is a type of computer with a high level of performance as compared to a general-purpose computer. The performance of a supercomputer is commonly measured in floating-point operations per second (FLOPS) instead of million instructions per second (MIPS). Since 2022, supercomputers have existed which can perform over 1018 FLOPS, so called exascale supercomputers. For comparison, a desktop computer has performance in the range of hundreds of gigaFLOPS (1011) to tens of teraFLOPS (1013). Since November 2017, all of the world's fastest 500 supercomputers run on Linux-based operating systems. Additional research is being conducted in the United States, the European Union, Taiwan, Japan, and China to build faster, more powerful and technologically superior exascale supercomputers.
Floating point operations per second is a measure of computer performance in computing, useful in fields of scientific computations that require floating-point calculations.
Blue Gene was an IBM project aimed at designing supercomputers that can reach operating speeds in the petaFLOPS (PFLOPS) range, with relatively low power consumption.
High-performance computing (HPC) uses supercomputers and computer clusters to solve advanced computation problems.
EPCC, formerly the Edinburgh Parallel Computing Centre, is a supercomputing centre based at the University of Edinburgh. Since its foundation in 1990, its stated mission has been to accelerate the effective exploitation of novel computing throughout industry, academia and commerce.
Cray XMT is a scalable multithreaded shared memory supercomputer architecture by Cray, based on the third generation of the Tera MTA architecture, targeted at large graph problems. Presented in 2005, it supersedes the earlier unsuccessful Cray MTA-2. It uses the Threadstorm3 CPUs inside Cray XT3 blades. Designed to make use of commodity parts and existing subsystems for other commercial systems, it alleviated the shortcomings of Cray MTA-2's high cost of fully custom manufacture and support. It brought various substantial improvements over Cray MTA-2, most notably nearly tripling the peak performance, and vastly increased maximum CPU count to 8,192 and maximum memory to 128 TB, with a data TLB of maximal 512 TB.
The TOP500 project ranks and details the 500 most powerful non-distributed computer systems in the world. The project was started in 1993 and publishes an updated list of the supercomputers twice a year. The first of these updates always coincides with the International Supercomputing Conference in June, and the second is presented at the ACM/IEEE Supercomputing Conference in November. The project aims to provide a reliable basis for tracking and detecting trends in high-performance computing and bases rankings on HPL benchmarks, a portable implementation of the high-performance LINPACK benchmark written in Fortran for distributed-memory computers.
In computing, performance per watt is a measure of the energy efficiency of a particular computer architecture or computer hardware. Literally, it measures the rate of computation that can be delivered by a computer for every watt of power consumed. This rate is typically measured by performance on the LINPACK benchmark when trying to compare between computing systems: an example using this is the Green500 list of supercomputers. Performance per watt has been suggested to be a more sustainable measure of computing than Moore's Law.
PERCS is IBM's answer to DARPA's High Productivity Computing Systems (HPCS) initiative. The program resulted in commercial development and deployment of the Power 775, a supercomputer design with extremely high performance ratios in fabric and memory bandwidth, as well as very high performance density and power efficiency.
The National Center for Computational Sciences (NCCS) is a United States Department of Energy (DOE) Leadership Computing Facility that houses the Oak Ridge Leadership Computing Facility (OLCF), a DOE Office of Science User Facility charged with helping researchers solve challenging scientific problems of global interest with a combination of leading high-performance computing (HPC) resources and international expertise in scientific computing.
Exascale computing refers to computing systems capable of calculating at least 1018 IEEE 754 Double Precision (64-bit) operations (multiplications and/or additions) per second (exaFLOPS)"; it is a measure of supercomputer performance.
Supercomputing in India has a history going back to the 1980s. The Government of India created an indigenous development programme as they had difficulty purchasing foreign supercomputers. As of November 2024, the AIRAWAT supercomputer is the fastest supercomputer in India, having been ranked 136th fastest in the world in the TOP500 supercomputer list. AIRAWAT has been installed at the Centre for Development of Advanced Computing (C-DAC) in Pune.
The history of supercomputing goes back to the 1960s when a series of computers at Control Data Corporation (CDC) were designed by Seymour Cray to use innovative designs and parallelism to achieve superior computational peak performance. The CDC 6600, released in 1964, is generally considered the first supercomputer. However, some earlier computers were considered supercomputers for their day such as the 1954 IBM NORC in the 1950s, and in the early 1960s, the UNIVAC LARC (1960), the IBM 7030 Stretch (1962), and the Manchester Atlas (1962), all of which were of comparable power.
Several centers for supercomputing exist across Europe, and distributed access to them is coordinated by European initiatives to facilitate high-performance computing. One such initiative, the HPC Europa project, fits within the Distributed European Infrastructure for Supercomputing Applications (DEISA), which was formed in 2002 as a consortium of eleven supercomputing centers from seven European countries. Operating within the CORDIS framework, HPC Europa aims to provide access to supercomputers across Europe.
Tachyon is a parallel/multiprocessor ray tracing software. It is a parallel ray tracing library for use on distributed memory parallel computers, shared memory computers, and clusters of workstations. Tachyon implements rendering features such as ambient occlusion lighting, depth-of-field focal blur, shadows, reflections, and others. It was originally developed for the Intel iPSC/860 by John Stone for his M.S. thesis at University of Missouri-Rolla. Tachyon subsequently became a more functional and complete ray tracing engine, and it is now incorporated into a number of other open source software packages such as VMD, and SageMath. Tachyon is released under a permissive license.
Approaches to supercomputer architecture have taken dramatic turns since the earliest systems were introduced in the 1960s. Early supercomputer architectures pioneered by Seymour Cray relied on compact innovative designs and local parallelism to achieve superior computational peak performance. However, in time the demand for increased computational power ushered in the age of massively parallel systems.
Appro was a developer of supercomputing supporting High Performance Computing (HPC) markets focused on medium- to large-scale deployments. Appro was based in Milpitas, California with a computing center in Houston, Texas, and a manufacturing and support subsidiary in South Korea and Japan.
Hewlett Packard Enterprise Frontier, or OLCF-5, is the world's first exascale supercomputer. It is hosted at the Oak Ridge Leadership Computing Facility (OLCF) in Tennessee, United States and became operational in 2022. As of November 2024, Frontier is the second fastest supercomputer in the world. It is based on the Cray EX and is the successor to Summit (OLCF-4). Frontier achieved an Rmax of 1.102 exaFLOPS, which is 1.102 quintillion floating-point operations per second, using AMD CPUs and GPUs.
The Tri-Lab Operating System Stack (TOSS) is a Linux distribution based on Red Hat Enterprise Linux (RHEL) that was created to provide a software stack for high performance computing (HPC) clusters for laboratories within the National Nuclear Security Administration (NNSA). The operating system allows multiple smaller systems to emulate a high-performance computing (HPC) platform.
Michael Karl Gschwind is an American computer scientist at Nvidia in Santa Clara, California. He is recognized for his seminal contributions to the design and exploitation of general-purpose programmable accelerators, as an early advocate of sustainability in computer design and as a prolific inventor.