ScaLAPACK

Last updated

The ScaLAPACK (or Scalable LAPACK) library includes a subset of LAPACK routines redesigned for distributed memory MIMD parallel computers. It is currently written in a Single-Program-Multiple-Data style using explicit message passing for interprocessor communication. It assumes matrices are laid out in a two-dimensional block cyclic decomposition. [1] [2] [3]

Contents

ScaLAPACK is designed for heterogeneous computing and is portable on any computer that supports MPI or PVM.

ScaLAPACK depends on PBLAS operations in the same way LAPACK depends on BLAS.

As of version 2.0 the code base directly includes PBLAS and BLACS and has dropped support for PVM.

After two decades of operation, a new library was created to replace ScaLAPACK, which was not suitable for modern accelerated architectures. Slate is written in C++ and was designed primarily to serve as a dense linear algebra library to the United States Department of Energy and to the high-performance computing community at large. [4] [5]

Examples

Related Research Articles

<span class="mw-page-title-main">Sparse matrix</span> Matrix in which most of the elements are zero

In numerical analysis and scientific computing, a sparse matrix or sparse array is a matrix in which most of the elements are zero. There is no strict definition regarding the proportion of zero-value elements for a matrix to qualify as sparse but a common criterion is that the number of non-zero elements is roughly equal to the number of rows or columns. By contrast, if most of the elements are non-zero, the matrix is considered dense. The number of zero-valued elements divided by the total number of elements is sometimes referred to as the sparsity of the matrix.

LINPACK is a software library for performing numerical linear algebra on digital computers. It was written in Fortran by Jack Dongarra, Jim Bunch, Cleve Moler, and Gilbert Stewart, and was intended for use on supercomputers in the 1970s and early 1980s. It has been largely superseded by LAPACK, which runs more efficiently on modern architectures.

<span class="mw-page-title-main">Jack Dongarra</span> American computer scientist (born 1950)

Jack Joseph Dongarra is an American computer scientist and mathematician. He is the American University Distinguished Professor of Computer Science in the Electrical Engineering and Computer Science Department at the University of Tennessee. He holds the position of a Distinguished Research Staff member in the Computer Science and Mathematics Division at Oak Ridge National Laboratory, Turing Fellowship in the School of Mathematics at the University of Manchester, and is an adjunct professor and teacher in the Computer Science Department at Rice University. He served as a faculty fellow at the Texas A&M University Institute for Advanced Study (2014–2018). Dongarra is the founding director of the Innovative Computing Laboratory at the University of Tennessee. He was the recipient of the Turing Award in 2021.

<span class="mw-page-title-main">LAPACK</span> Software library for numerical linear algebra

LAPACK is a standard software library for numerical linear algebra. It provides routines for solving systems of linear equations and linear least squares, eigenvalue problems, and singular value decomposition. It also includes routines to implement the associated matrix factorizations such as LU, QR, Cholesky and Schur decomposition. LAPACK was originally written in FORTRAN 77, but moved to Fortran 90 in version 3.2 (2008). The routines handle both real and complex matrices in both single and double precision. LAPACK relies on an underlying BLAS implementation to provide efficient and portable computational building blocks for its routines.

Basic Linear Algebra Subprograms (BLAS) is a specification that prescribes a set of low-level routines for performing common linear algebra operations such as vector addition, scalar multiplication, dot products, linear combinations, and matrix multiplication. They are the de facto standard low-level routines for linear algebra libraries; the routines have bindings for both C and Fortran. Although the BLAS specification is general, BLAS implementations are often optimized for speed on a particular machine, so using them can bring substantial performance benefits. BLAS implementations will take advantage of special floating point hardware such as vector registers or SIMD instructions.

Numerical linear algebra, sometimes called applied linear algebra, is the study of how matrix operations can be used to create computer algorithms which efficiently and accurately provide approximate answers to questions in continuous mathematics. It is a subfield of numerical analysis, and a type of linear algebra. Computers use floating-point arithmetic and cannot exactly represent irrational data, so when a computer algorithm is applied to a matrix of data, it can sometimes increase the difference between a number stored in the computer and the true number that it is an approximation of. Numerical linear algebra uses properties of vectors and matrices to develop computer algorithms that minimize the error introduced by the computer, and is also concerned with ensuring that the algorithm is as efficient as possible.

MUMPS is a software application for the solution of large sparse systems of linear algebraic equations on distributed memory parallel computers. It was developed in European project PARASOL (1996–1999) by CERFACS, IRIT-ENSEEIHT and RAL. The software implements the multifrontal method, which is a version of Gaussian elimination for large sparse systems of equations, especially those arising from the finite element method. It is written in Fortran 90 with parallelism by MPI and it uses BLAS and ScaLAPACK kernels for dense matrix computations. Since 1999, MUMPS has been supported by CERFACS, IRIT-ENSEEIHT, and INRIA.

Intel oneAPI Math Kernel Library is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math.

Tensor software is a class of mathematical software designed for manipulation and calculation with tensors.

Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is a matrix-free method for finding the largest eigenvalues and the corresponding eigenvectors of a symmetric generalized eigenvalue problem

The following tables provide a comparison of linear algebra software libraries, either specialized or general purpose libraries with significant linear algebra coverage.

Parallel Basic Linear Algebra Subprograms (PBLAS) is an implementation of Level 2 and 3 BLAS intended for distributed memory architectures. It provides a computational backbone for ScaLAPACK, a parallel implementation of LAPACK. It depends on Level 1 sequential BLAS operations for local computation and BLACS for communication between nodes.

The LINPACK Benchmarks are a measure of a system's floating-point computing power. Introduced by Jack Dongarra, they measure how fast a computer solves a dense n by n system of linear equations Ax = b, which is a common task in engineering.

<span class="mw-page-title-main">Message passing in computer clusters</span> Aspect of computer clusters

Message passing is an inherent element of all computer clusters. All computer clusters, ranging from homemade Beowulfs to some of the fastest supercomputers in the world, rely on message passing to coordinate the activities of the many nodes they encompass. Message passing in computer clusters built with commodity servers and switches is used by virtually every internet service.

Programming with Big Data in R (pbdR) is a series of R packages and an environment for statistical computing with big data by using high-performance statistical computation. The pbdR uses the same programming language as R with S3/S4 classes and methods which is used among statisticians and data miners for developing statistical software. The significant difference between pbdR and R code is that pbdR mainly focuses on distributed memory systems, where data are distributed across several processors and analyzed in a batch mode, while communications between processors are based on MPI that is easily used in large high-performance computing (HPC) systems. R system mainly focuses on single multi-core machines for data analysis via an interactive mode such as GUI interface.

In scientific computing, GotoBLAS and GotoBLAS2 are open source implementations of the BLAS API with many hand-crafted optimizations for specific processor types. GotoBLAS was developed by Kazushige Goto at the Texas Advanced Computing Center. As of 2003, it was used in seven of the world's ten fastest supercomputers.

jblas is a linear algebra library, created by Mikio Braun, for the Java programming language built upon BLAS and LAPACK. Unlike most other Java linear algebra libraries, jblas is designed to be used with native code through the Java Native Interface (JNI) and comes with precompiled binaries. When used on one of the targeted architectures, it will automatically select the correct binary to use and load it. This allows it to be used out of the box and avoid a potentially tedious compilation process. jblas provides an easier to use high level API on top of the archaic API provided by BLAS and LAPACK, removing much of the tediousness.

ILNumerics is a mathematical class library for Common Language Infrastructure (CLI) developers and a domain specific language (DSL) for the implementation of numerical algorithms on the .NET platform. While algebra systems with graphical user interfaces focus on prototyping of algorithms, implementation of such algorithms into distribution-ready applications is done using development environments and general purpose programming languages (GPL). ILNumerics is an extension to Visual Studio and aims at supporting the creation of technical applications based on .NET.

References

  1. J. Dongarra and D. Walker. "The Design of Linear Algebra Libraries for High Performance Computers".{{cite journal}}: Cite journal requires |journal= (help)
  2. J. Demmel, M. Heath, and H. van der Vorst. "Parallel Numerical Linear Algebra".{{cite journal}}: Cite journal requires |journal= (help)CS1 maint: multiple names: authors list (link)
  3. "2d block-cyclic data layout".
  4. "SLATE". icl.utk.edu. Retrieved 2024-02-02.
  5. Gates, Mark; Kurzak, Jakub; Charara, Ali; YarKhan, Asim; Dongarra, Jack (2019-11-17). "SLATE: design of a modern distributed and accelerated linear algebra library". Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. SC '19. New York, NY, USA: Association for Computing Machinery: 1–18. doi:10.1145/3295500.3356223. ISBN   978-1-4503-6229-0.