Math Kernel Library

Last updated
Intel oneAPI Math Kernel Library
Developer(s) Intel
Initial releaseMay 9, 2003;20 years ago (2003-05-09)
Stable release
2023.1 / March 29, 2023;9 months ago (2023-03-29) [1]
Written in C/C++, Intel DPC++ Compiler, Fortran
Operating system Microsoft Windows, Linux, macOS
Platform CPU [2]

GPU

Type Library and framework
License freeware under ISSL [3] [4]
Website www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl.html   OOjs UI icon edit-ltr-progressive.svg

Intel oneAPI Math Kernel Library (Intel oneMKL; formerly Intel Math Kernel Library or Intel MKL) is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math. [5] [6]

Contents

The library supports Intel CPUs and GPUs [2] and is available for Windows, Linux and macOS operating systems. [5] [6] [7]

Intel oneAPI Math Kernel Library is not to be confused with oneAPI Math Kernel Library (oneMKL) Interfaces, a piece of open-source glue code that allows Intel MKL routines to be used from Data Parallel C++. [8]

History and licensing

Intel launched the Math Kernel Library on May 9, 2003, and called it blas.lib. [9] The project's development teams are located in Russia and the United States.

The library was available in a standalone form, free of charge under the terms of Intel Simplified Software License [3] which allow redistribution. [10]

Since April 2020, MKL has become part of oneAPI. Commercial support for oneMKL is available when purchased as part of oneAPI Base Toolkit.

Following Apple’s transition away from Intel CPUs, support for macOS is scheduled for removal in 2024.

Performance and vendor lock-in

MKL and other programs generated by the Intel C++ Compiler and the Intel DPC++ Compiler improve performance with a technique called function multi-versioning: a function is compiled or written for many of the x86 instruction set extensions, and at run-time a "master function" uses the CPUID instruction to select a version most appropriate for the current CPU. However, as long as the master function detects a non-Intel CPU, it almost always chooses the most basic (and slowest) function to use, regardless of what instruction sets the CPU claims to support. This has netted the system a nickname of "cripple AMD" routine since 2009. [11] As of 2020, Intel's MKL remains the numeric library installed by default along with many pre-compiled mathematical applications on Windows (such as NumPy, SymPy). [12] [13] Although relying on the MKL, MATLAB implemented a workaround starting with Release 2020a which ensures full support for AVX2 by the MKL also for non Intel (AMD) CPUs. [14]

Details

Functional categories

Intel MKL has the following functional categories: [15]

Once, MKL included Deep Neural Network functions, but they were removed in version 2020. [16] Its successor is the Intel oneAPI Deep Neural Network Library.

See also

Related Research Articles

<span class="mw-page-title-main">Single instruction, multiple data</span> Type of parallel processing

Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal and it can be directly accessible through an instruction set architecture (ISA), but it should not be confused with an ISA. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.

<span class="mw-page-title-main">LAPACK</span> Software library for numerical linear algebra

LAPACK is a standard software library for numerical linear algebra. It provides routines for solving systems of linear equations and linear least squares, eigenvalue problems, and singular value decomposition. It also includes routines to implement the associated matrix factorizations such as LU, QR, Cholesky and Schur decomposition. LAPACK was originally written in FORTRAN 77, but moved to Fortran 90 in version 3.2 (2008). The routines handle both real and complex matrices in both single and double precision. LAPACK relies on an underlying BLAS implementation to provide efficient and portable computational building blocks for its routines.

Basic Linear Algebra Subprograms (BLAS) is a specification that prescribes a set of low-level routines for performing common linear algebra operations such as vector addition, scalar multiplication, dot products, linear combinations, and matrix multiplication. They are the de facto standard low-level routines for linear algebra libraries; the routines have bindings for both C and Fortran. Although the BLAS specification is general, BLAS implementations are often optimized for speed on a particular machine, so using them can bring substantial performance benefits. BLAS implementations will take advantage of special floating point hardware such as vector registers or SIMD instructions.

dnAnalytics is an open-source numerical library for .NET written in C# and F#. It features functionality similar to BLAS and LAPACK.

AMD Core Math Library (ACML) is an end-of-life software development library released by AMD, replaced by many open source libraries, including AMD libm 4.0. This library provides mathematical routines optimized for AMD processors.

Automatically Tuned Linear Algebra Software (ATLAS) is a software library for linear algebra. It provides a mature open source implementation of BLAS APIs for C and Fortran77.

Intel oneAPI DPC++/C++ Compiler and Intel C++ Compiler Classic are Intel’s C, C++, SYCL, and Data Parallel C++ (DPC++) compilers for Intel processor-based systems, available for Windows, Linux, and macOS operating systems.

Intel Integrated Performance Primitives is a multi-threaded software library of functions for multimedia and data processing applications, produced by Intel.

Intel Fortran Compiler, as part of Intel OneAPI HPC toolkit, is a group of Fortran compilers from Intel for Windows, macOS, and Linux.

<span class="mw-page-title-main">OpenCL</span> Open standard for programming heterogenous computing systems, such as CPUs or GPUs

OpenCL is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs) and other processors or hardware accelerators. OpenCL specifies programming languages for programming these devices and application programming interfaces (APIs) to control the platform and execute programs on the compute devices. OpenCL provides a standard interface for parallel computing using task- and data-based parallelism.

Intel Parallel Studio XE was a software development product developed by Intel that facilitated native code development on Windows, macOS and Linux in C++ and Fortran for parallel computing. Parallel programming enables software programs to take advantage of multi-core processors from Intel and other processor vendors.

IT++ is a C++ library of classes and functions for linear algebra, numerical optimization, signal processing, communications, and statistics. It is being developed by researchers in these areas and is widely used by researchers, both in the communications industry and universities. The IT++ library originates from the former Department of Information Theory at the Chalmers University of Technology, Gothenburg, Sweden.

In scientific computing, GotoBLAS and GotoBLAS2 are open source implementations of the BLAS API with many hand-crafted optimizations for specific processor types. GotoBLAS was developed by Kazushige Goto at the Texas Advanced Computing Center. As of 2003, it was used in seven of the world's ten fastest supercomputers.

OpenBLAS is an open-source implementation of the BLAS and LAPACK APIs with many hand-crafted optimizations for specific processor types. It is developed at the Lab of Parallel Software and Computational Science, ISCAS.

The following table compares notable software frameworks, libraries and computer programs for deep learning.

<span class="mw-page-title-main">ROCm</span> Parallel computing platform: GPGPU libraries and application programming interface

ROCm is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains: general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), heterogeneous computing. It offers several programming models: HIP, OpenMP/Message Passing Interface (MPI), OpenCL.

In scientific computing, BLIS is an open-source framework for implementing a superset of BLAS functionality for specific processor types that was recently awarded the J. H. Wilkinson Prize for Numerical Software. It exposes that functionality through two traditional Application Programming Interfaces (APIs): the BLAS interface and the CBLAS interface. BLIS also includes two APIs native to the framework: a typed (BLAS-like) API and an object API. These native interfaces provide access to BLAS-like functionality that is not supported by, but closely related to, operations found in the BLAS . The framework is developed and supported by the Science of High-Performance Computing (SHPC) group of the Oden Institute for Computational Engineering and Sciences at The University of Texas at Austin and the Matthews Research Group at Southern Methodist University.

Agner Fog is a Danish evolutionary anthropologist and computer scientist. He is currently an associate professor of computer science at the Technical University of Denmark (DTU), and has been present at DTU since 1995. He is best known for coining the term "Regality Theory" and for writing extensive optimization manuals for machines running the x86 architecture.

References

  1. "Intel® Math Kernel Library Release Notes and New Features". software.intel.com.
  2. 1 2 Intel® oneAPI Math Kernel Library (oneMKL) | Intel® Software
  3. 1 2 "Intel Simplified Software License".
  4. "OneMKL — oneAPI Specification 1.1-rev-1 documentation".
  5. 1 2 "Intel Math Kernel Library".
  6. 1 2 "Intel Math Kernel Library (MKL)".
  7. "MKL - Intel Math Kernel Library". 23 April 2012.
  8. "oneapi-src/oneMKL". oneAPI-SRC. 19 March 2021. oneMKL interfaces are an open-source implementation of the oneMKL Data Parallel C++ (DPC++) interface according to the oneMKL specification. It works with multiple devices (backends) using device-specific libraries underneath.
  9. "Intel launches library tool for developers. (Intel Math Kernel Library 6.0) (Brief Article)". Telecomworldwire. May 9, 2003. Retrieved November 29, 2009.
  10. "Intel Math Kernel Library Licensing FAQ".
  11. Agner Fog. "Agner's CPU blog - Intel's "cripple AMD" function".
  12. "Comment chain in: r/matlab - How-To force Matlab to use a fast codepath on AMD Ryzen/TR CPUs - up to 250% performance gains". reddit. Retrieved 2020-06-06.
  13. "High-Performance Computing Center Stuttgart - Knowledge Base - Libraries(Hawk)" . Retrieved 2020-06-06.
  14. "Crippled No Longer: Matlab Now Runs on AMD CPUs at Full Speed - ExtremeTech". www.extremetech.com. Retrieved 2020-10-29.
  15. admin (2019-11-14). "Developer Reference for Intel® Math Kernel Library - C". software.intel.com. Retrieved 2019-11-27.
  16. "Intel® Math Kernel Library Release Notes and New Features". Intel. Retrieved 10 October 2023.