Intel Cluster Ready

Last updated

The Intel Cluster Ready certification is a marketing program from Intel. It is aimed at hardware and software vendors in the low-end and mid-range cluster market. To get certified, systems have to fulfill a minimum set of cluster-specific requirements. This way, vendors of parallel software can build their applications on a basic cluster platform, trusting certain components to be present. Other drivers, libraries and tools will have to be provided by the software vendor or its partners, or by a system integrator.

Contents

Description

The program was announced in June 2007. [1] The nodes of an Intel Cluster Ready compliant cluster are based on Xeon server processors and PC hardware, interconnected through Ethernet or InfiniBand. The operating system is a Linux distribution conforming to a specific file system layout. Also included are Intel's closed source but publicly available parallel libraries: the Message Passing Interface, Threading Building Blocks, and Math Kernel Library.

Intel only specifies the requirements a cluster has to fulfill to get certified. The specific implementation is the responsibility of the platform vendor. Intel's Cluster Checker checks the system's compliance. It is not only deployed by the vendor, the integrator and the end user to verify the system, it can also be used to troubleshoot an operational cluster.

While cluster hardware gets certified, software can be registered as well. Intel provides a minimal cluster infrastructure where software vendors can run their package, test scripts and test data. After successful completion, the application gets registered as being Intel Cluster Ready compliant. The Cluster Ready program is free for both hardware and software vendors.

The Intel Cluster Ready program does not primarily include the high-end clusters used for scientific calculations at universities and research institutes. It aims to commoditize the parallel systems used for industrial and commercial applications. According to IDC, more than half of the servers currently sold for technical applications is deployed as part of a cluster. For example, these systems are used for industrial computations, financial analyses, and modelling in engineering. IDC expects clustered systems to responsible for more than three-quarters of the high-performance technical computing market.

AMD-based clusters

Although the Cluster Ready program is aimed at systems built on Intel's Xeon processors, the Cluster Checker can also be used to verify an Advanced Micro Devices-based system. Intel's parallel libraries run on AMD hardware with the same performance and are fully supported by Intel. According to Werner Krotz-Vogel, Technical Marketing Engineer at Intel's Cluster Ready team, the Math Kernel Library (MKL) runs even faster than the open source library Automatically Tuned Linear Algebra Software (ATLAS) on an AMD Opteron system.

Related Research Articles

In computer architecture, 64-bit integers, memory addresses, or other data units are those that are 64 bits wide. Also, 64-bit central processing unit (CPU) and arithmetic logic unit (ALU) architectures are those that are based on processor registers, address buses, or data buses of that size. 64-bit microcomputers are computers in which 64-bit microprocessors are the norm. From the software perspective, 64-bit computing means the use of machine code with 64-bit virtual memory addresses. However, not all 64-bit instruction sets support full 64-bit virtual memory addresses; x86-64 and ARMv8, for example, support only 48 bits of virtual address, with the remaining 16 bits of the virtual address required to be all 0's or all 1's, and several 64-bit instruction sets support fewer than 64 bits of physical memory address.

x86-64 Type of instruction set which is a 64-bit version of the x86 instruction set

x86-64 is a 64-bit version of the x86 instruction set, first released in 1999. It introduced two new modes of operation, 64-bit mode and compatibility mode, along with a new 4-level paging mode.

Linux Standard Base

The Linux Standard Base (LSB) is a joint project by several Linux distributions under the organizational structure of the Linux Foundation to standardize the software system structure, including the Filesystem Hierarchy Standard used in the Linux kernel. The LSB is based on the POSIX specification, the Single UNIX Specification (SUS), and several other open standards, but extends them in certain areas.

Coprocessor

A coprocessor is a computer processor used to supplement the functions of the primary processor. Operations performed by the coprocessor may be floating point arithmetic, graphics, signal processing, string processing, cryptography or I/O interfacing with peripheral devices. By offloading processor-intensive tasks from the main processor, coprocessors can accelerate system performance. Coprocessors allow a line of computers to be customized, so that customers who do not need the extra performance do not need to pay for it.

x86 virtualization is the use of hardware-assisted virtualization capabilities on an x86/x86-64 CPU.

Basic Linear Algebra Subprograms (BLAS) is a specification that prescribes a set of low-level routines for performing common linear algebra operations such as vector addition, scalar multiplication, dot products, linear combinations, and matrix multiplication. They are the de facto standard low-level routines for linear algebra libraries; the routines have bindings for both C and Fortran. Although the BLAS specification is general, BLAS implementations are often optimized for speed on a particular machine, so using them can bring substantial performance benefits. BLAS implementations will take advantage of special floating point hardware such as vector registers or SIMD instructions.

Mesa, also called Mesa3D and The Mesa 3D Graphics Library, is an open source software implementation of OpenGL, Vulkan, and other graphics API specifications. Mesa translates these specifications to vendor-specific graphics hardware drivers.

coreboot

coreboot, formerly known as LinuxBIOS, is a software project aimed at replacing proprietary firmware found in most computers with a lightweight firmware designed to perform only the minimum number of tasks necessary to load and run a modern 32-bit or 64-bit operating system.

Free and open-source graphics device driver

A free and open-source graphics device driver is a software stack which controls computer-graphics hardware and supports graphics-rendering application programming interfaces (APIs) and is released under a free and open-source software license. Graphics device drivers are written for specific hardware to work within a specific operating system kernel and to support a range of APIs used by applications to access the graphics hardware. They may also control output to the display if the display driver is part of the graphics hardware. Most free and open-source graphics device drivers are developed by the Mesa project. The driver is made up of a compiler, a rendering API, and software which manages access to the graphics hardware.

In a computer, the Advanced Configuration and Power Interface (ACPI) provides an open standard that operating systems can use to discover and configure computer hardware components, to perform power management e.g. putting unused hardware components to sleep, to perform auto configuration e.g. Plug and Play, and to perform status monitoring. First released in December 1996, ACPI aims to replace Advanced Power Management (APM), the MultiProcessor Specification, the PCI BIOS specification, and the Plug and Play BIOS (PnP) Specification. ACPI brings the power management under the control of the operating system, as opposed to the previous BIOS-centric system that relied on platform-specific firmware to determine power management and configuration policies. The specification is central to the Operating System-directed configuration and Power Management (OSPM) system. ACPI defines a hardware abstraction interface between the system firmware, the computer hardware components, and the operating systems.

Stream processing is a computer programming paradigm, equivalent to dataflow programming, event stream processing, and reactive programming, that allows some applications to more easily exploit a limited form of parallel processing. Such applications can use multiple computational units, such as the floating point unit on a graphics processing unit or field-programmable gate arrays (FPGAs), without explicitly managing allocation, synchronization, or communication among those units.

Multi-core processor Microprocessor with more than one processing unit

A multi-core processor is a computer processor on a single integrated circuit with two or more separate processing units, called cores, each of which reads and executes program instructions. The instructions are ordinary CPU instructions but the single processor can run instructions on separate cores at the same time, increasing overall speed for programs that support multithreading or other parallel computing techniques. Manufacturers typically integrate the cores onto a single integrated circuit die or onto multiple dies in a single chip package. The microprocessors currently used in almost all personal computers are multi-core.

Intel C++ Compiler, is a group of C, C++, SYCL and Data Parallel C++ (DPC++) compilers from Intel available for Windows, Mac, Linux, and FreeBSD.

Intel Fortran Compiler, is a group of Fortran compilers from Intel for Windows, OS X, and Linux.

OpenCL Open standard for programming heterogenous computing systems, such as CPUs or GPUs

OpenCL is a framework for writing programs that execute across heterogeneous platforms consisting of central processing units (CPUs), graphics processing units (GPUs), digital signal processors (DSPs), field-programmable gate arrays (FPGAs) and other processors or hardware accelerators. OpenCL specifies programming languages for programming these devices and application programming interfaces (APIs) to control the platform and execute programs on the compute devices. OpenCL provides a standard interface for parallel computing using task- and data-based parallelism.

Intel Parallel Studio XE was rebranded and repackaged by Intel when oneAPI toolkits were released in December 2020. Intel oneAPI Base Toolkit + Intel oneAPI HPC toolkit contain all the tools in Parallel Studio XE and more. One significant addition is a Data Parallel C++ (DPC++) compiler designed to allow developers to reuse code across hardware targets.

Intel Math Kernel Library is a library of optimized math routines for science, engineering, and financial applications. Core math functions include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier transforms, and vector math.

Heterogeneous System Architecture (HSA) is a cross-vendor set of specifications that allow for the integration of central processing units and graphics processors on the same bus, with shared memory and tasks. The HSA is being developed by the HSA Foundation, which includes AMD and ARM. The platform's stated aim is to reduce communication latency between CPUs, GPUs and other compute devices, and make these various devices more compatible from a programmer's perspective, relieving the programmer of the task of planning the moving of data between devices' disjoint memories.

The Data Plane Development Kit (DPDK) is an Open source software project managed by the Linux Foundation. It provides a set of data plane libraries and network interface controller polling-mode drivers for offloading TCP packet processing from the operating system kernel to processes running in user space. This offloading achieves higher computing efficiency and higher packet throughput than is possible using the interrupt-driven processing provided in the kernel.

Foreshadow Hardware vulnerability for Intel processors

Foreshadow, known as L1 Terminal Fault (L1TF) by Intel, is a vulnerability that affects modern microprocessors that was first discovered by two independent teams of researchers in January 2018, but was first disclosed to the public on 14 August 2018. The vulnerability is a speculative execution attack on Intel processors that may result in the disclosure of sensitive information stored in personal computers and third-party clouds. There are two versions: the first version (original/Foreshadow) targets data from SGX enclaves; and the second version (next-generation/Foreshadow-NG) targets virtual machines (VMs), hypervisors (VMM), operating systems (OS) kernel memory, and System Management Mode (SMM) memory. A listing of affected Intel hardware has been posted.

References

  1. "Intel Accelerates High Performance Computing Clusters" (Press release). Intel Corporation. June 27, 2007. Retrieved September 4, 2016.