This article needs additional citations for verification .(October 2014) |
Heterogeneous computing refers to systems that use more than one kind of processor or core. These systems gain performance or energy efficiency not just by adding the same type of processors, but by adding dissimilar coprocessors, usually incorporating specialized processing capabilities to handle particular tasks. [1]
Usually heterogeneity in the context of computing refers to different instruction-set architectures (ISA), where the main processor has one and other processors have another - usually a very different - architecture (maybe more than one), not just a different microarchitecture (floating point number processing is a special case of this - not usually referred to as heterogeneous).
In the past heterogeneous computing meant different ISAs had to be handled differently, while in a modern example, Heterogeneous System Architecture (HSA) systems [2] eliminate the difference (for the user) while using multiple processor types (typically CPUs and GPUs), usually on the same integrated circuit, to provide the best of both worlds: general GPU processing (apart from the GPU's well-known 3D graphics rendering capabilities, it can also perform mathematically intensive computations on very large data-sets), while CPUs can run the operating system and perform traditional serial tasks.
The level of heterogeneity in modern computing systems is gradually increasing as further scaling of fabrication technologies allows for formerly discrete components to become integrated parts of a system-on-chip, or SoC.[ citation needed ] For example, many new processors now include built-in logic for interfacing with other devices (SATA, PCI, Ethernet, USB, RFID, radios, UARTs, and memory controllers), as well as programmable functional units and hardware accelerators (GPUs, cryptography co-processors, programmable network processors, A/V encoders/decoders, etc.).
Recent findings show that a heterogeneous-ISA chip multiprocessor that exploits diversity offered by multiple ISAs can outperform the best same-ISA homogeneous architecture by as much as 21% with 23% energy savings and a reduction of 32% in Energy Delay Product (EDP). [3] AMD's 2014 announcement on its pin-compatible ARM and x86 SoCs, codename Project Skybridge, [4] suggested a heterogeneous-ISA (ARM+x86) chip multiprocessor in the making.[ citation needed ]
A system with heterogeneous CPU topology is a system where the same ISA is used, but the cores themselves are different in speed. [5] The setup is more similar to a symmetric multiprocessor. (Although such systems are technically asymmetric multiprocessors, the cores do not differ in roles or device access.) There are typically two types of cores: a higher performance core usually known as a "big" or P-core and a more power efficient core usually known as a "small" or E-core. The terms P- and E-cores are usually used in relation to Intel's implementation of hetereogeneous computing, while the terms big and little cores are usually used in relation to the ARM architecture. Some processors have three categories of core, prime, performance and efficiency cores, with prime cores having higher performance than performance cores; a prime core is known as "big", a performance core is known as "medium", and an efficiency core is known as "small". [6]
A common use of such topology is to provide better power efficiency, especially in mobile SoCs.
Heterogeneous computing systems present new challenges not found in typical homogeneous systems. [8] The presence of multiple processing elements raises all of the issues involved with homogeneous parallel processing systems, while the level of heterogeneity in the system can introduce non-uniformity in system development, programming practices, and overall system capability. Areas of heterogeneity can include: [9]
This section may require cleanup to meet Wikipedia's quality standards. The specific problem is: Some groupings don't make sense when "what's added compared to a bare CPU" is considered. Maybe it's time to rethink the taxonomy.(September 2021) |
Heterogeneous computing hardware can be found in every domain of computing—from high-end servers and high-performance computing machines all the way down to low-power embedded devices including mobile phones and tablets.
Processor design is a subfield of computer science and computer engineering (fabrication) that deals with creating a processor, a key component of computer hardware.
In computer science, an instruction set architecture (ISA) is an abstract model that generally defines how software controls the CPU in a computer or a family of computers. A device or program that executes instructions described by that ISA, such as a central processing unit (CPU), is called an implementation of that ISA.
A system on a chip or system-on-chip is an integrated circuit that integrates most or all components of a computer or other electronic system. These components almost always include on-chip central processing unit (CPU), memory interfaces, input/output devices and interfaces, and secondary storage interfaces, often alongside other components such as radio modems and a graphics processing unit (GPU) – all on a single substrate or microchip. SoCs may contain digital and also analog, mixed-signal and often radio frequency signal processing functions.
SuperH is a 32-bit reduced instruction set computing (RISC) instruction set architecture (ISA) developed by Hitachi and currently produced by Renesas. It is implemented by microcontrollers and microprocessors for embedded systems.
HyperTransport (HT), formerly known as Lightning Data Transport, is a technology for interconnection of computer processors. It is a bidirectional serial/parallel high-bandwidth, low-latency point-to-point link that was introduced on April 2, 2001. The HyperTransport Consortium is in charge of promoting and developing HyperTransport technology.
A digital signal processor (DSP) is a specialized microprocessor chip, with its architecture optimized for the operational needs of digital signal processing. DSPs are fabricated on metal–oxide–semiconductor (MOS) integrated circuit chips. They are widely used in audio signal processing, telecommunications, digital image processing, radar, sonar and speech recognition systems, and in common consumer electronic devices such as mobile phones, disk drives and high-definition television (HDTV) products.
Reconfigurable computing is a computer architecture combining some of the flexibility of software with the high performance of hardware by processing with flexible hardware platforms like field-programmable gate arrays (FPGAs). The principal difference when compared to using ordinary microprocessors is the ability to add custom computational blocks using FPGAs. On the other hand, the main difference from custom hardware, i.e. application-specific integrated circuits (ASICs) is the possibility to adapt the hardware during runtime by "loading" a new circuit on the reconfigurable fabric, thus providing new computational blocks without the need to manufacture and add new chips to the existing system.
A coprocessor is a computer processor used to supplement the functions of the primary processor. Operations performed by the coprocessor may be floating-point arithmetic, graphics, signal processing, string processing, cryptography or I/O interfacing with peripheral devices. By offloading processor-intensive tasks from the main processor, coprocessors can accelerate system performance. Coprocessors allow a line of computers to be customized, so that customers who do not need the extra performance do not need to pay for it.
Xilinx, Inc. was an American technology and semiconductor company that primarily supplied programmable logic devices. The company is renowned for inventing the first commercially viable field-programmable gate array (FPGA). It also pioneered the first fabless manufacturing model.
The MicroBlaze is a soft microprocessor core designed for Xilinx field-programmable gate arrays (FPGA). As a soft-core processor, MicroBlaze is implemented entirely in the general-purpose memory and logic fabric of Xilinx FPGAs.
The PowerPC 400 family is a line of 32-bit embedded RISC processor cores based on the PowerPC or Power ISA instruction set architectures. The cores are designed to fit inside specialized applications ranging from system-on-a-chip (SoC) microcontrollers, network appliances, application-specific integrated circuits (ASICs) and field-programmable gate arrays (FPGAs) to set-top boxes, storage devices and supercomputers.
A multi-core processor (MCP) is a microprocessor on a single integrated circuit (IC) with two or more separate central processing units (CPUs), called cores to emphasize their multiplicity. Each core reads and executes program instructions, specifically ordinary CPU instructions. However, the MCP can run instructions on separate cores at the same time, increasing overall speed for programs that support multithreading or other parallel computing techniques. Manufacturers typically integrate the cores onto a single IC die, known as a chip multiprocessor (CMP), or onto multiple dies in a single chip package. As of 2024, the microprocessors used in almost all new personal computers are multi-core.
The Advanced Learning and Research Institute (ALaRI), a faculty of informatics, was established in 1999 at the University of Lugano to promote research and education in embedded systems. The Faculty of Informatics within very few years has become one of the Switzerland major destinations for teaching and research, ranking third after the two Federal Institutes of Technology, Zurich and Lausanne.
VideoCore is a series of low-power mobile multimedia processors originally developed by Alphamosaic Ltd and now owned by Broadcom. Alphamosaic marketed its first version as a two-dimensional DSP architecture that makes it flexible and efficient enough to decode a number of multimedia codecs in software while maintaining low power usage. The semiconductor intellectual property core has been found so far only on Broadcom SoCs.
Manycore processors are special kinds of multi-core processors designed for a high degree of parallel processing, containing numerous simpler, independent processor cores. Manycore processors are used extensively in embedded computers and high-performance computing.
Virtex is the flagship family of FPGA products currently developed by AMD, originally Xilinx before being acquired by the former. Other current product lines include Kintex (mid-range) and Artix (low-cost), each including configurations and models optimized for different applications. In addition, AMD offers the Spartan low-cost series, which continues to be updated and is nearing production utilizing the same underlying architecture and process node as the larger 7-series devices.
Heterogeneous System Architecture (HSA) is a cross-vendor set of specifications that allow for the integration of central processing units and graphics processors on the same bus, with shared memory and tasks. The HSA is being developed by the HSA Foundation, which includes AMD and ARM. The platform's stated aim is to reduce communication latency between CPUs, GPUs and other compute devices, and make these various devices more compatible from a programmer's perspective, relieving the programmer of the task of planning the moving of data between devices' disjoint memories.
A vision processing unit (VPU) is an emerging class of microprocessor; it is a specific type of AI accelerator, designed to accelerate machine vision tasks.
Coherent Accelerator Processor Interface (CAPI), is a high-speed processor expansion bus standard for use in large data center computers, initially designed to be layered on top of PCI Express, for directly connecting central processing units (CPUs) to external accelerators like graphics processing units (GPUs), ASICs, FPGAs or fast storage. It offers low latency, high speed, direct memory access connectivity between devices of different instruction set architectures.
The ARM Cortex-A55 is a central processing unit implementing the ARMv8.2-A 64-bit instruction set designed by ARM Holdings' Cambridge design centre. The Cortex-A55 is a 2-wide decode in-order superscalar pipeline.
Next year, AMD will release a low-power 20nm Cortex A57 based SoC with integrated Graphics Core Next GPU.