GP5 chip

Last updated
The GP5 chip GP5 chip next to pair of dice.png
The GP5 chip

The GP5 is a co-processor accelerator built to accelerate discrete belief propagation on factor graphs and other large-scale tensor product operations for machine learning. It is related to, and anticipated by a number of years, the Google Tensor Processing Unit

It is designed to run as a co-processor with another controller (such as a CPU (x86) or an ARM/MIPS/Tensilica core). It was developed as the culmination of DARPA's Analog Logic program [1]

The GP5 has a fairly exotic architecture, resembling neither a GPU nor a DSP, and leverages massive fine-grained and coarse-grained parallelism. It is deeply pipelined. The different algorithmic tasks involved in performing belief propagation updates are performed by independent, heterogeneous compute units. The performance of the chip is governed by the structure of the machine learning workload being evaluated. In typical cases, the GP5 is roughly 100 times faster and 100 times more energy efficient than a single core of a modern core i7 performing a comparable task. It is roughly 10 times faster and 1000 times more energy efficient than a state-of-the art GPU. It is roughly 1000 times faster and 10 times more energy efficient than a state-of-the-art ARM processor. It was benchmarked on typical machine learning and inference workloads that included protein side-chain folding, turbo error correction decoding, stereo vision, signal noise reduction, and others.

Analog Devices, Inc. acquired the intellectual property for the GP5 when it acquired Lyric Semiconductor, Inc. in 2011.

Related Research Articles

System on a chip type of integrated circuit

A system on chip is an integrated circuit that integrates all components of a computer or other electronic system. These components typically include a central processing unit (CPU), memory, input/output ports and secondary storage – all on a single substrate or microchip, the size of a coin. It may contain digital, analog, mixed-signal, and often radio frequency signal processing functions, depending on the application. As they are integrated on a single substrate, SoCs consume much less power and take up much less area than multi-chip designs with equivalent functionality. Because of this, SoCs are very common in the mobile computing and edge computing markets. Systems-on-chip are typically fabricated using metal–oxide–semiconductor (MOS) technology, and are commonly used in embedded systems and the Internet of Things.

Power management is a feature of some electrical appliances, especially copiers, computers, CPUs, GPUs and computer peripherals such as monitors and printers, that turns off the power or switches the system to a low-power state when inactive. In computing this is known as PC power management and is built around a standard called ACPI. This supersedes APM. All recent (consumer) computers have ACPI support.

In computing, especially digital signal processing, the multiply–accumulate operation is a common step that computes the product of two numbers and adds that product to an accumulator. The hardware unit that performs the operation is known as a multiplier–accumulator ; the operation itself is also often called a MAC or a MAC operation. The MAC operation modifies an accumulator a:

General-purpose computing on graphics processing units is the use of a graphics processing unit (GPU), which typically handles computation only for computer graphics, to perform computation in applications traditionally handled by the central processing unit (CPU). The use of multiple video cards in one computer, or large numbers of graphics chips, further parallelizes the already parallel nature of graphics processing. In addition, even a single GPU-CPU framework provides advantages that multiple CPUs on their own do not offer due to the specialization in each chip.

In electronic design a semiconductor intellectual property core, IP core, or IP block is a reusable unit of logic, cell, or integrated circuit layout design that is the intellectual property of one party. IP cores may be licensed to another party or can be owned and used by a single party alone. The term is derived from the licensing of the patent and/or source code copyright that exist in the design. IP cores can be used as building blocks within application-specific integrated circuit (ASIC) designs or field-programmable gate array (FPGA) logic designs.

Multi-core processor Microprocessor with more than one processing unit

A multi-core processor is a computer processor integrated circuit with two or more separate processing units, called cores, each of which reads and executes program instructions, as if the computer had several processors. The instructions are ordinary CPU instructions but the single processor can run instructions on separate cores at the same time, increasing overall speed for programs that support multithreading or other parallel computing techniques. Manufacturers typically integrate the cores onto a single integrated circuit die or onto multiple dies in a single chip package. The microprocessors currently used in almost all personal computers are multi-core. A multi-core processor implements multiprocessing in a single physical package. Designers may couple cores in a multi-core device tightly or loosely. For example, cores may or may not share caches, and they may implement message passing or shared-memory inter-core communication methods. Common network topologies to interconnect cores include bus, ring, two-dimensional mesh, and crossbar. Homogeneous multi-core systems include only identical cores; heterogeneous multi-core systems have cores that are not identical. Just as with single-processor systems, cores in multi-core systems may implement architectures such as VLIW, superscalar, vector, or multithreading.

Programmable system-on-chip Type of integrated circuit

PSoC is a family of microcontroller integrated circuits by Cypress Semiconductor. These chips include a CPU core and mixed-signal arrays of configurable integrated analog and digital peripherals.

Transistor count the number of transistors in a device

The transistor count is the number of transistors on an integrated circuit (IC). It typically refers to the number of MOSFETs on an IC chip, as all modern ICs use MOSFETs. It is the most common measure of IC complexity. The rate at which MOS transistor counts have increased generally follows Moore's law, which observed that the transistor count doubles approximately every two years.

Intrinsity was a privately held Austin, Texas based fabless semiconductor company; it was founded in 1997 as EVSX on the remnants of Exponential Technology and changed its name to Intrinsity in 2000. It had around 100 employees and supplied tools and services for highly efficient semiconductor logic design, enabling high performance microprocessors with fewer transistors and low power consumption. The acquisition of the firm by Apple Inc. was confirmed on April 27, 2010.

Actions Semiconductor Co. Ltd. is a Chinese fabless semiconductor company founded in 2000 and headquartered in Zhuhai, Guangdong province. The company has an about 600 employees and designs SoCs for tablets, digital audio players, photo viewers and related products.

The ARM Cortex-A9 MPCore is a 32-bit processor core licensed by ARM Holdings implementing the ARMv7-A architecture. It is a multicore processor providing up to 4 cache-coherent cores.

Arm Holdings British multinational semiconductor and software design company

Arm Holdings is a British multinational semiconductor and software design company, owned by SoftBank Group and its Vision Fund. With its headquarters in Cambridgeshire, in the United Kingdom, its primary business is in the design of ARM processors (CPUs), although it also designs software development tools under the DS-5, RealView and Keil brands, as well as systems and platforms, system-on-a-chip (SoC) infrastructure and software. As a "Holding" company, it also holds shares of other companies. It is considered to be market dominant for processors in mobile phones and tablet computers. The company is one of the best-known "Silicon Fen" companies.

Adapteva is a fabless semiconductor company focusing on low power many core microprocessor design. The company was the second company to announce a design with 1,000 specialized processing cores on a single integrated circuit.

ARM Cortex-A7 microprocessor core developed by Arm Holdings, supporting 1-4 cores

The ARM Cortex-A7 MPCore is a 32-bit microprocessor core licensed by ARM Holdings implementing the ARMv7-A architecture announced in 2011.

Heterogeneous computing refers to systems that use more than one kind of processor or cores. These systems gain performance or energy efficiency not just by adding the same type of processors, but by adding dissimilar coprocessors, usually incorporating specialized processing capabilities to handle particular tasks.

TensorFlow machine learning software library

TensorFlow is a free and open-source software library for dataflow and differentiable programming across a range of tasks. It is a symbolic math library, and is also used for machine learning applications such as neural networks. It is used for both research and production at Google.‍  

An AI accelerator is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence applications, especially artificial neural networks, machine vision and machine learning. Typical applications include algorithms for robotics, internet of things and other data-intensive or sensor-driven tasks. They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2018, a typical AI integrated circuit chip contains billions of MOSFET transistors.

Apple A11 Apple SoC

The Apple A11 Bionic is a 64-bit ARM-based system on a chip (SoC), designed by Apple Inc. and manufactured by TSMC. It first appeared in the iPhone 8, iPhone 8 Plus, and iPhone X which were introduced on September 12, 2017. According to Apple the chip has two high-performance cores which are 25% faster than the Apple A10 and four high-efficiency cores which are up to 70% faster than the energy-efficient cores in the A10.

The Pixel Visual Core (PVC) is a series of ARM-based system in package (SiP) image processors designed by Google. The PVC is a fully programmable image, vision and AI multi-core domain-specific architecture (DSA) for mobile devices and in future for IoT. It first appeared in the Google Pixel 2 and 2 XL which were introduced on October 19, 2017. It has also appeared in the Google Pixel 3 and 3 XL. Starting with the Pixel 4, this chip was replaced with the Pixel Neural Core.

The Apple A13 Bionic is a 64-bit ARM-based system on a chip (SoC), designed by Apple Inc. It appears in the iPhone 11 and iPhone 11 Pro.

References

  1. DARPA FA8750-07-C-0231