Huang's law

Last updated

Huang's law is an observation in computer science and engineering that advancements in graphics processing units (GPUs) are growing at a rate much faster than with traditional central processing units (CPUs). The observation is in contrast to Moore's law that predicted the number of transistors in a dense integrated circuit (IC) doubles about every two years. [1] Huang's law states that the performance of GPUs will more than double every two years. [2] The hypothesis is subject to questions about its validity.

Contents

History

NVIDIA RTX 4090.jpg
An RTX 4090, the most recent flagship card in Nvidia's GeForce series, with 82.58 TFLOPS at single precision computing (FP32) …
RTX 3090 Founders Edition.jpg
… and its predecessor, the RTX 3090, with 35.58 TFLOPS (FP32), showing a massive leap in raw performance in just one generation

The observation was made by Jensen Huang, the chief executive officer of Nvidia, at its 2018 GPU Technology Conference (GTC) held in San Jose, California. [3] He observed that Nvidia's GPUs were "25 times faster than five years ago" whereas Moore's law would have expected only a ten-fold increase. [2] As microchip components become smaller, it became harder for chip advancement to meet the speed of Moore's law. [4]

In 2006 Nvidia's GPU had a 4x performance advantage over other CPUs. In 2018 the Nvidia GPU was 20 times faster than a comparable CPU node: the GPUs were 1.7x faster each year. Moore’s law would predict a doubling every two years, however Nvidia's GPU performance was more than tripled every two years, fulfilling Huang's law. [5]

Huang's law claims that a synergy between hardware, software, and artificial intelligence makes the new 'law' possible. [upper-alpha 1] Huang said, "The innovation isn't just about chips," he said, "It's about the entire stack." He said that graphics processors especially are important to a new paradigm. [3] Elimination of bottlenecks can speed up the process and create advantages in getting to the goal. "Nvidia is a one trick pony," Huang has said. [7] According to Huang: "Accelerated computing is liberating, … Let’s say you have an airplane that has to deliver a package. It takes 12 hours to deliver it. Instead of making the plane go faster, concentrate on how to deliver the package faster, look at 3D printing at the destination." The object "… is to deliver the goal faster." [7]

For artificial intelligence tasks, Huang said that training the convolutional network AlexNet took six days on two of Nvidia's GTX 580 processors to complete the training process but only 18 minutes on a modern DGX-2 AI server, resulting in a speed-up factor of 500. Compared to Moore's law, which focuses purely on CPU transistors, Huang's law describes a combination of advances in architecture, interconnects, memory technology, and algorithms. [2] [6]

Reception

Bharath Ramsundar wrote that deep learning is being coupled with "[i]mprovements in custom architecture". For example, machine learning systems have been implemented in the blockchain world, where Bitmain assaulted "many cryptocurrencies by designing custom mining ASICs (application-specific integrated circuits)" which had been envisioned as undoable. "Nvidia's grand achievement however is in making the case that these improvement in architectures are not merely isolated victories for specific applications but perhaps broadly applicable to all of computer science." They have suggested that broad harnessing of GPUs and the GPU stack (cf., CPU stack) can deliver "dramatic growth in deep learning architecture." "The magic" of Huang's law promise is that as nascent deep learning powered software becomes more availed, the improvements from GPU scaling and more generally from architectural improvements" will concretely improve "performance and behavior of modern software stacks." [8]

There has been criticism. Journalist Joel Hruska writing in ExtremeTech in 2020 said "there is no such thing as Huang's Law", calling it an "illusion" that rests on the gains made possible by Moore's law; and that it is too soon to determine a law exists. [9] The research nonprofit Epoch has found that, between 2006 and 2021, GPU price performance (in terms of FLOPS/$) has tended to double approximately every 2.5 years, much slower than predicted by Huang's law. [10]

See also

Notes

  1. Contrary to other reports, it is said that "Huang’s Law" … is a term coined by The Wall Street Journal journalist Christopher Mims." [2] [6]

Related Research Articles

<span class="mw-page-title-main">AMD</span> American multinational semiconductor company

Advanced Micro Devices, Inc. (AMD) is an American multinational corporation and semiconductor company based in Santa Clara, California, that develops computer processors and related technologies for business and consumer markets.

<span class="mw-page-title-main">Nvidia</span> American multinational technology company

Nvidia Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California, and incorporated in Delaware. It is a software and fabless company which designs and supplies graphics processing units (GPUs), application programming interfaces (APIs) for data science and high-performance computing as well as system on a chip units (SoCs) for the mobile computing and automotive market. Nvidia is also a dominant supplier of artificial intelligence (AI) hardware and software.

<span class="mw-page-title-main">GeForce</span> Brand of GPUs by Nvidia

GeForce is a brand of graphics processing units (GPUs) designed by Nvidia and marketed for the performance market. As of the GeForce 40 series, there have been eighteen iterations of the design. The first GeForce products were discrete GPUs designed for add-on graphics boards, intended for the high-margin PC gaming market, and later diversification of the product line covered all tiers of the PC graphics market, ranging from cost-sensitive GPUs integrated on motherboards, to mainstream add-in retail boards. Most recently, GeForce technology has been introduced into Nvidia's line of embedded application processors, designed for electronic handhelds and mobile handsets.

<span class="mw-page-title-main">Coprocessor</span> Type of computer processor

A coprocessor is a computer processor used to supplement the functions of the primary processor. Operations performed by the coprocessor may be floating-point arithmetic, graphics, signal processing, string processing, cryptography or I/O interfacing with peripheral devices. By offloading processor-intensive tasks from the main processor, coprocessors can accelerate system performance. Coprocessors allow a line of computers to be customized, so that customers who do not need the extra performance do not need to pay for it.

<span class="mw-page-title-main">Graphics processing unit</span> Specialized electronic circuit; graphics accelerator

A graphics processing unit (GPU) is a specialized electronic circuit initially designed to accelerate computer graphics and image processing. After their initial design, GPUs were found to be useful for non-graphic calculations involving embarrassingly parallel problems due to their parallel structure. Other non-graphical uses include the training of neural networks and cryptocurrency mining.

<span class="mw-page-title-main">Northbridge (computing)</span> PC chip handling onboard control tasks

In computing, a northbridge is a microchip that comprises the core logic chipset architecture on motherboards to handle high-performance tasks, especially for older personal computers. It is connected directly to a CPU via the front-side bus (FSB), and is usually used in conjunction with a slower southbridge to manage communication between the CPU and other parts of the motherboard.

The transistor count is the number of transistors in an electronic device. It is the most common measure of integrated circuit complexity. The rate at which MOS transistor counts have increased generally follows Moore's law, which observes that transistor count doubles approximately every two years. However, being directly proportional to the area of a chip, transistor count does not represent how advanced the corresponding manufacturing technology is: a better indication of this is transistor density.

<span class="mw-page-title-main">CUDA</span> Parallel computing platform and programming model

Compute Unified Device Architecture (CUDA) is a proprietary parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for accelerated general-purpose processing, an approach called general-purpose computing on GPUs (GPGPU). CUDA API and its runtime: The CUDA API is an extension of the C programming language that adds the ability to specify thread-level parallelism in C and also to specify GPU device specific operations (like moving data between the CPU and the GPU). CUDA is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements for the execution of compute kernels. In addition to drivers and runtime kernels, the CUDA platform includes compilers, libraries and developer tools to help programmers accelerate their applications.

<span class="mw-page-title-main">Larrabee (microarchitecture)</span> Canceled Intel GPGPU chip

Larrabee is the codename for a cancelled GPGPU chip that Intel was developing separately from its current line of integrated graphics accelerators. It is named after either Mount Larrabee or Larrabee State Park in Whatcom County, Washington, near the town of Bellingham. The chip was to be released in 2010 as the core of a consumer 3D graphics card, but these plans were cancelled due to delays and disappointing early performance figures. The project to produce a GPU retail product directly from the Larrabee research project was terminated in May 2010 and its technology was passed on to the Xeon Phi. The Intel MIC multiprocessor architecture announced in 2010 inherited many design elements from the Larrabee project, but does not function as a graphics processing unit; the product is intended as a co-processor for high performance computing.

AMD FireStream was AMD's brand name for their Radeon-based product line targeting stream processing and/or GPGPU in supercomputers. Originally developed by ATI Technologies around the Radeon X1900 XTX in 2006, the product line was previously branded as both ATI FireSTREAM and AMD Stream Processor. The AMD FireStream can also be used as a floating-point co-processor for offloading CPU calculations, which is part of the Torrenza initiative. The FireStream line has been discontinued since 2012, when GPGPU workloads were entirely folded into the AMD FirePro line.

<span class="mw-page-title-main">Tegra</span> System on a chip by Nvidia

Tegra is a system on a chip (SoC) series developed by Nvidia for mobile devices such as smartphones, personal digital assistants, and mobile Internet devices. The Tegra integrates an ARM architecture central processing unit (CPU), graphics processing unit (GPU), northbridge, southbridge, and memory controller onto one package. Early Tegra SoCs are designed as efficient multimedia processors. The Tegra-line evolved to emphasize performance for gaming and machine learning applications without sacrificing power efficiency, before taking a drastic shift in direction towards platforms that provide vehicular automation with the applied "Nvidia Drive" brand name on reference boards and its semiconductors; and with the "Nvidia Jetson" brand name for boards adequate for AI applications within e.g. robots or drones, and for various smart high level automation purposes.

Project Denver is the codename of a central processing unit designed by Nvidia that implements the ARMv8-A 64/32-bit instruction sets using a combination of simple hardware decoder and software-based binary translation where "Denver's binary translation layer runs in software, at a lower level than the operating system, and stores commonly accessed, already optimized code sequences in a 128 MB cache stored in main memory". Denver is a very wide in-order superscalar pipeline. Its design makes it suitable for integration with other SIPs cores into one die constituting a system on a chip (SoC).

Heterogeneous computing refers to systems that use more than one kind of processor or core. These systems gain performance or energy efficiency not just by adding the same type of processors, but by adding dissimilar coprocessors, usually incorporating specialized processing capabilities to handle particular tasks.

<span class="mw-page-title-main">Volta (microarchitecture)</span> GPU microarchitecture by Nvidia

Volta is the codename, but not the trademark, for a GPU microarchitecture developed by Nvidia, succeeding Pascal. It was first announced on a roadmap in March 2013, although the first product was not announced until May 2017. The architecture is named after 18th–19th century Italian chemist and physicist Alessandro Volta. It was Nvidia's first chip to feature Tensor Cores, specially designed cores that have superior deep learning performance over regular CUDA cores. The architecture is produced with TSMC's 12 nm FinFET process. The Ampere microarchitecture is the successor to Volta.

<span class="mw-page-title-main">High Bandwidth Memory</span> Type of memory used on processors that require high transfer rate memory

High Bandwidth Memory (HBM) is a computer memory interface for 3D-stacked synchronous dynamic random-access memory (SDRAM) initially from Samsung, AMD and SK Hynix. It is used in conjunction with high-performance graphics accelerators, network devices, high-performance datacenter AI ASICs, as on-package cache in CPUs and on-package RAM in upcoming CPUs, and FPGAs and in some supercomputers. The first HBM memory chip was produced by SK Hynix in 2013, and the first devices to use HBM were the AMD Fiji GPUs in 2015.

An AI accelerator, deep learning processor, or neural processing unit (NPU) is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and machine vision. Typical applications include algorithms for robotics, Internet of Things, and other data-intensive or sensor-driven tasks. They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2024, a typical AI integrated circuit chip contains tens of billions of MOSFET transistors.

<span class="mw-page-title-main">AMD Instinct</span> Brand name by AMD; data center GPUs for high-performance-computing, machine learning

AMD Instinct is AMD's brand of data center GPUs. It replaced AMD's FirePro S brand in 2016. Compared to the Radeon brand of mainstream consumer/gamer products, the instinct product line is intended to accelerate deep learning, artificial neural network, and high-performance computing/GPGPU applications.

HEAVY.AI is an American-based software company, that uses graphics processing units (GPUs) and central processing units (CPUs) to query and visualize big data. The company was founded in 2013 by Todd Mostak and Thomas Graham and is headquartered in San Francisco, California.

<span class="mw-page-title-main">Hopper (microarchitecture)</span> GPU microarchitecture designed by Nvidia

Hopper is a graphics processing unit (GPU) microarchitecture developed by Nvidia. It is designed for datacenters and is parallel to Ada Lovelace. It's the latest generation of Nvidia Tesla.

Nvidia GTC is a global artificial intelligence (AI) conference for developers that brings together developers, engineers, researchers, inventors, and IT professionals. Topics focus on AI, computer graphics, data science, machine learning and autonomous machines. Each conference begins with a keynote from Nvidia CEO and founder Jensen Huang, followed by a variety of sessions and talks with experts from around the world.

References

  1. Drum, Kevin. "Moore's Law is dead. Long live Huang's Law".
  2. 1 2 3 4 Mims, Christopher (September 19, 2020). "Huang's Law Is the New Moore's Law, and Explains Why Nvidia Wants Arm". Wall Street Journal via www.wsj.com.(subscription required) reprinted in "Huang's Law is New Moore's Law and explains why Nvidia wants an arm". Yahoo!. September 29, 2020.
  3. 1 2 Perry, Tekla S. (May 2018). "Move Over, Moore's Law: Make Way for Huang's Law". IEEE Spectrum . IEEE . Retrieved September 24, 2020. Graphics processors are on a supercharged development path that eclipses Moore's Law. … GPUs are also advancing more quickly than CPUs because they rely upon a parallel architecture, Jesse Clayton, an Nvidia senior manager, pointed out in another session."
  4. Tibken, Shara (9 January 2019). "CES 2019: Moore's Law is dead, says Nvidia's CEO". CBS Interactive. CNet . Retrieved 24 September 2020.
  5. Woodie, Alex (27 March 2018). "Nvidia Riding High as GPU Workloads and Capabilities Soar". HPCwire. Retrieved 24 September 2020.
  6. 1 2 Goetting, Brittany (September 20, 2020). "Advancing AI And 'Huang's Law' Could Be Why NVIDIA Moved To Acquire Arm". HotHardware . Retrieved September 24, 2020.
  7. 1 2 Hayes, Caroline (October 11, 2018). "Jensen Huang: Moore's law is dead – long live AI". Electronics Weekly . Metropolis International . Retrieved September 24, 2020. ... there are two dynamics controlling the computing industry today – the end of Moore's law and software that can write itself, artificial intelligence, or AI. ... We can study where bottlenecks are. New software systems make the application go faster, not just the chip.
  8. Ramsundar, Bharath (April 7, 2018). "The Advent of Huang's Law" . Retrieved September 24, 2020.
  9. Hruska, Joel (September 22, 2020). "There's No Such Thing as 'Huang's Law,' Despite Nvidia's AI Lead". Extreme Tech .
  10. Marius Hobbhahn and Tamay Besiroglu (2022), "Trends in GPU price-performance". Published online at epochai.org. Retrieved from: https://epochai.org/blog/trends-in-gpu-price-performance