Megahertz myth

Last updated

The megahertz myth, or in more recent cases the gigahertz myth, refers to the misconception of only using clock rate (for example measured in megahertz or gigahertz) to compare the performance of different microprocessors. While clock rates are a valid way of comparing the performance of different speeds of the same model and type of processor, other factors such as an amount of execution units, pipeline depth, cache hierarchy, branch prediction, and instruction sets can greatly affect the performance when considering different processors. For example, one processor may take two clock cycles to add two numbers and another clock cycle to multiply by a third number, whereas another processor may do the same calculation in two clock cycles. Comparisons between different types of processors are difficult because performance varies depending on the type of task. A benchmark is a more thorough way of measuring and comparing computer performance.

Contents

The myth started around 1984 when comparing the Apple II with the IBM PC. The argument was that the IBM computer was five times faster than the Apple II, as its Intel 8088 processor had a clock speed roughly 4.7 times the clock speed of the MOS Technology 6502 used in the latter. However, what really matters is not how finely divided a machine's instructions are, but how long it takes to complete a given task. Consider the LDA # (Load Accumulator Immediate) instruction. On a 6502 that instruction requires two clock cycles, or 2 μs at 1 MHz. Although the 4.77 MHz 8088's clock cycles are shorter, the LDA # needs at least [1] 4 of them, so it takes 4 / 4.77 MHz = 0.84 μs at least. So, at best, that instruction runs only a little more than 2 times as fast on the original IBM PC than on the Apple II.

History

Pentium 1 series processors KL P1 Family.jpg
Pentium 1 series processors

Background

The x86 CISC based CPU architecture which Intel introduced in 1978 was used as the standard for the DOS based IBM PC, and developments of it still continue to dominate the Microsoft Windows market. An IBM RISC based architecture was used for the PowerPC CPU which was released in 1992. In 1994, Apple Computer introduced Macintosh computers using these PowerPC CPUs. Initially, this architecture met hopes for performance, and different ranges of PowerPC CPUs were developed, often delivering different performances at the same clock rate. Similarly, at this time the Intel 80486 was selling alongside the Pentium which delivered almost twice the performance of the 80486 at the same clock rate. [2]

Rise of the myth

The myth arose because the clock rate was commonly taken as a simple measure of processor performance, and was promoted in advertising and by enthusiasts without taking into account other factors. The term came into use in the context of comparing PowerPC-based Apple Macintosh computers with Intel-based PCs. Marketing based on the myth led to the clock rate being given higher priority than actual performance and led to AMD introducing model numbers giving a notional clock rate based on comparative performance to overcome a perceived deficiency in their actual clock rate. [3]

Modern adaptations of the myth

With the advent of multithreading and multi-core processors, the myth has stirred up more misconceptions regarding the measurement of performance in multi-core processors. Some people without the proper knowledge believe that a quad-core processor running at 3 GHz would result in an overall performance of 12 GHz worth of CPU. Others may say that the overall performance is in fact 3 GHz, with each core running at 750 MHz. Both of these ideas are incorrect. Often the same user making these comparisons will be comparing multiple brands of CPU, which will not do the same amount of work per cycle in any case. While micro-architecture traits such as pipeline depth play the same role in performance, the design of parallel processing brings other factors into the picture such as software efficiency.

It is true that a poorly written program will run poorly on even a single-core [4] system, but even a well-written program that was designed in a linear fashion, will often (if not always) perform better on a single-core system than a multi-core one when run by itself. Not all programs use all available cores, not all tasks can or need to be split over several cores, and additionally using several cores may introduce additional complexity in the program, so multithreading is usually only done when the benefits of doing so are obvious. A system's overall performance cannot be judged by simply comparing the amount of processor cores and clock rates, the software running on the system is also a major factor of observed speed. The myth of the importance of clock rate has confused many people as to how they judge the speed of a computer system.

Challenges to the myth

Comparisons between PowerPC and Pentium had become a staple of Apple presentations. At the New York City Macworld Expo Keynote on July 18, 2001, Steve Jobs described an 867 MHz  G4 as completing a task in 45 seconds while a 1.7 GHz Pentium 4 took 82 seconds for the same task, saying that "the name that we've given it is the megahertz myth". [5] He then introduced senior hardware VP Jon Rubinstein who gave a tutorial describing how shorter pipelines gave better performance at half the clock rate. The online cartoon The Joy of Tech subsequently presented a series of cartoons inspired by Rubinstein's tutorial. [6]

Processor speed limits

Pentium 4 processors had high clock speeds, resulting in high temperatures and high power use. Pentium 4 Prescott 2.40GHz(1).jpg
Pentium 4 processors had high clock speeds, resulting in high temperatures and high power use.

From approximately 1995 to 2005, Intel advertised its Pentium mainstream processors primarily on the basis of clock speed alone, in comparison to competitor products from AMD. Press articles had predicted that computer processors may eventually run as fast as 10 to 20 gigahertz in the next several decades.

This continued up until about 2005, when the Pentium Extreme Edition was reaching thermal dissipation limits running at speeds of nearly 4 gigahertz. The processor could not go faster without requiring complex changes to the cooling design, such as microfluidic cooling channels embedded within the chip itself to remove heat rapidly.

This was followed by the introduction of the Core 2 desktop processor in 2006, which was a major change from previous Intel desktop processors, allowing nearly a 50% decrease in processor clock while retaining the same performance.

Core 2 had its beginnings in the Pentium M mobile processor, where energy efficiency was more important than raw power, and initially offered power-saving options not available in the Pentium 4 and Pentium D.

Higher frequencies

In the succeeding years after the demise of the NetBurst microarchitecture and its 3+ GHz CPUs, microprocessor clock speeds kept slowly increasing after initially dropping by about 1 GHz. Several years' advances in manufacturing processes and power management (specifically, the ability to set clock speeds on a per-core basis) allowed for clock speeds as high or higher than the old NetBurst Pentium 4s and Pentium Ds but with much higher efficiency and performance. As of 2018, many Intel microprocessors are able to exceed a base clock speed of 4 GHz (Intel Core i7-7700K and i3-7350K have a base clock speed of 4.20 GHz, for example).

In 2011, AMD was first able to break the 4 GHz barrier for x86 microprocessors with the debut of the initial Bulldozer based AMD FX CPUs. In June 2013, AMD released the FX-9590 which can reach speeds of up to 5.0 GHz, but similar issues with power usage and heat output returned.

Neither Intel nor AMD produced the first microprocessor across the industry to break the 4 GHz and 5 GHz barriers. The IBM z10 achieved 4.4 GHz in 2008, and the IBM z196 achieved 5.2 GHz in 2010, followed by the z12 achieving 5.5 GHz in Autumn 2012.

Modern microprocessors supplied by AMD; such as the Ryzen 7 7700X from AMD which has base clock speed of 4.5 GHz, 8 cores, and a maximum speed of 5.4 GHz show how just cores and GHz alone do not fully determine the performance. Intel's directly competing microprocessor, the Intel Core i5-13600K, which uses the big.Little architecture, has a much lower base clock of 3.5 GHz on the 6 performance cores, and 2.5 GHz on the 8 efficiency cores. The performance between these microprocessors is extremely close even though there is a major difference in the clock speeds and number of cores between the microprocessors. [7] [8]

See also

Related Research Articles

<span class="mw-page-title-main">Athlon</span> Brand of microprocessors by AMD

Athlon is the brand name applied to a series of x86-compatible microprocessors designed and manufactured by AMD. The original Athlon was the first seventh-generation x86 processor and the first desktop processor to reach speeds of one gigahertz (GHz). It made its debut as AMD's high-end processor brand on June 23, 1999. Over the years AMD has used the Athlon name with the 64-bit Athlon 64 architecture, the Athlon II, and Accelerated Processing Unit (APU) chips targeting the Socket AM1 desktop SoC architecture, and Socket AM4 Zen microarchitecture. The modern Zen-based Athlon with a Radeon Graphics processor was introduced in 2019 as AMD's highest-performance entry-level processor.

<span class="mw-page-title-main">Cyrix 6x86</span> Microprocessor

The Cyrix 6x86 is a line of sixth-generation, 32-bit x86 microprocessors designed and released by Cyrix in 1995. Cyrix, being a fabless company, had the chips manufactured by IBM and SGS-Thomson. The 6x86 was made as a direct competitor to Intel's Pentium microprocessor line, and was pin compatible. During the 6x86's development, the majority of applications performed almost entirely integer operations. The designers foresaw that future applications would most likely maintain this instruction focus. So, to optimize the chip's performance for what they believed to be the most likely application of the CPU, the integer execution resources received most of the transistor budget. This would later prove to be a strategic mistake, as the popularity of the P5 Pentium caused many software developers to hand-optimize code in assembly language, to take advantage of the P5 Pentium's tightly pipelined and lower latency FPU. For example, the highly anticipated first-person shooter Quake used highly optimized assembly code designed almost entirely around the P5 Pentium's FPU. As a result, the P5 Pentium significantly outperformed other CPUs in the game.

<span class="mw-page-title-main">Intel 8088</span> Intel microprocessor model

The Intel 8088 microprocessor is a variant of the Intel 8086. Introduced on June 1, 1979, the 8088 has an eight-bit external data bus instead of the 16-bit bus of the 8086. The 16-bit registers and the one megabyte address range are unchanged, however. In fact, according to the Intel documentation, the 8086 and 8088 have the same execution unit (EU)—only the bus interface unit (BIU) is different. The 8088 was used in the original IBM PC and in IBM PC compatible clones.

i486 Successor to the Intel 386

The Intel 486, officially named i486 and also known as 80486, is a microprocessor. It is a higher-performance follow-up to the Intel 386. The i486 was introduced in 1989. It represents the fourth generation of binary compatible CPUs following the 8086 of 1978, the Intel 80286 of 1982, and 1985's i386.

<span class="mw-page-title-main">Instructions per second</span> Measure of a computers processing speed

Instructions per second (IPS) is a measure of a computer's processor speed. For complex instruction set computers (CISCs), different instructions take different amounts of time, so the value measured depends on the instruction mix; even for comparing processors in the same family the IPS measurement can be problematic. Many reported IPS values have represented "peak" execution rates on artificial instruction sequences with few branches and no cache contention, whereas realistic workloads typically lead to significantly lower IPS values. Memory hierarchy also greatly affects processor performance, an issue barely considered in IPS calculations. Because of these problems, synthetic benchmarks such as Dhrystone are now generally used to estimate computer performance in commonly used applications, and raw IPS has fallen into disuse.

<span class="mw-page-title-main">Pentium (original)</span> Intel microprocessor

The Pentium is a x86 microprocessor introduced by Intel on March 22, 1993. It is the first CPU using the Pentium brand. Considered the fifth generation in the 8086 compatible line of processors, its implementation and microarchitecture was internally called P5.

<span class="mw-page-title-main">Duron</span> Series of CPUs by AMD

Duron is a line of budget x86-compatible microprocessors manufactured by AMD and released on June 19, 2000. Duron was intended to be a lower-cost offering to complement AMD's then mainstream performance Athlon processor line, and it also competed with rival chipmaker Intel's Pentium III and Celeron processor offerings. The Duron brand name was retired in 2004, succeeded by the AMD's Sempron line of processors as their budget offering.

<span class="mw-page-title-main">Celeron</span> Line of discontinued microprocessors made by Intel

Celeron is a discontinued series of low-end IA-32 and x86-64 computer microprocessor models targeted at low-cost personal computers, manufactured by Intel. The first Celeron-branded CPU was introduced on April 15, 1998, and was based on the Pentium II.

<span class="mw-page-title-main">Cyrix</span> American microprocessor developer

Cyrix Corporation was a microprocessor developer that was founded in 1988 in Richardson, Texas, as a specialist supplier of floating point units for 286 and 386 microprocessors. The company was founded by Tom Brightman and Jerry Rogers.

<span class="mw-page-title-main">Pentium 4</span> Brand by Intel

Pentium 4 is a series of single-core CPUs for desktops, laptops and entry-level servers manufactured by Intel. The processors were shipped from November 20, 2000 until August 8, 2008. It was removed from the official price lists starting in 2010, being replaced by Pentium Dual-Core.

<span class="mw-page-title-main">Pentium III</span> Line of desktop and mobile microprocessors produced by Intel

The Pentium III brand refers to Intel's 32-bit x86 desktop and mobile CPUs based on the sixth-generation P6 microarchitecture introduced on February 28, 1999. The brand's initial processors were very similar to the earlier Pentium II-branded processors. The most notable differences were the addition of the Streaming SIMD Extensions (SSE) instruction set, and the introduction of a controversial serial number embedded in the chip during manufacturing. The Pentium III is also a single-core processor.

<span class="mw-page-title-main">Pentium Pro</span> Sixth-generation x86 microprocessor by Intel

The Pentium Pro is a sixth-generation x86 microprocessor developed and manufactured by Intel and introduced on November 1, 1995. It introduced the P6 microarchitecture and was originally intended to replace the original Pentium in a full range of applications. While the Pentium and Pentium MMX had 3.1 and 4.5 million transistors, respectively, the Pentium Pro contained 5.5 million transistors. Later, it was reduced to a more narrow role as a server and high-end desktop processor and was used in supercomputers like ASCI Red, the first computer to reach the trillion floating point operations per second (teraFLOPS) performance mark in 1996. The Pentium Pro was capable of both dual- and quad-processor configurations. It only came in one form factor, the relatively large rectangular Socket 8. The Pentium Pro was succeeded by the Pentium II Xeon in 1998.

<span class="mw-page-title-main">Cyrix 5x86</span> 1995 line of x86-compatible microprocessors

The Cyrix 5x86 was a line of x86 microprocessors designed by Cyrix and released on June 5 of 1995. Cyrix, being a fabless company, had the chips manufactured by IBM. The line came out about 5 months before the more famous Cyrix 6x86. The Cyrix 5x86 was one of the fastest CPUs ever produced for Socket 3 computer systems. With better performance in most applications than an Intel Pentium processor at 75 MHz, the Cyrix Cx5x86 filled a gap by providing a medium-performance processor option for 486 Socket 3 motherboards.

The PR system was a figure of merit developed by AMD, Cyrix, IBM Microelectronics and SGS-Thomson in the mid-1990s as a method of comparing their x86 processors to those of rival Intel. The idea was to consider instructions per cycle (IPC) in addition to the clock speed, so that the processors become comparable with Intel's Pentium that had a higher clock speed with overall lower IPC.

<span class="mw-page-title-main">Athlon 64</span> Series of CPUs by AMD

The Athlon 64 is a ninth-generation, AMD64-architecture microprocessor produced by Advanced Micro Devices (AMD), released on September 23, 2003. It is the third processor to bear the name Athlon, and the immediate successor to the Athlon XP. The Athlon 64 was the second processor to implement the AMD64 architecture and the first 64-bit processor targeted at the average consumer. Variants of the Athlon 64 have been produced for Socket 754, Socket 939, Socket 940, and Socket AM2. It was AMD's primary consumer CPU, and primarily competed with Intel's Pentium 4, especially the Prescott and Cedar Mill core revisions.

In computing, the clock rate or clock speed typically refers to the frequency at which the clock generator of a processor can generate pulses, which are used to synchronize the operations of its components, and is used as an indicator of the processor's speed. It is measured in the SI unit of frequency hertz (Hz).

<span class="mw-page-title-main">VIA C3</span> Family of x86 central processing units for personal computers

The VIA C3 is a family of x86 central processing units for personal computers designed by Centaur Technology and sold by VIA Technologies. The different CPU cores are built following the design methodology of Centaur Technology.

The NetBurst microarchitecture, called P68 inside Intel, was the successor to the P6 microarchitecture in the x86 family of central processing units (CPUs) made by Intel. The first CPU to use this architecture was the Willamette-core Pentium 4, released on November 20, 2000 and the first of the Pentium 4 CPUs; all subsequent Pentium 4 and Pentium D variants have also been based on NetBurst. In mid-2001, Intel released the Foster core, which was also based on NetBurst, thus switching the Xeon CPUs to the new architecture as well. Pentium 4-based Celeron CPUs also use the NetBurst architecture.

<span class="mw-page-title-main">P6 (microarchitecture)</span> Intel processor microarchitecture

The P6 microarchitecture is the sixth-generation Intel x86 microarchitecture, implemented by the Pentium Pro microprocessor that was introduced in November 1995. It is frequently referred to as i686. It was planned to be succeeded by the NetBurst microarchitecture used by the Pentium 4 in 2000, but was revived for the Pentium M line of microprocessors. The successor to the Pentium M variant of the P6 microarchitecture is the Core microarchitecture which in turn is also derived from P6.

<span class="mw-page-title-main">History of general-purpose CPUs</span> History of processors used in general purpose computers

The history of general-purpose CPUs is a continuation of the earlier history of computing hardware.

References

  1. The 8088 has a loosely-coupled Execution Unit (EU) and Bus Interface Unit (BIU), with a prefetch queue; in the 8088, to execute the MOV AL,# instruction, similar in function to the LDA # instruction of the 6502, the EU requires 4 clock cycles, but the BIU requires 8 clock cycles. (It is a 2-byte instruction, and the BIU requires 4 clock cycles to read or write 1 byte, assuming no wait states.) Therefore, if the instruction is already in the prefetch queue, it takes 4 clock cycles to execute; if the instruction has not been prefetched, it takes 8 clock cycles; and if the BIU is in the process of prefetching the instruction when the EU begins to execute it, it takes 5 to 7 clock cycles. In contrast, the 6502, which has a much simpler fetch-execute pipeline, always takes the same number of clock cycles to execute a given instruction in any context.
  2. "Analysis: x86 Vs PPC" . Retrieved 2008-09-18.
  3. Tony Smith (February 28, 2002). "Megahertz myth : Technology". The Guardian . Retrieved 2008-09-18.
  4. "single-core". Archived from the original on 2009-08-05. Retrieved 2017-09-04.
  5. "A video of Megahertz Myth presentation". YouTube . Archived from the original on 2021-12-21.
  6. Nitrozac and Snaggy (2001-10-11). "The Megahertz Myth". The Joy of Tech. Retrieved 2011-11-21.
  7. https://www.amd.com/en/products/cpu/amd-ryzen-7-7700x
  8. https://www.intel.com/content/www/us/en/products/sku/230493/intel-core-i513600k-processor-24m-cache-up-to-5-10-ghz/specifications.html