BogoMips

Last updated September 26, 2023

BogoMips (from "bogus" and MIPS) is a crude measurement of CPU speed made by the Linux kernel when it boots to calibrate an internal busy-loop.^[1] An often-quoted definition of the term is "the number of million times per second a processor can do absolutely nothing".^[2]^[3]

History

In 1993, Lars Wirzenius posted a Usenet message^[5] explaining the reasons for its introduction in the Linux kernel on comp.os.linux:

[...]

MIPS is short for Millions of Instructions Per Second. It is a measure for the computation speed of a processor. Like most such measures, it is more often abused than used properly (it is very difficult to justly compare MIPS for different kinds of computers).

BogoMips are Linus's own invention. The linux kernel version 0.99.11 (dated 11 July 1993) needed a timing loop (the time is too short and/or needs to be too exact for a non-busy-loop method of waiting), which must be calibrated to the processor speed of the machine. Hence, the kernel measures at boot time how fast a certain kind of busy loop runs on a computer. "Bogo" comes from "bogus", i.e, something which is a fake. Hence, the BogoMips value gives some indication of the processor speed, but it is way too unscientific to be called anything but BogoMips.

The reasons (there are two) it is printed during boot-up is that a) it is slightly useful for debugging and for checking that the computer[’]s caches and turbo button work, and b) Linus loves to chuckle when he sees confused people on the news.

[...]

Proper BogoMips ratings

As a very approximate guide, the BogoMips can be pre-calculated by the following table. The given rating is typical for that CPU with the then current and applicable Linux version. The index is the ratio of "BogoMips per clock speed" for any CPU to the same for an Intel 386DX CPU, for comparison purposes.^[6]^[7]

System	Rating	Index
Intel 8088	clock × 0.004	0.02
Intel/AMD 386SX	clock × 0.14	0.8
Intel/AMD 386DX	clock × 0.18	1 (definition)
Motorola 68030	clock × 0.25	1.4
Cyrix/IBM 486	clock × 0.34	1.8
Intel Pentium	clock × 0.40	2.2
Intel 486	clock × 0.50	2.8
AMD 5x86	clock × 0.50	2.8
MIPS R4000/R4400	clock × 0.50	2.8
ARM9	clock × 0.50	2.8
Motorola 68040	clock × 0.67	3.7
PowerPC 603	clock × 0.67	3.7
Intel StrongARM	clock × 0.66	3.7
NexGen Nx586	clock × 0.75	4.2
PowerPC 601	clock × 0.84	4.7
Alpha 21064/21064A	clock × 0.99	5.5
Alpha 21066/21066A	clock × 0.99	5.5
Alpha 21164/21164A	clock × 0.99	5.5
Intel Pentium Pro	clock × 0.99	5.5
Cyrix 5x86/6x86	clock × 1.00	5.6
Intel Pentium II/III	clock × 1.00	5.6
AMD K7/Athlon	clock × 1.00	5.6
Intel Celeron	clock × 1.00	5.6
Intel Itanium	clock × 1.00	5.6
R4600	clock × 1.00	5.6
Hitachi SH-4	clock × 1.00	5.6
Raspberry Pi (Model B)	clock × 1.00	5.6
Intel Itanium 2	clock × 1.49	8.3
Alpha 21264	clock × 1.99	11.1
VIA Centaur	clock × 1.99	11.1
AMD K5/K6/K6-2/K6-III	clock × 2.00	11.1
AMD Duron/Athlon XP	clock × 2.00	11.1
AMD Sempron	clock × 2.00	11.1
UltraSparc II	clock × 2.00	11.1
Intel Pentium MMX	clock × 2.00	11.1
Intel Pentium 4	clock × 2.00	11.1
Intel Pentium M	clock × 2.00	11.1
Intel Core Duo	clock × 2.00	11.1
Intel Core 2 Duo	clock × 2.00	11.1
Intel Atom N455	clock × 2.00	11.1
Centaur C6-2	clock × 2.00	11.1
PowerPC 604/604e/750	clock × 2.00	11.1
Intel Pentium III Coppermine	clock × 2.00	11.1
Intel Pentium III Xeon	clock × 2.00	11.1
Motorola 68060	clock × 2.00	11.1
Intel Xeon MP (32-bit) (hyper-threading)	clock × 3.97	22.1
IBM S390	not enough data (yet)
ARM	not enough data (yet)

With the 2.2.14 Linux kernel, a caching setting of the CPU state was moved from behind to before the BogoMips calculation. Although the BogoMips algorithm itself wasn't changed, from that kernel onward the BogoMips rating for then current Pentium CPUs was twice that of the rating before the change. The changed BogoMips outcome had no effect on real processor performance.^{[ citation needed ]}

In Linux, BogoMips can be easily obtained by searching the cpuinfo file:^[7]

$ grep-ibogomips/proc/cpuinfo

Computation of BogoMips

With kernel 2.6.x, BogoMips are implemented in the /usr/src/linux/init/calibrate.c kernel source file. It computes the Linux kernel timing parameter loops_per_jiffy (see jiffy) value. The explanation from source code:

  /*    * A simple loop like    *  while ( jiffies < start_jiffies+1)    *    start = read_current_timer();    * will not do. As we don't really know whether jiffy switch    * happened first or timer_value was read first. And some asynchronous    * event can happen between these two events introducing errors in lpj.    *    * So, we do    * 1. pre_start <- When we are sure that jiffy switch hasn't happened    * 2. check jiffy switch    * 3. start <- timer value before or after jiffy switch    * 4. post_start <- When we are sure that jiffy switch has happened    *    * Note, we don't know anything about order of 2 and 3.    * Now, by looking at post_start and pre_start difference, we can    * check whether any asynchronous event happened or not    */

loops_per_jiffy is used to implement udelay (delay in microseconds) and ndelay (delay in nanoseconds) functions. These functions are needed by some drivers to wait for hardware. Note that a busy waiting technique is used, so the kernel is effectively blocked when executing ndelay/udelay functions. For i386 architecture delay_loop is implemented in /usr/src/linux/arch/i386/lib/delay.c as:

/* simple loop based delay: */staticvoiddelay_loop(unsignedlongloops){intd0;__asm____volatile__("\tjmp 1f\n"".align 16\n""1:\tjmp 2f\n"".align 16\n""2:\tdecl %0\n\tjns 2b":"=&a"(d0):"0"(loops));}

equivalent to the following assembler code

;  input: eax = d0; output: eax = 0jmpstart.align16start:jmpbody.align16body:decleaxjnsbody

which can be rewritten to C-pseudocode

staticvoiddelay_loop(longloops){longd0=loops;do{--d0;}while(d0>=0);}

Full and complete information and details about BogoMips, and hundreds of reference entries can be found in the (outdated) BogoMips mini-Howto.^[4]

Timer-based delays

In 2012, ARM contributed a new udelay implementation allowing the system timer built into many ARMv7 CPUs to be used instead of a busy-wait loop. This implementation was released in Version 3.6 of the Linux kernel.^[8] Timer-based delays are more robust on systems that use frequency scaling to dynamically adjust the processor's speed at runtime, as loops_per_jiffies values may not necessarily scale linearly. Also, since the timer frequency is known in advance, no calibration is needed at boot time.

One side effect of this change is that the BogoMIPS value will reflect the timer frequency, not the CPU's core frequency. Typically the timer frequency is much lower than the processor's maximum frequency, and some users may be surprised to see an unusually low BogoMIPS value when comparing against systems that use traditional busy-wait loops.

Related Research Articles

In computing, a context switch is the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point, and then restoring a different, previously saved, state. This allows multiple processes to share a single central processing unit (CPU), and is an essential feature of a multiprogramming or multitasking operating system. In a traditional CPU, each process - a program in execution - utilizes the various CPU registers to store data and hold the current state of the running process. However, in a multitasking operating system, the operating system switches between processes or threads to allow the execution of multiple processes simultaneously. For every switch, the operating system must save the state of the currently running process, followed by loading the next process state, which will run on the CPU. This sequence of operations that stores the state of the running process and the loading of the following running process is called a context switch.

MIPS is a family of reduced instruction set computer (RISC) instruction set architectures (ISA) developed by MIPS Computer Systems, now MIPS Technologies, based in the United States.

Non-uniform memory access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can access its own local memory faster than non-local memory. The benefits of NUMA are limited to particular workloads, notably on servers where the data is often associated strongly with certain tasks or users.

In computing, a system call is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services, creation and execution of new processes, and communication with integral kernel services such as process scheduling. System calls provide an essential interface between a process and the operating system.

<span class="mw-page-title-main">Load (computing)</span> Amount of computational work that a computer system performs

In UNIX computing, the system load is a measure of the amount of computational work that a computer system performs. The load average represents the average system load over a period of time. It conventionally appears in the form of three numbers which represent the system load during the last one-, five-, and fifteen-minute periods.

In computing, scheduling is the action of assigning resources to perform tasks. The resources may be processors, network links or expansion cards. The tasks may be threads, processes or data flows.

In software engineering, a spinlock is a lock that causes a thread trying to acquire it to simply wait in a loop ("spin") while repeatedly checking whether the lock is available. Since the thread remains active but is not performing a useful task, the use of such a lock is a kind of busy waiting. Once acquired, spinlocks will usually be held until they are explicitly released, although in some implementations they may be automatically released if the thread being waited on blocks or "goes to sleep".

In computer science and software engineering, busy-waiting, busy-looping or spinning is a technique in which a process repeatedly checks to see if a condition is true, such as whether keyboard input or a lock is available. Spinning can also be used to generate an arbitrary time delay, a technique that was necessary on systems that lacked a method of waiting a specific length of time. Processor speeds vary greatly from computer to computer, especially as some processors are designed to dynamically adjust speed based on current workload. Consequently, spinning as a time-delay technique can produce unpredictable or even inconsistent results on different systems unless code is included to determine the time a processor takes to execute a "do nothing" loop, or the looping code explicitly checks a real-time clock.

The proc filesystem (procfs) is a special filesystem in Unix-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional tracing methods or direct access to kernel memory. Typically, it is mapped to a mount point named /proc at boot time. The proc file system acts as an interface to internal data structures about running processes in the kernel. In Linux, it can also be used to obtain information about the kernel and to change certain kernel parameters at runtime (sysctl).

In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it.

Enhanced SpeedStep is a series of dynamic frequency scaling technologies built into some Intel microprocessors that allow the clock speed of the processor to be dynamically changed by software. This allows the processor to meet the instantaneous performance needs of the operation being performed, while minimizing power draw and heat generation. EIST was introduced in several Prescott 6 series in the first quarter of 2005, namely the Pentium 4 660. Intel Speed Shift Technology (SST) was introduced in Intel Skylake Processor.

The High Precision Event Timer (HPET) is a hardware timer available in modern x86-compatible personal computers. Compared to older types of timers available in the x86 architecture, HPET allows more efficient processing of highly timing-sensitive applications, such as multimedia playback and OS task switching. It was developed jointly by Intel and Microsoft and has been incorporated in PC chipsets since 2005. Formerly referred to by Intel as a Multimedia Timer, the term HPET was selected to avoid confusion with the software multimedia timers introduced in the MultiMedia Extensions to Windows 3.0.

In the x86 architecture, the CPUID instruction is a processor supplementary instruction allowing software to discover details of the processor. It was introduced by Intel in 1993 with the launch of the Pentium and SL-enhanced 486 processors.

Ubicom was a company which developed communications and media processor (CMP) and software platforms for real-time interactive applications and multimedia content delivery in the digital home. The company provided optimized system-level solutions to OEMs for a wide range of products including wireless routers, access points, VoIP gateways, streaming media devices, print servers and other network devices. Ubicom was a venture-backed, privately held company with corporate headquarters in San Jose, California.

The Time Stamp Counter (TSC) is a 64-bit register present on all x86 processors since the Pentium. It counts the number of CPU cycles since its reset. The instruction RDTSC returns the TSC in EDX:EAX. In x86-64 mode, RDTSC also clears the upper 32 bits of RAX and RDX. Its opcode is 0F 31. Pentium competitors such as the Cyrix 6x86 did not always have a TSC and may consider RDTSC an illegal instruction. Cyrix included a Time Stamp Counter in their MII.

OpenVZ is an operating-system-level virtualization technology for Linux. It allows a physical server to run multiple isolated operating system instances, called containers, virtual private servers (VPSs), or virtual environments (VEs). OpenVZ is similar to Solaris Containers and LXC.

In the x86 computer architecture, HLT (halt) is an assembly language instruction which halts the central processing unit (CPU) until the next external interrupt is fired. Interrupts are signals sent by hardware devices to the CPU alerting it that an event occurred to which it should react. For example, hardware timers send interrupts to the CPU at regular intervals.

Dynamic frequency scaling is a power management technique in computer architecture whereby the frequency of a microprocessor can be automatically adjusted "on the fly" depending on the actual needs, to conserve power and reduce the amount of heat generated by the chip. Dynamic frequency scaling helps preserve battery on mobile devices and decrease cooling cost and noise on quiet computing settings, or can be useful as a security measure for overheated systems.

The Brain Fuck Scheduler (BFS) is a process scheduler designed for the Linux kernel in August 2009 based on earliest eligible virtual deadline first scheduling (EEVDF), as an alternative to the Completely Fair Scheduler (CFS) and the O(1) scheduler. BFS was created by an experienced kernel programmer Con Kolivas.

Jiffy can be an informal term for any unspecified short period, as in "I will be back in a jiffy". From this, it has acquired a number of more precise applications as the name of multiple units of measurement, each used to express or measure very brief durations of time. First attested in 1780, the word's origin is unclear, though one suggestion is that it was thieves' cant for lightning. It was common in a number of Scots English dialects and in John Jamieson's Etymological Dictionary of the Scottish Language (1808) it is suggested that it is a corruption of 'gliff' (glimpse) or 'gliffin' (glance) and may ultimately derive from Gothic or Teutonic words for 'shine'.

References

↑ Van Dorst, Wim (January 1996). "The Quintessential Linux Benchmark". Linux Journal. Retrieved 2008-08-22.
↑ Eric S Raymond, and Geoff Mackenzie, published on the Internet in the early 1990s, untraceable origin.
↑ Raymond, Eric S. "Hackers Jargon File".
1 2 Van Dorst, Wim (2 March 2006). "BogoMips Mini-Howto" (V38 ed.). Retrieved 2008-08-22.
↑ Wirzenius, Lars. "Re: printing & BogoMips".
↑ Bekman, Stas. "What is a BogoMip?".
1 2 "BogoMips mini-Howto".
↑ Deacon, Will. "ARM: 7452/1: delay: allow timer-based delay implementation to be selected".

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[lj-1] Van Dorst, Wim (January 1996). "The Quintessential Linux Benchmark". Linux Journal. Retrieved 2008-08-22.

[quote-2] Eric S Raymond, and Geoff Mackenzie, published on the Internet in the early 1990s, untraceable origin.

[esr-3] Raymond, Eric S. "Hackers Jargon File".

[howto-4] 1 2 Van Dorst, Wim (2 March 2006). "BogoMips Mini-Howto" (V38 ed.). Retrieved 2008-08-22.

[5] Wirzenius, Lars. "Re: printing & BogoMips".

[6] Bekman, Stas. "What is a BogoMip?".

[BogoMips_HOWTO-7] 1 2 "BogoMips mini-Howto".

[8] Deacon, Will. "ARM: 7452/1: delay: allow timer-based delay implementation to be selected".

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]