BogoMips (from "bogus" and MIPS) is a crude measurement of CPU speed made by the Linux kernel when it boots to calibrate an internal busy-loop. [1] An often-quoted definition of the term is "the number of million times per second a processor can do absolutely nothing". [2] [3]
BogoMips is a value that can be used to verify whether the processor in question is in the proper range of similar processors, i.e. BogoMips represents a processor's clock frequency as well as the potentially present CPU cache. It is not usable for performance comparisons among different CPUs. [4]
In 1993, Lars Wirzenius posted a Usenet message [5] explaining the reasons for its introduction in the Linux kernel on comp.os.linux:
As a very approximate guide, the BogoMips can be pre-calculated by the following table. The given rating is typical for that CPU with the then current and applicable Linux version. The index is the ratio of "BogoMips per clock speed" for any CPU to the same for an Intel 386DX CPU, for comparison purposes. [6] [7]
System | Rating | Index |
---|---|---|
Intel 8088 | clock × 0.004 | 0.02 |
Intel/AMD 386SX | clock × 0.14 | 0.8 |
Intel/AMD 386DX | clock × 0.18 | 1 (definition) |
Motorola 68030 | clock × 0.25 | 1.4 |
Cyrix/IBM 486 | clock × 0.34 | 1.8 |
Intel Pentium | clock × 0.40 | 2.2 |
Intel 486 | clock × 0.50 | 2.8 |
AMD 5x86 | clock × 0.50 | 2.8 |
MIPS R4000/R4400 | clock × 0.50 | 2.8 |
ARM9 | clock × 0.50 | 2.8 |
Motorola 68040 | clock × 0.67 | 3.7 |
PowerPC 603 | clock × 0.67 | 3.7 |
Intel StrongARM | clock × 0.66 | 3.7 |
NexGen Nx586 | clock × 0.75 | 4.2 |
PowerPC 601 | clock × 0.84 | 4.7 |
Alpha 21064/21064A | clock × 0.99 | 5.5 |
Alpha 21066/21066A | clock × 0.99 | 5.5 |
Alpha 21164/21164A | clock × 0.99 | 5.5 |
Intel Pentium Pro | clock × 0.99 | 5.5 |
Cyrix 5x86/6x86 | clock × 1.00 | 5.6 |
Intel Pentium II/III | clock × 1.00 | 5.6 |
AMD K7/Athlon | clock × 1.00 | 5.6 |
Intel Celeron | clock × 1.00 | 5.6 |
Intel Itanium | clock × 1.00 | 5.6 |
R4600 | clock × 1.00 | 5.6 |
Hitachi SH-4 | clock × 1.00 | 5.6 |
Raspberry Pi (Model B) | clock × 1.00 | 5.6 |
Intel Itanium 2 | clock × 1.49 | 8.3 |
Alpha 21264 | clock × 1.99 | 11.1 |
VIA Centaur | clock × 1.99 | 11.1 |
AMD K5/K6/K6-2/K6-III | clock × 2.00 | 11.1 |
AMD Duron/Athlon XP | clock × 2.00 | 11.1 |
AMD Sempron | clock × 2.00 | 11.1 |
UltraSparc II | clock × 2.00 | 11.1 |
Intel Pentium MMX | clock × 2.00 | 11.1 |
Intel Pentium 4 | clock × 2.00 | 11.1 |
Intel Pentium M | clock × 2.00 | 11.1 |
Intel Core Duo | clock × 2.00 | 11.1 |
Intel Core 2 Duo | clock × 2.00 | 11.1 |
Intel Atom N455 | clock × 2.00 | 11.1 |
Centaur C6-2 | clock × 2.00 | 11.1 |
PowerPC 604/604e/750 | clock × 2.00 | 11.1 |
Intel Pentium III Coppermine | clock × 2.00 | 11.1 |
Intel Pentium III Xeon | clock × 2.00 | 11.1 |
Motorola 68060 | clock × 2.00 | 11.1 |
Intel Xeon MP (32-bit) (hyper-threading) | clock × 3.97 | 22.1 |
IBM S390 | not enough data (yet) | |
ARM | not enough data (yet) |
With the 2.2.14 Linux kernel, a caching setting of the CPU state was moved from behind to before the BogoMips calculation. Although the BogoMips algorithm itself wasn't changed, from that kernel onward the BogoMips rating for then current Pentium CPUs was twice that of the rating before the change. The changed BogoMips outcome had no effect on real processor performance.[ citation needed ]
In Linux, BogoMips can be easily obtained by searching the cpuinfo file: [7]
$ grep-ibogomips/proc/cpuinfo
With kernel 2.6.x, BogoMips are implemented in the /usr/src/linux/init/calibrate.c
kernel source file. It computes the Linux kernel timing parameter loops_per_jiffy
(see jiffy) value. The explanation from source code:
/* * A simple loop like * while ( jiffies < start_jiffies+1) * start = read_current_timer(); * will not do. As we don't really know whether jiffy switch * happened first or timer_value was read first. And some asynchronous * event can happen between these two events introducing errors in lpj. * * So, we do * 1. pre_start <- When we are sure that jiffy switch hasn't happened * 2. check jiffy switch * 3. start <- timer value before or after jiffy switch * 4. post_start <- When we are sure that jiffy switch has happened * * Note, we don't know anything about order of 2 and 3. * Now, by looking at post_start and pre_start difference, we can * check whether any asynchronous event happened or not */
loops_per_jiffy
is used to implement udelay
(delay in microseconds) and ndelay
(delay in nanoseconds) functions. These functions are needed by some drivers to wait for hardware. Note that a busy waiting technique is used, so the kernel is effectively blocked when executing ndelay
/udelay
functions. For i386 architecture delay_loop
is implemented in /usr/src/linux/arch/i386/lib/delay.c
as:
/* simple loop based delay: */staticvoiddelay_loop(unsignedlongloops){intd0;__asm____volatile__("\tjmp 1f\n"".align 16\n""1:\tjmp 2f\n"".align 16\n""2:\tdecl %0\n\tjns 2b":"=&a"(d0):"0"(loops));}
equivalent to the following assembler code
; input: eax = d0; output: eax = 0jmpstart.align16start:jmpbody.align16body:decleaxjnsbody
which can be rewritten to C-pseudocode
staticvoiddelay_loop(longloops){longd0=loops;do{--d0;}while(d0>=0);}
Full and complete information and details about BogoMips, and hundreds of reference entries can be found in the (outdated) BogoMips mini-Howto. [4]
In 2012, ARM contributed a new udelay
implementation allowing the system timer built into many ARMv7 CPUs to be used instead of a busy-wait loop. This implementation was released in Version 3.6 of the Linux kernel. [8] Timer-based delays are more robust on systems that use frequency scaling to dynamically adjust the processor's speed at runtime, as loops_per_jiffies
values may not necessarily scale linearly. Also, since the timer frequency is known in advance, no calibration is needed at boot time.
One side effect of this change is that the BogoMIPS value will reflect the timer frequency, not the CPU's core frequency. Typically the timer frequency is much lower than the processor's maximum frequency, and some users may be surprised to see an unusually low BogoMIPS value when comparing against systems that use traditional busy-wait loops.
In computing, a context switch is the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point, and then restoring a different, previously saved, state. This allows multiple processes to share a single central processing unit (CPU), and is an essential feature of a multiprogramming or multitasking operating system. In a traditional CPU, each process - a program in execution - utilizes the various CPU registers to store data and hold the current state of the running process. However, in a multitasking operating system, the operating system switches between processes or threads to allow the execution of multiple processes simultaneously. For every switch, the operating system must save the state of the currently running process, followed by loading the next process state, which will run on the CPU. This sequence of operations that stores the state of the running process and the loading of the following running process is called a context switch.
MIPS is a family of reduced instruction set computer (RISC) instruction set architectures (ISA) developed by MIPS Computer Systems, now MIPS Technologies, based in the United States.
Non-uniform memory access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can access its own local memory faster than non-local memory. The benefits of NUMA are limited to particular workloads, notably on servers where the data is often associated strongly with certain tasks or users.
In computing, a system call is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services, creation and execution of new processes, and communication with integral kernel services such as process scheduling. System calls provide an essential interface between a process and the operating system.
In UNIX computing, the system load is a measure of the amount of computational work that a computer system performs. The load average represents the average system load over a period of time. It conventionally appears in the form of three numbers which represent the system load during the last one-, five-, and fifteen-minute periods.
In computing, scheduling is the action of assigning resources to perform tasks. The resources may be processors, network links or expansion cards. The tasks may be threads, processes or data flows.
In software engineering, a spinlock is a lock that causes a thread trying to acquire it to simply wait in a loop ("spin") while repeatedly checking whether the lock is available. Since the thread remains active but is not performing a useful task, the use of such a lock is a kind of busy waiting. Once acquired, spinlocks will usually be held until they are explicitly released, although in some implementations they may be automatically released if the thread being waited on blocks or "goes to sleep".
In computer science and software engineering, busy-waiting, busy-looping or spinning is a technique in which a process repeatedly checks to see if a condition is true, such as whether keyboard input or a lock is available. Spinning can also be used to generate an arbitrary time delay, a technique that was necessary on systems that lacked a method of waiting a specific length of time. Processor speeds vary greatly from computer to computer, especially as some processors are designed to dynamically adjust speed based on current workload. Consequently, spinning as a time-delay technique can produce unpredictable or even inconsistent results on different systems unless code is included to determine the time a processor takes to execute a "do nothing" loop, or the looping code explicitly checks a real-time clock.
The proc filesystem (procfs) is a special filesystem in Unix-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional tracing methods or direct access to kernel memory. Typically, it is mapped to a mount point named /proc at boot time. The proc file system acts as an interface to internal data structures about running processes in the kernel. In Linux, it can also be used to obtain information about the kernel and to change certain kernel parameters at runtime (sysctl).
In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it.
Enhanced SpeedStep is a series of dynamic frequency scaling technologies built into some Intel microprocessors that allow the clock speed of the processor to be dynamically changed by software. This allows the processor to meet the instantaneous performance needs of the operation being performed, while minimizing power draw and heat generation. EIST was introduced in several Prescott 6 series in the first quarter of 2005, namely the Pentium 4 660. Intel Speed Shift Technology (SST) was introduced in Intel Skylake Processor.
The High Precision Event Timer (HPET) is a hardware timer available in modern x86-compatible personal computers. Compared to older types of timers available in the x86 architecture, HPET allows more efficient processing of highly timing-sensitive applications, such as multimedia playback and OS task switching. It was developed jointly by Intel and Microsoft and has been incorporated in PC chipsets since 2005. Formerly referred to by Intel as a Multimedia Timer, the term HPET was selected to avoid confusion with the software multimedia timers introduced in the MultiMedia Extensions to Windows 3.0.
In the x86 architecture, the CPUID instruction is a processor supplementary instruction allowing software to discover details of the processor. It was introduced by Intel in 1993 with the launch of the Pentium and SL-enhanced 486 processors.
Ubicom was a company which developed communications and media processor (CMP) and software platforms for real-time interactive applications and multimedia content delivery in the digital home. The company provided optimized system-level solutions to OEMs for a wide range of products including wireless routers, access points, VoIP gateways, streaming media devices, print servers and other network devices. Ubicom was a venture-backed, privately held company with corporate headquarters in San Jose, California.
The Time Stamp Counter (TSC) is a 64-bit register present on all x86 processors since the Pentium. It counts the number of CPU cycles since its reset. The instruction RDTSC
returns the TSC in EDX:EAX. In x86-64 mode, RDTSC
also clears the upper 32 bits of RAX and RDX. Its opcode is 0F 31
. Pentium competitors such as the Cyrix 6x86 did not always have a TSC and may consider RDTSC
an illegal instruction. Cyrix included a Time Stamp Counter in their MII.
OpenVZ is an operating-system-level virtualization technology for Linux. It allows a physical server to run multiple isolated operating system instances, called containers, virtual private servers (VPSs), or virtual environments (VEs). OpenVZ is similar to Solaris Containers and LXC.
In the x86 computer architecture, HLT
(halt) is an assembly language instruction which halts the central processing unit (CPU) until the next external interrupt is fired. Interrupts are signals sent by hardware devices to the CPU alerting it that an event occurred to which it should react. For example, hardware timers send interrupts to the CPU at regular intervals.
Dynamic frequency scaling is a power management technique in computer architecture whereby the frequency of a microprocessor can be automatically adjusted "on the fly" depending on the actual needs, to conserve power and reduce the amount of heat generated by the chip. Dynamic frequency scaling helps preserve battery on mobile devices and decrease cooling cost and noise on quiet computing settings, or can be useful as a security measure for overheated systems.
The Brain Fuck Scheduler (BFS) is a process scheduler designed for the Linux kernel in August 2009 based on earliest eligible virtual deadline first scheduling (EEVDF), as an alternative to the Completely Fair Scheduler (CFS) and the O(1) scheduler. BFS was created by an experienced kernel programmer Con Kolivas.
Jiffy can be an informal term for any unspecified short period, as in "I will be back in a jiffy". From this, it has acquired a number of more precise applications as the name of multiple units of measurement, each used to express or measure very brief durations of time. First attested in 1780, the word's origin is unclear, though one suggestion is that it was thieves' cant for lightning. It was common in a number of Scots English dialects and in John Jamieson's Etymological Dictionary of the Scottish Language (1808) it is suggested that it is a corruption of 'gliff' (glimpse) or 'gliffin' (glance) and may ultimately derive from Gothic or Teutonic words for 'shine'.