This article needs additional citations for verification .(November 2010) |
In UNIX computing, the system load is a measure of the amount of computational work that a computer system performs. The load average represents the average system load over a period of time. It conventionally appears in the form of three numbers which represent the system load during the last one-, five-, and fifteen-minute periods.
All Unix and Unix-like systems generate a dimensionless metric of three "load average" numbers in the kernel. Users can easily query the current result from a Unix shell by running the uptime
command:
$ uptime 14:34:03 up 10:43, 4 users, load average: 0.06, 0.11, 0.09
The w
and top
commands show the same three load average numbers, as do a range of graphical user interface utilities.
In operating systems based on the Linux kernel, this information can be easily accessed by reading the /proc/loadavg
file.
To explore this kind of information in dept, according to the Linux's Filesystem Hierarchy Standard, architecture-dependent information are exposed on the file /proc/stat
. [1] [2] [3]
An idle computer has a load number of 0 (the idle process is not counted). Each process using or waiting for CPU (the ready queue or run queue) increments the load number by 1. Each process that terminates decrements it by 1. Most UNIX systems count only processes in the running (on CPU) or runnable (waiting for CPU) states. However, Linux also includes processes in uninterruptible sleep states (usually waiting for disk activity), which can lead to markedly different results if many processes remain blocked in I/O due to a busy or stalled I/O system. [4] This, for example, includes processes blocking due to an NFS server failure or too slow media (e.g., USB 1.x storage devices). Such circumstances can result in an elevated load average, which does not reflect an actual increase in CPU use (but still gives an idea of how long users have to wait).
Systems calculate the load average as the exponentially damped/weighted moving average of the load number. The three values of load average refer to the past one, five, and fifteen minutes of system operation. [5]
Mathematically speaking, all three values always average all the system load since the system started up. They all decay exponentially, but they decay at different speeds: they decay exponentially by e after 1, 5, and 15 minutes respectively. Hence, the 1-minute load average consists of 63% (more precisely: 1 - 1/e) of the load from the last minute and 37% (1/e) of the average load since start up, excluding the last minute. For the 5- and 15-minute load averages, the same 63%/37% ratio is computed over 5 minutes and 15 minutes, respectively. Therefore, it is not technically accurate that the 1-minute load average only includes the last 60 seconds of activity, as it includes 37% of the activity from the past, but it is correct to state that it includes mostly the last minute.
For single-CPU systems that are CPU bound, one can think of load average as a measure of system utilization during the respective time period. For systems with multiple CPUs, one must divide the load by the number of processors in order to get a comparable measure.
For example, one can interpret a load average of "1.73 0.60 7.98" on a single-CPU system as:
This means that this system (CPU, disk, memory, etc.) could have handled all the work scheduled for the last minute if it were 1.73 times as fast.
In a system with four CPUs, a load average of 3.73 would indicate that there were, on average, 3.73 processes ready to run, and each one could be scheduled into a CPU.
On modern UNIX systems, the treatment of threading with respect to load averages varies. Some systems treat threads as processes for the purposes of load average calculation: each thread waiting to run will add 1 to the load. However, other systems, especially systems implementing so-called M:N threading, use different strategies such as counting the process exactly once for the purpose of load (regardless of the number of threads), or counting only threads currently exposed by the user-thread scheduler to the kernel, which may depend on the level of concurrency set on the process. Linux appears to count each thread separately as adding 1 to the load. [6]
The comparative study of different load indices carried out by Ferrari et al. [7] reported that CPU load information based upon the CPU queue length does much better in load balancing compared to CPU utilization. The reason CPU queue length did better is probably because when a host is heavily loaded, its CPU utilization is likely to be close to 100%, and it is unable to reflect the exact load level of the utilization. In contrast, CPU queue lengths can directly reflect the amount of load on a CPU. As an example, two systems, one with 3 and the other with 6 processes in the queue, are both very likely to have utilizations close to 100%, although they obviously differ.[ original research? ]
On Linux systems, the load-average is not calculated on each clock tick, but driven by a variable value that is based on the HZ frequency setting and tested on each clock tick. This setting defines the kernel clock tick rate in Hertz (times per second), and it defaults to 100 for 10ms ticks. Kernel activities use this number of ticks to time themselves. Specifically, the timer.c::calc_load() function, which calculates the load average, runs every LOAD_FREQ = (5*HZ+1) ticks, or about every five seconds:
unsignedlongavenrun[3];staticinlinevoidcalc_load(unsignedlongticks){unsignedlongactive_tasks;/* fixed-point */staticintcount=LOAD_FREQ;count-=ticks;if(count<0){count+=LOAD_FREQ;active_tasks=count_active_tasks();CALC_LOAD(avenrun[0],EXP_1,active_tasks);CALC_LOAD(avenrun[1],EXP_5,active_tasks);CALC_LOAD(avenrun[2],EXP_15,active_tasks);}}
The avenrun array contains 1-minute, 5-minute and 15-minute average. The CALC_LOAD
macro and its associated values are defined in sched.h:
#define FSHIFT 11 /* nr of bits of precision */#define FIXED_1 (1<<FSHIFT) /* 1.0 as fixed-point */#define LOAD_FREQ (5*HZ+1) /* 5 sec intervals */#define EXP_1 1884 /* 1/exp(5sec/1min) as fixed-point */#define EXP_5 2014 /* 1/exp(5sec/5min) */#define EXP_15 2037 /* 1/exp(5sec/15min) */#define CALC_LOAD(load,exp,n) \ load *= exp; \ load += n*(FIXED_1-exp); \ load >>= FSHIFT;
The "sampled" calculation of load averages is a somewhat common behavior; FreeBSD, too, only refreshes the value every five seconds. The interval is usually taken to not be exact so that they do not collect processes that are scheduled to fire at a certain moment. [8]
A post on the Linux mailing list considers its +1 tick insufficient to avoid Moire artifacts from such collection, and suggests an interval of 4.61 seconds instead. [9] This change is common among Android system kernels, although the exact expression used assumes an HZ of 100. [10]
Other commands for assessing system performance include:
uptime
– the system reliability and load average top
– for an overall system view vmstat
– vmstat reports information about runnable or blocked processes, memory, paging, block I/O, traps, and CPU. htop
– interactive process viewerdool
(formerly dstat
), [11] atop
– helps correlate all existing resource data for processes, memory, paging, block I/O, traps, and CPU activity. iftop
– interactive network traffic viewer per interfacenethogs
– interactive network traffic viewer per processiotop
– interactive I/O viewer [12] iostat
– for storage I/O statistics netstat
– for network statistics mpstat
– for CPU statisticstload
– load average graph for terminal xload
– load average graph for X/proc/loadavg
– text file containing load averageIn computing, a context switch is the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point, and then restoring a different, previously saved, state. This allows multiple processes to share a single central processing unit (CPU), and is an essential feature of a multiprogramming or multitasking operating system. In a traditional CPU, each process - a program in execution - utilizes the various CPU registers to store data and hold the current state of the running process. However, in a multitasking operating system, the operating system switches between processes or threads to allow the execution of multiple processes simultaneously. For every switch, the operating system must save the state of the currently running process, followed by loading the next process state, which will run on the CPU. This sequence of operations that stores the state of the running process and the loading of the following running process is called a context switch.
Mach is a kernel developed at Carnegie Mellon University by Richard Rashid and Avie Tevanian to support operating system research, primarily distributed and parallel computing. Mach is often considered one of the earliest examples of a microkernel. However, not all versions of Mach are microkernels. Mach's derivatives are the basis of the operating system kernel in GNU Hurd and of Apple's XNU kernel used in macOS, iOS, iPadOS, tvOS, and watchOS.
OS-9 is a family of real-time, process-based, multitasking, multi-user operating systems, developed in the 1980s, originally by Microware Systems Corporation for the Motorola 6809 microprocessor. It was purchased by Radisys Corp in 2001, and was purchased again in 2013 by its current owner Microware LP.
In computing, a process is the instance of a computer program that is being executed by one or many threads. There are many different process models, some of which are light weight, but almost all processes are rooted in an operating system (OS) process which comprises the program code, assigned system resources, physical and logical access permissions, and data structures to initiate, control and coordinate execution activity. Depending on the OS, a process may be made up of multiple threads of execution that execute instructions concurrently.
QNX is a commercial Unix-like real-time operating system, aimed primarily at the embedded systems market.
In computer science, a semaphore is a variable or abstract data type used to control access to a common resource by multiple threads and avoid critical section problems in a concurrent system such as a multitasking operating system. Semaphores are a type of synchronization primitive. A trivial semaphore is a plain variable that is changed depending on programmer-defined conditions.
Processor affinity, or CPU pinning or "cache affinity", enables the binding and unbinding of a process or a thread to a central processing unit (CPU) or a range of CPUs, so that the process or thread will execute only on the designated CPU or CPUs rather than any CPU. This can be viewed as a modification of the native central queue scheduling algorithm in a symmetric multiprocessing operating system. Each item in the queue has a tag indicating its kin processor. At the time of resource allocation, each task is allocated to its kin processor in preference to others.
In computing, scheduling is the action of assigning resources to perform tasks. The resources may be processors, network links or expansion cards. The tasks may be threads, processes or data flows.
Uptime is a measure of system reliability, expressed as the percentage of time a machine, typically a computer, has been working and available. Uptime is the opposite of downtime.
In modern computers many processes run at once. Active processes are placed in an array called a run queue, or runqueue. The run queue may contain priority values for each process, which will be used by the scheduler to determine which process to run next. To ensure each program has a fair share of resources, each one is run for some time period (quantum) before it is paused and placed back into the run queue. When a program is stopped to let another run, the program with the highest priority in the run queue is then allowed to execute.
top is a task manager program, found in many Unix-like operating systems, that displays information about CPU and memory utilization.
Signals are standardized messages sent to a running program to trigger specific behavior, such as quitting or error handling. They are a limited form of inter-process communication (IPC), typically used in Unix, Unix-like, and other POSIX-compliant operating systems.
Micro-Controller Operating Systems is a real-time operating system (RTOS) designed by Jean J. Labrosse in 1991. It is a priority-based preemptive real-time kernel for microprocessors, written mostly in the programming language C. It is intended for use in embedded systems.
The proc filesystem (procfs) is a special filesystem in Unix-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional tracing methods or direct access to kernel memory. Typically, it is mapped to a mount point named /proc at boot time. The proc file system acts as an interface to internal data structures about running processes in the kernel. In Linux, it can also be used to obtain information about the kernel and to change certain kernel parameters at runtime (sysctl).
In computer science, the event loop is a programming construct or design pattern that waits for and dispatches events or messages in a program. The event loop works by making a request to some internal or external "event provider", then calls the relevant event handler. The event loop is also sometimes referred to as the message dispatcher, message loop, message pump, or run loop.
CPU time is the amount of time for which a central processing unit (CPU) was used for processing instructions of a computer program or operating system, as opposed to elapsed time, which includes for example, waiting for input/output (I/O) operations or entering low-power (idle) mode. The CPU time is measured in clock ticks or seconds. Often, it is useful to measure CPU time as a percentage of the CPU's capacity, which is called the CPU usage. CPU time and CPU usage have two main uses.
The Completely Fair Scheduler (CFS) is a process scheduler that was merged into the 2.6.23 release of the Linux kernel and is the default scheduler of the tasks of the SCHED_NORMAL
class. It handles CPU resource allocation for executing processes, and aims to maximize overall CPU utilization while also maximizing interactive performance.
nmon is a computer performance system monitor tool for the AIX and Linux operating systems. The nmon tool has two modes a) displays the performance stats on-screen in a condensed format or b) the same stats are saved to a comma-separated values (CSV) data file for later graphing and analysis to aid the understanding of computer resource use, tuning options and bottlenecks.
A process is a program in execution, and an integral part of any modern-day operating system (OS). The OS must allocate resources to processes, enable processes to share and exchange information, protect the resources of each process from other processes and enable synchronization among processes. To meet these requirements, the OS must maintain a data structure for each process, which describes the state and resource ownership of that process, and which enables the OS to exert control over each process.
The Brain Fuck Scheduler (BFS) is a process scheduler designed for the Linux kernel in August 2009 based on earliest eligible virtual deadline first scheduling (EEVDF), as an alternative to the Completely Fair Scheduler (CFS) and the O(1) scheduler. BFS was created by an experienced kernel programmer Con Kolivas.
...Dag Wieers ceased development of Dstat...