O(1) scheduler

Last updated December 19, 2024

An O(1) scheduler (pronounced "O of 1 scheduler", "Big O of 1 scheduler", or "constant time scheduler") is a kernel scheduling design that can schedule processes within a constant amount of time, regardless of how many processes are running on the operating system. This is an improvement over previously used O(n) schedulers, which schedule processes in an amount of time that scales linearly based on the amounts of inputs.

Overview

The Linux scheduler was overhauled completely with the release of kernel 2.6 in 2003.^[1]^[2] The new scheduler was called the O(1) scheduler. The algorithm used by the O(1) scheduler relies on active and expired arrays of processes to achieve constant scheduling time. Each process is given a fixed time quantum, after which it is preempted and moved to the expired array. Once all the tasks from the active array have exhausted their time quantum and have been moved to the expired array, an array switch takes place. Because the arrays are accessed only via pointer, switching them is as fast as swapping two pointers.^[3] This switch makes the active array the new empty expired array, while the expired array becomes the active array.

About O(1) notation

An algorithm operates over an input, and the size of that input usually determines its running time. Big O notation is used to denote the growth rate of an algorithm's execution time based on the amount of input. For example, the running time of an O(n) algorithm increases linearly as the input size n grows.^[4] The running time of an O(n²) algorithm grows quadratically. If it is possible to establish a constant upper bound on the running time of an algorithm, it is considered to be O(1) (one might say it runs in "constant time"). That is, an O(1) algorithm is guaranteed to complete in a certain amount of time regardless of the size of the input.^[5]

Improvement in Linux scheduler performance

The Linux 2.6.8.1 scheduler did not contain any algorithms that run in worse than O(1) time.^[6] That is, every part of the scheduler is guaranteed to execute within a certain constant amount of time regardless of how many tasks are on the system. This allows the Linux kernel to efficiently handle massive numbers of tasks without increasing overhead costs as the number of tasks grows. There are two key data structures in the Linux 2.6.8.1 scheduler that allow for it to perform its duties in O(1) time, and its design revolves around them: runqueues and priority arrays.

Issues

The main issue with this algorithm is the complex heuristics used to mark a task as interactive or non-interactive. The algorithm tries to identify interactive processes by analyzing average sleep time (the amount of time the process spends waiting for input). Processes that sleep for long periods of time probably are waiting for user input, so the scheduler assumes they're interactive. The scheduler gives a priority bonus to interactive tasks (for better throughput) while penalizing non-interactive tasks by lowering their priorities. All the calculations to determine the interactivity of tasks are complex and subject to potential miscalculations,^{[ citation needed ]} causing non-interactive behavior from an interactive process.

Replacement

In 2.6.23 (October 2007), the Completely Fair Scheduler was introduced, replacing the O(1) Scheduler. According to Ingo Molnar, the author of the CFS, its core design can be summed up in single sentence: "CFS basically models an 'ideal, precise multitasking CPU' on real hardware."^[7]

Related Research Articles

<span class="mw-page-title-main">Computer multitasking</span> Concurrent execution of multiple processes

In computing, multitasking is the concurrent execution of multiple tasks over a certain period of time. New tasks can interrupt already started ones before they finish, instead of waiting for them to end. As a result, a computer executes segments of multiple tasks in an interleaved manner, while the tasks share common processing resources such as central processing units (CPUs) and main memory. Multitasking automatically interrupts the running program, saving its state and loading the saved state of another program and transferring control to it. This "context switch" may be initiated at fixed time intervals, or the running program may be coded to signal to the supervisory software when it can be interrupted.

In computing, a context switch is the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point, and then restoring a different, previously saved, state. This allows multiple processes to share a single central processing unit (CPU), and is an essential feature of a multiprogramming or multitasking operating system. In a traditional CPU, each process - a program in execution - utilizes the various CPU registers to store data and hold the current state of the running process. However, in a multitasking operating system, the operating system switches between processes or threads to allow the execution of multiple processes simultaneously. For every switch, the operating system must save the state of the currently running process, followed by loading the next process state, which will run on the CPU. This sequence of operations that stores the state of the running process and loads the following running process is called a context switch.

An operating system (OS) is system software that manages computer hardware and software resources, and provides common services for computer programs.

In computing, a process is the instance of a computer program that is being executed by one or many threads. There are many different process models, some of which are light weight, but almost all processes are rooted in an operating system (OS) process which comprises the program code, assigned system resources, physical and logical access permissions, and data structures to initiate, control and coordinate execution activity. Depending on the OS, a process may be made up of multiple threads of execution that execute instructions concurrently.

In computer science, a thread of execution is the smallest sequence of programmed instructions that can be managed independently by a scheduler, which is typically a part of the operating system. In many cases, a thread is a component of a process.

In computer science, algorithmic efficiency is a property of an algorithm which relates to the amount of computational resources used by the algorithm. Algorithmic efficiency can be thought of as analogous to engineering productivity for a repeating or continuous process.

<span class="mw-page-title-main">Load (computing)</span> Amount of computational work that a computer system performs

In UNIX computing, the system load is a measure of the amount of computational work that a computer system performs. The load average represents the average system load over a period of time. It conventionally appears in the form of three numbers which represent the system load during the last one-, five-, and fifteen-minute periods.

In computing, scheduling is the action of assigning resources to perform tasks. The resources may be processors, network links or expansion cards. The tasks may be threads, processes or data flows.

In modern computers many processes run at once. Active processes are placed in an array called a run queue, or runqueue. The run queue may contain priority values for each process, which will be used by the scheduler to determine which process to run next. To ensure each program has a fair share of resources, each one is run for some time period (quantum) before it is paused and placed back into the run queue. When a program is stopped to let another run, the program with the highest priority in the run queue is then allowed to execute.

nice is a program found on Unix and Unix-like operating systems such as Linux. It directly maps to a kernel call of the same name. nice is used to invoke a utility or shell script with a particular CPU priority, thus giving the process more or less CPU time than other processes. A niceness of -20 is the lowest niceness, or highest priority. The default niceness for processes is inherited from its parent process and is usually 0.

In computing, preemption is the act of temporarily interrupting an executing task, with the intention of resuming it at a later time. This interrupt is done by an external scheduler with no assistance or cooperation from the task. This preemptive scheduler usually runs in the most privileged protection ring, meaning that interruption and then resumption are considered highly secure actions. Such changes to the currently executing task of a processor are known as context switching.

<span class="mw-page-title-main">CPU time</span> Time used by a computer

CPU time is the amount of time that a central processing unit (CPU) was used for processing instructions of a computer program or operating system. CPU time is measured in clock ticks or seconds. Sometimes it is useful to convert CPU time into a percentage of the CPU capacity, giving the CPU usage.

AQuoSA is an open architecture for the provisioning of adaptive Quality of Service functionality into the Linux kernel. The project features a flexible, portable, lightweight and open architecture for supporting QoS related services on the top of a general-purpose operating system as Linux. The architecture is well founded on formal scheduling analysis and control theoretical results.

The Completely Fair Scheduler (CFS) was a process scheduler that was merged into the 2.6.23 release of the Linux kernel. It was the default scheduler of the tasks of the SCHED_NORMAL class and handled CPU resource allocation for executing processes, aiming to maximize overall CPU utilization while also maximizing interactive performance.

Con Kolivas is a Greek-Australian anaesthetist. He has worked as a computer programmer on the Linux kernel and on the development of the cryptographic currency mining software CGMiner. His Linux contributions include patches for the kernel to improve its desktop performance, particularly reducing I/O impact.

The Brain Fuck Scheduler (BFS) is a process scheduler designed for the Linux kernel in August 2009 based on earliest eligible virtual deadline first scheduling (EEVDF), as an alternative to the Completely Fair Scheduler (CFS) and the O(1) scheduler. BFS was created by Con Kolivas.

The Slurm Workload Manager, formerly known as Simple Linux Utility for Resource Management (SLURM), or simply Slurm, is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world's supercomputers and computer clusters.

SCHED_DEADLINE is a CPU scheduler available in the Linux kernel since version 3.14, based on the earliest deadline first (EDF) and constant bandwidth server (CBS) algorithms, supporting resource reservations: each task scheduled under such policy is associated with a budget Q, and a period P, corresponding to a declaration to the kernel that Q time units are required by that task every P time units, on any processor. This makes SCHED_DEADLINE particularly suitable for real-time applications, like multimedia or industrial control, where P corresponds to the minimum time elapsing between subsequent activations of the task, and Q corresponds to the worst-case execution time needed by each activation of the task.

The O(n) scheduler is the scheduler used in the Linux kernel between versions 2.4 and 2.6. Since version 2.6.0, it has been replaced by the O(1) scheduler and in 2.6.23 by the current Completely Fair Scheduler (CFS).

Earliest eligible virtual deadline first (EEVDF) is a dynamic priority proportional share scheduling algorithm for soft real-time systems.

References

↑ "Introducing the 2.6 Kernel | Linux Journal". www.linuxjournal.com. Retrieved 2019-07-19.
↑ Chandandeep Singh Pabla (August 1, 2009). "Completely Fair Scheduler". linux journal. Retrieved 2014-09-09.
↑ Robert Love. "The Linux Process Scheduler" . Retrieved 2014-09-09.
↑ dws. "An informal introduction to O(N) notation" . Retrieved 2014-09-09.
↑ Rob Bell. "A Beginner's Guide to Big O Notation" . Retrieved 2014-09-09.
↑ Josh Aas. "Understanding the Linux 2.6.8.1 CPU Scheduler" (PDF). GitHub . Retrieved 2014-09-09.
↑ <mingo@elte.hu>, Ingo Molnar. "Linux-Kernel Archive: Re: fair clock use in CFS". lkml.iu.edu. Retrieved 2018-08-30.

External links

Understanding the Linux 2.6.8.1 CPU Scheduler; Josh Aas, 17 February 2005
HybridThreads (Hthreads) Archived 2008-05-11 at the Wayback Machine ; a HW/SW co-designed POSIX-compatible OS featuring an O(1) scheduler implemented in hardware
Inside the Linux scheduler; by M. Tim Jones, an IBM developerWorks article
David Mosberger. "A closer look at the Linux O(1) scheduler". HP Research Labs. Archived from the original on 16 March 2012.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Introducing the 2.6 Kernel | Linux Journal". www.linuxjournal.com. Retrieved 2019-07-19.

[2] Chandandeep Singh Pabla (August 1, 2009). "Completely Fair Scheduler". linux journal. Retrieved 2014-09-09.

[3] Robert Love. "The Linux Process Scheduler" . Retrieved 2014-09-09.

[4] ws. "An informal introduction to O(N) notation" . Retrieved 2014-09-09.

[5] Rob Bell. "A Beginner's Guide to Big O Notation" . Retrieved 2014-09-09.

[6] Josh Aas. "Understanding the Linux 2.6.8.1 CPU Scheduler" (PDF). GitHub . Retrieved 2014-09-09.

[7] <mingo@elte.hu>, Ingo Molnar. "Linux-Kernel Archive: Re: fair clock use in CFS". lkml.iu.edu. Retrieved 2018-08-30.

[1]

[2]

[3]

[4]

[5]

[6]

[7]