Completely fair queueing

Last updated September 23, 2024

Completely Fair Queuing (CFQ) is an I/O scheduler for the Linux kernel which was written in 2003 by Jens Axboe.^[1]

Description

CFQ places synchronous requests submitted by processes into a number of per-process queues and then allocates timeslices for each of the queues to access the disk. The length of the time slice and the number of requests a queue is allowed to submit depends on the I/O priority of the given process. Asynchronous requests for all processes are batched together in fewer queues, one per priority. While CFQ does not do explicit anticipatory I/O scheduling, it achieves the same effect of having good aggregate throughput for the system as a whole, by allowing a process queue to idle at the end of synchronous I/O thereby "anticipating" further close I/O from that process. It can be considered a natural extension of granting I/O time slices to a process.

History

Prior to the integration

In February 2003 Andrea Arcangeli put forward his idea for a Stochastic Fair Queueing I/O scheduler to Jens Axboe who then implemented it. Jens Axboe made improvements to his first implementation, calling the new version the Completely Fair Queueing scheduler, and produced a patch to apply it to the 2.5.60 development series kernel.

Kernel 2.6.6 (10 May 2004)

The CFQ I/O scheduler was first integrated into the mainline kernel as an optional I/O scheduler. It was possible to change the scheduler at boot time with the 'elevator' parameter to kernel.

Kernel 2.6.9 (19 October 2004)

Red Hat Enterprise Linux 4 used this I/O scheduler as the default even though it used a kernel based on a 2.6.9.^[2]

Kernel 2.6.10 (24 December 2004)

The second release of the CFQ scheduler dubbed CFQv2 is included in the 2.6.10, improvements include better responsiveness and the elimination of some starvation issues which were present in the earlier version. The scheduler now is also switchable at run time by modifying the /sys/block/<block_device>/queue/scheduler variable in the sysfs filesystem.

Kernel 2.6.13 (27 June 2005)

CFQ scheduler moved to a new time sliced design dubbed CFQv3. Among other things, it implements ioprio_get(2) and ioprio_set(2) which allows user to set per-process I/O priorities, usually using ionice(1) command (although using nice(1) also modifies I/O priorities somewhat).

Kernel 2.6.18 (20 September 2006)

CFQ became the default scheduler, replacing the anticipatory scheduler.^[3]

Kernel 5.0 (2019-03-03)

CFQ has been removed.^[4]^[5] CFQ evolved into Budget Fair Queueing (BFQ).^[6]^[7]

Related Research Articles

In computing, scheduling is the action of assigning resources to perform tasks. The resources may be processors, network links or expansion cards. The tasks may be threads, processes or data flows.

In computing, a futex is a kernel system call that programmers can use to implement basic locking, or as a building block for higher-level locking abstractions such as semaphores and POSIX mutexes or condition variables.

QIO is a term used in several computer operating systems designed by the former Digital Equipment Corporation (DEC) of Maynard, Massachusetts.

In computer science, asynchronous I/O is a form of input/output processing that permits other processing to continue before the I/O operation has finished. A name used for asynchronous I/O in the Windows API is overlapped I/O.

OS-level virtualization is an operating system (OS) virtualization paradigm in which the kernel allows the existence of multiple isolated user space instances, including containers, zones, virtual private servers (OpenVZ), partitions, virtual environments (VEs), virtual kernels, and jails. Such instances may look like real computers from the point of view of programs running in them. A computer program running on an ordinary operating system can see all resources of that computer. Programs running inside a container can only see the container's contents and devices assigned to the container.

OpenVZ is an operating-system-level virtualization technology for Linux. It allows a physical server to run multiple isolated operating system instances, called containers, virtual private servers (VPSs), or virtual environments (VEs). OpenVZ is similar to Solaris Containers and LXC.

Anticipatory scheduling is an algorithm for scheduling hard disk input/output. It seeks to increase the efficiency of disk utilization by "anticipating" future synchronous read operations.

Jens Axboe is a Linux kernel hacker.

Input/output (I/O) scheduling is the method that computer operating systems use to decide in which order I/O operations will be submitted to storage volumes. I/O scheduling is sometimes called disk scheduling.

The Completely Fair Scheduler (CFS) was a process scheduler that was merged into the 2.6.23 release of the Linux kernel. It was the default scheduler of the tasks of the SCHED_NORMAL class and handled CPU resource allocation for executing processes, aiming to maximize overall CPU utilization while also maximizing interactive performance.

The NOOP scheduler is the simplest I/O scheduler for the Linux kernel. This scheduler was developed by Jens Axboe.

Deadline Scheduler is an I/O scheduler, or disk scheduler, for the Linux kernel. It was written in 2002 by Jens Axboe.

Fusion-io, Inc. was a computer hardware and software systems company based in Cottonwood Heights, Utah, that designed and manufactured products using flash memory technology. The Fusion ioMemory was marketed for applications such as databases, virtualization, cloud computing, big data. Their ioDrive product was considered around 2011 to be one of the fastest storage devices on the market.

The Brain Fuck Scheduler (BFS) is a process scheduler designed for the Linux kernel in August 2009 based on earliest eligible virtual deadline first scheduling (EEVDF), as an alternative to the Completely Fair Scheduler (CFS) and the O(1) scheduler. BFS was created by Con Kolivas.

SCHED_DEADLINE is a CPU scheduler available in the Linux kernel since version 3.14, based on the earliest deadline first (EDF) and constant bandwidth server (CBS) algorithms, supporting resource reservations: each task scheduled under such policy is associated with a budget Q, and a period P, corresponding to a declaration to the kernel that Q time units are required by that task every P time units, on any processor. This makes SCHED_DEADLINE particularly suitable for real-time applications, like multimedia or industrial control, where P corresponds to the minimum time elapsing between subsequent activations of the task, and Q corresponds to the worst-case execution time needed by each activation of the task.

The O(n) scheduler is the scheduler used in the Linux kernel between versions 2.4 and 2.6. Since version 2.6.0, it has been replaced by the O(1) scheduler and in 2.6.23 by the current Completely Fair Scheduler (CFS).

Earliest deadline first (EDF) or least time to go is a dynamic priority scheduling algorithm used in real-time operating systems to place processes in a priority queue. Whenever a scheduling event occurs the queue will be searched for the process closest to its deadline. This process is the next to be scheduled for execution.

bcache is a cache mechanism in the Linux kernel's block layer, which is used for accessing secondary storage devices. It allows one or more fast storage devices, such as flash-based solid-state drives (SSDs), to act as a cache for one or more slower storage devices, such as hard disk drives (HDDs); this effectively creates hybrid volumes and provides performance improvements.

dm-cache is a component of the Linux kernel's device mapper, which is a framework for mapping block devices onto higher-level virtual block devices. It allows one or more fast storage devices, such as flash-based solid-state drives (SSDs), to act as a cache for one or more slower storage devices such as hard disk drives (HDDs); this effectively creates hybrid volumes and provides secondary storage performance improvements.

io_uring is a Linux kernel system call interface for storage device asynchronous I/O operations addressing performance issues with similar interfaces provided by functions like read /write or aio_read /aio_write etc. for operations on data accessed by file descriptors.

References

↑ "Source code of the IO scheduler, (contains copyright information in header)" . Retrieved 28 December 2017.
↑ D. John Shakshober (June 2005). "Choosing an I/O Scheduler for Red Hat® Enterprise Linux® 4 and the 2.6 Kernel". Red Hat magazine. Archived from the original on 27 August 2007. Retrieved 20 November 2011.
↑ Jens Axboe (June 2006). "Linux Kernel 2.6.18 - Make CFQ the default IO scheduler" . Retrieved 20 March 2016.
↑ Jens Axboe (2018-10-12). "block: remove legacy IO schedulers" . Retrieved 2020-10-25.
↑ Linus Torvalds (2018-12-28). "Merge tag 'for-4.21/block-20181221' of git.kernel.dk/linux-block" . Retrieved 2020-10-25.
↑ "Budget Fair Queueing I/O Scheduler".
↑ "BFQ I/O Scheduler Queued For Linux 4.12 - Phoronix". www.phoronix.com.

Sources

Short detail about CFQ (archived from here)
Coming in 2.6.10

External links

Documentation of CFQ from the Linux source code
CFQ - Tuning guide from RedHat's knowledge base

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Source code of the IO scheduler, (contains copyright information in header)" . Retrieved 28 December 2017.

[2] D. John Shakshober (June 2005). "Choosing an I/O Scheduler for Red Hat® Enterprise Linux® 4 and the 2.6 Kernel". Red Hat magazine. Archived from the original on 27 August 2007. Retrieved 20 November 2011.

[3] Jens Axboe (June 2006). "Linux Kernel 2.6.18 - Make CFQ the default IO scheduler" . Retrieved 20 March 2016.

[4] Jens Axboe (2018-10-12). "block: remove legacy IO schedulers" . Retrieved 2020-10-25.

[5] Linus Torvalds (2018-12-28). "Merge tag 'for-4.21/block-20181221' of git.kernel.dk/linux-block" . Retrieved 2020-10-25.

[6] "Budget Fair Queueing I/O Scheduler".

[7] "BFQ I/O Scheduler Queued For Linux 4.12 - Phoronix". www.phoronix.com.

[1]

[2]

[3]

[4]

[5]

[6]

[7]