Network scheduler

Last updated
Packets queuing in a FIFO (first in, first out) data structure. Data Queue.svg
Packets queuing in a FIFO (first in, first out) data structure.

A network scheduler, also called packet scheduler, queueing discipline, qdisc or queueing algorithm, is an arbiter on a node in packet switching communication network. It manages the sequence of network packets in the transmit and receive queues of the network interface controller. There are several network schedulers available for the different operating systems, that implement many of the existing network scheduling algorithms.

Arbiters are electronic devices that allocate access to shared resources.

In telecommunications networks, a node is either a redistribution point or a communication endpoint. The definition of a node depends on the network and protocol layer referred to. A physical network node is an active electronic device that is attached to a network, and is capable of creating, receiving, or transmitting information over a communications channel. A passive distribution point such as a distribution frame or patch panel is consequently not a node.

Packet switching a method of grouping data which is transmitted over a digital network into packets

Packet switching is a method of grouping data that is transmitted over a digital network into packets. Packets are made of a header and a payload. Data in the header are used by networking hardware to direct the packet to its destination where the payload is extracted and used by application software. Packet switching is the primary basis for data communications in computer networks worldwide.

Contents

The network scheduler logic decides which network packet to forward next. The network scheduler is associated with a queuing system, storing the network packets temporarily until they are transmitted. Systems may have a single or multiple queues in which case each may hold the packets of one flow, classification, or priority.

In packet switching networks, traffic flow, packet flow or network flow is a sequence of packets from a source computer to a destination, which may be another host, a multicast group, or a broadcast domain. RFC 2722 defines traffic flow as "an artificial logical equivalent to a call or connection." RFC 3697 defines traffic flow as "a sequence of packets sent from a particular source to a particular unicast, anycast, or multicast destination that the source desires to label as a flow. A flow could consist of all packets in a specific transport connection or a media stream. However, a flow is not necessarily 1:1 mapped to a transport connection." Flow is also defined in RFC 3917 as "a set of IP packets passing an observation point in the network during a certain time interval."

In some cases it may not be possible to schedule all transmissions within the constraints of the system. In these cases the network scheduler is responsible for deciding which traffic to forward and what gets dropped.

Packet loss occurs when one or more packets of data travelling across a computer network fail to reach their destination. Packet loss is either caused by errors in data transmission, typically across wireless networks, or network congestion. Packet loss is measured as a percentage of packets lost with respect to packets sent.

Terminology and responsibilities

A network scheduler may have responsibility in implementation of specific network traffic control initiatives. Network traffic control is an umbrella term for all measures aimed at reducing congest, latency and packet loss. Specifically, active queue management (AQM) is the selective dropping of queued network packets to achieve the larger goal of preventing excessive network congestion. The scheduler must choose which packets to drop. Traffic shaping smooths the bandwidth requirements of traffic flows by delaying transmission packets when they are queued in bursts. The scheduler decides the timing for the transmitted packets. quality of service (QoS) is the prioritization of traffic based on service class (Differentiated services) or reserved connection (Integrated services).

In computer networking, network traffic control is the process of managing, controlling or reducing the network traffic, particularly Internet bandwidth, e.g. by the network scheduler. It is used by network administrators, to reduce congestion, latency and packet loss. This is part of bandwidth management. In order to use these tools effectively, it is necessary to measure the network traffic to determine the causes of network congestion and attack those problems specifically.

In routers and switches, active queue management (AQM) is the policy of dropping packets inside a buffer associated with a network interface controller (NIC) before that buffer becomes full, often with the goal of reducing network congestion or improving end-to-end latency. This task is performed by the network scheduler, which for this purpose uses various algorithms such as random early detection (RED), Explicit Congestion Notification (ECN), or controlled delay (CoDel). RFC 7567 recommends active queue management as a best practice.

Traffic shaping is a bandwidth management technique used on computer networks which delays some or all datagrams to bring them into compliance with a desired traffic profile. Traffic shaping is used to optimize or guarantee performance, improve latency, or increase usable bandwidth for some kinds of packets by delaying other kinds. It is often confused with traffic policing, the distinct but related practice of packet dropping and packet marking.

Algorithms

In the course of time many network queueing disciplines have been developed. Each of these provides specific reordering or dropping of network packets inside various transmit or receive buffers. [1] [2] Queuing disciplines are commonly used as attempts to compensate for various networking conditions, like reducing the latency for certain classes of network packets, and are generally used as part of QoS measures. [3] [4] [5]

In computer science, a data buffer is a region of a physical memory storage used to temporarily store data while it is being moved from one place to another. Typically, the data is stored in a buffer as it is retrieved from an input device or just before it is sent to an output device. However, a buffer may be used when moving data between processes within a computer. This is comparable to buffers in telecommunication. Buffers can be implemented in a fixed memory location in hardware—or by using a virtual data buffer in software, pointing at a location in the physical memory. In all cases, the data stored in a data buffer are stored on a physical storage medium. A majority of buffers are implemented in software, which typically use the faster RAM to store temporary data, due to the much faster access time compared with hard disk drives. Buffers are typically used when there is a difference between the rate at which data is received and the rate at which it can be processed, or in the case that these rates are variable, for example in a printer spooler or in online video streaming. In the distributed computing environment, data buffer is often implemented in the form of burst buffer that provides distributed buffering service.

Examples of algorithms suitable for managing network traffic include:

Class-based queuing (CBQ) is a queuing discipline for the network scheduler that allows traffic to share bandwidth equally, after being grouped by classes. The classes can be based upon a variety of parameters, such as priority, interface, or originating program.

In network routing, CoDel for controlled delay is a scheduling algorithm for the network scheduler developed by Van Jacobson and Kathleen Nichols. It is designed to overcome bufferbloat in networking hardware, such as routers, by setting limits on the delay network packets experience as they pass through buffers in this equipment. CoDel aims to improve on the overall performance of the random early detection (RED) algorithm by addressing some of its fundamental misconceptions, as perceived by Jacobson, and by being easier to manage.

Several of the above have been implemented as Linux kernel modules [18] and are freely available.

Bufferbloat

Bufferbloat is a phenomenon in packet-switched networks in which excess buffering of packets causes high latency and packet delay variation. Bufferbloat can be addressed by a network scheduler that strategically discards packets to avoid an unnecessarily high buffering backlog. Examples include CoDel and Random early detection.

Implementations

Linux kernel

The Linux kernel's packet scheduler is part of the network stack, together with netfilter, nftables, and Berkeley Packet Filter. Simplified Structure of the Linux Kernel.svg
The Linux kernel's packet scheduler is part of the network stack, together with netfilter, nftables, and Berkeley Packet Filter.

The Linux kernel packet scheduler is an integral part of the Linux kernel's network stack and manages the transmit and receive ring buffers of all NICs, by working on the layer 2 of the OSI model and handling Ethernet frames, for example.

The packet scheduler is configured using the utility called tc (short for "traffic control"). As the default queuing discipline, the packet scheduler uses a FIFO implementation called pfifo_fast, [19] although systemd since its version 217 changes the default queuing discipline to fq_codel. [20]

The ifconfig and ip utilities enable system administrators to configure the buffer sizes txqueuelen and rxqueuelen for each device separately in terms of number of Ethernet frames regardless of their size. The Linux kernel's network stack contains several other buffers, which are not managed by the network scheduler. [lower-alpha 1]

Berkeley Packet Filter filters can be attached to the packet scheduler's classifiers. The eBPF functionality brought by version 4.1 of the Linux kernel in 2015 extends the classic BPF programmable classifiers to eBPF. [21] These can be compiled using the LLVM eBPF backend and loaded into a running kernel using the tc utility. [22]

BSD and OpenBSD

ALTQ is the implementation of a network scheduler for BSDs. As of OpenBSD version 5.5 ALTQ was replaced by the HFSC scheduler.

See also

Notes

  1. The overall size of all buffers has been the point of critique by the Bufferbloat project, which provided a partial solution with CoDel that has been primarily tested in OpenWrt.

Related Research Articles

Quality of service (QoS) is the description or measurement of the overall performance of a service, such as a telephony or computer network or a cloud computing service, particularly the performance seen by the users of the network. To quantitatively measure quality of service, several related aspects of the network service are often considered, such as packet loss, bit rate, throughput, transmission delay, availability, jitter, etc.

In computing, scheduling is the method by which work is assigned to resources that complete the work. The work may be virtual computation elements such as threads, processes or data flows, which are in turn scheduled onto hardware resources such as processors, network links or expansion cards.

Network interface controller hardware component that connects a computer to a computer network

A network interface controller is a computer hardware component that connects a computer to a computer network.

Explicit Congestion Notification (ECN) is an extension to the Internet Protocol and to the Transmission Control Protocol and is defined in RFC 3168 (2001). ECN allows end-to-end notification of network congestion without dropping packets. ECN is an optional feature that may be used between two ECN-enabled endpoints when the underlying network infrastructure also supports it.

Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of new connections. A consequence of congestion is that an incremental increase in offered load leads either only to a small increase or even a decrease in network throughput.

Deficit Round Robin (DRR), also Deficit Weighted Round Robin (DWRR), is a scheduling algorithm for the network scheduler. DRR is, like weighted fair queuing (WFQ), a packet-based implementation of the ideal Generalized Processor Sharing (GPS) policy. It was proposed by M. Shreedhar and G. Varghese in 1995 as an efficient and fair algorithm.

Netfilter is a framework provided by the Linux kernel that allows various networking-related operations to be implemented in the form of customized handlers. Netfilter offers various functions and operations for packet filtering, network address translation, and port translation, which provide the functionality required for directing packets through a network and prohibiting packets from reaching sensitive locations within a network.

The token bucket is an algorithm used in packet switched computer networks and telecommunications networks. It can be used to check that data transmissions, in the form of packets, conform to defined limits on bandwidth and burstiness. It can also be used as a scheduling algorithm to determine the timing of transmissions that will comply with the limits set for the bandwidth and burstiness: see network scheduler.

Random early detection

Random early detection (RED), also known as random early discard or random early drop is a queuing discipline for a network scheduler suited for congestion avoidance.

Transmission Control Protocol (TCP) uses a network congestion-avoidance algorithm that includes various aspects of an additive increase/multiplicative decrease (AIMD) scheme, with other schemes such as slow start and congestion window to achieve congestion avoidance. The TCP congestion-avoidance algorithm is the primary basis for congestion control in the Internet. Per the end-to-end principle, congestion control is largely a function of internet hosts, not the network itself. There are several variations and versions of the algorithm implemented in protocol stacks of operating systems of computers that connect to the Internet.

l7-filter

l7-filter is a software package which provides a classifier for Linux's Netfilter subsystem which can categorize Internet Protocol packets based on their application layer data. The major goal of this tool is to make possible the identification of peer-to-peer programs, which use unpredictable port numbers. There are two versions for this software. The first is implemented as a kernel module for Linux 2.4 and 2.6. The second experimental version was released in December 2006 which runs as a user-space program and relies on netfilter's user-space libraries for the classification process.

Fair queuing is a family of scheduling algorithms used in some process and network schedulers. The algorithm is designed to achieve fairness when a limited resource is shared, for example to prevent flows with large packets or processes that generate small jobs from consuming more throughput or CPU time than other flows or processes.

Bandwidth management is the process of measuring and controlling the communications on a network link, to avoid filling the link to capacity or overfilling the link, which would result in network congestion and poor performance of the network. Bandwidth is measured in bits per second (bit/s) or bytes per second (B/s).

The Berkeley Packet Filter (BPF) provides a raw interface to data link layers, permitting raw link-layer packets to be sent and received. It is available on most Unix-like operating systems. In addition, if the driver for the network interface supports promiscuous mode, it allows the interface to be put into that mode so that all packets on the network can be received, even those destined to other hosts.

Blue is a scheduling discipline for the network scheduler developed by graduate student Wu-chang Feng for Professor Kang G. Shin at the University of Michigan and others at the Thomas J. Watson Research Center of IBM in 1999.

Bufferbloat is a cause of high latency in packet-switched networks caused by excess buffering of packets. Bufferbloat can also cause packet delay variation, as well as reduce the overall network throughput. When a router or switch is configured to use excessively large buffers, even very high-speed networks can become practically unusable for many interactive applications like voice over IP (VoIP), online gaming, and even ordinary web surfing.

Open vSwitch

Open vSwitch, sometimes abbreviated as OVS, is an open-source implementation of a distributed virtual multilayer switch. The main purpose of Open vSwitch is to provide a switching stack for hardware virtualization environments, while supporting multiple protocols and standards used in computer networks.

References

  1. "Traffic Control HOWTO: Classless Queuing Disciplines (qdiscs)". tldp.org. Retrieved November 24, 2013.
  2. Saravanan Radhakrishnan (September 30, 1999). "QoS Support in Linux: Queuing Disciplines". qos.ittc.ku.edu. Retrieved March 18, 2014.
  3. "Traffic Control HOWTO: Components of Linux Traffic Control". tldp.org. Retrieved November 24, 2013.
  4. "Traffic Control HOWTO: Traditional Elements of Traffic Control". tldp.org. Retrieved November 24, 2013.
  5. "Queuing Disciplines: Order of Packet Transmission and Dropping" (PDF). tau.ac.il. October 25, 2006. Retrieved March 18, 2014.
  6. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.108.4477&rep=rep1&type=pdf
  7. "Let them run CAKE". LWN.net.
  8. "DRR Linux kernel network scheduler module". kernel.org . Retrieved 2013-09-07.
  9. "FavorQueue: a Parameterless Active Queue Management to Improve TCP Traffic Performance" (PDF).
  10. "Heavy-Hitter Filter qdisc". kernel.org.
  11. "HTB Linux kernel network scheduler module". kernel.org . Retrieved 2013-09-07.
  12. "QFQ Linux kernel network scheduler module". kernel.org . Retrieved 2013-09-07.
  13. "Fair Queue packet scheduler committed to Linux kernel 3.12".
  14. "Network emulator Linux kernel network scheduler module". kernel.org . Retrieved 2013-09-07.
  15. "Proportional Integral controller Enhanced (PIE)". kernel.org.
  16. "SFQ Linux kernel network scheduler module". kernel.org . Retrieved 2013-09-07.
  17. "TBF Linux kernel network scheduler module". kernel.org . Retrieved 2013-09-07.
  18. "The Linux kernel network scheduler". kernel.org. 2012-12-26. Retrieved 2013-09-07.
  19. "Linux Advanced Routing and Traffic Control HOWTO, Section 9.2.1. pfifo_fast". lartc.org. 2012-05-19. Retrieved 2014-09-19.
  20. "systemd System and Service Manager: NEWS file". freedesktop.org. 2015-05-22. Retrieved 2015-06-09.
  21. "Linux kernel 4.1, Section 11. Networking". kernelnewbies.org. 2015-06-21.
  22. "BPF and XDP Reference Guide". Cilium documentation web site.