Fair queuing

Last updated

Fair queuing is a family of scheduling algorithms used in some process and network schedulers. The algorithm is designed to achieve fairness when a limited resource is shared, for example to prevent flows with large packets or processes that generate small jobs from consuming more throughput or CPU time than other flows or processes.

Contents

Fair queuing is implemented in some advanced network switches and routers.

History

The term fair queuing was coined by John Nagle in 1985 while proposing round-robin scheduling in the gateway between a local area network and the internet to reduce network disruption from badly-behaving hosts. [1] [2] [3]

A byte-weighted version was proposed by Alan Demers, Srinivasan Keshav and Scott Shenker in 1989, and was based on the earlier Nagle fair queuing algorithm. [4] [5] The byte-weighted fair queuing algorithm aims to mimic a bit-per-bit multiplexing by computing theoretical departure date for each packet.

The concept has been further developed into weighted fair queuing, and the more general concept of traffic shaping, where queuing priorities are dynamically controlled to achieve desired flow quality of service goals or accelerate some flows.

Principle

Fair queuing uses one queue per packet flow and services them in rotation, such that each flow can "obtain an equal fraction of the resources". [1] [2]

The advantage over conventional first in first out (FIFO) or priority queuing is that a high-data-rate flow, consisting of large packets or many data packets, cannot take more than its fair share of the link capacity.

Fair queuing is used in routers, switches, and statistical multiplexers that forward packets from a buffer. The buffer works as a queuing system, where the data packets are stored temporarily until they are transmitted.

With a link data-rate of R, at any given time the N active data flows (the ones with non-empty queues) are serviced each with an average data rate of R/N. In a short time interval the data rate may fluctuate around this value since the packets are delivered sequentially in turn.

Fairness

In the context of network scheduling, fairness has multiple definitions. Nagel's article uses round-robin scheduling of packets, [2] which is fair in terms of the number of packets, but not on the bandwidth use when packets have varying size. Several formal notions of fairness measure have been defined including max-min fairness, worst-case fairness, [6] and fairness index. [7]

Generalisation to weighted sharing

The initial idea gives to each flow the same rate. A natural extension consists in letting the user specify the portion of bandwidth allocated to each flow leading to weighted fair queuing and generalized processor sharing.

A byte-weighted fair queuing algorithm

This algorithm attempts to emulate the fairness of bitwise round-robin sharing of link resources among competing flows. Packet-based flows, however, must be transmitted packetwise and in sequence. The byte-weighted fair queuing algorithm selects transmission order for the packets by modeling the finish time for each packet as if they could be transmitted bitwise round robin. The packet with the earliest finish time according to this modeling is the next selected for transmission.

The complexity of the algorithm is O(log(n)), where n is the number of queues/flows.

Algorithm details

Modeling of actual finish time, while feasible, is computationally intensive. The model needs to be substantially recomputed every time a packet is selected for transmission and every time a new packet arrives into any queue.

To reduce computational load, the concept of virtual time is introduced. Finish time for each packet is computed on this alternate monotonically increasing virtual timescale. While virtual time does not accurately model the time packets complete their transmissions, it does accurately model the order in which the transmissions must occur to meet the objectives of the full-featured model. Using virtual time, it is unnecessary to recompute the finish time for previously queued packets. Although the finish time, in absolute terms, for existing packets is potentially affected by new arrivals, finish time on the virtual time line is unchanged - the virtual time line warps with respect to real time to accommodate any new transmission.

The virtual finish time for a newly queued packet is given by the sum of the virtual start time plus the packet's size. The virtual start time is the maximum between the previous virtual finish time of the same queue and the current instant.

With a virtual finishing time of all candidate packets (i.e., the packets at the head of all non-empty flow queues) computed, fair queuing compares the virtual finishing time and selects the minimum one. The packet with the minimum virtual finishing time is transmitted.

Pseudocode

Shared variables     const N             // Nb of queues      queues[1..N]        // queues     lastVirFinish[1..N] // last virtual finish instant
receive(packet)      queueNum := chooseQueue(packet)      queues[queueNum].enqueue(packet)      updateTime(packet, queueNum)
updateTime(packet, queueNum)     // virStart is the virtual start of service     virStart := max(now(), lastVirFinish[queueNum])     packet.virFinish := packet.size + virStart     lastVirFinish[queueNum] := packet.virFinish
send()      queueNum := selectQueue()      packet := queues[queueNum].dequeue()      return packet
selectQueue()      it := 1      minVirFinish = while it ≤ N do          queue := queues[it]          ifnot queue.empty and queue.head.virFinish < minVirFinish then              minVirFinish = queue.head.virFinish              queueNum := it           it := it + 1      return queueNum

The function receive() is executed each time a packet is received, and send() is executed each time a packet to send must be selected, i.e. when the link is idle and the queues are not empty. This pseudo-code assumes there is a function now() that returns the current virtual time, and a function chooseQueue() that selects the queue where the packet is enqueued.

The function selectQueue() selects the queue with the minimal virtual finish time. For the sake of readability, the pseudo-code presented here does a linear search. But maintaining a sorted list can be implemented in logarithmic time, leading to a O(log(n)) complexity, but with more complex code.

See also

Related Research Articles

Network throughput refers to the rate of message delivery over a communication channel, such as Ethernet or packet radio, in a communication network. The data that these messages contain may be delivered over physical or logical links, or through network nodes. Throughput is usually measured in bits per second, and sometimes in data packets per second or data packets per time slot.

In computing, scheduling is the action of assigning resources to perform tasks. The resources may be processors, network links or expansion cards. The tasks may be threads, processes or data flows.

<span class="mw-page-title-main">Round-robin scheduling</span> Algorithm employed by process and network schedulers in computing

Round-robin (RR) is one of the algorithms employed by process and network schedulers in computing. As the term is generally used, time slices are assigned to each process in equal portions and in circular order, handling all processes without priority. Round-robin scheduling is simple, easy to implement, and starvation-free. Round-robin scheduling can be applied to other scheduling problems, such as data packet scheduling in computer networks. It is an operating system concept.

Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of new connections. A consequence of congestion is that an incremental increase in offered load leads either only to a small increase or even a decrease in network throughput.

Weighted round robin (WRR) is a network scheduler for data flows, but also used to schedule processes.

Deficit Round Robin (DRR), also Deficit Weighted Round Robin (DWRR), is a scheduling algorithm for the network scheduler. DRR is, like weighted fair queuing (WFQ), a packet-based implementation of the ideal Generalized Processor Sharing (GPS) policy. It was proposed by M. Shreedhar and G. Varghese in 1995 as an efficient and fair algorithm.

<span class="mw-page-title-main">Leaky bucket</span> Network traffic shaping and policing algorithm

The leaky bucket is an algorithm based on an analogy of how a bucket with a constant leak will overflow if either the average rate at which water is poured in exceeds the rate at which the bucket leaks or if more water than the capacity of the bucket is poured in all at once. It can be used to determine whether some sequence of discrete events conforms to defined limits on their average and peak rates or frequencies, e.g. to limit the actions associated with these events to these rates or delay them until they do conform to the rates. It may also be used to check conformance or limit to an average rate alone, i.e. remove any variation from the average.

<span class="mw-page-title-main">Random early detection</span> Algorithm

Random early detection (RED), also known as random early discard or random early drop is a queuing discipline for a network scheduler suited for congestion avoidance.

<span class="mw-page-title-main">Statistical time-division multiplexing</span>

Statistical multiplexing is a type of communication link sharing, very similar to dynamic bandwidth allocation (DBA). In statistical multiplexing, a communication channel is divided into an arbitrary number of variable bitrate digital channels or data streams. The link sharing is adapted to the instantaneous traffic demands of the data streams that are transferred over each channel. This is an alternative to creating a fixed sharing of a link, such as in general time division multiplexing (TDM) and frequency division multiplexing (FDM). When performed correctly, statistical multiplexing can provide a link utilization improvement, called the statistical multiplexing gain.

Generalized processor sharing (GPS) is an ideal scheduling algorithm for process schedulers and network schedulers. It is related to the fair-queuing principle which groups packets into classes and shares the service capacity between them. GPS shares this capacity according to some fixed weights.

In computer science, an input queue is a collection of processes in storage that are waiting to be brought into memory to run a program. Input queues are mainly used in Operating System Scheduling which is a technique for distributing resources among processes. Input queues not only apply to operating systems (OS), but may also be applied to scheduling inside networking devices. The purpose of scheduling is to ensure resources are being distributed fairly and effectively; therefore, it improves the performance of the system.

Scott J. Shenker is an American computer scientist, and professor of computer science at the University of California, Berkeley. He is also the leader of the Extensible Internet Group at the International Computer Science Institute in Berkeley, California.

Proportional-fair scheduling is a compromise-based scheduling algorithm. It is based upon maintaining a balance between two competing interests: Trying to maximize the total throughput of the network while at the same time allowing all users at least a minimal level of service. This is done by assigning each data flow a data rate or a scheduling priority that is inversely proportional to its anticipated resource consumption.

Maximum throughput scheduling is a procedure for scheduling data packets in a packet-switched best-effort network, typically a wireless network, in view to maximize the total throughput of the network, or the system spectral efficiency in a wireless network. This is achieved by giving scheduling priority to the least "expensive" data flows in terms of consumed network resources per transferred amount of information.

Bandwidth management is the process of measuring and controlling the communications on a network link, to avoid filling the link to capacity or overfilling the link, which would result in network congestion and poor performance of the network. Bandwidth is described by bit rate and measured in units of bits per second (bit/s) or bytes per second (B/s).

Weighted fair queueing (WFQ) is a network scheduling algorithm. WFQ is both a packet-based implementation of the generalized processor sharing (GPS) policy, and a natural extension of fair queuing (FQ). Whereas FQ shares the link's capacity in equal subparts, WFQ allows schedulers to specify, for each flow, which fraction of the capacity will be given.

Multipath routing is a routing technique simultaneously using multiple alternative paths through a network. This can yield a variety of benefits such as fault tolerance, increased bandwidth, and improved security.

<span class="mw-page-title-main">Network scheduler</span> Arbiter on a node in packet switching communication network

A network scheduler, also called packet scheduler, queueing discipline (qdisc) or queueing algorithm, is an arbiter on a node in a packet switching communication network. It manages the sequence of network packets in the transmit and receive queues of the protocol stack and network interface controller. There are several network schedulers available for the different operating systems, that implement many of the existing network scheduling algorithms.

Time-Sensitive Networking (TSN) is a set of standards under development by the Time-Sensitive Networking task group of the IEEE 802.1 working group. The TSN task group was formed in November 2012 by renaming the existing Audio Video Bridging Task Group and continuing its work. The name changed as a result of the extension of the working area of the standardization group. The standards define mechanisms for the time-sensitive transmission of data over deterministic Ethernet networks.

Enhanced Transmission Selection (ETS) is a network scheduler scheduling algorithm that has been defined by the Data Center Bridging Task Group of the IEEE 802.1 Working Group. It is a hierarchical scheduler that combines static priority scheduling and a bandwidth sharing algorithms.

References

  1. 1 2 John Nagle: "On packet switches with infinite storage," RFC 970, IETF, December 1985.
  2. 1 2 3 Nagle, J. B. (1987). "On Packet Switches with Infinite Storage". IEEE Transactions on Communications . 35 (4): 435–438. CiteSeerX   10.1.1.649.5380 . doi:10.1109/TCOM.1987.1096782.
  3. Phillip Gross (January 1986), Proceedings of the 16-17 January 1986 DARPA Gateway Algorithms and Data Structures Task Force (PDF), IETF, pp. 5, 98, retrieved 2015-03-04, Nagle presented his "fair queuing" scheme, in which gateways maintain separate queues for each sending host. In this way, hosts with pathological implementations can not usurp more than their fair share of the gateway's resources. This invoked spirited and interested discussion.
  4. Demers, Alan; Keshav, Srinivasan; Shenker, Scott (1989). "Analysis and simulation of a fair queueing algorithm". ACM SIGCOMM Computer Communication Review. 19 (4): 1–12. doi: 10.1145/75247.75248 .
  5. Demers, Alan; Keshav, Srinivasan; Shenker, Scott (1990). "Analysis and Simulation of a Fair Queueing Algorithm" (PDF). Internetworking: Research and Experience. 1: 3–26.
  6. Bennett, J. C. R.; Hui Zhang (1996). "WF/sup 2/Q: Worst-case fair weighted fair queueing". Proceedings of IEEE INFOCOM '96. Conference on Computer Communications. Vol. 1. p. 120. doi:10.1109/INFCOM.1996.497885. ISBN   978-0-8186-7293-4. S2CID   17558577.
  7. Ito, Y.; Tasaka, S.; Ishibashi, Y. (2002). "Variably weighted round robin queueing for core IP routers". Conference Proceedings of the IEEE International Performance, Computing, and Communications Conference (Cat. No.02CH37326). p. 159. doi:10.1109/IPCCC.2002.995147. ISBN   978-0-7803-7371-6. S2CID   60787008.