Latency (engineering)

Last updated

Latency, from a general point of view, is a time delay between the cause and the effect of some physical change in the system being observed. Lag, as it is known in gaming circles, refers to the latency between the input to a simulation and the visual or auditory response, often occurring because of network delay in online games. [1]

Contents

Latency is physically a consequence of the limited velocity at which any physical interaction can propagate. The magnitude of this velocity is always less than or equal to the speed of light. Therefore, every physical system with any physical separation (distance) between cause and effect will experience some sort of latency, regardless of the nature of the stimulation to which it has been exposed.

The precise definition of latency depends on the system being observed or the nature of the simulation. In communications, the lower limit of latency is determined by the medium being used to transfer information. In reliable two-way communication systems, latency limits the maximum rate at which information can be transmitted, as there is often a limit on the amount of information that is in-flight at any given moment. Perceptible latency has a strong effect on user satisfaction and usability in the field of human–machine interaction. [2]

Communications

Online games are sensitive to latency (lag), since fast response times to new events occurring during a game session are rewarded while slow response times may carry penalties. Due to a delay in transmission of game events, a player with a high latency internet connection may show slow responses in spite of appropriate reaction time. This gives players with low-latency connections a technical advantage.

Capital markets

Joel Hasbrouck and Gideon Saar (2011) measure latency to execute financial transactions based on three components: the time it takes for information to reach the trader, execution of the trader's algorithms to analyze the information and decide a course of action, and the generated action to reach the exchange and get implemented. Hasbrouck and Saar contrast this with the way in which latencies are measured by many trading venues that use much more narrow definitions, such as the processing delay measured from the entry of the order (at the vendor's computer) to the transmission of an acknowledgment (from the vendor's computer). [3] Trading using computers has developed to the point where millisecond improvements in network speeds offer a competitive advantage for financial institutions. [4]

Packet-switched networks

Network latency in a packet-switched network is measured as either one-way (the time from the source sending a packet to the destination receiving it), or round-trip delay time (the one-way latency from the source to the destination plus the one-way latency from the destination back to the source). Round-trip latency is more often quoted, because it can be measured from a single point. Many software platforms provide a service called ping that can be used to measure round-trip latency. Ping uses the Internet Control Message Protocol (ICMP) echo request which causes the recipient to send the received packet as an immediate response, thus it provides a rough way of measuring round-trip delay time. Ping cannot perform accurate measurements, [5] principally because ICMP is intended only for diagnostic or control purposes, and differs from real communication protocols such as TCP. Furthermore, routers and internet service providers might apply different traffic shaping policies to different protocols. [6] [7] For more accurate measurements it is better to use specific software, for example: hping, Netperf or Iperf.

However, in a non-trivial network, a typical packet will be forwarded over multiple links and gateways, each of which will not begin to forward the packet until it has been completely received. In such a network, the minimal latency is the sum of the transmission delay of each link, plus the forwarding latency of each gateway. In practice, minimal latency also includes queuing and processing delays. Queuing delay occurs when a gateway receives multiple packets from different sources heading toward the same destination. Since typically only one packet can be transmitted at a time, some of the packets must queue for transmission, incurring additional delay. Processing delays are incurred while a gateway determines what to do with a newly received packet. Bufferbloat can also cause increased latency that is an order of magnitude or more. The combination of propagation, serialization, queuing, and processing delays often produces a complex and variable network latency profile.

Latency limits total throughput in reliable two-way communication systems as described by the bandwidth-delay product.

Fiber optics

Latency in optical fiber is largely a function of the speed of light. This would equate to a latency of 3.33  μs for every kilometer of path length. The index of refraction of most fiber optic cables is about 1.5, meaning that light travels about 1.5 times as fast in a vacuum as it does in the cable. This works out to about 5.0 μs of latency for every kilometer. In shorter metro networks, higher latency can be experienced due to extra distance in building risers and cross-connects. To calculate the latency of a connection, one has to know the distance traveled by the fiber, which is rarely a straight line, since it has to traverse geographic contours and obstacles, such as roads and railway tracks, as well as other rights-of-way.

Due to imperfections in the fiber, light degrades as it is transmitted through it. For distances of greater than 100 kilometers, amplifiers or regenerators are deployed. Latency introduced by these components needs to be taken into account.

Satellite transmission

Satellites in geostationary orbits are far enough away from Earth that communication latency becomes significant – about a quarter of a second for a trip from one ground-based transmitter to the satellite and back to another ground-based transmitter; close to half a second for two-way communication from one Earth station to another and then back to the first. Low Earth orbit is sometimes used to cut this delay, at the expense of more complicated satellite tracking on the ground and requiring more satellites in the satellite constellation to ensure continuous coverage.

Audio

Audio latency is the delay between when an audio signal enters and when it emerges from a system. Potential contributors to latency in an audio system include analog-to-digital conversion, buffering, digital signal processing, transmission time, digital-to-analog conversion and the speed of sound in air.

Video

Video latency refers to the degree of delay between the time a transfer of a video stream is requested and the actual time that transfer begins. Networks that exhibit relatively small delays are known as low-latency networks, while their counterparts are known as high-latency networks.

Workflow

Any individual workflow within a system of workflows can be subject to some type of operational latency. It may even be the case that an individual system may have more than one type of latency, depending on the type of participant or goal-seeking behavior. This is best illustrated by the following two examples involving air travel.

From the point of view of a passenger, latency can be described as follows. Suppose John Doe flies from London to New York. The latency of his trip is the time it takes him to go from his house in England to the hotel he is staying at in New York. This is independent of the throughput of the London-New York air link whether there were 100 passengers a day making the trip or 10000, the latency of the trip would remain the same.

From the point of view of flight operations personnel, latency can be entirely different. Consider the staff at the London and New York airports. Only a limited number of planes are able to make the transatlantic journey, so when one lands they must prepare it for the return trip as quickly as possible. It might take, for example:

Assuming the above are done consecutively, minimum plane turnaround time is:

35 + 15 + 10 + 30 = 90

However, cleaning, refueling and loading the cargo can be done at the same time. Passengers can only be loaded after cleaning is complete. The reduced latency, then, is:

35 + 10 = 45
15
30
Minimum latency = 45

The people involved in the turnaround are interested only in the time it takes for their individual tasks. When all of the tasks are done at the same time, however, it is possible to reduce the latency to the length of the longest task. If some steps have prerequisites, it becomes more difficult to perform all steps in parallel. In the example above, the requirement to clean the plane before loading passengers results in a minimum latency longer than any single task.

Mechanics

Any mechanical process encounters limitations modeled by Newtonian physics. The behavior of disk drives provides an example of mechanical latency. Here, it is the time seek time for the actuator arm to be positioned above the appropriate track and then rotational latency for the data encoded on a platter to rotate from its current position to a position under the disk read-and-write head.

Computer hardware and software systems

Computers run instructions in the context of a process. In the context of computer multitasking, the execution of the process can be postponed if other processes are also executing. In addition, the operating system can schedule when to perform the action that the process is commanding. For example, suppose a process commands that a computer card's voltage output be set high-low-high-low and so on at a rate of 1000 Hz. The operating system schedules the process for each transition (high-low or low-high) based on a hardware clock such as the High Precision Event Timer. The latency is the delay between the events generated by the hardware clock and the actual transitions of voltage from high to low or low to high.

Many desktop operating systems have performance limitations that create additional latency. The problem may be mitigated with real-time extensions and patches such as PREEMPT RT.

On embedded systems, the real-time execution of instructions is often supported by a real-time operating system.

Note that in software systems, benchmarking against "average" and "median" latency can be misleading because few outlier numbers can distort them. Instead, software architects and software developers should use "99th percentile". [8]

Simulations

In simulation applications, latency refers to the time delay, often measured in milliseconds, between initial input and output clearly discernible to the simulator trainee or simulator subject. Latency is sometimes also called transport delay. Some authorities[ who? ] distinguish between latency and transport delay by using the term latency in the sense of the extra time delay of a system over and above the reaction time of the vehicle being simulated, but this requires detailed knowledge of the vehicle dynamics and can be controversial.

In simulators with both visual and motion systems, it is particularly important that the latency of the motion system not be greater than of the visual system, or symptoms of simulator sickness may result. This is because, in the real world, motion cues are those of acceleration and are quickly transmitted to the brain, typically in less than 50 milliseconds; this is followed some milliseconds later by a perception of change in the visual scene. The visual scene change is essentially one of change of perspective or displacement of objects such as the horizon, which takes some time to build up to discernible amounts after the initial acceleration which caused the displacement. A simulator should, therefore, reflect the real-world situation by ensuring that the motion latency is equal to or less than that of the visual system and not the other way round.

See also

Related Research Articles

Quality of service (QoS) is the description or measurement of the overall performance of a service, such as a telephony or computer network, or a cloud computing service, particularly the performance seen by the users of the network. To quantitatively measure quality of service, several related aspects of the network service are often considered, such as packet loss, bit rate, throughput, transmission delay, availability, jitter, etc.

Network throughput refers to the rate of message delivery over a communication channel, such as Ethernet or packet radio, in a communication network. The data that these messages contain may be delivered over physical or logical links, or through network nodes. Throughput is usually measured in bits per second, and sometimes in data packets per second or data packets per time slot.

In telecommunication and computer engineering, the queuing delay or queueing delay is the time a job waits in a queue until it can be executed. It is a key component of network delay. In a switched network, queuing delay is the time between the completion of signaling by the call originator and the arrival of a ringing signal at the call receiver. Queuing delay may be caused by delays at the originating switch, intermediate switches, or the call receiver servicing switch. In a data network, queuing delay is the sum of the delays between the request for service and the establishment of a circuit to the called data terminal equipment (DTE). In a packet-switched network, queuing delay is the sum of the delays encountered by a packet between the time of insertion into the network and the time of delivery to the address.

In technology, response time is the time a system or functional unit takes to react to a given input.

In telecommunications, round-trip delay (RTD) or round-trip time (RTT) is the amount of time it takes for a signal to be sent plus the amount of time it takes for acknowledgement of that signal having been received. This time delay includes propagation times for the paths between the two communication endpoints. In the context of computer networks, the signal is typically a data packet. RTT is commonly used interchangeably with ping time, which can be determined with the ping command. However, ping time may differ from experienced RTT with other protocols since the payload and priority associated with ICMP messages used by ping may differ from that of other traffic.

FAST TCP is a TCP congestion avoidance algorithm especially targeted at long-distance, high latency links, developed at the Netlab, California Institute of Technology and now being commercialized by FastSoft. FastSoft was acquired by Akamai Technologies in 2012.

<span class="mw-page-title-main">Satellite Internet access</span> Satellite-provided Internet

Satellite Internet access is Internet access provided through communication satellites; if it can sustain high speeds, it is termed satellite broadband. Modern consumer grade satellite Internet service is typically provided to individual users through geostationary satellites that can offer relatively high data speeds, with newer satellites using the Ku band to achieve downstream data speeds up to 506 Mbit/s. In addition, new satellite internet constellations are being developed in low-earth orbit to enable low-latency internet access from space.

Network performance refers to measures of service quality of a network as seen by the customer.

Transmission Control Protocol (TCP) uses a congestion control algorithm that includes various aspects of an additive increase/multiplicative decrease (AIMD) scheme, along with other schemes including slow start and a congestion window (CWND), to achieve congestion avoidance. The TCP congestion-avoidance algorithm is the primary basis for congestion control in the Internet. Per the end-to-end principle, congestion control is largely a function of internet hosts, not the network itself. There are several variations and versions of the algorithm implemented in protocol stacks of operating systems of computers that connect to the Internet.

TCP tuning techniques adjust the network congestion avoidance parameters of Transmission Control Protocol (TCP) connections over high-bandwidth, high-latency networks. Well-tuned networks can perform up to 10 times faster in some cases. However, blindly following instructions without understanding their real consequences can hurt performance as well.

Packet loss occurs when one or more packets of data travelling across a computer network fail to reach their destination. Packet loss is either caused by errors in data transmission, typically across wireless networks, or network congestion. Packet loss is measured as a percentage of packets lost with respect to packets sent.

<span class="mw-page-title-main">Computer network</span> Network that allows computers to share resources and communicate with each other

A computer network is a set of computers sharing resources located on or provided by network nodes. Computers use common communication protocols over digital interconnections to communicate with each other. These interconnections are made up of telecommunication network technologies based on physically wired, optical, and wireless radio-frequency methods that may be arranged in a variety of network topologies.

In telecommunication networks, the transmission time is the amount of time from the beginning until the end of a message transmission. In the case of a digital message, it is the time from the first bit until the last bit of a message has left the transmitting node. The packet transmission time in seconds can be obtained from the packet size in bit and the bit rate in bit/s as:

Latency refers to a short period of delay between when an audio signal enters a system, and when it emerges. Potential contributors to latency in an audio system include analog-to-digital conversion, buffering, digital signal processing, transmission time, digital-to-analog conversion, and the speed of sound in the transmission medium.

In computing, computer performance is the amount of useful work accomplished by a computer system. Outside of specific contexts, computer performance is estimated in terms of accuracy, efficiency and speed of executing computer program instructions. When it comes to high computer performance, one or more of the following factors might be involved:

<span class="mw-page-title-main">Network delay</span> Time required for data to traverse a network

Network delay is a design and performance characteristic of a telecommunications network. It specifies the latency for a bit of data to travel across the network from one communication endpoint to another. It is typically measured in multiples or fractions of a second. Delay may differ slightly, depending on the location of the specific pair of communicating endpoints. Engineers usually report both the maximum and average delay, and they divide the delay into several parts:

In capital markets, low latency is the use of algorithmic trading to react to market events faster than the competition to increase profitability of trades. For example, when executing arbitrage strategies the opportunity to "arb" the market may only present itself for a few milliseconds before parity is achieved. To demonstrate the value that clients put on latency, in 2007 a large global investment bank has stated that every millisecond lost results in $100m per annum in lost opportunity.

In computers, lag is delay (latency) between the action of the user (input) and the reaction of the server supporting the task, which has to be sent back to the client.

Bufferbloat is a cause of high latency and jitter in packet-switched networks caused by excess buffering of packets. Bufferbloat can also cause packet delay variation, as well as reduce the overall network throughput. When a router or switch is configured to use excessively large buffers, even very high-speed networks can become practically unusable for many interactive applications like voice over IP (VoIP), audio streaming, online gaming, and even ordinary web browsing.

CoDel is an active queue management (AQM) algorithm in network routing, developed by Van Jacobson and Kathleen Nichols and published as RFC8289. It is designed to overcome bufferbloat in networking hardware, such as routers, by setting limits on the delay network packets experience as they pass through buffers in this equipment. CoDel aims to improve on the overall performance of the random early detection (RED) algorithm by addressing some of its fundamental misconceptions, as perceived by Jacobson, and by being easier to manage.

References

  1. "Latency" Archived 2021-04-22 at the Wayback Machine Retrieved 2020-10-27.
  2. Souders, Steve. "Velocity and the Bottom Line" . Retrieved 23 February 2023.
  3. Hasbrouck, Joel; Saar, Gideon. "Low-Latency Trading" (PDF). p. 1. Archived from the original (PDF) on 11 November 2011. Retrieved 18 July 2011.
  4. "High-frequency trading: when milliseconds mean millions". The Telegraph. Retrieved 2018-03-25.
  5. "Don't misuse ping!". Archived from the original on 12 October 2017. Retrieved 29 April 2015.
  6. Shane Chen (2005). "Network Protocols Discussion / Traffic Shaping Strategies". knowplace.org. Archived from the original on 2007-01-09.
  7. "Basic QoS part 1 – Traffic Policing and Shaping on Cisco IOS Router". The CCIE R&S. 19 September 2012. Retrieved 29 April 2015.
  8. Foundations of Data Intensive Applications Large Scale Data Analytics Under the Hood. 2021. ISBN   9781119713012.

Further reading