Latency (engineering)

Last updated

Latency is a time interval between the stimulation and response, or, from a more general point of view, a time delay between the cause and the effect of some physical change in the system being observed. [1] Latency is physically a consequence of the limited velocity with which any physical interaction can propagate. The magnitude of this velocity is always less than or equal to the speed of light. Therefore, every physical system will experience some sort of latency, regardless of the nature of stimulation that it has been exposed to.

Speed of light speed at which all massless particles and associated fields travel in vacuum

The speed of light in vacuum, commonly denoted c, is a universal physical constant important in many areas of physics. Its exact value is 299,792,458 metres per second. It is exact because by international agreement a metre is defined to be the length of the path travelled by light in vacuum during a time interval of 1/299792458 second. According to special relativity, c is the maximum speed at which all conventional matter and hence all known forms of information in the universe can travel. Though this speed is most commonly associated with light, it is in fact the speed at which all massless particles and changes of the associated fields travel in vacuum. Such particles and waves travel at c regardless of the motion of the source or the inertial reference frame of the observer. In the special and general theories of relativity, c interrelates space and time, and also appears in the famous equation of mass–energy equivalence E = mc2.

Contents

The precise definition of latency depends on the system being observed and the nature of stimulation. In communications, the lower limit of latency is determined by the medium being used for communications. In reliable two-way communication systems, latency limits the maximum rate that information can be transmitted, as there is often a limit on the amount of information that is "in-flight" at any one moment. In the field of human–machine interaction, perceptible latency has a strong effect on user satisfaction and usability.

Communication latency

Online games are sensitive to latency since fast response times to new events occurring during a game session are rewarded while slow response times may carry penalties. Lag is the term used to describe latency in gaming. Due to a delay in transmission of game events, a player with a high latency internet connection may show slow responses in spite of appropriate reaction time. This gives players with low latency connections a technical advantage.

An online game is a video game that is either partially or primarily played through the Internet or any other computer network available. Online games are ubiquitous on modern gaming platforms, including PCs, consoles and mobile devices, and span many genres, including first-person shooters, strategy games and massively multiplayer online role-playing games (MMORPG).

In online gaming, lag is a noticeable delay between the action of players and the reaction of the server supporting the game.

Minimizing latency is of interest in the capital markets, [2] particularly where algorithmic trading is used to process market updates and turn around orders within milliseconds. Low-latency trading occurs on the networks used by financial institutions to connect to stock exchanges and electronic communication networks (ECNs) to execute financial transactions. [3] Joel Hasbrouck and Gideon Saar (2011) measure latency based on three components: the time it takes for information to reach the trader, execution of the trader’s algorithms to analyze the information and decide a course of action, and the generated action to reach the exchange and get implemented. Hasbrouck and Saar contrast this with the way in which latencies are measured by many trading venues who use much more narrow definitions, such as, the processing delay measured from the entry of the order (at the vendor’s computer) to the transmission of an acknowledgement (from the vendor’s computer). [4] Electronic trading now makes up 60% to 70% of the daily volume on the NYSE and algorithmic trading close to 35%. [5] Trading using computers has developed to the point where millisecond improvements in network speeds offer a competitive advantage for financial institutions. [6]

In capital markets, low latency is the use of algorithmic trading to react to market events faster than the competition to increase profitability of trades. For example, when executing arbitrage strategies the opportunity to “arb” the market may only present itself for a few milliseconds before parity is achieved. To demonstrate the value that clients put on latency, in 2007 a large global investment bank has stated that every millisecond lost results in $100m per annum in lost opportunity.

Algorithmic trading is a method of executing a large order using automated pre-programmed trading instructions accounting for variables such as time, price, and volume to send small slices of the order out to the market over time. They were developed so that traders do not need to constantly watch a stock and repeatedly send those slices out manually. Popular "algos" include Percentage of Volume, Pegged, VWAP, TWAP, Implementation Shortfall, Target Close. In the twenty-first century, algorithmic trading has been gaining traction with both retail and institutional traders. Algorithmic trading is not an attempt to make a trading profit. It is simply a way to minimize the cost, market impact and risk in execution of an order. It is widely used by investment banks, pension funds, mutual funds, and hedge funds because these institutional traders need to execute large orders in markets that cannot support all of the size at once.

Packet-switched networks

Network latency in a packet-switched network is measured as either one-way (the time from the source sending a packet to the destination receiving it), or round-trip delay time (the one-way latency from source to destination plus the one-way latency from the destination back to the source). Round-trip latency is more often quoted, because it can be measured from a single point. Note that round trip latency excludes the amount of time that a destination system spends processing the packet.[ citation needed ] Many software platforms provide a service called ping that can be used to measure round-trip latency. Ping uses the Internet Control Message Protocol (ICMP) echo request which causes the recipient to send the received packet as an immediate response, thus it provides a rough way of measuring round-trip delay time. Ping cannot perform accurate measurements, [7] principally because ICMP is intended only for diagnostic or control purposes, and differs from real communication protocols such as TCP. Furthermore, routers and internet service providers might apply different traffic shaping policies to different protocols. [8] [9] For more accurate measurements it is better to use specific software, for example: hping, Netperf or Iperf.

In telecommunications, the round-trip delay time (RTD) or round-trip time (RTT) is the length of time it takes for a signal to be sent plus the length of time it takes for an acknowledgement of that signal to be received. This time delay includes the propagation times for the paths between the two communication endpoints.

Ping is a computer network administration software utility used to test the reachability of a host on an Internet Protocol (IP) network. It is available for virtually all operating systems that have networking capability, including most embedded network administration software.

However, in a non-trivial network, a typical packet will be forwarded over multiple links and gateways, each of which will not begin to forward the packet until it has been completely received. In such a network, the minimal latency is the sum of the transmission delay of each link, plus the forwarding latency of each gateway. In practice, minimal latency also includes queuing and processing delays. Queuing delay occurs when a gateway receives multiple packets from different sources heading towards the same destination. Since typically only one packet can be transmitted at a time, some of the packets must queue for transmission, incurring additional delay. Processing delays are incurred while a gateway determines what to do with a newly received packet. Bufferbloat can also cause increased latency that is an order of magnitude or more. The combination of propagation, serialization, queuing, and processing delays often produces a complex and variable network latency profile.

In telecommunication and computer engineering, the queuing delay or queueing delay is the time a job waits in a queue until it can be executed. It is a key component of network delay. In a switched network, queuing delay is the time between the completion of signaling by the call originator and the arrival of a ringing signal at the call receiver. Queuing delay may be caused by delays at the originating switch, intermediate switches, or the call receiver servicing switch. In a data network, queuing delay is the sum of the delays between the request for service and the establishment of a circuit to the called data terminal equipment (DTE). In a packet-switched network, queuing delay is the sum of the delays encountered by a packet between the time of insertion into the network and the time of delivery to the address.

In a network based on packet switching, processing delay is the time it takes routers to process the packet header. Processing delay is a key component in network delay.

Bufferbloat is a cause of high latency in packet-switched networks caused by excess buffering of packets. Bufferbloat can also cause packet delay variation, as well as reduce the overall network throughput. When a router or switch is configured to use excessively large buffers, even very high-speed networks can become practically unusable for many interactive applications like voice over IP (VoIP), online gaming, and even ordinary web surfing.

Latency limits total throughput in reliable two-way communication systems as described by the bandwidth-delay product.

In data communications, bandwidth-delay product is the product of a data link's capacity and its round-trip delay time. The result, an amount of data measured in bits, is equivalent to the maximum amount of data on the network circuit at any given time, i.e., data that has been transmitted but not yet acknowledged.

Fiber optics

Latency in fiber optics is largely a function of the speed of light, which is 299,792,458 meters/second in vacuum. This would equate to a latency of 3.33  µs for every kilometer of path length. The index of refraction of most fibre optic cables is about 1.5, meaning that light travels about 1.5 times as fast in a vacuum as it does in the cable. This works out to about 5.0 µs of latency for every kilometer. In shorter metro networks, higher latency can be experienced due to extra distance in building risers and cross-connects. To calculate latency of a connection, one has to know the distance traveled by the fibre, which is rarely a straight line, since it has to traverse geographic contours and obstacles, such as roads and railway tracks, as well as other rights-of-way.

Due to imperfections in the fibre, light degrades as it is transmitted through it. For distances of greater than 100 kilometers, amplifiers or regenerators are deployed. Latency introduced by these components needs to be taken into account.

Satellite transmission

Satellites in geostationary orbits are far enough away from Earth that communication latency becomes significant — about a quarter of a second for a trip from one ground-based transmitter to the satellite and back to another ground-based transmitter; close to half a second for two-way communication from one Earth station to another and then back to the first. Low Earth orbit is sometimes used to cut this delay, at the expense of more complicated satellite tracking on the ground and requiring more satellites in the satellite constellation to ensure continuous coverage.

Audio latency

Audio latency is the delay between when an audio signal enters and when it emerges from a system. Potential contributors to latency in an audio system include analog-to-digital conversion, buffering, digital signal processing, transmission time, digital-to-analog conversion and the speed of sound in air.

Operational latency

Any individual workflow within a system of workflows can be subject to some type of operational latency. It may even be the case that an individual system may have more than one type of latency, depending on the type of participant or goal-seeking behavior. This is best illustrated by the following two examples involving air travel.

From the point of view of a passenger, latency can be described as follows. Suppose John Doe flies from London to New York. The latency of his trip is the time it takes him to go from his house in England to the hotel he is staying at in New York. This is independent of the throughput of the London-New York air link whether there were 100 passengers a day making the trip or 10000, the latency of the trip would remain the same.

From the point of view of flight operations personnel, latency can be entirely different. Consider the staff at the London and New York airports. Only a limited number of planes are able to make the transatlantic journey, so when one lands they must prepare it for the return trip as quickly as possible. It might take, for example:

Assuming the above are done consecutively, minimum plane turnaround time is:

35 + 15 + 10 + 30 = 90

However, cleaning, refueling and loading the cargo can be done at the same time. Passengers can be loaded after cleaning is complete. The reduced latency, then, is:

35 + 10 = 45
15
30
Minimum latency = 45

The people involved in the turnaround are interested only in the time it takes for their individual tasks. When all of the tasks are done at the same time, however, it is possible to reduce the latency to the length of the longest task. If some steps have prerequisites, it becomes more difficult to perform all steps in parallel. In the example above, the requirement to clean the plane before loading passengers results in a minimum latency longer than any single task.

Mechanical latency

Any mechanical process encounters limitations modeled by Newtonian physics. The behavior of disk drives provides an example of mechanical latency. Here, it is the time needed for the data encoded on a platter to rotate from its current position to a position adjacent to the read-write head as well as the seek time required for the actuator arm for the read-write head to be positioned above the appropriate track. This is also known as rotational latency and seek time since the basic term latency is also applied to the time required by a computer's electronics and software to perform polling, interrupts, and direct memory access.

Computer hardware and operating system latency

Computers run sets of instructions called a process. In operating systems, the execution of the process can be postponed if other processes are also executing. In addition, the operating system can schedule when to perform the action that the process is commanding. For example, suppose a process commands that a computer card's voltage output be set high-low-high-low and so on at a rate of 1000 Hz. The operating system may choose to adjust the scheduling of each transition (high-low or low-high) based on an internal clock. The latency is the delay between the process instruction commanding the transition and the hardware actually transitioning the voltage from high to low or low to high.

On Microsoft Windows, it appears[ original research? ] that the timing of commands to hardware is not exact. Empirical data suggest that Windows (using the Windows sleep timer which accepts millisecond sleep times) will schedule on a 1024 Hz clock and will delay 24 of 1024 transitions per second to make an average of 1000 Hz for the update rate.[ citation needed ] This can have serious ramifications for discrete-time algorithms that rely on fairly consistent timing between updates such as those found in control theory. The sleep function or similar windows API were at no point designed for accurate timing purposes. Certain multimedia-oriented API routines like timeGetTime() and its siblings provide better timing consistency. However, consumer- and server-grade Windows (as of 2011 those based on NT kernel) were not to be real-time operating systems. Drastically more accurate timings could be achieved by using dedicated hardware extensions and control-loop cards.

Linux may have the same problems with scheduling of hardware I/O.[ citation needed ] The problem in Linux is mitigated by support for posix real-time extensions, and the possibility of using a kernel with the PREEMPT_RT patch applied.

On embedded systems, the real-time execution of instructions is often supported by the low-level embedded operating system.

In simulators and simulation

In simulation applications, 'latency' refers to the time delay, normally measured in milliseconds (1/1,000 sec), between initial input and an output clearly discernible to the simulator trainee or simulator subject. Latency is sometimes also called transport delay.

See also

Related Research Articles

The Internet Control Message Protocol (ICMP) is a supporting protocol in the Internet protocol suite. It is used by network devices, including routers, to send error messages and operational information indicating, for example, that a requested service is not available or that a host or router could not be reached. ICMP differs from transport protocols such as TCP and UDP in that it is not typically used to exchange data between systems, nor is it regularly employed by end-user network applications.

In general terms, throughput is the maximum rate of production or the maximum rate at which something can be processed.

In computing, traceroute and tracert are computer network diagnostic commands for displaying the route (path) and measuring transit delays of packets across an Internet Protocol (IP) network. The history of the route is recorded as the round-trip times of the packets received from each successive host in the route (path); the sum of the mean times in each hop is a measure of the total time spent to establish the connection. Traceroute proceeds unless all (three) sent packets are lost more than twice; then the connection is lost and the route cannot be evaluated. Ping, on the other hand, only computes the final round-trip times from the destination point.

Delay may refer to:

Satellite Internet access is Internet access provided through communications satellites. Modern consumer grade satellite Internet service is typically provided to individual users through geostationary satellites that can offer relatively high data speeds, with newer satellites using Ku band to achieve downstream data speeds up to 506 Mbit/s.

End-to-end delay or one-way delay (OWD) refers to the time taken for a packet to be transmitted across a network from source to destination. It is a common term in IP network monitoring, and differs from round-trip time (RTT) in that only path in the one direction from source to destination is measured.

Network performance refers to measures of service quality of a network as seen by the customer.

Packet loss occurs when one or more packets of data travelling across a computer network fail to reach their destination. Packet loss is either caused by errors in data transmission, typically across wireless networks, or network congestion. Packet loss is measured as a percentage of packets lost with respect to packets sent.

In telecommunication networks, the transmission time, is the amount of time from the beginning until the end of a message transmission. In the case of a digital message, it is the time from the first bit until the last bit of a message has left the transmitting node. The packet transmission time in seconds can be obtained from the packet size in bit and the bit rate in bit/s as:

Latency refers to a short period of delay between when an audio signal enters and when it emerges from a system. Potential contributors to latency in an audio system include analog-to-digital conversion, buffering, digital signal processing, transmission time, digital-to-analog conversion and the speed of sound in the transmission medium.

Network delay is an important design and performance characteristic of a computer network or telecommunications network. The delay of a network specifies how long it takes for a bit of data to travel across the network from one node or endpoint to another. It is typically measured in multiples or fractions of seconds. Delay may differ slightly, depending on the location of the specific pair of communicating nodes. Although users only care about the total delay of a network, engineers need to perform precise measurements. Thus, engineers usually report both the maximum and average delay, and they divide the delay into several parts:

In packet switching networks, traffic flow, packet flow or network flow is a sequence of packets from a source computer to a destination, which may be another host, a multicast group, or a broadcast domain. RFC 2722 defines traffic flow as "an artificial logical equivalent to a call or connection." RFC 3697 defines traffic flow as "a sequence of packets sent from a particular source to a particular unicast, anycast, or multicast destination that the source desires to label as a flow. A flow could consist of all packets in a specific transport connection or a media stream. However, a flow is not necessarily 1:1 mapped to a transport connection." Flow is also defined in RFC 3917 as "a set of IP packets passing an observation point in the network during a certain time interval."

Audio Video Bridging

Audio Video Bridging (AVB) is a common name for the set of technical standards developed by the Institute of Electrical and Electronics Engineers (IEEE) Audio Video Bridging Task Group of the IEEE 802.1 standards committee. This task group was renamed to Time-Sensitive Networking Task Group in November 2012 to reflect the expanded scope of work.

ITU-T Y.156sam Ethernet Service Activation Test Methodology is a draft recommendation under study by the ITU-T describing a new testing methodology adapted to the multiservice reality of packet-based networks.

In network routing, CoDel for controlled delay is a scheduling algorithm for the network scheduler developed by Van Jacobson and Kathleen Nichols. It is designed to overcome bufferbloat in network links, such as routers, by setting limits on the delay network packets experience as they pass through the buffer. CoDel aims at improving on the overall performance of the random early detection (RED) algorithm by addressing some of its fundamental misconceptions, as perceived by Jacobson, and by being easier to manage.

References

  1. "What is Latency?" / "Latency" Retrieved 2015-02-22.
  2. TABB (2009). High Frequency Trading Technology: a TABB Anthology.
  3. Mackenzie, Michael; Grant, Jeremy (2009). "The dash to flash" (PDF). Financial Times. Retrieved 18 July 2011. extracting tiny slices of profit from trading small numbers of shares in companies, often between different trading platforms, with success relying on minimal variations in speed - or "latency", in the trading vernacular.
  4. Hasbrouck, Joel; Saar, Gideon. "Low-Latency Trading" (PDF). p. 1. Retrieved 18 July 2011.
  5. Heires, Katherine (July 2009). "Code Green: Goldman Sachs & UBS Cases Heighten Need to Keep Valuable Digital Assets From Walking Out The Door. Millions in Trading Profits May Depend On It" (PDF). Securities Industry News. Retrieved 18 July 2011.
  6. "High-frequency trading: when milliseconds mean millions". The Telegraph. Retrieved 2018-03-25.
  7. "Don't misuse ping!" . Retrieved 29 April 2015.
  8. Shane Chen (2005). "Network Protocols Discussion / Traffic Shaping Strategies". knowplace.org. Archived from the original on 2007-01-09.
  9. "Basic QoS part 1 – Traffic Policing and Shaping on Cisco IOS Router". The CCIE R&S. Retrieved 29 April 2015.