I/O bound

Last updated

In computer science, I/O bound refers to a condition in which the time it takes to complete a computation is determined principally by the period spent waiting for input/output operations to be completed, which can be juxtaposed with being CPU bound. This circumstance arises when the rate at which data is requested is slower than the rate it is consumed or, in other words, more time is spent requesting data than processing it. [1]

Contents

I/O bound as an inherent problem in computing

The I/O bound state has been identified as a problem in computing almost since its inception. The Von Neumann architecture, which is employed by many computing devices, this involves multiple possible solutions such as implementing a logically separate central processor unit which along with storing the instructions of the program also retrieves actual data usually from main memory and makes use of this more accessible data for working. When the process is terminated it writes back the results to the original storage (usually the main memory).

Since data must be moved between the CPU and memory along a bus which has a limited data transfer rate, there exists a condition that is known as the Von Neumann bottleneck. Put simply, this means that the data bandwidth between the CPU and memory tends to limit the overall speed of computation. In terms of the actual technology that makes up a computer, the Von Neumann Bottleneck predicts that it is easier to make the CPU perform calculations faster than it is to supply it with data at the necessary rate for this to be possible.

In recent history, the Von Neumann bottleneck has become more apparent. The design philosophy of modern computers is based upon a physically separate CPU and main memory. It is possible to make the CPU run at a high data transfer rate because data is moved between locations inside them across tiny distances. The physical separation between CPU and main memory, however, requires a data bus to move data across comparatively long distances of centimetres or more. The problem of making this part of the system operate sufficiently fast to keep up with the CPU has been a great challenge to designers. [2]

I/O bound as a practical problem

The I/O bound state is considered undesirable because it means that the CPU must stall its operation while waiting for data to be loaded or unloaded from main memory or secondary storage. With faster computation speed being the primary goal of new computer designs and components such as the CPU and memory being expensive, there is a strong imperative to avoid I/O bound states and eliminating them can yield a more economic improvement in performance than upgrading the CPU or memory.

As CPU gets faster, processes tend to get more I/O-bound

Or in simpler terms:

As CPU gets faster, processes tend to not increase in speed in proportion to CPU speed because they get more I/O-bound.

This means that I/O bound processes are slower than non-I/O bound processes, not faster. This is due to increases in the rate of data processing in the core, while the rate at which data is transferred from storage to the processor does not increase with it. As CPU clock speed increases, allowing more instructions to be executed in a given time window, the limiting factor of effective execution is the rate at which instructions can be delivered to the processor from storage, and sent from the processor to their destination. In short, programs naturally shift to being more and more I/O bound. [3]

Comparison with CPU-bound

Assume we have one CPU-bound process and many I/O-bound processes. As the processes flow around the system, the following scenario may result. The CPU-bound process will get and hold the CPU. During this time, all the other processes will finish their I/O and will move into the ready queue, waiting for the CPU. While the processes wait in the ready queue, the I/O devices are idle. Eventually, the CPU-bound process finishes its CPU burst and moves to an I/O device. All the I/O-bound processes, which have short CPU bursts, execute quickly and move back to the I/O queues. At this point, the CPU sits idle. The CPU-bound process will then move back to the ready queue and be allocated the CPU. Again, all the I/O processes end up waiting in the ready queue until the CPU-bound process is done. There is a convoy effect as all the other processes wait for the one big process to get off the CPU. This effect results in lower CPU and device utilization than might be possible if the shorter processes were allowed to go first. [4]

See also

Related Research Articles

<span class="mw-page-title-main">Central processing unit</span> Central computer component which executes instructions

A central processing unit (CPU)—also called a central processor or main processor—is the most important processor in a given computer. Its electronic circuitry executes instructions of a computer program, such as arithmetic, logic, controlling, and input/output (I/O) operations. This role contrasts with that of external components, such as main memory and I/O circuitry, and specialized coprocessors such as graphics processing units (GPUs).

<span class="mw-page-title-main">Computer data storage</span> Storage of digital data readable by computers

Computer data storage is a technology consisting of computer components and recording media that are used to retain digital data. It is a core function and fundamental component of computers.

The control unit (CU) is a component of a computer's central processing unit (CPU) that directs the operation of the processor. A CU typically uses a binary decoder to convert coded instructions into timing and control signals that direct the operation of the other units.

In computing, a context switch is the process of storing the state of a process or thread, so that it can be restored and resume execution at a later point, and then restoring a different, previously saved, state. This allows multiple processes to share a single central processing unit (CPU), and is an essential feature of a multiprogramming or multitasking operating system. In a traditional CPU, each process - a program in execution - utilizes the various CPU registers to store data and hold the current state of the running process. However, in a multitasking operating system, the operating system switches between processes or threads to allow the execution of multiple processes simultaneously. For every switch, the operating system must save the state of the currently running process, followed by loading the next process state, which will run on the CPU. This sequence of operations that stores the state of the running process and the loading of the following running process is called a context switch.

<span class="mw-page-title-main">Intel 8088</span> Intel microprocessor model

The Intel 8088 microprocessor is a variant of the Intel 8086. Introduced on June 1, 1979, the 8088 has an eight-bit external data bus instead of the 16-bit bus of the 8086. The 16-bit registers and the one megabyte address range are unchanged, however. In fact, according to the Intel documentation, the 8086 and 8088 have the same execution unit (EU)—only the bus interface unit (BIU) is different. The 8088 was used in the original IBM PC and in IBM PC compatible clones.

Direct memory access (DMA) is a feature of computer systems that allows certain hardware subsystems to access main system memory independently of the central processing unit (CPU).

<span class="mw-page-title-main">Harvard architecture</span> Computer architecture where code and data each have a separate bus

The Harvard architecture is a computer architecture with separate storage and signal pathways for instructions and data. It is often contrasted with the von Neumann architecture, where program instructions and data share the same memory and pathways.

In computer science, algorithmic efficiency is a property of an algorithm which relates to the amount of computational resources used by the algorithm. An algorithm must be analyzed to determine its resource usage, and the efficiency of an algorithm can be measured based on the usage of different resources. Algorithmic efficiency can be thought of as analogous to engineering productivity for a repeating or continuous process.

<span class="mw-page-title-main">Front-side bus</span> Type of computer communication interface

The front-side bus (FSB) is a computer communication interface (bus) that was often used in Intel-chip-based computers during the 1990s and 2000s. The EV6 bus served the same function for competing AMD CPUs. Both typically carry data between the central processing unit (CPU) and a memory controller hub, known as the northbridge.

<span class="mw-page-title-main">Load (computing)</span> Amount of computational work that a computer system performs

In UNIX computing, the system load is a measure of the amount of computational work that a computer system performs. The load average represents the average system load over a period of time. It conventionally appears in the form of three numbers which represent the system load during the last one-, five-, and fifteen-minute periods.

In computing, scheduling is the action of assigning resources to perform tasks. The resources may be processors, network links or expansion cards. The tasks may be threads, processes or data flows.

<span class="mw-page-title-main">Spooling</span> Form of multitasking in computers

In computing, spooling is a specialized form of multi-programming for the purpose of copying data between different devices. In contemporary systems, it is usually used for mediating between a computer application and a slow peripheral, such as a printer. Spooling allows programs to "hand off" work to be done by the peripheral and then proceed to other tasks, or to not begin until input has been transcribed. A dedicated program, the spooler, maintains an orderly sequence of jobs for the peripheral and feeds it data at its own rate. Conversely, for slow input peripherals, such as a card reader, a spooler can maintain a sequence of computational jobs waiting for data, starting each job when all of the relevant input is available; see batch processing. The spool itself refers to the sequence of jobs, or the storage area where they are held. In many cases, the spooler is able to drive devices at their full rated speed with minimal impact on other processing.

<span class="mw-page-title-main">Von Neumann architecture</span> Computer architecture where code and data share a common bus

The von Neumann architecture—also known as the von Neumann model or Princeton architecture—is a computer architecture based on a 1945 description by John von Neumann, and by others, in the First Draft of a Report on the EDVAC. The document describes a design architecture for an electronic digital computer with these components:

In computer science, a task, job or process is said to be CPU-bound when the time it takes for it to complete is determined principally by the speed of the central processor. The term can also refer to the condition a computer running such a workload is in, in which its processor utilization is high, perhaps at 100% usage for many seconds or minutes, and interrupts generated by peripherals may be processed slowly or be indefinitely delayed.

In computer science, a data buffer is a region of a memory used to store data temporarily while it is being moved from one place to another. Typically, the data is stored in a buffer as it is retrieved from an input device or just before it is sent to an output device. However, a buffer may be used when data is moved between processes within a computer. That is comparable to buffers in telecommunication. Buffers can be implemented in a fixed memory location in hardware or by using a virtual data buffer in software that points at a location in the physical memory.

Computational RAM (C-RAM) is random-access memory with processing elements integrated on the same chip. This enables C-RAM to be used as a SIMD computer. It also can be used to more efficiently use memory bandwidth within a memory chip. The general technique of doing computations in memory is called Processing-In-Memory (PIM).

In computing, computer performance is the amount of useful work accomplished by a computer system. Outside of specific contexts, computer performance is estimated in terms of accuracy, efficiency and speed of executing computer program instructions. When it comes to high computer performance, one or more of the following factors might be involved:

<span class="mw-page-title-main">History of general-purpose CPUs</span> History of processors used in general purpose computers

The history of general-purpose CPUs is a continuation of the earlier history of computing hardware.

<span class="mw-page-title-main">Computer hardware</span> Physical components of a computer

Computer hardware includes the physical parts of a computer, such as the case, central processing unit (CPU), random access memory (RAM), monitor, mouse, keyboard, computer data storage, graphics card, sound card, speakers and motherboard.

<span class="mw-page-title-main">Input/output</span> Communication between an information processing system and the outside world

In computing, input/output is the communication between an information processing system, such as a computer, and the outside world, possibly a human or another information processing system. Inputs are the signals or data received by the system and outputs are the signals or data sent from it. The term can also be used as part of an action; to "perform I/O" is to perform an input or output operation.

References

  1. Identifying disk I/O-bound systems
  2. Erik Engbrecht's Blog: What does I/O bound really mean?
  3. slebetman. "why-i-o-bound-processes-are-faster" . Retrieved 9 August 2014.
  4. Silberschatz, Abraham; Galvin, Peter B.; Gagne, Greg (2013). Operating System Concepts. Vol. 9. John Wiley & Sons. ISBN   978-1-118-06333-0.