Vectored I/O

Last updated

In computing, vectored I/O, also known as scatter/gather I/O, is a method of input and output by which a single procedure call sequentially reads data from multiple buffers and writes it to a single data stream (gather), or reads data from a data stream and writes it to multiple buffers (scatter), as defined in a vector of buffers. Scatter/gather refers to the process of gathering data from, or scattering data into, the given set of buffers. Vectored I/O can operate synchronously or asynchronously. The main reasons for using vectored I/O are efficiency and convenience.

Contents

Vectored I/O has several potential uses:

Standards bodies document the applicable functions readv [1] and writev [2] in POSIX 1003.1-2001 and the Single UNIX Specification version 2. The Windows API has analogous functions ReadFileScatter and WriteFileGather; however, unlike the POSIX functions, they require the alignment of each buffer on a memory page. [3] Winsock provides separate WSASend and WSARecv functions without this requirement.

While working directly with a vector of buffers can be significantly harder than working with a single buffer, using higher-level APIs [4] for working efficiently can mitigate the difficulties.

Examples

The following example in the C programming language prints "Hello, Wikipedia Community!" to the standard output. Each word is saved into a single buffer and with only one call to writev(), all buffers are printed to the standard output.

#include<stdio.h>#include<stdlib.h>#include<string.h>#include<unistd.h>#include<sys/uio.h>intmain(intargc,char*argv[]){constcharbuf1[]="Hello, ";constcharbuf2[]="Wikipedia ";constcharbuf3[]="Community!\n";structiovecbufs[]={{.iov_base=(void*)buf1,.iov_len=strlen(buf1)},{.iov_base=(void*)buf2,.iov_len=strlen(buf2)},{.iov_base=(void*)buf3,.iov_len=strlen(buf3)},};if(writev(STDOUT_FILENO,bufs,sizeof(bufs)/sizeof(bufs[0]))==-1){perror("writev()");exit(EXIT_FAILURE);}returnEXIT_SUCCESS;}

See also

Related Research Articles

The Cyclone programming language was intended to be a safe dialect of the C language. It avoids buffer overflows and other vulnerabilities that are possible in C programs by design, without losing the power and convenience of C as a tool for system programming. It is no longer supported by its original developers, with the reference tooling not supporting 64-bit platforms. The Rust language is mentioned by the original developers for having integrated many of the same ideas Cyclone had.

<span class="mw-page-title-main">GNU Debugger</span> Source-level debugger

The GNU Debugger (GDB) is a portable debugger that runs on many Unix-like systems and works for many programming languages, including Ada, Assembly, C, C++, D, Fortran, Haskell, Go, Objective-C, OpenCL C, Modula-2, Pascal, Rust, and partially others.

Berkeley sockets is an application programming interface (API) for Internet sockets and Unix domain sockets, used for inter-process communication (IPC). It is commonly implemented as a library of linkable modules. It originated with the 4.2BSD Unix operating system, which was released in 1983.

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own, such as devices that use magnetic tape. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization. POSIX abandoned tar in favor of pax, yet tar sees continued widespread use.

In computer programming, standard streams are preconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin), standard output (stdout) and standard error (stderr). Originally I/O happened via a physically connected system console, but standard streams abstract this. When a command is executed via an interactive shell, the streams are typically connected to the text terminal on which the shell is running, but can be changed with redirection or a pipeline. More generally, a child process inherits the standard streams of its parent process.

The C programming language provides many standard library functions for file input and output. These functions make up the bulk of the C standard library header <stdio.h>. The functionality descends from a "portable I/O package" written by Mike Lesk at Bell Labs in the early 1970s, and officially became part of the Unix operating system in Version 7.

In computing, mmap(2) is a POSIX-compliant Unix system call that maps files or devices into memory. It is a method of memory-mapped file I/O. It implements demand paging because file contents are not immediately read from disk and initially use no physical RAM at all. The actual reads from disk are performed after a specific location is accessed, in a lazy manner. After the mapping is no longer needed, the pointers must be unmapped with munmap(2). Protection information—for example, marking mapped regions as executable—can be managed using mprotect(2), and special treatment can be enforced using madvise(2).

In Unix and Unix-like computer operating systems, a file descriptor is a process-unique identifier (handle) for a file or other input/output resource, such as a pipe or network socket.

The One Definition Rule (ODR) is an important rule of the C++ programming language that prescribes that classes/structs and non-inline functions cannot have more than one definition in the entire program and template and types cannot have more than one definition by translation unit. It is defined in the ISO C++ Standard 2003, at section 3.2. Some other programming languages have similar but differently defined rules towards the same objective.

In computer science, asynchronous I/O is a form of input/output processing that permits other processing to continue before the I/O operation has finished. A name used for asynchronous I/O in the Windows API is overlapped I/O.

In computing, ioctl is a system call for device-specific input/output operations and other operations which cannot be expressed by regular file semantics. It takes a parameter specifying a request code; the effect of a call depends completely on the request code. Request codes are often device-specific. For instance, a CD-ROM device driver which can instruct a physical device to eject a disc would provide an ioctl request code to do so. Device-independent request codes are sometimes used to give userspace access to kernel functions which are only used by core system software or still under development.

For most file systems, a program initializes access to a file in a file system using the open system call. This allocates resources associated to the file, and returns a handle that the process will use to refer to that file. In some cases the open is performed by the first access.

sizeof is a unary operator in the programming languages C and C++. It generates the storage size of an expression or a data type, measured in the number of char-sized units. Consequently, the construct sizeof (char) is guaranteed to be 1. The actual number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern computing platforms this is eight bits. The result of sizeof has an unsigned integer type that is usually denoted by size_t.

In computing, exec is a functionality of an operating system that runs an executable file in the context of an already existing process, replacing the previous executable. This act is also referred to as an overlay. It is especially important in Unix-like systems, although it also exists elsewhere. As no new process is created, the process identifier (PID) does not change, but the machine code, data, heap, and stack of the process are replaced by those of the new program.

setjmp.h is a header defined in the C standard library to provide "non-local jumps": control flow that deviates from the usual subroutine call and return sequence. The complementary functions setjmp and longjmp provide this functionality.

stdarg.h is a header in the C standard library of the C programming language that allows functions to accept an indefinite number of arguments. It provides facilities for stepping through a list of function arguments of unknown number and type. C++ provides this functionality in the header cstdarg.

select is a system call and application programming interface (API) in Unix-like and POSIX-compliant operating systems for examining the status of file descriptors of open input/output channels. The select system call is similar to the poll facility introduced in UNIX System V and later operating systems. However, with the c10k problem, both select and poll have been superseded by the likes of kqueue, epoll, /dev/poll and I/O completion ports.

Secure coding is the practice of developing computer software in such a way that guards against the accidental introduction of security vulnerabilities. Defects, bugs and logic flaws are consistently the primary cause of commonly exploited software vulnerabilities. Through the analysis of thousands of reported vulnerabilities, security professionals have discovered that most vulnerabilities stem from a relatively small number of common software programming errors. By identifying the insecure coding practices that lead to these errors and educating developers on secure alternatives, organizations can take proactive steps to help significantly reduce or eliminate vulnerabilities in software before deployment.

The write is one of the most basic routines provided by a Unix-like operating system kernel. It writes data from a buffer declared by the user to a given device, such as a file. This is the primary way to output data from a program by directly using a system call. The destination is identified by a numeric code. The data to be written, for instance a piece of text, is defined by a pointer and a size, given in number of bytes.

cat (Unix) Unix command utility

cat is a standard Unix utility that reads files sequentially, writing them to standard output. The name is derived from its function to (con)catenate files . It has been ported to a number of operating systems.

References

  1. readv Archived 2008-09-05 at the Wayback Machine in the Single Unix Specification
  2. writev Archived 2007-12-17 at the Wayback Machine in the Single Unix Specification
  3. ReadFileScatter in MSDN Library
  4. Vstr Archived 2017-03-05 at the Wayback Machine the Vectored String API