Vectored I/O

Last updated

In computing, vectored I/O, also known as scatter/gather I/O, is a method of input and output by which a single procedure call sequentially reads data from multiple buffers and writes it to a single data stream (gather), or reads data from a data stream and writes it to multiple buffers (scatter), as defined in a vector of buffers. Scatter/gather refers to the process of gathering data from, or scattering data into, the given set of buffers. Vectored I/O can operate synchronously or asynchronously. The main reasons for using vectored I/O are efficiency and convenience.

Contents

Vectored I/O has several potential uses:

Standards bodies document the applicable functions readv [1] and writev [2] in POSIX 1003.1-2001 and the Single UNIX Specification version 2. The Windows API has analogous functions ReadFileScatter and WriteFileGather; however, unlike the POSIX functions, they require the alignment of each buffer on a memory page. [3] Winsock provides separate WSASend and WSARecv functions without this requirement.

While working directly with a vector of buffers can be significantly harder than working with a single buffer, using higher-level APIs [4] for working efficiently can mitigate the difficulties.

Examples

The following example in the C programming language prints "Hello, Wikipedia Community!" to the standard output. Each word is saved into a single buffer and with only one call to writev(), all buffers are printed to the standard output.

#include<stdio.h>#include<stdlib.h>#include<string.h>#include<unistd.h>#include<sys/uio.h>intmain(intargc,char*argv[]){constcharbuf1[]="Hello, ";constcharbuf2[]="Wikipedia ";constcharbuf3[]="Community!\n";structiovecbufs[]={{.iov_base=(void*)buf1,.iov_len=strlen(buf1)},{.iov_base=(void*)buf2,.iov_len=strlen(buf2)},{.iov_base=(void*)buf3,.iov_len=strlen(buf3)},};if(writev(STDOUT_FILENO,bufs,sizeof(bufs)/sizeof(bufs[0]))==-1){perror("writev()");exit(EXIT_FAILURE);}returnEXIT_SUCCESS;}

Rust

Rust provides the write_vectored method on the Write trait. [5]

usestd::io::IoSlice;usestd::io::prelude::*;usestd::fs::File;fnmain()-> std::io::Result<()>{letdata1=[1;8];letdata2=[15;8];letio_slice1=IoSlice::new(&data1);letio_slice2=IoSlice::new(&data2);letmutbuffer=File::create("foo.txt")?;// Writes some prefix of the byte string, not necessarily all of it.buffer.write_vectored(&[io_slice1,io_slice2])?;Ok(())}

See also

Related Research Articles

The Cyclone programming language was intended to be a safe dialect of the C language. It avoids buffer overflows and other vulnerabilities that are possible in C programs by design, without losing the power and convenience of C as a tool for system programming. It is no longer supported by its original developers, with the reference tooling not supporting 64-bit platforms. The Rust language is mentioned by the original developers for having integrated many of the same ideas Cyclone had.

A Berkeley (BSD) socket is an application programming interface (API) for Internet domain sockets and Unix domain sockets, used for inter-process communication (IPC). It is commonly implemented as a library of linkable modules. It originated with the 4.2BSD Unix operating system, which was released in 1983.

In computer programming, standard streams are preconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin), standard output (stdout) and standard error (stderr). Originally I/O happened via a physically connected system console, but standard streams abstract this. When a command is executed via an interactive shell, the streams are typically connected to the text terminal on which the shell is running, but can be changed with redirection or a pipeline. More generally, a child process inherits the standard streams of its parent process.

The C programming language provides many standard library functions for file input and output. These functions make up the bulk of the C standard library header <stdio.h>. The functionality descends from a "portable I/O package" written by Mike Lesk at Bell Labs in the early 1970s, and officially became part of the Unix operating system in Version 7.

<span class="mw-page-title-main">Foreach loop</span> Control flow statement for traversing items in a collection

In computer programming, foreach loop is a control flow statement for traversing items in a collection. foreach is usually used in place of a standard for loop statement. Unlike other for loop constructs, however, foreach loops usually maintain no explicit counter: they essentially say "do this to everything in this set", rather than "do this x times". This avoids potential off-by-one errors and makes code simpler to read. In object-oriented languages, an iterator, even if implicit, is often used as the means of traversal.

In computing, a named pipe is an extension to the traditional pipe concept on Unix and Unix-like systems, and is one of the methods of inter-process communication (IPC). The concept is also found in OS/2 and Microsoft Windows, although the semantics differ substantially. A traditional pipe is "unnamed" and lasts only as long as the process. A named pipe, however, can last as long as the system is up, beyond the life of the process. It can be deleted if no longer used. Usually a named pipe appears as a file, and generally processes attach to it for IPC.

stat (system call) Unix system call

stat is a Unix system call that returns file attributes about an inode. The semantics of stat vary between operating systems. As an example, Unix command ls uses this system call to retrieve information on files that includes:

The bitap algorithm is an approximate string matching algorithm. The algorithm tells whether a given text contains a substring which is "approximately equal" to a given pattern, where approximate equality is defined in terms of Levenshtein distance – if the substring and pattern are within a given distance k of each other, then the algorithm considers them equal. The algorithm begins by precomputing a set of bitmasks containing one bit for each element of the pattern. Then it is able to do most of the work with bitwise operations, which are extremely fast.

typedef is a reserved keyword in the programming languages C, C++, and Objective-C. It is used to create an additional name (alias) for another data type, but does not create a new type, except in the obscure case of a qualified typedef of an array type where the typedef qualifiers are transferred to the array element type. As such, it is often used to simplify the syntax of declaring complex data structures consisting of struct and union types, although it is also commonly used to provide specific descriptive type names for integer data types of varying sizes.

In computing, ioctl is a system call for device-specific input/output operations and other operations which cannot be expressed by regular file semantics. It takes a parameter specifying a request code; the effect of a call depends completely on the request code. Request codes are often device-specific. For instance, a CD-ROM device driver which can instruct a physical device to eject a disc would provide an ioctl request code to do so. Device-independent request codes are sometimes used to give userspace access to kernel functions which are only used by core system software or still under development.

For most file systems, a program initializes access to a file in a file system using the open system call. This allocates resources associated to the file, and returns a handle that the process will use to refer to that file. In some cases the open is performed by the first access.

sizeof is a unary operator in the programming languages C and C++. It generates the storage size of an expression or a data type, measured in the number of char-sized units. Consequently, the construct sizeof (char) is guaranteed to be 1. The actual number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern computing platforms this is eight bits. The result of sizeof has an unsigned integer type that is usually denoted by size_t.

In client-server computing, a Unix domain socket is a Berkeley socket that allows data to be exchanged between two processes executing on the same Unix or Unix-like host computer. This is similar to an Internet domain socket that allows data to be exchanged between two processes executing on different host computers.

setjmp.h is a header defined in the C standard library to provide "non-local jumps": control flow that deviates from the usual subroutine call and return sequence. The complementary functions setjmp and longjmp provide this functionality.

select is a system call and application programming interface (API) in Unix-like and POSIX-compliant operating systems for examining the status of file descriptors of open input/output channels. The select system call is similar to the poll facility introduced in UNIX System V and later operating systems. However, with the c10k problem, both select and poll have been superseded by the likes of kqueue, epoll, /dev/poll and I/O completion ports.

Secure coding is the practice of developing computer software in such a way that guards against the accidental introduction of security vulnerabilities. Defects, bugs and logic flaws are consistently the primary cause of commonly exploited software vulnerabilities. Through the analysis of thousands of reported vulnerabilities, security professionals have discovered that most vulnerabilities stem from a relatively small number of common software programming errors. By identifying the insecure coding practices that lead to these errors and educating developers on secure alternatives, organizations can take proactive steps to help significantly reduce or eliminate vulnerabilities in software before deployment.

The write is one of the most basic routines provided by a Unix-like operating system kernel. It writes data from a buffer declared by the user to a given device, such as a file. This is the primary way to output data from a program by directly using a system call. The destination is identified by a numeric code. The data to be written, for instance a piece of text, is defined by a pointer and a size, given in number of bytes.

In the C++ programming language, input/output library refers to a family of class templates and supporting functions in the C++ Standard Library that implement stream-based input/output capabilities. It is an object-oriented alternative to C's FILE-based streams from the C standard library.

Although C++ is one of the most widespread programming languages, many prominent software engineers criticize C++ arguing that it is overly complex and fundamentally flawed. Among the critics have been: Robert Pike, Joshua Bloch, Linus Torvalds, Donald Knuth, Richard Stallman, and Ken Thompson. C++ has been widely adopted and implemented as a systems language through most of its existence. It has been used to build many pieces of important software such as operating systems, runtime systems, programming language interpreters, parsers, lexers, compilers, etc.

<span class="mw-page-title-main">Zig (programming language)</span> General-purpose programming language

Zig is an imperative, general-purpose, statically typed, compiled system programming language designed by Andrew Kelley. It is free and open-source software, released under an MIT License.

References

  1. readv Archived 2008-09-05 at the Wayback Machine in the Single Unix Specification
  2. writev Archived 2007-12-17 at the Wayback Machine in the Single Unix Specification
  3. ReadFileScatter in MSDN Library
  4. Vstr Archived 2017-03-05 at the Wayback Machine the Vectored String API
  5. "Write in std::io - Rust". doc.rust-lang.org. Retrieved 26 January 2025.