Read (system call)

Last updated

In modern POSIX compliant operating systems, a program that needs to access data from a file stored in a file system uses the read system call. The file is identified by a file descriptor that is normally obtained from a previous call to open. This system call reads in data in bytes, the number of which is specified by the caller, from the file and stores then into a buffer supplied by the calling process.

Contents

The read system call takes three arguments:

  1. The file descriptor of the file.
  2. The buffer where the read data is to be stored.
  3. The number of bytes to be read from the file.

POSIX usage

The read system call interface is standardized by the POSIX specification. Data from a file is read by calling the read function:

ssize_tread(intfd,void*buf,size_tcount);

The value returned is the number of bytes read (zero indicates end of file) and the file position is advanced by this number. It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a terminal), or because the system call was interrupted by a signal.

Alternatively, -1 is returned when an error occurs, in such a case errno is set appropriately and further it is left unspecified whether the file position (if any) changes.

See also

Related Research Articles

NT File System (NTFS) is a proprietary journaling file system developed by Microsoft in the 1990s.

Berkeley sockets is an application programming interface (API) for Internet sockets and Unix domain sockets, used for inter-process communication (IPC). It is commonly implemented as a library of linkable modules. It originated with the 4.2BSD Unix operating system, which was released in 1983.

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own, such as devices that use magnetic tape. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization. POSIX abandoned tar in favor of pax, yet tar sees continued widespread use.

In computer programming, standard streams are preconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin), standard output (stdout) and standard error (stderr). Originally I/O happened via a physically connected system console, but standard streams abstract this. When a command is executed via an interactive shell, the streams are typically connected to the text terminal on which the shell is running, but can be changed with redirection or a pipeline. More generally, a child process inherits the standard streams of its parent process.

The C programming language provides many standard library functions for file input and output. These functions make up the bulk of the C standard library header <stdio.h>. The functionality descends from a "portable I/O package" written by Mike Lesk at Bell Labs in the early 1970s, and officially became part of the Unix operating system in Version 7.

In computing, a bus error is a fault raised by hardware, notifying an operating system (OS) that a process is trying to access memory that the CPU cannot physically address: an invalid address for the address bus, hence the name. In modern use on most architectures these are much rarer than segmentation faults, which occur primarily due to memory access violations: problems in the logical address or permissions.

C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely malloc, realloc, calloc, aligned_alloc and free.

errno.h is a header file in the standard library of the C programming language. It defines macros for reporting and retrieving error conditions using the symbol errno.

<span class="mw-page-title-main">Commodore DOS</span> Operating system

Commodore DOS, also known as CBM DOS, is the disk operating system used with Commodore's 8-bit computers. Unlike most other DOSes, which are loaded from disk into the computer's own RAM and executed there, CBM DOS is executed internally in the drive: the DOS resides in ROM chips inside the drive, and is run there by one or more dedicated MOS 6502 family CPUs. Thus, data transfer between Commodore 8-bit computers and their disk drives more closely resembles a local area network connection than typical disk/host transfers.

java.nio is a collection of Java programming language APIs that offer features for intensive I/O operations. It was introduced with the J2SE 1.4 release of Java by Sun Microsystems to complement an existing standard I/O. NIO was developed under the Java Community Process as JSR 51. An extension to NIO that offers a new file system API, called NIO.2, was released with Java SE 7 ("Dolphin").

File locking is a mechanism that restricts access to a computer file, or to a region of a file, by allowing only one user or process to modify or delete it at a specific time and to prevent reading of the file while it's being modified or deleted.

Descriptors are an architectural feature of Burroughs large systems, including the current Unisys Clearpath/MCP systems. Apart from being stack- and tag-based, a notable architectural feature of these systems is that they are descriptor-based. Descriptors are the means of having data that does not reside on the stack such as arrays and objects. Descriptors are also used for string data as in compilers and commercial applications.

For most file systems, a program initializes access to a file in a file system using the open system call. This allocates resources associated to the file, and returns a handle that the process will use to refer to that file. In some cases the open is performed by the first access.

setjmp.h is a header defined in the C standard library to provide "non-local jumps": control flow that deviates from the usual subroutine call and return sequence. The complementary functions setjmp and longjmp provide this functionality.

select is a system call and application programming interface (API) in Unix-like and POSIX-compliant operating systems for examining the status of file descriptors of open input/output channels. The select system call is similar to the poll facility introduced in UNIX System V and later operating systems. However, with the c10k problem, both select and poll have been superseded by the likes of kqueue, epoll, /dev/poll and I/O completion ports.

S3 is a structured, imperative high-level computer programming language. It was developed by the UK company International Computers Limited (ICL) for its 2900 Series mainframes. It is a system programming language with syntax influenced by ALGOL 68 but with data types and operators aligned to those offered by the 2900 Series. It was the implementation language of the operating system VME.

In Unix-like operating systems, a device file, device node, or special file is an interface to a device driver that appears in a file system as if it were an ordinary file. There are also special files in DOS, OS/2, and Windows. These special files allow an application program to interact with a device by using its device driver via standard input/output system calls. Using standard system calls simplifies many programming tasks, and leads to consistent user-space I/O mechanisms regardless of device features and functions.

epoll is a Linux kernel system call for a scalable I/O event notification mechanism, first introduced in version 2.5.45 of the Linux kernel. Its function is to monitor multiple file descriptors to see whether I/O is possible on any of them. It is meant to replace the older POSIX select(2) and poll(2) system calls, to achieve better performance in more demanding applications, where the number of watched file descriptors is large (unlike the older system calls, which operate in O(n) time, epoll operates in O(1) time).

A close system call is a system call used to close a file descriptor by the kernel. For most file systems, a program terminates access to a file in a filesystem using the close system call. This flushes file buffers, updates file metadata, which may include and end-of-file indicator in the data; de-allocates resources associated with the file and updates the system wide table of files in use. Some programming languages maintain a data structure of files opened by their runtime library and may close when the program terminates. This practice is known as resource acquisition is initialization (RAII). Some operating systems will invoke the close on files held by a program if it terminates. Some operating systems will invoke the close syscall as part of an operating system recovery as a result of a system failure.

The write is one of the most basic routines provided by a Unix-like operating system kernel. It writes data from a buffer declared by the user to a given device, such as a file. This is the primary way to output data from a program by directly using a system call. The destination is identified by a numeric code. The data to be written, for instance a piece of text, is defined by a pointer and a size, given in number of bytes.

References