C file input/output

Last updated

The C programming language provides many standard library functions for file input and output. These functions make up the bulk of the C standard library header <stdio.h>. [1] The functionality descends from a "portable I/O package" written by Mike Lesk at Bell Labs in the early 1970s, [2] and officially became part of the Unix operating system in Version 7. [3]

Contents

The I/O functionality of C is fairly low-level by modern standards; C abstracts all file operations into operations on streams of bytes, which may be "input streams" or "output streams". Unlike some earlier programming languages, C has no direct support for random-access data files; to read from a record in the middle of a file, the programmer must create a stream, seek to the middle of the file, and then read bytes in sequence from the stream.

The stream model of file I/O was popularized by Unix, which was developed concurrently with the C programming language itself. The vast majority of modern operating systems have inherited streams from Unix, and many languages in the C programming language family have inherited C's file I/O interface with few if any changes (for example, PHP).

Overview

This library uses what are called streams to operate with physical devices such as keyboards, printers, terminals or with any other type of files supported by the system. Streams are an abstraction to interact with these in a uniform way. All streams have similar properties independent of the individual characteristics of the physical media they are associated with. [4]

Functions

Most of the C file input/output functions are defined in <stdio.h> (or in the C++ header cstdio, which contains the standard C functionality but in the std namespace).

Byte
character
Wide
character
Description
File access fopen Opens a file (with a non-Unicode filename on Windows and possible UTF-8 filename on Linux)
freopen Opens a different file with an existing stream
fflush Synchronizes an output stream with the actual file
fclose Closes a file
setbuf Sets the buffer for a file stream
setvbuf Sets the buffer and its size for a file stream
fwide Switches a file stream between wide-character I/O and narrow-character I/O
Direct
input/output
fread Reads from a file
fwrite Writes to a file
Unformatted
input/output
fgetc
getc
fgetwc
getwc
Reads a byte/wchar_t from a file stream
fgets fgetws Reads a byte/wchar_t line from a file stream
fputc
putc
fputwc
putwc
Writes a byte/wchar_t to a file stream
fputs fputws Writes a byte/wchar_t string to a file stream
getchar getwchar Reads a byte/wchar_t from stdin
gets Reads a byte string from stdin until a newline or end of file is encountered (deprecated in C99, removed from C11)
putchar putwchar Writes a byte/wchar_t to stdout
puts Writes a byte string to stdout
ungetc ungetwc Puts a byte/wchar_t back into a file stream
Formatted
input/output
scanf
fscanf
sscanf
wscanf
fwscanf
swscanf
Reads formatted byte/wchar_t input from stdin,
a file stream or a buffer
vscanf
vfscanf
vsscanf
vwscanf
vfwscanf
vswscanf
Reads formatted input byte/wchar_t from stdin,
a file stream or a buffer using variable argument list
printf
fprintf
sprintf
snprintf
wprintf
fwprintf
swprintf
Prints formatted byte/wchar_t output to stdout,
a file stream or a buffer
vprintf
vfprintf
vsprintf
vsnprintf
vwprintf
vfwprintf
vswprintf
Prints formatted byte/wchar_t output to stdout,
a file stream, or a buffer using variable argument list
perror Writes a description of the current error to stderr
File positioning ftell
ftello
Returns the current file position indicator
fseek
fseeko
Moves the file position indicator to a specific location in a file
fgetpos Gets the file position indicator
fsetpos Moves the file position indicator to a specific location in a file
rewind Moves the file position indicator to the beginning in a file
Error
handling
clearerr Clears errors
feof Checks for the end-of-file
ferror Checks for a file error
Operations
on files
remove Erases a file
rename Renames a file
tmpfile Returns a pointer to a temporary file
tmpnam Returns a unique filename

Constants

Constants defined in the <stdio.h> header include:

NameNotes
EOF A negative integer of type int used to indicate end-of-file conditions
BUFSIZ An integer which is the size of the buffer used by the setbuf() function
FILENAME_MAXThe size of a char array which is large enough to store the name of any file that can be opened
FOPEN_MAXThe number of files that may be open simultaneously; will be at least eight
_IOFBFAn abbreviation for "input/output fully buffered"; it is an integer which may be passed to the setvbuf() function to request block buffered input and output for an open stream
_IOLBFAn abbreviation for "input/output line buffered"; it is an integer which may be passed to the setvbuf() function to request line buffered input and output for an open stream
_IONBFAn abbreviation for "input/output not buffered"; it is an integer which may be passed to the setvbuf() function to request unbuffered input and output for an open stream
L_tmpnamThe size of a char array which is large enough to store a temporary filename generated by the tmpnam() function
NULLA macro expanding to the null pointer constant; that is, a constant representing a pointer value which is guaranteed not to be a valid address of an object in memory
SEEK_CURAn integer which may be passed to the fseek() function to request positioning relative to the current file position
SEEK_ENDAn integer which may be passed to the fseek() function to request positioning relative to the end of the file
SEEK_SETAn integer which may be passed to the fseek() function to request positioning relative to the beginning of the file
TMP_MAXThe maximum number of unique filenames generable by the tmpnam() function; will be at least 25

Variables

Stdstreams-notitle.svg

Variables defined in the <stdio.h> header include:

NameNotes
stdin A pointer to a FILE which refers to the standard input stream, usually a keyboard.
stdout A pointer to a FILE which refers to the standard output stream, usually a display terminal.
stderr A pointer to a FILE which refers to the standard error stream, often a display terminal.

Member types

Data types defined in the <stdio.h> header include:

Extensions

The POSIX standard defines several extensions to stdio in its Base Definitions, among which are a readline function that allocates memory, the fileno and fdopen functions that establish the link between FILE objects and file descriptors, and a group of functions for creating FILE objects that refer to in-memory buffers. [5]

Example

The following C program opens a binary file called myfile, reads five bytes from it, and then closes the file.

#include<stdio.h>#include<stdlib.h>intmain(void){charbuffer[5];FILE*fp=fopen("myfile","rb");if(fp==NULL){perror("Failed to open file \"myfile\"");returnEXIT_FAILURE;}if(fread(buffer,1,5,fp)<5){fputs("An error occurred while reading the file.\n",stderr);returnEXIT_FAILURE;}fclose(fp);printf("The bytes read were: ");for(inti=0;i<5;++i){printf("%02X ",buffer[i]);}putchar('\n');returnEXIT_SUCCESS;}

Alternatives to stdio

Several alternatives to stdio have been developed. Among these is the C++ iostream library, part of the ISO C++ standard. ISO C++ still requires the stdio functionality.

Other alternatives include the Sfio [6] (A Safe/Fast I/O Library) library from AT&T Bell Laboratories. This library, introduced in 1991, aimed to avoid inconsistencies, unsafe practices and inefficiencies in the design of stdio. Among its features is the possibility to insert callback functions into a stream to customize the handling of data read from or written to the stream. [7] It was released to the outside world in 1997, and the last release was 1 February 2005. [8]

See also

Related Research Articles

C is a general-purpose computer programming language. It was created in the 1970s by Dennis Ritchie, and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems, device drivers, and protocol stacks, but its use in application software has been decreasing. C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.

Berkeley sockets is an application programming interface (API) for Internet sockets and Unix domain sockets, used for inter-process communication (IPC). It is commonly implemented as a library of linkable modules. It originated with the 4.2BSD Unix operating system, which was released in 1983.

<span class="mw-page-title-main">C shell</span> Unix shell

The C shell is a Unix shell created by Bill Joy while he was a graduate student at University of California, Berkeley in the late 1970s. It has been widely distributed, beginning with the 2BSD release of the Berkeley Software Distribution (BSD) which Joy first distributed in 1978. Other early contributors to the ideas or the code were Michael Ubell, Eric Allman, Mike O'Brien and Jim Kulp.

In computer programming, standard streams are preconnected input and output communication channels between a computer program and its environment when it begins execution. The three input/output (I/O) connections are called standard input (stdin), standard output (stdout) and standard error (stderr). Originally I/O happened via a physically connected system console, but standard streams abstract this. When a command is executed via an interactive shell, the streams are typically connected to the text terminal on which the shell is running, but can be changed with redirection or a pipeline. More generally, a child process inherits the standard streams of its parent process.

The C preprocessor is the macro preprocessor for several computer programming languages, such as C, Objective-C, C++, and a variety of Fortran languages. The preprocessor provides inclusion of header files, macro expansions, conditional compilation, and line control.

In computing, a polyglot is a computer program or script written in a valid form of multiple programming languages or file formats. The name was coined by analogy to multilingualism. A polyglot file is composed by combining syntax from two or more different formats.

The C standard library or libc is the standard library for the C programming language, as specified in the ISO C standard. Starting from the original ANSI C standard, it was developed at the same time as the C library POSIX specification, which is a superset of it. Since ANSI C was adopted by the International Organization for Standardization, the C standard library is also called the ISO C library.

errno.h is a header file in the standard library of the C programming language. It defines macros for reporting and retrieving error conditions using the symbol errno.

The printf family of functions in the C programming language are a set of functions that take a format string as input among a variable sized list of other values and produce as output a string that corresponds to the format specifier and given input values. The string is written in a simple template language: characters are usually copied literally into the function's output, but format specifiers, which start with a % character, indicate the location and method to translate a piece of data to characters. The design has been copied to expose similar functionality in other programming languages.

In Unix and Unix-like computer operating systems, a file descriptor is a process-unique identifier (handle) for a file or other input/output resource, such as a pipe or network socket.

In computing, vectored I/O, also known as scatter/gather I/O, is a method of input and output by which a single procedure call sequentially reads data from multiple buffers and writes it to a single data stream (gather), or reads data from a data stream and writes it to multiple buffers (scatter), as defined in a vector of buffers. Scatter/gather refers to the process of gathering data from, or scattering data into, the given set of buffers. Vectored I/O can operate synchronously or asynchronously. The main reasons for using vectored I/O are efficiency and convenience.

A scanf format string is a control parameter used in various functions to specify the layout of an input string. The functions can then divide the string and translate into values of appropriate data types. String scanning functions are often supplied in standard libraries. Scanf is a function that reads formatted data from the standard input string, which is usually the keyboard and writes the results whenever called in the specified arguments.

The C date and time functions are a group of functions in the standard library of the C programming language implementing date and time manipulation operations. They provide support for time acquisition, conversion between date formats, and formatted output to strings.

setjmp.h is a header defined in the C standard library to provide "non-local jumps": control flow that deviates from the usual subroutine call and return sequence. The complementary functions setjmp and longjmp provide this functionality.

stdarg.h is a header in the C standard library of the C programming language that allows functions to accept an indefinite number of arguments. It provides facilities for stepping through a list of function arguments of unknown number and type. C++ provides this functionality in the header cstdarg.

select is a system call and application programming interface (API) in Unix-like and POSIX-compliant operating systems for examining the status of file descriptors of open input/output channels. The select system call is similar to the poll facility introduced in UNIX System V and later operating systems. However, with the c10k problem, both select and poll have been superseded by the likes of kqueue, epoll, /dev/poll and I/O completion ports.

In computing, C localization functions are a group of functions in the C programming language implementing basic localization routines. The functions are used in multilingual programs to adapt to the specific locale. In particular, the way of displaying of numbers and currency can be modified. These settings affect the behaviour of input/output functions in the C Standard Library.

The POSIX terminal interface is the generalized abstraction, comprising both an application programming interface for programs, and a set of behavioural expectations for users of a terminal, as defined by the POSIX standard and the Single Unix Specification. It is a historical development from the terminal interfaces of BSD version 4 and Seventh Edition Unix.

The write is one of the most basic routines provided by a Unix-like operating system kernel. It writes data from a buffer declared by the user to a given device, such as a file. This is the primary way to output data from a program by directly using a system call. The destination is identified by a numeric code. The data to be written, for instance a piece of text, is defined by a pointer and a size, given in number of bytes.

In the C++ programming language, input/output library refers to a family of class templates and supporting functions in the C++ Standard Library that implement stream-based input/output capabilities. It is an object-oriented alternative to C's FILE-based streams from the C standard library.

References

  1. ISO/IEC 9899:1999 specification. p. 274, § 7.19.
  2. Kernighan, Brian; Pike, Rob (1984). The UNIX Programming Environment . Englewood Cliffs: Prentice Hall. p. 200.
  3. McIlroy, M. D. (1987). A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986 (PDF) (Technical report). CSTR. Bell Labs. 139.
  4. "(stdio.h) - C++ Reference". C++. Retrieved 25 July 2021.
  5. stdio.h   Base Definitions Reference, The Single UNIX Specification , Version 4 from The Open Group
  6. "Sfio: A Safe/Fast I/O Library". Archived from the original on 11 February 2006. Retrieved 16 March 2021.{{cite web}}: CS1 maint: bot: original URL status unknown (link)
  7. Korn, David G.; Vo, Kiem-Phong (1991). SFIO: Safe/Fast String/File IO. Proc. Summer USENIX Conf. CiteSeerX   10.1.1.51.6574 .
  8. Fowler, Glenn S.; Korn, David G.; Vo, Kiem-Phong (2000). Extended Formatting with Sfio. Proc. Summer USENIX Conf.