File descriptor

Last updated November 17, 2024

In Unix and Unix-like computer operating systems, a file descriptor (FD, less frequently fildes) is a process-unique identifier (handle) for a file or other input/output resource, such as a pipe or network socket.

Overflow
Operations on file descriptors
Creating file descriptors
Deriving file descriptors
Operations on a single file descriptor
Operations on multiple file descriptors
Operations on the file descriptor table
Operations that modify process state
File locking
Sockets
Miscellaneous
at suffix operations
File descriptors as capabilities
See also
References

File descriptors typically have non-negative integer values, with negative values being reserved to indicate "no value" or error conditions.

File descriptors are a part of the POSIX API. Each Unix process (except perhaps daemons) should have three standard POSIX file descriptors, corresponding to the three standard streams:

Integer value	Name	< unistd.h > symbolic constant^[1]	< stdio.h > file stream^[2]
0	Standard input	STDIN_FILENO	stdin
1	Standard output	STDOUT_FILENO	stdout
2	Standard error	STDERR_FILENO	stderr

Overflow

In the traditional implementation of Unix, file descriptors index into a per-process file descriptor table maintained by the kernel, that in turn indexes into a system-wide table of files opened by all processes, called the file table. This table records the mode with which the file (or other resource) has been opened: for reading, writing, appending, and possibly other modes. It also indexes into a third table called the inode table that describes the actual underlying files.^[3] To perform input or output, the process passes the file descriptor to the kernel through a system call, and the kernel will access the file on behalf of the process. The process does not have direct access to the file or inode tables.

On Linux, the set of file descriptors open in a process can be accessed under the path /proc/PID/fd/, where PID is the process identifier. File descriptor /proc/PID/fd/0 is stdin, /proc/PID/fd/1 is stdout, and /proc/PID/fd/2 is stderr. As a shortcut to these, any running process can also access its own file descriptors through the folders /proc/self/fd and /dev/fd.^[4]

In Unix-like systems, file descriptors can refer to any Unix file type named in a file system. As well as regular files, this includes directories, block and character devices (also called "special files"), Unix domain sockets, and named pipes. File descriptors can also refer to other objects that do not normally exist in the file system, such as anonymous pipes and network sockets.

The FILE data structure in the C standard I/O library usually includes a low level file descriptor for the object in question on Unix-like systems. The overall data structure provides additional abstraction and is instead known as a file handle.

Operations on file descriptors

The following lists typical operations on file descriptors on modern Unix-like systems. Most of these functions are declared in the <unistd.h> header, but some are in the <fcntl.h> header instead.

Creating file descriptors

open()
creat()^[5]
socket()
accept()
socketpair()
pipe()
epoll_create() (Linux)
signalfd() (Linux)
eventfd() (Linux)
timerfd_create() (Linux)
memfd_create() (Linux)
userfaultfd() (Linux)
fanotify_init() (Linux)
inotify_init() (Linux)
clone() (with flag CLONE_PIDFD, Linux)
pidfd_open() (Linux)
open_by_handle_at() (Linux)
kqueue() (BSD)
pdfork() (kFreeBSD)

Deriving file descriptors

dirfd()
fileno()

Operations on a single file descriptor

read(), write()
readv(), writev()
pread(), pwrite()
recv(), send()
recvfrom(), sendto()
recvmsg(), sendmsg() (also used for sending FDs to other processes over a Unix domain socket)
recvmmsg(), sendmmsg()
lseek(), llseek()
fstat()
fstatvfs()
fchmod()
fchown()
ftruncate()
fsync()
fdatasync()
fdopendir()
fgetxattr(), fsetxattr() (Linux)
flistxattr(), fremovexattr() (Linux)
statx (Linux)
setns (Linux)
vmsplice() (Linux)
pidfd_send_signal() (Linux)
pdkill() (kFreeBSD)
waitid() (with P_PIDFD ID type, Linux)
fdopen() (stdio function:converts file descriptor to FILE*)
dprintf() (stdio function: prints to file descriptor)

Operations on multiple file descriptors

select(), pselect()
poll(), ppoll()
epoll_wait(), epoll_pwait(), epoll_pwait2() (Linux, takes a single epoll filedescriptor to wait on many other file descriptors)
epoll_ctl() (for Linux)
kqueue() (for BSD-based systems).
sendfile()
splice(), tee() (for Linux)
copy_file_range() (for Linux)

Operations on the file descriptor table

The fcntl() function is used to perform various operations on a file descriptor, depending on the command argument passed to it. There are commands to get and set attributes associated with a file descriptor, including F_GETFD, F_SETFD, F_GETFL and F_SETFL.

close()
closefrom() (BSD and Solaris only; deletes all file descriptors greater than or equal to specified number)
close_range() (for Linux)^[6]
dup() (duplicates an existing file descriptor guaranteeing to be the lowest number available file descriptor)
dup2(), dup3() (Close fd1 if necessary, and make file descriptor fd1 point to the open file of fd2)
fcntl (F_DUPFD)

Operations that modify process state

fchdir() (sets the process's current working directory based on a directory file descriptor)
mmap() (maps ranges of a file into the process's address space)

File locking

flock()
fcntl() (F_GETLK, F_SETLK and F_SETLKW)
lockf()

Sockets

connect()
bind()
listen()
accept() (creates a new file descriptor for an incoming connection)
getsockname()
getpeername()
getsockopt()
setsockopt()
shutdown() (shuts down one or both halves of a full duplex connection)

Miscellaneous

ioctl() (a large collection of miscellaneous operations on a single file descriptor, often associated with a device)

`at` suffix operations

A series of new operations has been added to many modern Unix-like systems, as well as numerous C libraries, to be standardized in a future version of POSIX.^[7] The at suffix signifies that the function takes an additional first argument supplying a file descriptor from which relative paths are resolved, the forms lacking the at suffix thus becoming equivalent to passing a file descriptor corresponding to the current working directory. The purpose of these new operations is to defend against a certain class of TOCTOU attacks.

openat()
faccessat()
fchmodat()
fchownat()
fstatat()
futimesat()
linkat()
mkdirat()
mknodat()
readlinkat()
renameat()
symlinkat()
unlinkat()
mkfifoat()
fdopendir()

File descriptors as capabilities

Unix file descriptors behave in many ways as capabilities. They can be passed between processes across Unix domain sockets using the sendmsg() system call. Note, however, that what is actually passed is a reference to an "open file description" that has mutable state (the file offset, and the file status and access flags). This complicates the secure use of file descriptors as capabilities, since when programs share access to the same open file description, they can interfere with each other's use of it by changing its offset or whether it is blocking or non-blocking, for example.^[8]^[9] In operating systems that are specifically designed as capability systems, there is very rarely any mutable state associated with a capability itself.

A Unix process' file descriptor table is an example of a C-list.

Related Research Articles

The Portable Operating System Interface is a family of standards specified by the IEEE Computer Society for maintaining compatibility between operating systems. POSIX defines system and user-level application programming interfaces (APIs), along with command line shells and utility interfaces, for software compatibility (portability) with variants of Unix and other operating systems. POSIX is also a trademark of the IEEE. POSIX is intended to be used by both application and system developers.

A Berkeley (BSD) socket is an application programming interface (API) for Internet domain sockets and Unix domain sockets, used for inter-process communication (IPC). It is commonly implemented as a library of linkable modules. It originated with the 4.2BSD Unix operating system, which was released in 1983.

In computing, particularly in the context of the Unix operating system and its workalikes, fork is an operation whereby a process creates a copy of itself. It is an interface which is required for compliance with the POSIX and Single UNIX Specification standards. It is usually implemented as a C standard library wrapper to the fork, clone, or other system calls of the kernel. Fork is the primary method of process creation on Unix-like operating systems.

Capability-based security is a concept in the design of secure computing systems, one of the existing security models. A capability is a communicable, unforgeable token of authority. It refers to a value that references an object along with an associated set of access rights. A user program on a capability-based operating system must use a capability to access an object. Capability-based security refers to the principle of designing user programs such that they directly share capabilities with each other according to the principle of least privilege, and to the operating system infrastructure necessary to make such transactions efficient and secure. Capability-based security is to be contrasted with an approach that uses traditional UNIX permissions and Access Control Lists.

The inode is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. Each inode stores the attributes and disk block locations of the object's data. File-system object attributes may include metadata, as well as owner and permission data.

In computing, mmap(2) is a POSIX-compliant Unix system call that maps files or devices into memory. It is a method of memory-mapped file I/O. It implements demand paging because file contents are not immediately read from disk and initially use no physical RAM at all. The actual reads from disk are performed after a specific location is accessed, in a lazy manner. After the mapping is no longer needed, the pointers must be unmapped with munmap(2). Protection information—for example, marking mapped regions as executable—can be managed using mprotect(2), and special treatment can be enforced using madvise(2).

stat is a Unix system call that returns file attributes about an inode. The semantics of stat vary between operating systems. As an example, Unix command ls uses this system call to retrieve information on files that includes:

The seven standard Unix file types are regular, directory, symbolic link, FIFO special, block special, character special, and socket as defined by POSIX. Different OS-specific implementations allow more types than what POSIX requires. A file's type can be identified by the ls -l command, which displays the type in the first character of the file-system permissions field.

File locking is a mechanism that restricts access to a computer file, or to a region of a file, by allowing only one user or process to modify or delete it at a specific time and to prevent reading of the file while it's being modified or deleted.

The proc filesystem (procfs) is a special filesystem in Unix-like operating systems that presents information about processes and other system information in a hierarchical file-like structure, providing a more convenient and standardized method for dynamically accessing process data held in the kernel than traditional tracing methods or direct access to kernel memory. Typically, it is mapped to a mount point named /proc at boot time. The proc file system acts as an interface to internal data structures about running processes in the kernel. In Linux, it can also be used to obtain information about the kernel and to change certain kernel parameters at runtime (sysctl).

In computer networking, STREAMS is the native framework in Unix System V for implementing character device drivers, network protocols, and inter-process communication. In this framework, a stream is a chain of coroutines that pass messages between a program and a device driver. STREAMS originated in Version 8 Research Unix, as Streams.

In computer science, the event loop is a programming construct or design pattern that waits for and dispatches events or messages in a program. The event loop works by making a request to some internal or external "event provider", then calls the relevant event handler.

select is a system call and application programming interface (API) in Unix-like and POSIX-compliant operating systems for examining the status of file descriptors of open input/output channels. The select system call is similar to the poll facility introduced in UNIX System V and later operating systems. However, with the c10k problem, both select and poll have been superseded by the likes of kqueue, epoll, /dev/poll and I/O completion ports.

ptrace is a system call found in Unix and several Unix-like operating systems. By using ptrace one process can control another, enabling the controller to inspect and manipulate the internal state of its target. ptrace is used by debuggers and other code-analysis tools, mostly as aids to software development.

Kqueue is a scalable event notification interface introduced in FreeBSD 4.1 in July 2000, also supported in NetBSD, OpenBSD, DragonFly BSD, and macOS. Kqueue was originally authored in 2000 by Jonathan Lemon, then involved with the FreeBSD Core Team. Kqueue makes it possible for software like nginx to solve the c10k problem. The term "kqueue" refers to its function as a "kernel event queue"

epoll is a Linux kernel system call for a scalable I/O event notification mechanism, first introduced in version 2.5.45 of the Linux kernel. Its function is to monitor multiple file descriptors to see whether I/O is possible on any of them. It is meant to replace the older POSIX select(2) and poll(2) system calls, to achieve better performance in more demanding applications, where the number of watched file descriptors is large (unlike the older system calls, which operate in O(n) time, epoll operates in O(1) time).

The write is one of the most basic routines provided by a Unix-like operating system kernel. It writes data from a buffer declared by the user to a given device, such as a file. This is the primary way to output data from a program by directly using a system call. The destination is identified by a numeric code. The data to be written, for instance a piece of text, is defined by a pointer and a size, given in number of bytes.

In Unix-like operating systems, dup and dup2 system calls create a copy of a given file descriptor. This new descriptor actually does not behave like a copy, but like an alias of the old one.

Namespaces are a feature of the Linux kernel that partition kernel resources such that one set of processes sees one set of resources, while another set of processes sees a different set of resources. The feature works by having the same namespace for a set of resources and processes, but those namespaces refer to distinct resources. Resources may exist in multiple namespaces. Examples of such resources are process IDs, host-names, user IDs, file names, some names associated with network access, and Inter-process communication.

poll is a POSIX system call to wait for one or more file descriptors to become ready for use.

References

↑ The Open Group. "The Open Group Base Specifications Issue 7, IEEE Std 1003.1-2008, 2016 Edition" . Retrieved 2017-09-21.
↑ The Open Group. "The Open Group Base Specifications Issue 7, IEEE Std 1003.1-2008, 2016 Edition". <stdio.h>. Retrieved 2017-09-21.
1 2 Bach, Maurice J. (1986). The Design of the UNIX Operating System (8 ed.). Prentice-Hall. pp. 92–96. ISBN 9780132017992.
↑ "Devices - What does the output of 'll /Proc/Self/Fd/' (From 'll /Dev/Fd') mean?".
↑ The Open Group. "The Open Group Base Specifications Issue 7, IEEE Std 1003.1-2008, 2018 Edition – creat" . Retrieved 2019-04-11.
↑ Stephen Kitt, Michael Kerrisk. "close_range(2) — Linux manual page" . Retrieved 2021-03-22.
↑ Extended API Set, Part 2 . The Open Group. October 2006. ISBN 1931624674.
↑ Brinkmann, Marcus (2009-02-04). "Building a bridge: library API's and file descriptors?". cap-talk. Archived from the original on 2012-07-30. Retrieved 2017-09-21.
↑ de Boyne Pollard, Jonathan (2007). "Don't set shared file descriptors to non-blocking I/O mode" . Retrieved 2017-09-21.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] The Open Group. "The Open Group Base Specifications Issue 7, IEEE Std 1003.1-2008, 2016 Edition" . Retrieved 2017-09-21.

[2] The Open Group. "The Open Group Base Specifications Issue 7, IEEE Std 1003.1-2008, 2016 Edition". <stdio.h>. Retrieved 2017-09-21.

[bach-3] 1 2 Bach, Maurice J. (1986). The Design of the UNIX Operating System (8 ed.). Prentice-Hall. pp. 92–96. ISBN 9780132017992.

[4] "Devices - What does the output of 'll /Proc/Self/Fd/' (From 'll /Dev/Fd') mean?".

[5] The Open Group. "The Open Group Base Specifications Issue 7, IEEE Std 1003.1-2008, 2018 Edition – creat" . Retrieved 2019-04-11.

[6] Stephen Kitt, Michael Kerrisk. "close_range(2) — Linux manual page" . Retrieved 2021-03-22.

[7] Extended API Set, Part 2 . The Open Group. October 2006. ISBN 1931624674.

[8] Brinkmann, Marcus (2009-02-04). "Building a bridge: library API's and file descriptors?". cap-talk. Archived from the original on 2012-07-30. Retrieved 2017-09-21.

[9] Boyne Pollard, Jonathan (2007). "Don't set shared file descriptors to non-blocking I/O mode" . Retrieved 2017-09-21.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

v t e Object-capability security
Concepts	Principle of least privilege (PoLP) Confused deputy problem Ambient authority File descriptor C-list Object-capability model Capability-based security Capability-based addressing Zooko's triangle Petnames
Operating systems, kernels	Capsicum Fuchsia Genode GNOSIS → KeyKOS → EROS → CapROS Hydra iMAX 432 Midori NLTSS seL4 HarmonyOS (HarmonyOS NEXT) Phantom OS
Programming languages	Cajita E Joe-E Joule
File systems	Tahoe-LAFS
Specialised hardware	BiiN Cambridge CAP Flex IBM System/38 Intel iAPX 432 Plessey System 250

v t e Computer files
Types	Binary file / text file Data file File format List of file formats File signatures Magic number Open file formats Proprietary file formats Metafile Sidecar file Sparse file Swap file System file Temporary file Zero-byte file
Properties	Filename 8.3 filename Long filename Filename mangling Filename extension List of filename extensions File attribute Extended file attributes File size Hidden file / Hidden directory
Organisation	Directory/folder NTFS links Temporary folder Directory structure File system Filesystem Hierarchy Standard Grid file system Semantic file system Path
Operations	Open Close Read Write
Linking	File descriptor Hard link Shortcut Alias Shadow Symbolic link
Management	Backup File comparison File copying Data compression File manager Comparison of file managers File system fragmentation File-system permissions File transfer File sharing File synchronization File verification