Fork–exec

Last updated • 3 min readFrom Wikipedia, The Free Encyclopedia

Fork–exec is a commonly used technique in Unix whereby an executing process spawns a new program.

Description

fork() is the name of the system call that the parent process uses to "divide" itself ("fork") into two identical processes. After calling fork(), the created child process is an exact copy of the parent except for the return value of the fork() call. This includes open files, register state, and all memory allocations, which includes the program's executable code. In some cases the two continue to run the same binary, but often one (usually the child) switches to running another binary executable using the exec() system call.

When a process forks, a complete copy of the executing program is made into the new process. This new process is a child of the parent process, and has a new process identifier (PID). The fork() function returns the child's PID to the parent process. The fork() function returns 0 to the child process. This enables the two otherwise identical processes to distinguish themselves from each other.

The parent process can either continue execution or wait for the child process to complete. The child, after discovering that it is the child, will most often then replace itself completely with another program, so that the code and address space of the original program are lost. This replacement is, however, a choice of the architecture one builds the given program on, and is therefore not an obligatory step in the child process' life.

If the parent chooses to wait for the child to die, then the parent will receive the exit code of the program that the child executed. To prevent the child becoming a zombie the parent should call wait on its children, either periodically or upon receiving the SIGCHLD signal, which indicates a child process has terminated.

One can also asynchronously wait on their children to finish, by using a signal handler for SIGCHLD, if they need to ensure everything is cleaned up. Here's an example of a signal handler that catches any incoming SIGCHLD signals and handles multiple concurrent signals received.

voidcleanup(intsignal){while(waitpid((pid_t)(-1),0,WNOHANG)>0){}}

When the child process calls exec(), all data in the original program is lost, and it is replaced with a running copy of the new program. This is known as overlaying. Although all data are replaced, the file descriptors that were open in the parent are closed only if the program has explicitly marked them close-on-exec. This allows for the common practice of the parent creating a pipe prior to calling fork() and using it to communicate with the executed program.

Microsoft Windows does not support the fork–exec model, as it does not have a system call analogous to fork(). The spawn() family of functions declared in process.h can replace it in cases where the call to fork() is followed directly by exec().

When a fork syscall is made on WSL, lxss.sys does some of the initial work to prepare for copying the process. It then calls internal NT APIs to create the process with the correct semantics and create a thread in the process with an identical register context. Finally, it does some additional work to complete copying the process and resumes the new process so it can begin executing.

Jack Hammons of Microsoft [1]

Related Research Articles

On Unix and Unix-like computer operating systems, a zombie process or defunct process is a process that has completed execution but still has an entry in the process table: it is a process in the "terminated state". This occurs for the child processes, where the entry is still needed to allow the parent process to read its child's exit status: once the exit status is read via the wait system call, the zombie's entry is removed from the process table and it is said to be "reaped". A child process initially becomes a zombie, only then being removed from the resource table. Under normal system operation, zombies are immediately waited on by their parent and then reaped by the system. Processes that stay zombies for a long time are usually an error and can cause a resource leak. Generally, the only kernel resource they occupy is the process table entry, their process ID. However, zombies can also hold buffers open, consuming memory. Zombies can hold handles to file descriptors, which prevents the space for those files from being available to the filesystem. This effect can be seen by a difference between du and df. While du may show a large amount of free disk space, df will show a full partition. If the zombies are not cleaned, this can fill the root partition and crash the system.

DNIX is a discontinued Unix-like real-time operating system from the Swedish company Dataindustrier AB (DIAB). A version named ABCenix was developed for the ABC 1600 computer from Luxor. Daisy Systems also had a system named Daisy DNIX on some of their computer-aided design (CAD) workstations. It was unrelated to DIAB's product.

<span class="mw-page-title-main">System call</span> Way for programs to access kernel services

In computing, a system call is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services, creation and execution of new processes, and communication with integral kernel services such as process scheduling. System calls provide an essential interface between a process and the operating system.

<span class="mw-page-title-main">Fork bomb</span> Type of denial-of-service software attack

In computing, a fork bomb is a denial-of-service (DoS) attack wherein a process continually replicates itself to deplete available system resources, slowing down or crashing the system due to resource starvation.

In computing, particularly in the context of the Unix operating system and its workalikes, fork is an operation whereby a process creates a copy of itself. It is an interface which is required for compliance with the POSIX and Single UNIX Specification standards. It is usually implemented as a C standard library wrapper to the fork, clone, or other system calls of the kernel. Fork is the primary method of process creation on Unix-like operating systems.

A child process in computing is a process created by another process. This technique pertains to multitasking operating systems, and is sometimes called a subprocess or traditionally a subtask.

In computing, a parent process is a process that has created one or more child processes.

<span class="mw-page-title-main">Daemon (computing)</span> Computer program that runs as a background process

In multitasking computer operating systems, a daemon is a computer program that runs as a background process, rather than being under the direct control of an interactive user. Traditionally, the process names of a daemon end with the letter d, for clarification that the process is in fact a daemon, and for differentiation between a daemon and a normal computer program. For example, syslogd is a daemon that implements system logging facility, and sshd is a daemon that serves incoming SSH connections.

Signals are standardized messages sent to a running program to trigger specific behavior, such as quitting or error handling. They are a limited form of inter-process communication (IPC), typically used in Unix, Unix-like, and other POSIX-compliant operating systems.

In computing, the exit status of a terminated process is an integer number that is made available to its parent process. In DOS, this may be referred to as an errorlevel.

File locking is a mechanism that restricts access to a computer file, or to a region of a file, by allowing only one user or process to modify or delete it at a specific time, and preventing reading of the file while it's being modified or deleted.

In computing, sigaction is a function API defined by POSIX to give the programmer access to what a program's behavior should be when receiving specific OS signals.

In computer science, the event loop is a programming construct or design pattern that waits for and dispatches events or messages in a program. The event loop works by making a request to some internal or external "event provider", then calls the relevant event handler.

In computing, exec is a functionality of an operating system that runs an executable file in the context of an already existing process, replacing the previous executable. This act is also referred to as an overlay. It is especially important in Unix-like systems, although it also exists elsewhere. As no new process is created, the process identifier (PID) does not change, but the machine code, data, heap, and stack of the process are replaced by those of the new program.

On many computer operating systems, a computer process terminates its execution by making an exit system call. More generally, an exit in a multithreading environment means that a thread of execution has stopped running. For resource management, the operating system reclaims resources that were used by the process. The process is said to be a dead process after it terminates.

Spawn in computing refers to a function that loads and executes a new child process. The current process may wait for the child to terminate or may continue to execute concurrent computing. Creating a new subprocess requires enough memory in which both the child process and the current program can execute.

In computer operating systems, a process may wait for another process to complete its execution. In most systems, a parent process can create an independently executing child process. The parent process may then issue a wait system call, which suspends the execution of the parent process while the child executes. When the child process terminates, it returns an exit status to the operating system, which is then returned to the waiting parent process. The parent process then resumes execution.

<span class="mw-page-title-main">Process management (computing)</span> Computer system for maintaining order among running programs

A process is a program in execution, and an integral part of any modern-day operating system (OS). The OS must allocate resources to processes, enable processes to share and exchange information, protect the resources of each process from other processes and enable synchronization among processes. To meet these requirements, The OS must maintain a data structure for each process, which describes the state and resource ownership of that process, and which enables the operating system to exert control over each process.

ptrace is a system call found in Unix and several Unix-like operating systems. By using ptrace one process can control another, enabling the controller to inspect and manipulate the internal state of its target. ptrace is used by debuggers and other code-analysis tools, mostly as aids to software development.

In Unix-like operating systems, dup and dup2 system calls create a copy of a given file descriptor. This new descriptor actually does not behave like a copy, but like an alias of the old one.

References

  1. "WSL System Calls". 8 June 2016.