Exec (system call)

Last updated

In computing, exec is a functionality of an operating system that runs an executable file in the context of an already existing process, replacing the previous executable. This act is also referred to as an overlay. It is especially important in Unix-like systems, although it also exists elsewhere. As no new process is created, the process identifier (PID) does not change, but the machine code, data, heap, and stack of the process are replaced by those of the new program.

Contents

The exec call is available for many programming languages including compilable languages and some scripting languages. In OS command interpreters, the exec built-in command replaces the shell process with the specified program. [1]

Nomenclature

Interfaces to exec and its implementations vary. Depending on programming language it may be accessible via one or more functions, and depending on operating system it may be represented with one or more actual system calls. For this reason exec is sometimes described as a collection of functions.

Standard names of such functions in C are execl, execle, execlp, execv, execve, and execvp (see below), but not "exec" itself. The Linux kernel has one corresponding system call named "execve", whereas all aforementioned functions are user-space wrappers around it.

Higher-level languages usually provide one call named exec.

Unix, POSIX, and other multitasking systems

C language prototypes

The POSIX standard declares exec functions in the unistd.h header file, in the C language. The same functions are declared in process.h for DOS (see below), OS/2, and Microsoft Windows.

intexecl(charconst*path,charconst*arg0,...);intexecle(charconst*path,charconst*arg0,...,charconst*envp[]);intexeclp(charconst*file,charconst*arg0,...);intexecv(charconst*path,charconst*argv[]);intexecve(charconst*path,charconst*argv[],charconst*envp[]);intexecvp(charconst*file,charconst*argv[]);intfexecve(intfd,char*constargv[],char*constenvp[]);

Some implementations provide these functions named with a leading underscore (e.g. _execl).

The base of each is exec (execute), followed by one or more letters:

e – An array of pointers to environment variables is explicitly passed to the new process image.
lCommand-line arguments are passed individually (a list) to the function.
p – Uses the PATH environment variable to find the file named in the file argument to be executed.
v – Command-line arguments are passed to the function as an array (vector) of pointers.
path

The argument specifies the path name of the file to execute as the new process image. Arguments beginning at arg0 are pointers to arguments to be passed to the new process image. The argv value is an array of pointers to arguments.

arg0

The first argument arg0 should be the name of the executable file. Usually it is the same value as the path argument. Some programs may incorrectly rely on this argument providing the location of the executable, but there is no guarantee of this nor is it standardized across platforms.

envp

Argument envp is an array of pointers to environment settings. The exec calls named ending with an e alter the environment for the new process image by passing a list of environment settings through the envp argument. This argument is an array of character pointers; each element (except for the final element) points to a null-terminated string defining an environment variable.

Each null-terminated string has the form:

name=value

where name is the environment variable name, and value is the value of that variable. The final element of the envp array must be null.

In the execl, execlp, execv, and execvp calls, the new process image inherits the current environment variables.

Effects

A file descriptor open when an exec call is made remains open in the new process image, unless was fcntl ed with FD_CLOEXEC or opened with O_CLOEXEC (the latter was introduced in POSIX.1-2001). This aspect is used to specify the standard streams (stdin, stdout and stderr) of the new program.

A successful overlay destroys the previous memory address space of the process, and all its memory areas, that were not shared, are reclaimed by the operating system. Consequently, all its data that were not passed to the new program, or otherwise saved, become lost.

Return value

A successful exec replaces the current process image, so it cannot return anything to the program that made the call. Processes do have an exit status, but that value is collected by the parent process.

If an exec function does return to the calling program, an error occurs, the return value is −1, and errno is set to one of the following values:

NameNotes
E2BIGThe argument list exceeds the system limit.
EACCESThe specified file has a locking or sharing violation.
ENOENTThe file or path name not found.
ENOMEMNot enough memory is available to execute the new process image.

DOS operating systems

DOS is not a multitasking operating system, but replacing the previous executable image has a great merit there due to harsh primary memory limitations and lack of virtual memory. The same API is used for overlaying programs in DOS and it has effects similar to ones on POSIX systems.

MS-DOS exec functions always load the new program into memory as if the "maximum allocation" in the program's executable file header is set to default value 0xFFFF. The EXEHDR utility can be used to change the maximum allocation field of a program. However, if this is done and the program is invoked with one of the exec functions, the program might behave differently from a program invoked directly from the operating-system command line or with one of the spawn functions (see below).

Command interpreters

Many Unix shells also offer a builtin exec command that replaces the shell process with the specified program. [1] Wrapper scripts often use this command to run a program (either directly or through an interpreter or virtual machine) after setting environment variables or other configuration. By using exec, the resources used by the shell program do not need to stay in use after the program is started. [2]

The exec command can also perform a redirection. In some shells it is even possible to use the exec command for redirection only, without making an actual overlay.

Alternatives

The traditional Unix system does not have the functionality to create a new process running a new executable program in one step, which explains the importance of exec for Unix programming. Other systems may use spawn as the main tool for running executables. Its result is equivalent to the fork–exec sequence of Unix-like systems. POSIX supports the posix_spawn routines as an optional extension that usually is implemented using vfork.

Other Systems

OS/360 and successors include a system call XCTL (transfer control) that performs a similar function to exec.

See also

Related Research Articles

<span class="mw-page-title-main">AWK</span> Programming language

AWK is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and is a standard feature of most Unix-like operating systems.

<span class="mw-page-title-main">GNU Debugger</span> Source-level debugger

The GNU Debugger (GDB) is a portable debugger that runs on many Unix-like systems and works for many programming languages, including Ada, Assembly, C, C++, D, Fortran, Haskell, Go, Objective-C, OpenCL C, Modula-2, Pascal, Rust, and partially others.

<span class="mw-page-title-main">Unix shell</span> Command-line interpreter for Unix operating system

A Unix shell is a command-line interpreter or shell that provides a command line user interface for Unix-like operating systems. The shell is both an interactive command language and a scripting language, and is used by the operating system to control the execution of the system using shell scripts.

An environment variable is a user-definable value that can affect the way running processes will behave on a computer. Environment variables are part of the environment in which a process runs. For example, a running process can query the value of the TEMP environment variable to discover a suitable location to store temporary files, or the HOME or USERPROFILE variable to find the directory structure owned by the user running the process.

<span class="mw-page-title-main">Pointer (computer programming)</span> Object which stores memory addresses in a computer program

In computer science, a pointer is an object in many programming languages that stores a memory address. This can be that of another value located in computer memory, or in some cases, that of memory-mapped computer hardware. A pointer references a location in memory, and obtaining the value stored at that location is known as dereferencing the pointer. As an analogy, a page number in a book's index could be considered a pointer to the corresponding page; dereferencing such a pointer would be done by flipping to the page with the given page number and reading the text found on that page. The actual format and content of a pointer variable is dependent on the underlying computer architecture.

xargs is a command on Unix and most Unix-like operating systems used to build and execute commands from standard input. It converts input from standard input into arguments to a command.

stat (system call) Unix system call

stat is a Unix system call that returns file attributes about an inode. The semantics of stat vary between operating systems. As an example, Unix command ls uses this system call to retrieve information on files that includes:

<span class="mw-page-title-main">Comparison of command shells</span>

A command shell is a command-line interface to interact with and manipulate a computer's operating system.

In computer programming, an entry point is the place in a program where the execution of a program begins, and where the program has access to command line arguments.

typedef is a reserved keyword in the programming languages C, C++, and Objective-C. It is used to create an additional name (alias) for another data type, but does not create a new type, except in the obscure case of a qualified typedef of an array type where the typedef qualifiers are transferred to the array element type. As such, it is often used to simplify the syntax of declaring complex data structures consisting of struct and union types, although it is also commonly used to provide specific descriptive type names for integer data types of varying sizes.

For most file systems, a program initializes access to a file in a file system using the open system call. This allocates resources associated to the file, and returns a handle that the process will use to refer to that file. In some cases the open is performed by the first access.

In computing, a shebang is the character sequence consisting of the characters number sign and exclamation mark at the beginning of a script. It is also called sharp-exclamation, sha-bang, hashbang, pound-bang, or hash-pling.

In computer programming, DLL injection is a technique used for running code within the address space of another process by forcing it to load a dynamic-link library. DLL injection is often used by external programs to influence the behavior of another program in a way its authors did not anticipate or intend. For example, the injected code could hook system function calls, or read the contents of password textboxes, which cannot be done the usual way. A program used to inject arbitrary code into arbitrary processes is called a DLL injector.

setcontext is one of a family of C library functions used for context control. The setcontext family allows the implementation in C of advanced control flow patterns such as iterators, fibers, and coroutines. They may be viewed as an advanced version of setjmp/longjmp; whereas the latter allows only a single non-local jump up the stack, setcontext allows the creation of multiple cooperative threads of control, each with its own stack.

<span class="mw-page-title-main">CPU time</span> Time used by a computer

CPU time is the amount of time for which a central processing unit (CPU) was used for processing instructions of a computer program or operating system, as opposed to elapsed time, which includes for example, waiting for input/output (I/O) operations or entering low-power (idle) mode. The CPU time is measured in clock ticks or seconds. Often, it is useful to measure CPU time as a percentage of the CPU's capacity, which is called the CPU usage. CPU time and CPU usage have two main uses.

Spawn in computing refers to a function that loads and executes a new child process. The current process may wait for the child to terminate or may continue to execute concurrent computing. Creating a new subprocess requires enough memory in which both the child process and the current program can execute.

PATH is an environment variable on Unix-like operating systems, DOS, OS/2, and Microsoft Windows, specifying a set of directories where executable programs are located. In general, each executing process or user session has its own PATH setting.

Getopt is a C library function used to parse command-line options of the Unix/POSIX style. It is a part of the POSIX specification, and is universal to Unix-like systems. It is also the name of a Unix program for parsing command line arguments in shell scripts.

Different command-line argument parsing methods are used by different programming languages to parse command-line arguments.

In Unix-like operating systems, dup and dup2 system calls create a copy of a given file descriptor. This new descriptor actually does not behave like a copy, but like an alias of the old one.

References

  1. 1 2 "exec(3) - Linux manual page". man7.org. Retrieved 2016-10-14.
  2. "Shell Wrappers". Linux Documentation Project. 2014-03-10. Retrieved 2021-02-18.