Executable

Last updated
A hex dump of an executable real mode loader Binary executable file2.png
A hex dump of an executable real mode loader

In computing, executable code or an executable file or executable program, sometimes simply referred to as an executable, causes a computer "to perform indicated tasks according to encoded instructions", [1] as opposed to a data file that must be parsed by a program to be meaningful.

Computing Activity that uses computers

Computing is any activity that uses computers to manage, process, and communicate information for various purposes. It includes development of both hardware and software. Computing is a critical, integral component of modern industrial technology. Major computing disciplines include computer engineering, software engineering, computer science, information systems, and information technology.

Data (computing) quantities, characters, or symbols on which operations are performed by a computer

Data is any sequence of one or more symbols given meaning by specific act(s) of interpretation.

Contents

The exact interpretation depends upon the use - while "instructions" is traditionally taken to mean machine code instructions for a physical CPU, in some contexts a file containing bytecode or scripting language instructions may also be considered executable.

Machine code set of instructions executed directly by a computers central processing unit (CPU)

Machine code is a computer program written in machine language instructions that can be executed directly by a computer's central processing unit (CPU). Each instruction causes the CPU to perform a very specific task, such as a load, a store, a jump, or an ALU operation on one or more units of data in CPU registers or memory.

Central processing unit Central component of any computer system which executes input/output, arithmetical, and logical operations

A central processing unit (CPU), also called a central processor or main processor, is the electronic circuitry within a computer that carries out the instructions of a computer program by performing the basic arithmetic, logic, controlling, and input/output (I/O) operations specified by the instructions. The computer industry has used the term "central processing unit" at least since the early 1960s. Traditionally, the term "CPU" refers to a processor, more specifically to its processing unit and control unit (CU), distinguishing these core elements of a computer from external components such as main memory and I/O circuitry.

Bytecode, also termed portable code or p-code, is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references that encode the result of compiler parsing and performing semantic analysis of things like type, scope, and nesting depths of program objects.

Generation of executable files

Executable files can be hand-coded in machine language, although it is far more convenient to develop software as source code in a high-level language that can be easily understood by humans. In some cases, source code might be specified in assembly language instead, which remains human-readable while being closely associated with machine code instructions. The high-level language is compiled into either an executable machine code file or a non-executable machine-code object file of some sort; the equivalent process on assembly language source code is called assembly. Several object files are linked to create the executable. Object files, executable or not, are typically in a container format, such as Executable and Linkable Format (ELF). This structures the generated machine code, for example dividing it into sections such as the .text (executable code), .data (static variables), and .rodata (static constants).

In computing, source code is any collection of code, possibly with comments, written using a human-readable programming language, usually as plain text. The source code of a program is specially designed to facilitate the work of computer programmers, who specify the actions to be performed by a computer mostly by writing source code. The source code is often transformed by an assembler or compiler into binary machine code understood by the computer. The machine code might then be stored for execution at a later time. Alternatively, source code may be interpreted and thus immediately executed.

Assembly language Low level programming language

An assembly language, often abbreviated asm, is any low-level programming language in which there is a very strong correspondence between the program's statements and the architecture's machine code instructions.

A compiler is a computer program that translates computer code written in one programming language into another programming language. The name compiler is primarily used for programs that translate source code from a high-level programming language to a lower level language to create an executable program.

In order to be executed by the system (such as an operating system, firmware, or boot loader), an executable file must conform to the system's application binary interface (ABI). Most simply a file is executed by loading the file into memory and simply jumping to the start of the address space and executing from there, but in more complicated interfaces executable files have additional metadata, specifying a separate entry point. For example, in ELF, the entry point is specified in the header in the e_entry field, which specifies the (virtual) memory address at which to start execution. In the GCC (GNU Compiler Collection) this field is set by the linker based on the _start symbol.

Firmware low-level software traditionally held in ROM

In computing, firmware is a specific class of computer software that provides the low-level control for the device's specific hardware. Firmware can either provide a standardized operating environment for the device's more complex software, or, for less complex devices, act as the device's complete operating system, performing all control, monitoring and data manipulation functions. Typical examples of devices containing firmware are embedded systems, consumer appliances, computers, computer peripherals, and others. Almost all electronic devices beyond the simplest contain some firmware.

Application binary interface

In computer software, an application binary interface (ABI) is an interface between two binary program modules; often, one of these modules is a library or operating system facility, and the other is a program that is being run by a user.

In computer programming, an entry point is where the first instructions of a program are executed, and where the program has access to command line arguments.

Executable files typically also include a runtime system, which implements runtime language features (such as task scheduling, exception handling, calling static constructors and destructors, etc.) and interactions with the operating system, notably passing arguments, environment, and returning an exit status, together with other startup and shutdown features such as releasing resources like file handles. For C, this is done by linking in the crt0 object, which contains the actual entry point and does setup and shutdown by calling the runtime library.

A runtime system, also called run-time system, runtime environment or run-time environment, primarily implements portions of an execution model. This is not to be confused with the runtime lifecycle phase of a program, during which the runtime system is in operation. Most languages have some form of runtime system that provides an environment in which programs run. This environment may address a number of issues including the layout of application memory, how the program accesses variables, mechanisms for passing parameters between procedures, interfacing with the operating system, and otherwise. The compiler makes assumptions depending on the specific runtime system to generate correct code. Typically the runtime system will have some responsibility for setting up and managing the stack and heap, and may include features such as garbage collection, threads or other dynamic features built into the language.

Exception handling is the process of responding to the occurrence, during computation, of exceptions – anomalous or exceptional conditions requiring special processing – often disrupting the normal flow of program execution. It is provided by specialized programming language constructs, computer hardware mechanisms like interrupts or operating system IPC facilities like signals.

The exit status of a process in computer programming is a small number passed from a child process to a parent process when it has finished executing a specific procedure or delegated task. In DOS, this may be referred to as an errorlevel.

Executable files thus normally contain significant additional machine code beyond that directly generated from the specific source code. In some cases it is desirable to omit this, for example for embedded systems development or simply to understand how compilation, linking, and loading work. In C this can be done by omitting the usual runtime, and instead explicitly specifying a linker script, which generates the entry point and handles startup and shutdown, such as calling main to start and returning exit status to kernel at end. [2]

See also

Related Research Articles

Executable and Linkable Format Standard file format for executables, object code, shared libraries, and core dumps

In computing, the Executable and Linkable Format, is a common standard file format for executable files, object code, shared libraries, and core dumps. First published in the specification for the application binary interface (ABI) of the Unix operating system version named System V Release 4 (SVR4), and later in the Tool Interface Standard, it was quickly accepted among different vendors of Unix systems. In 1999, it was chosen as the standard binary file format for Unix and Unix-like systems on x86 processors by the 86open project.

Java virtual machine runtime environment that can execute Java bytecode as a result of compiling computer programs written in the Java programming language

A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally describes what is required in a JVM implementation. Having a specification ensures interoperability of Java programs across different implementations so that program authors using the Java Development Kit (JDK) need not worry about idiosyncrasies of the underlying hardware platform.

Linker (computing) computer program producing executable programs

In computing, a linker or link editor is a computer utility program that takes one or more object files generated by a compiler or an assembler and combines them into a single executable file, library file, or another 'object' file.

Common Intermediate Language (CIL), formerly called Microsoft Intermediate Language (MSIL), is the intermediate language binary instruction set defined within the Common Language Infrastructure (CLI) specification. CIL instructions are executed by a CLI-compatible runtime environment such as the Common Language Runtime. Languages which target the CLI compile to CIL. CIL is object-oriented, stack-based bytecode. Runtimes typically just-in-time compile CIL instructions into native code.

In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpreter generally uses one of the following strategies for program execution:

  1. Parse the source code and perform its behavior directly;
  2. Translate source code into some efficient intermediate representation and immediately execute this;
  3. Explicitly execute stored precompiled code made by a compiler which is part of the interpreter system.

The Portable Executable (PE) format is a file format for executables, object code, DLLs, FON Font files, and others used in 32-bit and 64-bit versions of Windows operating systems. The PE format is a data structure that encapsulates the information necessary for the Windows OS loader to manage the wrapped executable code. This includes dynamic library references for linking, API export and import tables, resource management data and thread-local storage (TLS) data. On NT operating systems, the PE format is used for EXE, DLL, SYS, and other file types. The Extensible Firmware Interface (EFI) specification states that PE is the standard executable format in EFI environments.

Library (computing) collection of non-volatile resources used by computer programs, often to develop software; software component that can be reused by other software for a specific purpose

In computer science, a library is a collection of non-volatile resources used by computer programs, often for software development. These may include configuration data, documentation, help data, message templates, pre-written code and subroutines, classes, values or type specifications. In IBM's OS/360 and its successors they are referred to as partitioned data sets.

In computer science, self-modifying code is code that alters its own instructions while it is executing – usually to reduce the instruction path length and improve performance or simply to reduce otherwise repetitively similar code, thus simplifying maintenance. Self-modification is an alternative to the method of "flag setting" and conditional program branching, used primarily to reduce the number of times a condition needs to be tested. The term is usually only applied to code where the self-modification is intentional, not in situations where code accidentally modifies itself due to an error such as a buffer overflow.

In computing, just-in-time (JIT) compilation is a way of executing computer code that involves compilation during execution of a program – at run time – rather than prior to execution. Most often, this consists of source code or more commonly bytecode translation to machine code, which is then executed directly. A system implementing a JIT compiler typically continuously analyses the code being executed and identifies parts of the code where the speedup gained from compilation or recompilation would outweigh the overhead of compiling that code.

An object file is a file containing object code, meaning relocatable format machine code that is usually not directly executable. There are various formats for object files, and the same object code can be packaged in different object files. An object file may also work like a shared library.

Execution in computer and software engineering is the process by which a computer or virtual machine executes the instructions of a computer program. Each instruction of a program is a description of a specific action to be carried out in order for a specific problem to be solved; as instructions of a program and therefore the actions they describe are being carried out by an executing machine, specific effects are produced in accordance to the semantics of the instructions being executed.

A COM file is a type of simple executable file. On the Digital Equipment operating systems of the 1970s, .COM was used as a filename extension for text files containing commands to be issued to the operating system. With the introduction of CP/M, the type of files commonly associated with COM extension changed to that of executable files. This convention was later carried over to DOS. Even when complemented by the more general EXE file format for executables, the compact COM files remained viable and frequently used under DOS.

Relocation is the process of assigning load addresses for position-dependent code and data of a program and adjusting the code and data to reflect the assigned addresses. Prior to the advent of multiprocess systems, and still in many embedded systems, the addresses for objects were absolute starting at a known location, often zero. Since multiprocessing systems dynamically link and switch between programs it became necessary to be able to relocate objects using position-independent code. A linker usually performs relocation in conjunction with symbol resolution, the process of searching files and libraries to replace symbolic references or names of libraries with actual usable addresses in memory before running a program.

In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed, by copying the content of libraries from persistent storage to RAM, filling jump tables and relocating pointers. The specific operating system and executable format determine how the dynamic linker functions and how it is implemented.

This is a comparison of binary executable file formats which, once loaded by a suitable executable loader, can be directly executed by the CPU rather than become interpreted by software. In addition to the binary application code, the executables may contain headers and tables with relocation and fixup information as well as various kinds of meta data. Among those formats listed, the ones in most common use are PE, ELF, Mach-O and MZ.

Charm is a computer programming language devised in the early 1990s with similarities to the RTL/2, Pascal and C languages in addition to containing some unique features of its own. The Charm language is defined by a context-free grammar amenable to being processed by recursive descent parser as described in seminal books on compiler design.

In computer science, dynamic software updating (DSU) is a field of research pertaining to upgrading programs while they are running. DSU is not currently widely used in industry. However, researchers have developed a wide variety of systems and techniques for implementing DSU. These systems are commonly tested on real-world programs.

References

  1. "executable". Merriam-Webster's Online Dictionary. Merriam-Webster . Retrieved 2008-07-19.
  2. McKellar, Jessica (Mar 16, 2010). "Hello from a libc-free world! (Part 1)".