Crt0

Last updated

crt0 (also known as c0) is a set of execution startup routines linked into a C program that performs any initialization work required before calling the program's main function. After the main function completes the control returns to crt0, which calls the library function exit(0) to terminate the process.

Contents

Form and usage

Crt0 generally takes the form of an object file called crt0.o, often written in assembly language, which is automatically included by the linker into every executable file it builds. [1]

crt0 contains the most basic parts of the runtime library. As such, the exact work it performs depends on the program's compiler, operating system and C standard library implementation. [1] Beside the initialization work required by the environment and toolchain, crt0 can perform additional operations defined by the programmer, such as executing C++ global constructors and C functions carrying GCC's ((constructor)) attribute. [2] [3]

"crt" stands for "C runtime", and the zero stands for "the very beginning". However, when programs are compiled using GCC, it is also used for languages other than C. Alternative versions of crt0 are available for special usage scenarios; for example, to enable profiling with gprof, programs must be compiled with gcrt0 instead. [4]

Example crt0.s

This example is for Linux x86-64 with AT&T syntax, without an actual C runtime.

.text.globl_start_start:# _start is the entry point known to the linkerxor%ebp,%ebp# effectively RBP := 0, mark the end of stack framesmov(%rsp),%edi# get argc from the stack (implicitly zero-extended to 64-bit)lea8(%rsp),%rsi# take the address of argv from the stacklea16(%rsp,%rdi,8),%rdx# take the address of envp from the stackxor%eax,%eax# per ABI and compatibility with icccallmain# %edi, %rsi, %rdx are the three args (of which first two are C standard) to mainmov%eax,%edi# transfer the return of main to the first argument of _exitxor%eax,%eax# per ABI and compatibility with icccall_exit# terminate the program

See also

Related Research Articles

x86 Family of instruction set architectures

x86 is a family of complex instruction set computer (CISC) instruction set architectures initially developed by Intel based on the 8086 microprocessor and its 8-bit-external-bus variant, the 8088. The 8086 was introduced in 1978 as a fully 16-bit extension of 8-bit Intel's 8080 microprocessor, with memory segmentation as a solution for addressing more memory than can be covered by a plain 16-bit address. The term "x86" came into being because the names of several successors to Intel's 8086 processor end in "86", including the 80186, 80286, 80386 and 80486. Colloquially, their names were "186", "286", "386" and "486".

Common Intermediate Language (CIL), formerly called Microsoft Intermediate Language (MSIL) or Intermediate Language (IL), is the intermediate language binary instruction set defined within the Common Language Infrastructure (CLI) specification. CIL instructions are executed by a CIL-compatible runtime environment such as the Common Language Runtime. Languages which target the CLI compile to CIL. CIL is object-oriented, stack-based bytecode. Runtimes typically just-in-time compile CIL instructions into native code.

<span class="mw-page-title-main">Application binary interface</span> Binary interface between two program units

In computer software, an application binary interface (ABI) is an interface between two binary program modules. Often, one of these modules is a library or operating system facility, and the other is a program that is being run by a user.

Transmeta Corporation was an American fabless semiconductor company based in Santa Clara, California. It developed low power x86 compatible microprocessors based on a VLIW core and a software layer called Code Morphing Software.

A low-level programming language is a programming language that provides little or no abstraction from a computer's instruction set architecture; commands or functions in the language are structurally similar to a processor's instructions. Generally, this refers to either machine code or assembly language. Because of the low abstraction between the language and machine language, low-level languages are sometimes described as being "close to the hardware". Programs written in low-level languages tend to be relatively non-portable, due to being optimized for a certain type of system architecture.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

<span class="mw-page-title-main">D (programming language)</span> Multi-paradigm system programming language

D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineering of C++, D is now a very different language drawing inspiration from other high-level programming languages, notably Java, Python, Ruby, C#, and Eiffel.

In software engineering, a spinlock is a lock that causes a thread trying to acquire it to simply wait in a loop ("spin") while repeatedly checking whether the lock is available. Since the thread remains active but is not performing a useful task, the use of such a lock is a kind of busy waiting. Once acquired, spinlocks will usually be held until they are explicitly released, although in some implementations they may be automatically released if the thread being waited on blocks or "goes to sleep".

In computing, a bus error is a fault raised by hardware, notifying an operating system (OS) that a process is trying to access memory that the CPU cannot physically address: an invalid address for the address bus, hence the name. In modern use on most architectures these are much rarer than segmentation faults, which occur primarily due to memory access violations: problems in the logical address or permissions.

In computer programming, a runtime library is a set of low-level routines used by a compiler to invoke some of the behaviors of a runtime environment, by inserting calls to the runtime library into compiled executable binary. The runtime environment implements the execution model, built-in functions, and other fundamental behaviors of a programming language. During execution of that computer program, execution of those calls to the runtime library cause communication between the executable binary and the runtime environment. A runtime library often includes built-in functions for memory management or exception handling. Therefore, a runtime library is always specific to the platform and compiler.

In computer programming, an inline assembler is a feature of some compilers that allows low-level code written in assembly language to be embedded within a program, among code that otherwise has been compiled from a higher-level language such as C or Ada.

In computer science, a calling convention is an implementation-level (low-level) scheme for how subroutines or functions receive parameters from their caller and how they return a result. When some code calls a function, design choices have been taken for where and how parameters are passed to that function, and where and how results are returned from that function, with these transfers typically done via certain registers or within a stack frame on the call stack. There are design choices for how the tasks of preparing for a function call and restoring the environment after the function has completed are divided between the caller and the callee. Some calling convention specifies the way every function should get called. The correct calling convention should be used for every function call, to allow the correct and reliable execution of the whole program using these functions.

In assembly language programming, the function prologue is a few lines of code at the beginning of a function, which prepare the stack and registers for use within the function. Similarly, the function epilogue appears at the end of the function, and restores the stack and registers to the state they were in before the function was called.

In the x86 architecture, the CPUID instruction is a processor supplementary instruction allowing software to discover details of the processor. It was introduced by Intel in 1993 with the launch of the Pentium and SL-enhanced 486 processors.

On many computer operating systems, a computer process terminates its execution by making an exit system call. More generally, an exit in a multithreading environment means that a thread of execution has stopped running. For resource management, the operating system reclaims resources that were used by the process. The process is said to be a dead process after it terminates.

This article describes the calling conventions used when programming x86 architecture microprocessors.

<span class="mw-page-title-main">Cosmos (operating system)</span> Toolkit for building GUI and command-line based operating systems

C# Open Source Managed Operating System (Cosmos) is a toolkit for building GUI and command-line based operating systems, written mostly in the programming language C# and small amounts of a high-level assembly language named X#. Cosmos is a backronym, in that the acronym was chosen before the meaning. It is open-source software released under a BSD license.

A decompiler is a computer program that translates an executable file to high-level source code. It does therefore the opposite of a typical compiler, which translates a high-level language to a low-level language. While disassemblers translate an executable into assembly language, decompilers go a step further and translate the code into a higher level language such as C or Java, requiring more sophisticated techniques. Decompilers are usually unable to perfectly reconstruct the original source code, thus will frequently produce obfuscated code. Nonetheless, they remain an important tool in the reverse engineering of computer software.

Blocks are a non-standard extension added by Apple Inc. to Clang's implementations of the C, C++, and Objective-C programming languages that uses a lambda expression-like syntax to create closures within these languages. Blocks are supported for programs developed for Mac OS X 10.6+ and iOS 4.0+, although third-party runtimes allow use on Mac OS X 10.5 and iOS 2.2+ and non-Apple systems.

JIT spraying is a class of computer security exploit that circumvents the protection of address space layout randomization and data execution prevention by exploiting the behavior of just-in-time compilation. It has been used to exploit the PDF format and Adobe Flash.

References

  1. 1 2 "The C Runtime Initialization, crt0.o". embecosm.com. 2010. Retrieved 2013-12-30.
  2. "Program initialization: Creating a C library". osdev.org. 2014-02-25. Retrieved 2014-04-21.
  3. "Calling Global Constructors". osdev.org. 2014-04-08. Retrieved 2014-04-21.
  4. "Compiling a Program for Profiling: GNU gprof". sourceware.org. Retrieved 2013-12-30.