Ahead-of-time compilation

Last updated

In computer science, ahead-of-time compilation (AOT compilation) is the act of compiling an (often) higher-level programming language into an (often) lower-level language before execution of a program, usually at build-time, to reduce the amount of work needed to be performed at run time.

Contents

It is most commonly associated with the act of compiling a higher-level programming language such as C or C++, or an intermediate representation such as Java bytecode or Common Intermediate Language (CIL) code, into native machine code so that the resulting binary file can execute natively, just like a standard native compiler. When being used in this context, it is often seen as an opposite of just-in-time (JIT) compiling.

Speaking more generally, the target languages of an AOT compilation are not necessarily specific to native machine code but are defined rather arbitrarily. Some academic papers use this word to mean the act of compiling the Java bytecode to C [1] or the timing when optimization pipeline are performed. [2] An academic project [3] uses this word to mean the act of pre-compiling JavaScript to a machine-dependent optimized IR for V8 (JavaScript engine) [4] and to a machine independent bytecode for JavaScriptCore. [5] Some industrial language implementations (e.g. Clojure [6] and Hermes JavaScript engine [7] ) use this word to mean the act of pre-compiling the source language to VM specific bytecode. Angular (web framework) uses this word to mean converting its HTML template and TypeScript to JavaScript. [8]

In fact, since all static compilation is technically performed ahead of time, this particular wording is often used to emphasize examples where there are significant performance advantages over the act of such pre-compiling. The act of compiling Java to Java bytecode is hence rarely referred to as AOT since it is usually a requirement, not an optimization.

Reduced runtime overhead

Some programming languages with a managed code runtime that can be compiled to an intermediate representation use just-in-time (JIT) compiling. This briefly compiles intermediate code into machine code for a native run while the intermediate code is executing which may slow an application's performance. Ahead-of-time compiling eliminates the need for this step by occurring before execution rather than during execution.

Ahead-of-time compiling for dynamically typed languages to native machine code or other static VM bytecode is possible in a limited number of cases only.[ citation needed ] For example, the High Performance Erlang Project (HiPE) AOT compiler for the language Erlang can do this because of advanced static type reconstruction techniques and type speculations.

In most situations with fully AOT compiled programs and libraries, it is possible to remove part of a runtime environment, thus saving disk space, memory, battery life, and startup times (no JIT warmup phase), etc. Because of this, it can be useful in embedded or mobile devices.

Performance trade-offs

AOT compilers can perform complex and advanced code optimizations which in most cases of JITing will be considered much too costly. In contrast, AOT usually cannot perform some optimizations possible in JIT like runtime profile-guided optimization (PGO), pseudo-constant propagation, or indirect-virtual function inlining. AOT must compile to a target architecture while a JIT can compile the code to make the best use of the actual CPU it is running on, even years after the software has been released.

Further, JIT compilers can speculatively optimize hot code by making assumptions on the code. The generated code can be deoptimized if a speculative assumption later proves wrong. Such operation slows the performance of the running software until code is optimized again by adaptive optimization. An AOT compiler cannot make such assumptions and needs to infer as much information as possible at compile time. It needs to resort to less specialized code because it cannot know what types will go through a method. Such problems can be alleviated by profile-guided optimizations. But even in this case, the generated code cannot be adapted dynamically to the changing runtime profile as a JIT compiler would do.

Storage trade-offs

See also

Related Research Articles

<span class="mw-page-title-main">Java virtual machine</span> Virtual machine that runs Java programs

A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally describes what is required in a JVM implementation. Having a specification ensures interoperability of Java programs across different implementations so that program authors using the Java Development Kit (JDK) need not worry about idiosyncrasies of the underlying hardware platform.

Common Intermediate Language (CIL), formerly called Microsoft Intermediate Language (MSIL) or Intermediate Language (IL), is the intermediate language binary instruction set defined within the Common Language Infrastructure (CLI) specification. CIL instructions are executed by a CIL-compatible runtime environment such as the Common Language Runtime. Languages which target the CLI compile to CIL. CIL is object-oriented, stack-based bytecode. Runtimes typically just-in-time compile CIL instructions into native code.

<span class="mw-page-title-main">Interpreter (computing)</span> Program that executes source code without a separate compilation step

In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpreter generally uses one of the following strategies for program execution:

  1. Parse the source code and perform its behavior directly;
  2. Translate source code into some efficient intermediate representation or object code and immediately execute that;
  3. Explicitly execute stored precompiled bytecode made by a compiler and matched with the interpreter's virtual machine.

Bytecode is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references that encode the result of compiler parsing and performing semantic analysis of things like type, scope, and nesting depths of program objects.

<span class="mw-page-title-main">SpiderMonkey</span> JavaScript and WebAssembly engine maintained by the Mozilla Foundation

SpiderMonkey is an open-source JavaScript and WebAssembly engine by the Mozilla Foundation.

In computing, just-in-time (JIT) compilation is compilation during execution of a program rather than before execution. This may consist of source code translation but is more commonly bytecode translation to machine code, which is then executed directly. A system implementing a JIT compiler typically continuously analyses the code being executed and identifies parts of the code where the speedup gained from compilation or recompilation would outweigh the overhead of compiling that code.

Application virtualization software refers to both application virtual machines and software responsible for implementing them. Application virtual machines are typically used to allow application bytecode to run portably on many different computer architectures and operating systems. The application is usually run on the computer using an interpreter or just-in-time compilation (JIT). There are often several implementations of a given virtual machine, each covering a different set of functions.

In computer programming, a programming language implementation is a system for executing computer programs. There are two general approaches to programming language implementation:

In software development, the programming language Java was historically considered slower than the fastest third-generation typed languages such as C and C++. In contrast to those languages, Java compiles by default to a Java Virtual Machine (JVM) with operations distinct from those of the actual computer hardware. Early JVM implementations were interpreters; they simulated the virtual operations one-by-one rather than translating them into machine code for direct hardware execution.

Eclipse OpenJ9 is a high performance, scalable, Java virtual machine (JVM) implementation that is fully compliant with the Java Virtual Machine Specification.

Dalvik is a discontinued process virtual machine (VM) in the Android operating system that executes applications written for Android. Dalvik was an integral part of the Android software stack in the Android versions 4.4 "KitKat" and earlier, which were commonly used on mobile devices such as mobile phones and tablet computers, and more in some devices such as smart TVs and wearables. Dalvik is open-source software, originally written by Dan Bornstein, who named it after the fishing village of Dalvík in Eyjafjörður, Iceland.

V8 is a JavaScript and WebAssembly engine developed by Google for its Chrome browser. V8 is free and open-source software that is part of the Chromium project and also used separately in non-browser contexts, notably the Node.js runtime system.

The Native Image Generator, or simply NGen, is the ahead-of-time compilation (AOT) service of the .NET Framework. It allows a CLI assembly to be pre-compiled instead of letting the Common Language Runtime (CLR) do a just-in-time compilation (JIT) at runtime. In some cases the execution will be significantly faster than with JIT.

Dart is a programming language designed by Lars Bak and Kasper Lund and developed by Google. It can be used to develop web and mobile apps as well as server and desktop applications.

Tracing just-in-time compilation is a technique used by virtual machines to optimize the execution of a program at runtime. This is done by recording a linear sequence of frequently executed operations, compiling them to native machine code and executing them. This is opposed to traditional just-in-time (JIT) compilers that work on a per-method basis.

Excelsior JET is a now-defunct proprietary Java SE technology implementation built around an ahead-of-time (AOT) Java to native code compiler. The compiler transforms the portable Java bytecode into optimized executables for the desired hardware and operating system (OS). Also included are a Java runtime featuring a just-in-time (JIT) compiler for handling classes that were not precompiled for whatever reason, the complete Java SE API implementation licensed from Oracle, and a toolkit to aid deployment of the optimized applications. Excelsior JET was developed by Excelsior LLC, headquartered in Novosibirsk, Russia.

Java bytecode is the instruction set of the Java virtual machine (JVM), the language to which Java and other JVM-compatible source code is compiled. Each instruction is represented by a single byte, hence the name bytecode, making it a compact form of data.

<span class="mw-page-title-main">GraalVM</span> Virtual machine software

GraalVM is a Java Development Kit (JDK) written in Java. The open-source distribution of GraalVM is based on OpenJDK, and the enterprise distribution is based on Oracle JDK. As well as just-in-time (JIT) compilation, GraalVM can compile a Java application ahead of time. This allows for faster initialization, greater runtime performance, and decreased resource consumption, but the resulting executable can only run on the platform it was compiled for.

Android Runtime (ART) is an application runtime environment used by the Android operating system. Replacing Dalvik, the process virtual machine originally used by Android, ART performs the translation of the application's bytecode into native instructions that are later executed by the device's runtime environment.

References

  1. Jung, Dong-Heon; Park, Jong Kuk; Bae, Sung-Hwan; Lee, Jaemok; Moon, Soo-Mook (2006-10-22). "Efficient exception handling in Java bytecode-to-c ahead-of-time compiler for smbedded systems". Proceedings of the 6th ACM & IEEE International conference on Embedded software - EMSOFT '06. EMSOFT '06. Seoul, Korea: Association for Computing Machinery. pp. 188–194. doi:10.1145/1176887.1176915. ISBN   978-1-59593-542-7. S2CID   15591457.
  2. Chambers, Craig (2002-01-14). "Staged compilation". Proceedings of the 2002 ACM SIGPLAN workshop on Partial evaluation and semantics-based program manipulation. PEPM '02. Portland, Oregon: Association for Computing Machinery. pp. 1–8. doi:10.1145/503032.503045. ISBN   978-1-58113-455-1. S2CID   18546907.
  3. Zhuykov, R.; Sharygin, E. (2017-01-01). "Ahead-of-time compilation of JavaScript programs". Programming and Computer Software. 43 (1): 51–59. doi:10.1134/S036176881701008X. ISSN   1608-3261. S2CID   2338849.
  4. ispras/v8-aotc, Ivannikov Institute for System Programming of the Russian Academy of Sciences, 2021-01-30, retrieved 2021-03-17
  5. ispras/webkit-aotc, Ivannikov Institute for System Programming of the Russian Academy of Sciences, 2021-01-30, retrieved 2021-03-17
  6. "Clojure - Ahead-of-time Compilation and Class Generation". clojure.org. Retrieved 2021-03-17.
  7. "Hermes: A new open source JavaScript engine optimized for mobile apps". Facebook Engineering. 2019-07-12. Retrieved 2021-03-17.
  8. "Angular". angular.io. Retrieved 2021-03-17.