Program execution |
---|
General concepts |
Types of code |
Compilation strategies |
Notable runtimes |
|
Notable compilers & toolchains |
|
Tracing just-in-time compilation is a technique used by virtual machines to optimize the execution of a program at runtime. This is done by recording a linear sequence of frequently executed operations, compiling them to native machine code and executing them. This is opposed to traditional just-in-time (JIT) compilers that work on a per-method basis.
Just-in-time compilation is a technique to increase execution speed of programs by compiling parts of a program to machine code at runtime. One way to categorize different JIT compilers is by their compilation scope. Whereas method-based JIT compilers translate one method at a time to machine code, tracing JITs use frequently executed loops as their unit of compilation. Tracing JITs are based on the assumptions that programs spend most of their time in some loops of the program ("hot loops") and subsequent loop iterations often take similar paths. Virtual machines that have a tracing JIT are often mixed-mode execution environments, meaning that they have either an interpreter or a method compiler in addition to the tracing JIT.
A tracing JIT compiler goes through various phases at runtime. First, profiling information for loops is collected. After a hot loop has been identified, a special tracing phase is entered, which records all executed operations of that loop. This sequence of operations is called a trace. The trace is then optimized and compiled to machine code. When this loop is executed again, the compiled trace is called instead of the program counterpart.
These steps are explained in detail in the following:
The goal of profiling is to identify hot loops. This is often done by counting the number of iterations for every loop. After the count of a loop exceeds a certain threshold, the loop is considered to be hot, and tracing phase is entered.
In the tracing phase the execution of the loop proceeds normally, but in addition every executed operation is recorded into a trace. The recorded operations are typically stored in trace tree, often in an intermediate representation (IR). Tracing follows function calls, which leads to them being inlined into the trace. Tracing continues until the loop reaches its end and jumps back to the start.
Since the trace is recorded by following one concrete execution path of the loop, later executions of that trace can diverge from that path. To identify the places where that can happen, special guard instructions are inserted into the trace. One example for such a place are if statements. The guard is a quick check to determine whether the original condition is still true. If a guard fails, the execution of the trace is aborted.
Since tracing is done during execution, the trace can be made to contain runtime information (e.g. type information). This information can later be used in the optimization phase to increase code efficiency.
Traces are easy to optimize, since they represent only one execution path, which means that no control flow exists and needs no handling. Typical optimizations include common-subexpression elimination, dead code elimination, register allocation, invariant-code motion, constant folding, and escape analysis. [1]
After the optimization, the trace is turned into machine code. Similarly to optimization, this is easy due to the linear nature of traces.
After the trace has been compiled to machine code, it can be executed in subsequent iterations of the loop. Trace execution continues until a guard fails.
Whereas the idea of JITs reaches back to the 1960s, tracing JITs have become used more often only recently. The first mention of an idea that is similar to today's idea of tracing JITs was in 1970. [2] It was observed that compiled code could be derived from an interpreter at run-time by simply storing the actions performed during interpretation.
The first implementation of tracing is Dynamo, "a software dynamic optimization system that is capable of transparently improving the performance of a native instruction stream as it executes on the processor". [3] To do this, the native instruction stream is interpreted until a "hot" instruction sequence is found. For this sequence an optimized version is generated, cached and executed.
Dynamo was later extended to DynamoRIO. One DynamoRIO-based project was a framework for interpreter construction that combines tracing and partial evaluation. It was used to "dynamically remove interpreter overhead from language implementations". [4]
In 2006, HotpathVM, the first tracing JIT compiler for a high-level language[ citation needed ] was developed. [5] This VM was capable of dynamically identifying frequently executed bytecode instructions, which are traced and then compiled to machine code using static single assignment (SSA) construction. The motivation for HotpathVM was to have an efficient JVM for resource constrained mobile devices.
Another example of a tracing JIT is TraceMonkey, one of Mozilla’s JavaScript implementations for Firefox (2009). [6] TraceMonkey compiles frequently executed loop traces in the dynamic language JavaScript at run-time and specializes the generated code for the actual dynamic types occurring on each path.
Another project that utilizes tracing JITs is PyPy. It enables the use of tracing JITs for language implementations that were written with PyPy's translation toolchain, thus improving the performance of any program that is executed using that interpreter. This is possible by tracing the interpreter itself, instead of the program that is executed by the interpreter. [7]
Tracing JITs have also been explored by Microsoft in the SPUR project for their Common Intermediate Language (CIL). SPUR is a generic tracer for CIL, which can also be used to trace through a JavaScript implementation. [8]
Consider the following Python program that computes a sum of squares of successive whole numbers until that sum exceeds 100000:
defsquare(x):returnx*xi=0y=0whileTrue:y+=square(i)ify>100000:breaki=i+1
A trace for this program could look something like this:
loopstart(i1,y1)i2=int_mul(i1,i1)# i*iy2=int_add(y1,i2)# y += i*ib1=int_gt(y2,100000)guard_false(b1)i3=int_add(i1,1)# i = i+1jump(i3,y2)
Note how the function call to square
is inlined into the trace and how the if statement is turned into a guard_false
.
In computing, a compiler is a computer program that translates computer code written in one programming language into another language. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a low-level programming language to create an executable program.
A Java virtual machine (JVM) is a virtual machine that enables a computer to run Java programs as well as programs written in other languages that are also compiled to Java bytecode. The JVM is detailed by a specification that formally describes what is required in a JVM implementation. Having a specification ensures interoperability of Java programs across different implementations so that program authors using the Java Development Kit (JDK) need not worry about idiosyncrasies of the underlying hardware platform.
In computing, a virtual machine (VM) is the virtualization or emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer. Their implementations may involve specialized hardware, software, or a combination of the two. Virtual machines differ and are organized by their function, shown here:
In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpreter generally uses one of the following strategies for program execution:
Bytecode is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecodes are compact numeric codes, constants, and references that encode the result of compiler parsing and performing semantic analysis of things like type, scope, and nesting depths of program objects.
In computing, just-in-time (JIT) compilation is compilation during execution of a program rather than before execution. This may consist of source code translation but is more commonly bytecode translation to machine code, which is then executed directly. A system implementing a JIT compiler typically continuously analyses the code being executed and identifies parts of the code where the speedup gained from compilation or recompilation would outweigh the overhead of compiling that code.
In compiler design, static single assignment form is a type of intermediate representation (IR) where each variable is assigned exactly once. SSA is used in most high-quality optimizing compilers for imperative languages, including LLVM, the GNU Compiler Collection, and many commercial compilers.
In compiler optimization, register allocation is the process of assigning local automatic variables and expression results to a limited number of processor registers.
In software engineering, profiling is a form of dynamic program analysis that measures, for example, the space (memory) or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls. Most commonly, profiling information serves to aid program optimization, and more specifically, performance engineering.
PyPy is an implementation of the Python programming language. PyPy often runs faster than the standard implementation CPython because PyPy uses a just-in-time compiler. Most Python code runs well on PyPy except for code that depends on CPython extensions, which either does not work or incurs some overhead when run in PyPy.
In computer programming, a programming language implementation is a system for executing computer programs. There are two general approaches to programming language implementation:
Thread Level Speculation (TLS), also known as Speculative Multi-threading, or Speculative Parallelization, is a technique to speculatively execute a section of computer code that is anticipated to be executed later in parallel with the normal execution on a separate independent thread. Such a speculative thread may need to make assumptions about the values of input variables. If these prove to be invalid, then the portions of the speculative thread that rely on these input variables will need to be discarded and squashed. If the assumptions are correct the program can complete in a shorter time provided the thread was able to be scheduled efficiently.
In computer science, ahead-of-time compilation is the act of compiling an (often) higher-level programming language into an (often) lower-level language before execution of a program, usually at build-time, to reduce the amount of work needed to be performed at run time.
Dalvik is a discontinued process virtual machine (VM) in the Android operating system that executes applications written for Android. Dalvik was an integral part of the Android software stack in the Android versions 4.4 "KitKat" and earlier, which were commonly used on mobile devices such as mobile phones and tablet computers, and more in some devices such as smart TVs and wearables. Dalvik is open-source software, originally written by Dan Bornstein, who named it after the fishing village of Dalvík in Eyjafjörður, Iceland.
Profile-guided optimization, also known as profile-directed feedback (PDF), and feedback-directed optimization (FDO) is a compiler optimization technique in computer programming that uses profiling to improve program runtime performance.
A trace tree is a data structure that is used in the runtime compilation of programming code. Trace trees are used in tracing just-in-time compilation where tracing is used during code execution to look for hot spots before compilation. When those hot spots are entered again the compiled code is run instead. Each statement executed is traced, including within other function calls, and the entire execution path is compiled. This is different from compiling individual functions. More information can be gained allowing better compiler optimizations, including the removal of some function call overhead. The interpreter is called to continue whenever compiled code makes calls to code outside the compilation contexts.
LuaJIT is a tracing just-in-time compiler for the Lua programming language. Mike Pall, a primary maintainer of the project had resigned in 2015, resorting only to occasional patching to the future 2.1 version.
GraalVM is a Java Development Kit (JDK) written in Java. The open-source distribution of GraalVM is based on OpenJDK, and the enterprise distribution is based on Oracle JDK. As well as just-in-time (JIT) compilation, GraalVM can compile a Java application ahead of time. This allows for faster initialization, greater runtime performance, and decreased resource consumption, but the resulting executable can only run on the platform it was compiled for.
HipHop Virtual Machine (HHVM) is an open-source virtual machine based on just-in-time (JIT) compilation that serves as an execution engine for the Hack programming language. By using the principle of JIT compilation, Hack code is first transformed into intermediate HipHop bytecode (HHBC), which is then dynamically translated into x86-64 machine code, optimized, and natively executed. This contrasts with PHP's usual interpreted execution, in which the Zend Engine transforms PHP source code into opcodes that serve as a form of bytecode, and executes the opcodes directly on the Zend Engine's virtual CPU.
Android Runtime (ART) is an application runtime environment used by the Android operating system. Replacing Dalvik, the process virtual machine originally used by Android, ART performs the translation of the application's bytecode into native instructions that are later executed by the device's runtime environment.