Cranelift

Last updated
Cranelift
Developer(s) Bytecode Alliance
Initial release2016;8 years ago (2016)
Stable release
0.105.2 / February 28, 2024;8 months ago (2024-02-28)
Written in Rust
Operating system Cross-platform
Type Compiler
License Apache License with LLVM exceptions [1]
Website cranelift.dev

Cranelift (formerly known as Cretonne) is an optimizing compiler backend that converts a target-independent intermediate representation into executable machine code. It is written in Rust. The project started in 2016 and is currently developed by Bytecode Alliance. [2] [3] Unlike compiler backends such as LLVM that focus more on ahead-of-time compilation, Cranelift instead focuses on just-in-time compilation with short compile time being an explicit goal of the project. [4]

Contents

As of 2023, Cranelift supports instruction set architectures such as x86-64, AArch64, RISC-V, and IBM z/Architecture.

History

Prior to the backend framework rewrite in 2020, the project made use of only one intermediate representation (IR) across all compilation stages prior to machine code emission, namely CLIF (Cranelift IR Format). The old design made use of instruction legalizations, which involved transforming the high-level IR continuously until each CLIF instruction corresponded to an instruction of the target machine, at which point the executable code could be emitted.

In 2020, the backend was rewritten to use a separate, machine-specific IR called VCode for later compilation stages instead. After this rewrite, a high-level input CLIF would be optimized by a middle layer before being lowered into VCode, which would be further transformed by remaining passes in the backend. [4] Peepmatic, a tool used to generate a peephole optimizer from a domain-specific language (DSL), was also added, [5] though it was later removed. [6]

In 2022, a new register allocator, which was adapted from the register allocator from IonMonkey, was added. In addition, the backend framework was reworked to utilize ISLE (Instruction Selection/Lowering Expressions DSL), another DSL in the project that was made to ease the development of the instruction selection stage in backends. [7]

The mid-end saw addition of optimization passes that make use of e-graphs in 2022. [8] The optimization passes are enabled by default starting in 2023.

Applications

Currently, Cranelift is being utilized by multiple WebAssembly runtimes, such as Wasmtime   and Wasmer  , with performance comparable to LLVM-based runtimes. [9]

An alternative code generator for the Rust compiler that is powered by Cranelift also exists. [10]

See also

Related Research Articles

In computing, binary translation is a form of binary recompilation where sequences of instructions are translated from a source instruction set to the target instruction set. In some cases such as instruction set simulation, the target instruction set may be the same as the source instruction set, providing testing and debugging features such as instruction trace, conditional breakpoints and hot spot detection.

In compiler design, static single assignment form is a type of intermediate representation (IR) where each variable is assigned exactly once. SSA is used in most high-quality optimizing compilers for imperative languages, including LLVM, the GNU Compiler Collection, and many commercial compilers.

<span class="mw-page-title-main">LLVM</span> Compiler backend for multiple programming languages

LLVM is a set of compiler and toolchain technologies that can be used to develop a frontend for any programming language and a backend for any instruction set architecture. LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes. The name LLVM originally stood for Low Level Virtual Machine, though the project has expanded and the name is no longer officially an initialism.

In computer programming, an inline assembler is a feature of some compilers that allows low-level code written in assembly language to be embedded within a program, among code that otherwise has been compiled from a higher-level language such as C or Ada.

An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive to further processing, such as optimization and translation. A "good" IR must be accurate – capable of representing the source code without loss of information – and independent of any particular source or target language. An IR may take one of several forms: an in-memory data structure, or a special tuple- or stack-based code readable by the program. In the latter case it is also called an intermediate language.

C-- is a C-like programming language, designed to be generated mainly by compilers for high-level languages rather than written by human programmers. It was created by functional programming researchers Simon Peyton Jones and Norman Ramsey. Unlike many other intermediate languages, it is represented in plain ASCII text, not bytecode or another binary format.

In computer science, instruction selection is the stage of a compiler backend that transforms its middle-level intermediate representation (IR) into a low-level IR. In a typical compiler, instruction selection precedes both instruction scheduling and register allocation; hence its output IR has an infinite set of pseudo-registers and may still be – and typically is – subject to peephole optimization. Otherwise, it closely resembles the target machine code, bytecode, or assembly language.

Superoptimization is the process where a compiler automatically finds the optimal sequence for a loop-free sequence of instructions. Real-world compilers generally cannot produce genuinely optimal code, and while most standard compiler optimizations only improve code partly, a superoptimizer's goal is to find the optimal sequence, the canonical form. Superoptimizers can be used to improve conventional optimizers by highlighting missed opportunities so a human can write additional rules.

V8 is a JavaScript and WebAssembly engine developed by Google for its Chrome browser. V8 is free and open-source software that is part of the Chromium project and also used separately in non-browser contexts, notably the Node.js runtime system.

<span class="mw-page-title-main">Rust (programming language)</span> General-purpose programming language

Rust is a general-purpose programming language emphasizing performance, type safety, and concurrency. It enforces memory safety, meaning that all references point to valid memory. It does so without a traditional garbage collector; instead, both memory safety errors and data races are prevented by the "borrow checker", which tracks the object lifetime of references at compile time.

Emscripten is an LLVM/Clang-based compiler that compiles C and C++ source code to WebAssembly, primarily for execution in web browsers.

asm.js is a subset of JavaScript designed to allow computer software written in languages such as C to be run as web applications while maintaining performance characteristics considerably better than standard JavaScript, which is the typical language used for such applications.

<span class="mw-page-title-main">GraalVM</span> Virtual machine software

GraalVM is a Java Development Kit (JDK) written in Java. The open-source distribution of GraalVM is based on OpenJDK, and the enterprise distribution is based on Oracle JDK. As well as just-in-time (JIT) compilation, GraalVM can compile a Java application ahead of time. This allows for faster initialization, greater runtime performance, and decreased resource consumption, but the resulting executable can only run on the platform it was compiled for.

HipHop Virtual Machine (HHVM) is an open-source virtual machine based on just-in-time (JIT) compilation that serves as an execution engine for the Hack programming language. By using the principle of JIT compilation, Hack code is first transformed into intermediate HipHop bytecode (HHBC), which is then dynamically translated into x86-64 machine code, optimized, and natively executed. This contrasts with PHP's usual interpreted execution, in which the Zend Engine transforms PHP source code into opcodes that serve as a form of bytecode, and executes the opcodes directly on the Zend Engine's virtual CPU.

CrossBridge is an open-source toolset developed by Adobe Systems, that cross-compiles C and C++ code to run in Adobe Flash Player or Adobe AIR. Projects compiled with CrossBridge run up to 10 times faster than ActionScript 3 projects. CrossBridge was also known as "Alchemy" and the "Flash Runtime C++ Compiler", or "FlasCC".

<span class="mw-page-title-main">WebAssembly</span> Cross-platform assembly language and bytecode designed for execution in web browsers

WebAssembly (Wasm) defines a portable binary-code format and a corresponding text format for executable programs as well as software interfaces for facilitating communication between such programs and their host environment.

<span class="mw-page-title-main">ROCm</span> Parallel computing platform: GPGPU libraries and application programming interface

ROCm is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains: general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), heterogeneous computing. It offers several programming models: HIP, OpenMP, and OpenCL.

<span class="mw-page-title-main">Zig (programming language)</span> A general-purpose programming language, toolchain to build Zig/C/C++ code

Zig is an imperative, general-purpose, statically typed, compiled system programming language designed by Andrew Kelley. It is intended as a successor to the language C, with the intent of being even smaller and simpler to program in, while offering more functionality. It is free and open-source software, released under an MIT License.

<span class="mw-page-title-main">AssemblyScript</span> Programming language, variant of TypeScript that compiles to WebAssembly

AssemblyScript is a TypeScript-based programming language that is optimized for, and statically compiled to, WebAssembly. Resembling ECMAScript and JavaScript, but with static types, the language is developed by the AssemblyScript Project with contributions from the AssemblyScript community.

Mojo is a programming language in the Python family that is currently under development. It is available both in browsers via Jupyter notebooks, and locally on Linux and macOS. Mojo aims to combine the usability of a high-level programming language, specifically Python, with the performance of a system programming language such as C++, Rust, and Zig. As of 2024, the Mojo compiler is proprietary software with an open source standard library. Modular, the company behind Mojo, has stated an intent to eventually open source the Mojo language, as it matures.

References

  1. "LICENSE.md". GitHub .
  2. "GitHub - bytecodealliance/cranelift at 399be07fcf00d0974d2a83d1af55f0bee9861d54". GitHub. Retrieved 26 January 2023.
  3. "wasmtime/cranelift at main · bytecodealliance/wasmtime". GitHub. Retrieved 26 January 2023.
  4. 1 2 Fallin, Chris. "A New Backend for Cranelift, Part 1: Instruction Selection". cfallin.org. Retrieved 26 January 2023.
  5. "Introduce peepmatic: a peephole optimizations DSL and peephole optimizer compiler by fitzgen · Pull Request #1647 · bytecodealliance/wasmtime". GitHub.
  6. "Remove Peepmatic!!! by fitzgen · Pull Request #3543 · bytecodealliance/wasmtime". GitHub. Retrieved 2023-02-03.
  7. Fallin, Chris. "Cranelift's Instruction Selector DSL, ISLE: Term-Rewriting Made Practical". cfallin.org. Retrieved 26 January 2023.
  8. "egraph-based midend: draw the rest of the owl (productionized). by cfallin · Pull Request #4953 · bytecodealliance/wasmtime". GitHub. Retrieved 26 January 2023.
  9. Denis, Frank. "Performance of WebAssembly runtimes in 2023 | Frank DENIS random thoughts". 00f.net. Retrieved 26 January 2023.
  10. "Cranelift codegen backend for rust". GitHub . 26 January 2023. Retrieved 26 January 2023.