Bootstrappable builds, a process of compiling software that doesn't depend on (compiler) binaries that aren't built from source by this process. [1] [2] [3]
This process can protect against compiler backdoors: if the build process doesn't depend on binary code that is difficult to audit, then a compiler backdoor cannot be hidden in compiler binaries anymore.
A way to tackle the issue for a Software distributions is to reduce the size of the binaries used to bootstrap the distribution until there are not needed anymore or that the size is small enough to be easily reviewed by humans. [4]
Many compilers for various programming languages are written in the language they target. For instance the official Go compiler(gc) is written in Go.
So without alternatives compilers compiler like GCC that are written in another programming language (here in C and C++) the go compiler would require a binary of a previous version of the go compiler binary to be built.
To have bootstrappable builds, it is often possible to find an older versions of the compiler that could be built from sources, and from that, write code to automatically build the next version of the compilers until having a recent version. Identifying which version can build which versions is often not trivial and that often result in very long compilation times for the bootstrap procedure. Sometimes this also require to maintain older compiler versions and to backport support for newer CPU architectures on older compilers versions to be able to bootstrap these architectures. GCC 4.7 for example is the last version that can be compiled using tcc but can then go on to compile newer versions of GCC. [5]
This process can also be replaced or combined with other ways to bootstrap compilers.
For instance it is also possible to write a new compiler for a language, that is written in another language.
These techniques can be used to reduce the size of the binaries used to bootstrap a distribution.
As for building the first compiler that can build the subsequent compilers, it is possible to reduce the size to a single binary that is 357 bytes [6] and from that use multiple stages in the bootstrapping procedure to be able to build a C compiler, and from that build the other compilers or software. [7]
Software can depend on itself for compiling and the first version could've been compiled in a way that isn't bootstrappable.
Gradle is one such case as it depends on Scala, which had a proprietary dependency in its first release [8] , and Kotlin, which depends on itself and Gradle to be compiled. [9]
The Bootstrappable Builds project was started in 2016 as a spin-off of the Reproducible Builds project. [3]
In 2022, Guix gained the ability to be built from the aforementioned 357 bytes binary. [6]
The GNU Compiler Collection (GCC) is a collection of compilers from the GNU Project that support various programming languages, hardware architectures and operating systems. The Free Software Foundation (FSF) distributes GCC as free software under the GNU General Public License. GCC is a key component of the GNU toolchain which is used for most projects related to GNU and the Linux kernel. With roughly 15 million lines of code in 2019, GCC is one of the largest free programs in existence. It has played an important role in the growth of free software, as both a tool and an example.
The GNU Debugger (GDB) is a portable debugger that runs on many Unix-like systems and works for many programming languages, including Ada, Assembly, C, C++, D, Fortran, Haskell, Go, Objective-C, OpenCL C, Modula-2, Pascal, Rust, and partially others.
In computing, a linker or link editor is a computer system program that takes one or more object files and combines them into a single executable file, library file, or another "object" file.
MMIX is a 64-bit reduced instruction set computing (RISC) architecture designed by Donald Knuth, with significant contributions by John L. Hennessy and Richard L. Sites. Knuth has said that,
MMIX is a computer intended to illustrate machine-level aspects of programming. In my books The Art of Computer Programming, it replaces MIX, the 1960s-style machine that formerly played such a role… I strove to design MMIX so that its machine language would be simple, elegant, and easy to learn. At the same time I was careful to include all of the complexities needed to achieve high performance in practice, so that MMIX could in principle be built and even perhaps be competitive with some of the fastest general-purpose computers in the marketplace."
The GNU Compiler for Java (GCJ) is a discontinued free compiler for the Java programming language. It was part of the GNU Compiler Collection.
In software development, Make is a command-line interface software tool that performs actions ordered by configured dependencies as defined in a configuration file called a makefile. It is commonly used for build automation to build executable code from source code. But, not limited to building, Make can perform any operation available via the operating system shell.
MinGW, formerly mingw32, is a free and open source software development environment to create Microsoft Windows applications.
An incremental compiler is a kind of incremental computation applied to the field of compilation. Quite naturally, whereas ordinary compilers make a so-called clean build, that is, (re)build all program modules, an incremental compiler recompiles only modified portions of a program.
Technical variations of Linux distributions include support for different hardware devices and systems or software package configurations. Organizational differences may be motivated by historical reasons. Other criteria include security, including how quickly security upgrades are available; ease of package management; and number of packages available.
In computer science, bootstrapping is the technique for producing a self-compiling compiler – that is, a compiler written in the source programming language that it intends to compile. An initial core version of the compiler is generated in a different language ; successive expanded versions of the compiler are developed using this minimal subset of the language. The problem of compiling a self-compiling compiler has been called the chicken-or-egg problem in compiler design, and bootstrapping is a solution to this problem.
The Tiny C Compiler is an x86, X86-64 and ARM processor C compiler initially written by Fabrice Bellard. It is designed to work for slow computers with little disk space. Windows operating system support was added in version 0.9.23. TCC is distributed under the GNU Lesser General Public License.
In C and related programming languages, long double
refers to a floating-point data type that is often more precise than double precision though the language standard only requires it to be at least as precise as double
. As with C's other floating-point types, it may not necessarily map to an IEEE format.
Swiftfox was a web browser based on Mozilla Firefox. It was available for Linux platforms and distributed by Jason Halme. Swiftfox was a set of builds of Firefox optimized for different Intel and AMD microprocessors. Swiftfox was freely downloadable with open source code and proprietary binaries. Firefox extensions and plugins were compatible with Swiftfox, with notable exceptions. The name Swiftfox comes from the animal swift fox. Swiftfox differs from Firefox by a limited number of changes, and builds for different processors. Swiftfox was discontinued at some point prior to April 2017, and the project homepage now redirects to the creator's private Twitter account.
Swiftweasel was a fork of Mozilla Firefox available for the Linux platform only.
According to the Free Software Foundation Latin America, Linux-libre is a modified version of the Linux kernel that contains no binary blobs, obfuscated code, or code released under proprietary licenses. In the Linux kernel, those types of code are mostly used for proprietary firmware images. While generally redistributable, they do not give the user the freedom to audit, modify, or, consequently, redistribute their modified versions. The GNU Project keeps Linux-libre in synchronization with the mainline Linux kernel.
GNU Guix is a functional cross-platform package manager and a tool to instantiate and manage Unix-like operating systems, based on the Nix package manager. Configuration and package recipes are written in Guile Scheme. GNU Guix is the default package manager of the GNU Guix System distribution.
In computer programming, self-hosting is the use of a program as part of the toolchain or operating system that produces new versions of that same program—for example, a compiler that can compile its own source code. Self-hosting software is commonplace on personal computers and larger systems. Other programs that are typically self-hosting include kernels, assemblers, command-line interpreters and revision control software.
GNU Guix System or Guix System is a rolling release, free and open source Linux distribution built around the GNU Guix package manager. It enables a declarative operating system configuration and allows system upgrades that the user can rollback. It uses the GNU Shepherd init system and the Linux-libre kernel, with the support of the GNU Hurd kernel under development. On February 3, 2015, the Free Software Foundation added the distribution to its list of endorsed free Linux distributions. The Guix package manager and the Guix System drew inspiration from and were based on the Nix package manager and NixOS respectively.