In software engineering, retargeting is an attribute of software development tools that have been specifically designed to generate code for more than one computing platform.
A retargetable compiler is a compiler that has been designed to be relatively easy to modify to generate code for different CPU instruction set architectures. The history of this idea dates back to the 1950s when UNCOL was proposed as the universal intermediate language. The Pascal P-compiler is an example of an early widely used retargetable compiler.
The cost of producing a retargetable compiler that generates code of similar quality to a non-retargetable compiler (i.e., one designed to only ever produce code for a single processor) is higher because it is not possible to make use of cpu specific details throughout all phases of compilation. The benefits of a retargetable compiler is that the total cost over multiple CPUs is much lower than the combined cost of many individual non-targetable compilers.
Some retargetable compilers, e.g., GCC, became so widely ported and developed that they now include support for many optimizations and machine specific details that the quality of code often surpasses that of non-retargetable compilers on many CPUs.
A general-purpose global optimizer followed by machine-specific peephole optimization is a commonly used implementation technique. [1] The optimization of code for some high performance processors requires a detailed and specific knowledge of the architecture and how the instructions are executed. Unless developers invest the large amount of time necessary to write a code generator specifically for an architecture, the optimizations performed by a retargetable compiler will only be those applicable to generic processor characteristics.
A retargetable compiler is a kind of cross compiler. Often (but not always) a retargetable compiler is portable (the compiler itself can run on several different CPUs) and self-hosting.
Examples of retargetable compilers:
retdec ("Retargetable Decompiler") is an open source retargetable machine-code decompiler based on LLVM.[ citation needed ]
Conversely, retargetable assemblers are capable of generating object files of different formats, which is useful in porting assembly language programs to various operating systems that run on the same CPU architecture (such as Windows and Linux on the x86 platform). NASM is one such assembler.
In computer programming, assembly language, often abbreviated asm, is any low-level programming language in which there is a very strong correspondence between the instructions in the language and the architecture's machine code instructions. Because assembly depends on the machine code instructions, every assembly language is designed for exactly one specific computer architecture. Assembly language may also be called symbolic machine code.
In computing, a compiler is a computer program that translates computer code written in one programming language into another language. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language to create an executable program.
The GNU Compiler Collection (GCC) is an optimizing compiler produced by the GNU Project supporting various programming languages, hardware architectures and operating systems. The Free Software Foundation (FSF) distributes GCC as free software under the GNU General Public License. GCC is a key component of the GNU toolchain and the standard compiler for most projects related to GNU and the Linux kernel. With roughly 15 million lines of code in 2019, GCC is one of the biggest open source programs in existence. It has played an important role in the growth of free software, as both a tool and an example.
In computer science, register transfer language (RTL) is a type of object code a kind of intermediate representation (IR) that is very close to assembly language, such as that which is used in a compiler. It is used to describe data flow at the register-transfer level of an architecture. Academic papers and textbooks often use a form of RTL as an architecture-neutral assembly language. RTL is used as the name of a specific intermediate representation in several compilers, including the GNU Compiler Collection (GCC), Zephyr, and the European compiler projects CerCo and CompCert.
In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power consumption.
Single instruction, multiple data (SIMD) is a class of parallel computers in Flynn's taxonomy. It describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously. Such machines exploit data level parallelism, but not concurrency: there are simultaneous (parallel) computations, but only a single process (instruction) at a given moment. SIMD is particularly applicable to common tasks such as adjusting the contrast in a digital image or adjusting the volume of digital audio. Most modern CPU designs include SIMD instructions to improve the performance of multimedia use. SIMD is not to be confused with SIMT, which utilizes threads.
In computing, executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer "to perform indicated tasks according to encoded instructions", as opposed to a data file that must be interpreted (parsed) by a program to be meaningful.
In computing, binary translation is a form of binary recompilation where sequences of instructions are translated from a source instruction set to the target instruction set. In some cases such as instruction set simulation, the target instruction set may be the same as the source instruction set, providing testing and debugging features such as instruction trace, conditional breakpoints and hot spot detection.
In computing, just-in-time (JIT) compilation is a way of executing computer code that involves compilation during execution of a program – at run time – rather than before execution. Most often, this consists of source code or more commonly bytecode translation to machine code, which is then executed directly. A system implementing a JIT compiler typically continuously analyses the code being executed and identifies parts of the code where the speedup gained from compilation or recompilation would outweigh the overhead of compiling that code.
SSE2 is one of the Intel SIMD processor supplementary instruction sets first introduced by Intel with the initial version of the Pentium 4 in 2000. It extends the earlier SSE instruction set, and is intended to fully replace MMX. Intel extended SSE2 to create SSE3 in 2004. SSE2 added 144 new instructions to SSE, which has 70 instructions. Competing chip-maker AMD added support for SSE2 with the introduction of their Opteron and Athlon 64 ranges of AMD64 64-bit CPUs in 2003.
The LLVM compiler infrastructure project is a set of compiler and toolchain technologies, which can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes.
In computer science, instruction scheduling is a compiler optimization used to improve instruction-level parallelism, which improves performance on machines with instruction pipelines. Put more simply, it tries to do the following without changing the meaning of the code:
An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive for further processing, such as optimization and translation. A "good" IR must be accurate – capable of representing the source code without loss of information – and independent of any particular source or target language. An IR may take one of several forms: an in-memory data structure, or a special tuple- or stack-based code readable by the program. In the latter case it is also called an intermediate language.
In computer science, instruction selection is the stage of a compiler backend that transforms its middle-level intermediate representation (IR) into a low-level IR. In a typical compiler, instruction selection precedes both instruction scheduling and register allocation; hence its output IR has an infinite set of pseudo-registers and may still be – and typically is – subject to peephole optimization. Otherwise, it closely resembles the target machine code, bytecode, or assembly language.
The Amsterdam Compiler Kit (ACK) is a retargetable compiler suite and toolchain written by Andrew Tanenbaum and Ceriel Jacobs, and was MINIX's native toolchain until the MINIX userland was largely replaced by that of NetBSD and Clang was adopted as the system compiler. The ACK was originally closed-source software, but in April 2003 it was released under an open source BSD license. It has frontends for the following programming languages: C, Pascal, Modula-2, Occam, and BASIC.
Intel C++ Compiler Classic and Intel oneAPI DPC++/C++ Compiler are Intel’s C, C++, SYCL, and Data Parallel C++ (DPC++) compilers for Intel processor-based systems, available for Microsoft Windows, Linux, and macOS operating systems.
Clang is a compiler front end for the C, C++, Objective-C and Objective-C++ programming languages, as well as the OpenMP, OpenCL, RenderScript, CUDA and HIP frameworks. It uses the LLVM compiler infrastructure as its back end and has been part of the LLVM release cycle since LLVM 2.6.
Advanced Vector Extensions are extensions to the x86 instruction set architecture for microprocessors from Intel and AMD proposed by Intel in March 2008 and first supported by Intel with the Sandy Bridge processor shipping in Q1 2011 and later on by AMD with the Bulldozer processor shipping in Q3 2011. AVX provides new features, new instructions and a new coding scheme.
The FMA instruction set is an extension to the 128 and 256-bit Streaming SIMD Extensions instructions in the x86 microprocessor instruction set to perform fused multiply–add (FMA) operations. There are two variants: