Register transfer language

Last updated

In computer science, register transfer language (RTL) is a kind of intermediate representation (IR) that is very close to assembly language, such as that which is used in a compiler. It is used to describe data flow at the register-transfer level of an architecture. Academic papers and textbooks often use a form of RTL as an architecture-neutral assembly language. RTL is used as the name of a specific intermediate representation in several compilers, including the GNU Compiler Collection (GCC), Zephyr, and the European compiler projects CerCo and CompCert.

Contents

History

The idea behind RTL was first described in The Design and Application of a Retargetable Peephole Optimizer. [1]

GCC

In GCC, RTL is generated from the GIMPLE representation, transformed by various passes in the GCC middle-end, and then converted to assembly language.

GCC's RTL is usually written in a form which looks like a Lisp S-expression:

(set(reg:SI140)(plus:SI(reg:SI138)(reg:SI139)))

This side-effect expression says "sum the contents of register 138 with the contents of register 139 and store the result in register 140". The SI specifies the access mode for each register. In the example it is "SImode", i.e. "access the register as 32-bit integer".

The sequence of RTL generated has some dependency on the characteristics of the processor for which GCC is generating code. However, the meaning of the RTL is more-or-less independent of the target: it would usually be possible to read and understand a piece of RTL without knowing what processor it was generated for. Similarly, the meaning of the RTL doesn't usually depend on the original high-level language of the program.

A register transfer language is a system for expressing in symbolic form the microoperation sequences among the registers of a digital module. It is a convenient tool for describing the internal organization of digital computers in concise and precise manner. It can also be used to facilitate the design process of digital systems. [2]

Related Research Articles

Assembly language Low-level programming language

In computer programming, assembly language, sometimes abbreviated asm, is any low-level programming language in which there is a very strong correspondence between the instructions in the language and the architecture's machine code instructions. Assembly language usually has one statement per machine instruction (1:1), but constants, comments, assembler directives, symbolic labels of, e.g., memory locations, registers, and macros are generally also supported.

In computing, a compiler is a computer program that translates computer code written in one programming language into another language. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language to create an executable program.

GNU Compiler Collection Free and open-source compiler for various programming languages

The GNU Compiler Collection (GCC) is an optimizing compiler produced by the GNU Project supporting various programming languages, hardware architectures and operating systems. The Free Software Foundation (FSF) distributes GCC as free software under the GNU General Public License. GCC is a key component of the GNU toolchain and the standard compiler for most projects related to GNU and the Linux kernel. With roughly 15 million lines of code in 2019, GCC is one of the biggest free programs in existence. It has played an important role in the growth of free software, as both a tool and an example.

In computing, object code or object module is the product of a compiler.

In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power consumption.

In computer science, an instruction set architecture (ISA), also called computer architecture, is an abstract model of a computer. A device that executes instructions described by that ISA, such as a central processing unit (CPU), is called an implementation.

Single instruction, multiple data Type of parallel processing

Single instruction, multiple data (SIMD) is a type of parallel processing in Flynn's taxonomy. SIMD can be internal and it can be directly accessible through an instruction set architecture (ISA): it should not be confused with an ISA. SIMD describes computers with multiple processing elements that perform the same operation on multiple data points simultaneously.

Interpreter (computing) Program that executes source code without a separate compilation step

In computer science, an interpreter is a computer program that directly executes instructions written in a programming or scripting language, without requiring them previously to have been compiled into a machine language program. An interpreter generally uses one of the following strategies for program execution:

  1. Parse the source code and perform its behavior directly;
  2. Translate source code into some efficient intermediate representation or object code and immediately execute that;
  3. Explicitly execute stored precompiled bytecode made by a compiler and matched with the interpreter Virtual Machine.

In computing, code generation is the process by which a compiler's code generator converts some intermediate representation of source code into a form that can be readily executed by a machine.

In computer science, a high-level programming language is a programming language with strong abstraction from the details of the computer. In contrast to low-level programming languages, it may use natural language elements, be easier to use, or may automate significant areas of computing systems, making the process of developing a program simpler and more understandable than when using a lower-level language. The amount of abstraction provided defines how "high-level" a programming language is.

In compiler design, static single assignment form is a property of an intermediate representation (IR), which requires that each variable be assigned exactly once, and every variable be defined before it is used. Existing variables in the original IR are split into versions, new variables typically indicated by the original name with a subscript in textbooks, so that every definition gets its own version. In SSA form, use-def chains are explicit and each contains a single element.

In software engineering, the terms frontend and backend refer to the separation of concerns between the presentation layer (frontend), and the data access layer (backend) of a piece of software, or the physical infrastructure or hardware. In the client–server model, the client is usually considered the frontend and the server is usually considered the backend, even when some presentation work is actually done on the server itself.

LLVM is a set of compiler and toolchain technologies, which can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes.

In digital circuit design, register-transfer level (RTL) is a design abstraction which models a synchronous digital circuit in terms of the flow of digital signals (data) between hardware registers, and the logical operations performed on those signals.

In computer engineering, logic synthesis is a process by which an abstract specification of desired circuit behavior, typically at register transfer level (RTL), is turned into a design implementation in terms of logic gates, typically by a computer program called a synthesis tool. Common examples of this process include synthesis of designs specified in hardware description languages, including VHDL and Verilog. Some synthesis tools generate bitstreams for programmable logic devices such as PALs or FPGAs, while others target the creation of ASICs. Logic synthesis is one aspect of electronic design automation.

An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive for further processing, such as optimization and translation. A "good" IR must be accurate – capable of representing the source code without loss of information – and independent of any particular source or target language. An IR may take one of several forms: an in-memory data structure, or a special tuple- or stack-based code readable by the program. In the latter case it is also called an intermediate language.

In software engineering, retargeting is an attribute of software development tools that have been specifically designed to generate code for more than one computing platform.

In computer science, instruction selection is the stage of a compiler backend that transforms its middle-level intermediate representation (IR) into a low-level IR. In a typical compiler, instruction selection precedes both instruction scheduling and register allocation; hence its output IR has an infinite set of pseudo-registers and may still be – and typically is – subject to peephole optimization. Otherwise, it closely resembles the target machine code, bytecode, or assembly language.

Superoptimization is the process of automatically finding the optimal code sequence for one loop-free sequence of instructions. It is performed in and by a type of computer software termed a compiler. Real-world compilers generally cannot produce genuinely optimal code. While most standard compiler optimizations only improve code partly, a superoptimizer's goal is to find the optimal sequence, the canonical form. Superoptimizers can be used to improve conventional optimizers by highlighting missed opportunities so a human can write additional rules.

History of compiler construction Wikimedia history article

In computing, a compiler is a computer program that transforms source code written in a programming language or computer language, into another computer language. The most common reason for transforming source code is to create an executable program.

References

  1. "Davidson and Fraser; The Design and Application of a Retargetable Peephole Optimizer; ToPLaS v2(2) 191-202 (April 1980)" (PDF).
  2. Mano, Morris M. (1992). Computer System Architecture (3rd Edition). Prentice Hall. p. 94. ISBN   0131755633.