Extended basic block

Last updated August 29, 2021

In computing, an extended basic block^[1]^[2] is a collection of basic blocks of the code within a program with certain properties that make them highly amenable to optimizations. Many compiler optimizations operate on extended basic blocks.

Definition

An extended basic block is a maximal collection of basic blocks where:

only the first basic block can have multiple predecessor basic blocks;
all the other basic blocks have one single predecessor basic block, which must be within the collection of basic blocks.

Uses

Many local optimizations that operate on basic blocks can be easily extended to operate on extended basic blocks. An example is common subexpression elimination which removes duplicate expressions. In its simplest form it is a local optimization, operating only on basic blocks.^[3]

Notes

↑ Cooper, Keith D., and Torczon, Linda, Engineering a Compiler, Morgan Kaufmann, 2004, ISBN 1-55860-699-8 page 405
↑ Steven S. Muchnick Advanced Compiler Design and Implementation, Morgan Kaufmann Publishers, 1997. ISBN 1-55860-320-4 page 175
↑ "Archived copy". Archived from the original on 2014-05-15. Retrieved 2012-07-04.CS1 maint: archived copy as title (link)

External links

Related Research Articles

In computing, a compiler is a computer program that translates computer code written in one programming language into another language. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language to create an executable program.

Linker (computing) Computer program which combines multiple object files into a single file

In computing, a linker or link editor is a computer system program that takes one or more object files and combines them into a single executable file, library file, or another "object" file.

In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power consumption.

Constant folding and constant propagation are related compiler optimizations used by many modern compilers. An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and simultaneously remove dead code.

In computing, code generation is the process by which a compiler's code generator converts some intermediate representation of source code into a form that can be readily executed by a machine.

In compiler construction, a basic block is a straight-line code sequence with no branches in except to the entry and no branches out except at the exit. This restricted form makes a basic block highly amenable to analysis. Compilers usually decompose programs into their basic blocks as a first step in the analysis process. Basic blocks form the vertices or nodes in a control-flow graph.

In compiler design, static single assignment form is a property of an intermediate representation (IR), which requires that each variable be assigned exactly once, and every variable be defined before it is used. Existing variables in the original IR are split into versions, new variables typically indicated by the original name with a subscript in textbooks, so that every definition gets its own version. In SSA form, use-def chains are explicit and each contains a single element.

C dynamic memory allocation refers to performing manual memory management for dynamic memory allocation in the C programming language via a group of functions in the C standard library, namely malloc, realloc, calloc and free.

In compiler theory, dead code elimination is a compiler optimization to remove code which does not affect the program results. Removing such code has several benefits: it shrinks program size, an important consideration in some contexts, and it allows the running program to avoid executing irrelevant operations, which reduces its running time. It can also enable further optimizations by simplifying program structure. Dead code includes code that can never be executed, and code that only affects dead variables, that is, irrelevant to the program.

In compiler theory, common subexpression elimination (CSE) is a compiler optimization that searches for instances of identical expressions, and analyzes whether it is worthwhile replacing them with a single variable holding the computed value.

Data-flow analysis is a technique for gathering information about the possible set of values calculated at various points in a computer program. A program's control-flow graph (CFG) is used to determine those parts of a program to which a particular value assigned to a variable might propagate. The information gathered is often used by compilers when optimizing a program. A canonical example of a data-flow analysis is reaching definitions.

An intermediate representation (IR) is the data structure or code used internally by a compiler or virtual machine to represent source code. An IR is designed to be conducive for further processing, such as optimization and translation. A "good" IR must be accurate – capable of representing the source code without loss of information – and independent of any particular source or target language. An IR may take one of several forms: an in-memory data structure, or a special tuple- or stack-based code readable by the program. In the latter case it is also called an intermediate language.

In compiler theory, a reaching definition for a given instruction is an earlier instruction whose target variable can reach the given one without an intervening assignment. For example, in the following code:

d1 : y := 3 d2 : x := y

In compiler theory, loop interchange is the process of exchanging the order of two iteration variables used by a nested loop. The variable used in the inner loop switches to the outer loop, and vice versa. It is often done to ensure that the elements of a multi-dimensional array are accessed in the order in which they are present in memory, improving locality of reference.

In compiler theory, dependence analysis produces execution-order constraints between statements/instructions. Broadly speaking, a statement S2 depends on S1 if S1 must be executed before S2. Broadly, there are two classes of dependencies--control dependencies and data dependencies.

A static build is a compiled version of a program which has been statically linked against libraries.

Peephole optimization is an optimization technique performed on a small set of compiler-generated instructions; the small set is known as the peephole or window.

In computer science, loop fission is a compiler optimization in which a loop is broken into multiple loops over the same index range with each taking only a part of the original loop's body. The goal is to break down a large loop body into smaller ones to achieve better utilization of locality of reference. This optimization is most efficient in multi-core processors that can split a task into multiple tasks for each processor.

Trace scheduling is an optimization technique developed by Josh Fisher used in compilers for computer programs.

Value numbering is a technique of determining when two computations in a program are equivalent and eliminating one of them with a semantics preserving optimization.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Cooper, Keith D., and Torczon, Linda, Engineering a Compiler, Morgan Kaufmann, 2004, ISBN 1-55860-699-8 page 405

[2] Steven S. Muchnick Advanced Compiler Design and Implementation, Morgan Kaufmann Publishers, 1997. ISBN 1-55860-320-4 page 175

[3] "Archived copy". Archived from the original on 2014-05-15. Retrieved 2012-07-04.CS1 maint: archived copy as title (link)

[1]

[2]

[3]