Upwards exposed uses

Last updated

Upwards exposed uses or reachable uses, [1] is a concept in compiler theory which occurs in the copy propagation stage of compilation. [2]

Contents

Uses

During the copy propagation stage of program compilation, instances of a target are replaced with assignments to their values. During this process, it is necessary for the compiler to understand which instances of a target are being accessed so that appropriate substitution may occur, related to the concept of reaching definition in reaching analysis. [3] This is done with the purpose of simplifying code before execution: if the number of upwards exposed uses of an assignment is zero, it does not contribute to the end result of the code and can be safely removed. [1] This is also useful for improving code security during the compilation stages. [4]

Example

Consider the following pseudocode:

x=1y=zifFalse:x=0else:x=y+2

It is safe to assume that line 5 will never occur, as demonstrated by the number of upwards exposed uses for this point being zero. This can therefore be simplified:

y=zx=z+2

This leads to an end result which is less complex to compile, and more efficient to run. [2] This also meets the definition of reaching definition: In this context, upwards flow analysis was the technique used to demonstrate the needs for reaching definition. Additional techniques allow for more complex analysis of more deeply intertwined or complex control flow problems, such as those with various forms of loops. [4]

See also

Related Research Articles

In computing, a compiler is a computer program that translates computer code written in one programming language into another language. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language to create an executable program.

OCaml is a general-purpose, multi-paradigm programming language which extends the Caml dialect of ML with object-oriented features. OCaml was created in 1996 by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy, Ascánder Suárez, and others.

In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power consumption.

The term dead code has multiple definitions. Some use the term to refer to code which can never be executed at run-time. In some areas of computer programming, dead code is a section in the source code of a program which is executed but whose result is never used in any other computation. The execution of dead code wastes computation time and memory.

Constant folding and constant propagation are related compiler optimizations used by many modern compilers. An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and simultaneously remove dead code.

In compiler design, static single assignment form is a property of an intermediate representation (IR), which requires that each variable be assigned exactly once, and every variable be defined before it is used. Existing variables in the original IR are split into versions, new variables typically indicated by the original name with a subscript in textbooks, so that every definition gets its own version. In SSA form, use-def chains are explicit and each contains a single element.

Dominator (graph theory)

In computer science, in control-flow graphs, a node ddominates a node n if every path from the entry node to n must go through d. Notationally, this is written as d dom n. By definition, every node dominates itself.

In compiler theory, dead code elimination is a compiler optimization to remove code which does not affect the program results. Removing such code has several benefits: it shrinks program size, an important consideration in some contexts, and it allows the running program to avoid executing irrelevant operations, which reduces its running time. It can also enable further optimizations by simplifying program structure. Dead code includes code that can never be executed, and code that only affects dead variables, that is, irrelevant to the program.

In computing, aliasing describes a situation in which a data location in memory can be accessed through different symbolic names in the program. Thus, modifying the data through one name implicitly modifies the values associated with all aliased names, which may not be expected by the programmer. As a result, aliasing makes it particularly difficult to understand, analyze and optimize programs. Aliasing analysers intend to make and compute useful information for understanding aliasing in programs.

In computer programming, loop-invariant code consists of statements or expressions which can be moved outside the body of a loop without affecting the semantics of the program. Loop-invariant code motion is a compiler optimization which performs this movement automatically.

Data-flow analysis is a technique for gathering information about the possible set of values calculated at various points in a computer program. A program's control-flow graph (CFG) is used to determine those parts of a program to which a particular value assigned to a variable might propagate. The information gathered is often used by compilers when optimizing a program. A canonical example of a data-flow analysis is reaching definitions.

In computer programming, unreachable code is part of the source code of a program which can never be executed because there exists no control flow path to the code from the rest of the program.

In compiler theory, copy propagation is the process of replacing the occurrences of targets of direct assignments with their values. A direct assignment is an instruction of the form x = y, which simply assigns the value of y to x.

In compiler theory, a reaching definition for a given instruction is an earlier instruction whose target variable can reach the given one without an intervening assignment. For example, in the following code:

d1 : y := 3 d2 : x := y

In the Java programming language, the final keyword is used in several contexts to define an entity that can only be assigned once.

Interprocedural optimization (IPO) is a collection of compiler techniques used in computer programming to improve performance in programs containing many frequently used functions of small or medium length. IPO differs from other compiler optimization because it analyzes the entire program; other optimizations look at only a single function, or even a single block of code.

Tacit programming, also called point-free style, is a programming paradigm in which function definitions do not identify the arguments on which they operate. Instead the definitions merely compose other functions, among which are combinators that manipulate the arguments. Tacit programming is of theoretical interest, because the strict use of composition results in programs that are well adapted for equational reasoning. It is also the natural style of certain programming languages, including APL and its derivatives, and concatenative languages such as Forth. The lack of argument naming gives point-free style a reputation of being unnecessarily obscure, hence the epithet "pointless style".

In computer science, pointer analysis, or points-to analysis, is a static code analysis technique that establishes which pointers, or heap references, can point to which variables, or storage locations. It is often a component of more complex analyses such as escape analysis. A closely related technique is shape analysis.

A decompiler is a computer program that translates an executable file to a high-level source file which can be recompiled successfully. It is therefore the opposite of a compiler, which translates a source file in to an executable. Decompilers are usually unable to perfectly reconstruct the original source code, thus frequently will produce obfuscated code. Nonetheless, decompilers remain an important tool in the reverse engineering of computer software.

Tracing just-in-time compilation is a technique used by virtual machines to optimize the execution of a program at runtime. This is done by recording a linear sequence of frequently executed operations, compiling them to native machine code and executing them. This is opposed to traditional just-in-time (JIT) compilers that work on a per-method basis.

References

  1. 1 2 Harrold, Mary Jean (Fall 2009). "Basic Analysis" (PDF). Georgia Tech - College of Computing. CS 6340: Software Analysis and Testing. Atlanta, Georgia, USA: Georgia Institute of Technology. Archived (PDF) from the original on 2020-06-20. Retrieved 2020-06-12.
  2. 1 2 Aho, Alfred Vaino; Lam, Monica Sin-Ling; Sethi, Ravi; Ullman, Jeffrey David (2006). Compilers: Principles, Techniques, and Tools (2 ed.). Boston, Massachusetts, USA: Addison-Wesley. ISBN   0-321-48681-1. OCLC   70775643.
  3. Kore, Aamod (2020). "Upwards Exposed Uses". Toronto, Ontario, Canada: Department of Computer Science, University of Toronto. Archived from the original on 2020-06-20. Retrieved 2020-06-12.
  4. 1 2 Bergeretti, Jean-Francois; Carré, Bernard A. (1985-01-02). "Information-Flow and Data-Flow Analysis of while-Programs". ACM Transactions on Programming Languages and Systems . Association for Computing Machinery. 7 (1): 37–61. doi:10.1145/2363.2366. S2CID   19682896. Archived from the original on 2020-06-20. Retrieved 2020-06-20.