Upwards exposed uses

Last updated

In compiler theory, upwards exposed uses or reachable uses [1] are all uses of a variable that are reachable from a point in the program. A use of variable is a point or statement where that variable is referenced (read) by not changed. A use of a variable A is reachable from a point p if there exists a control-flow path in the control-flow graph from p to the use with not definition of A on the path.

Contents

A reachable uses analysis is a data-flow analysis [1] to calculate all reachable uses of a program. It is very similar to the liveness analysis. A variable is live in a point in the program if it has one or more reachable uses. Compared to the liveness analysis, reachable uses analysis provides additional information where the variable is used.

Uses

Upwards exposed uses occurs in the copy propagation stage of program compilation [2] . During the copy propagation stage, instances of a target are replaced with assignments to their values. During this process, it is necessary for the compiler to understand which instances of a target are being accessed so that appropriate substitution may occur, related to the concept of reaching definition in reaching analysis. [3] This is done with the purpose of simplifying code before execution: if the number of upwards exposed uses of an assignment is zero, it does not contribute to the end result of the code and can be safely removed. [1] This is also useful for improving code security during the compilation stages. [4]

Example

Consider the following pseudocode:

x=1y=zifFalse:x=0else:x=y+2

It is safe to assume that line 5 will never occur, as demonstrated by the number of upwards exposed uses for this point being zero. This can therefore be simplified:

y=zx=z+2

This leads to a result that is less complex to compile and more efficient to run. [2] This also meets the definition of reaching definition: In this context, upwards flow analysis was the technique used to demonstrate the needs for reaching definition. Additional techniques allow for more complex analysis of more deeply intertwined or complex control flow problems, such as those with various forms of loops. [4]

See also

Related Research Articles

An optimizing compiler is a compiler designed to generate code that is optimized in aspects such as minimizing program execution time, memory use, storage size, and power consumption. Optimization is generally implemented as a sequence of optimizing transformations, algorithms that transform code to produce semantically equivalent code optimized for some aspect. It is typically CPU and memory-intensive. In practice, factors such as available memory and a programmer's willingness to wait for compilation limit the optimizations a compiler might provide. Research indicates that some optimization problems are NP-complete, or even undecidable.

The term dead code has multiple definitions. Some use the term to refer to code which can never be executed at run-time. In some areas of computer programming, dead code is a section in the source code of a program which is executed but whose result is never used in any other computation. The execution of dead code wastes computation time and memory.

Constant folding and constant propagation are related compiler optimizations used by many modern compilers. An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and simultaneously remove dead code.

In computer programming, a type system is a logical system comprising a set of rules that assigns a property called a type to every term. Usually the terms are various language constructs of a computer program, such as variables, expressions, functions, or modules. A type system dictates the operations that can be performed on a term. For variables, the type system determines the allowed values of that term.

In compiler design, static single assignment form is a type of intermediate representation (IR) where each variable is assigned exactly once. SSA is used in most high-quality optimizing compilers for imperative languages, including LLVM, the GNU Compiler Collection, and many commercial compilers.

In compiler theory, dead-code elimination is a compiler optimization to remove dead code. Removing such code has several benefits: it shrinks program size, an important consideration in some contexts, it reduces resource usage such as the number of bytes to be transferred and it allows the running program to avoid executing irrelevant operations, which reduces its running time. It can also enable further optimizations by simplifying program structure. Dead code includes code that can never be executed, and code that only affects dead variables, that is, irrelevant to the program.

In computing, aliasing describes a situation in which a data location in memory can be accessed through different symbolic names in the program. Thus, modifying the data through one name implicitly modifies the values associated with all aliased names, which may not be expected by the programmer. As a result, aliasing makes it particularly difficult to understand, analyze and optimize programs. Aliasing analysers intend to make and compute useful information for understanding aliasing in programs.

In computer programming, loop-invariant code consists of statements or expressions that can be moved outside the body of a loop without affecting the semantics of the program. Loop-invariant code motion is a compiler optimization that performs this movement automatically.

Data-flow analysis is a technique for gathering information about the possible set of values calculated at various points in a computer program. A program's control-flow graph (CFG) is used to determine those parts of a program to which a particular value assigned to a variable might propagate. The information gathered is often used by compilers when optimizing a program. A canonical example of a data-flow analysis is reaching definitions.

Within computer science, a use-definition chain is a data structure that consists of a use U, of a variable, and all the definitions D of that variable that can reach that use without any other intervening definitions. A UD Chain generally means the assignment of some value to a variable.

In computer programming, unreachable code is part of the source code of a program which can never be executed because there exists no control flow path to the code from the rest of the program.

In compiler theory, copy propagation is the process of replacing the occurrences of targets of direct assignments with their values. A direct assignment is an instruction of the form x = y, which simply assigns the value of y to x.

In compiler theory, a reaching definition for a given instruction is an earlier instruction whose target variable can reach the given one without an intervening assignment. For example, in the following code:

d1 : y := 3 d2 : x := y

In the Java programming language, the final keyword is used in several contexts to define an entity that can only be assigned once.

<span class="mw-page-title-main">Recursion (computer science)</span> Use of functions that call themselves

In computer science, recursion is a method of solving a computational problem where the solution depends on solutions to smaller instances of the same problem. Recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.

The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions.

In compilers, live variable analysis is a classic data-flow analysis to calculate the variables that are live at each point in the program. A variable is live at some point if it holds a value that may be needed in the future, or equivalently if its value may be read before the next time the variable is written to.

Tacit programming, also called point-free style, is a programming paradigm in which function definitions do not identify the arguments on which they operate. Instead the definitions merely compose other functions, among which are combinators that manipulate the arguments. Tacit programming is of theoretical interest, because the strict use of composition results in programs that are well adapted for equational reasoning. It is also the natural style of certain programming languages, including APL and its derivatives, and concatenative languages such as Forth. The lack of argument naming gives point-free style a reputation of being unnecessarily obscure, hence the epithet "pointless style".

In computer science, pointer analysis, or points-to analysis, is a static code analysis technique that establishes which pointers, or heap references, can point to which variables, or storage locations. It is often a component of more complex analyses such as escape analysis. A closely related technique is shape analysis.

In engineering, debugging is the process of finding the root cause of and workarounds and possible fixes for bugs.

Concolic testing is a hybrid software verification technique that performs symbolic execution, a classical technique that treats program variables as symbolic variables, along a concrete execution path. Symbolic execution is used in conjunction with an automated theorem prover or constraint solver based on constraint logic programming to generate new concrete inputs with the aim of maximizing code coverage. Its main focus is finding bugs in real-world software, rather than demonstrating program correctness.

References

  1. 1 2 3 Harrold, Mary Jean (Fall 2009). "Basic Analysis" (PDF). Georgia Tech - College of Computing. CS 6340: Software Analysis and Testing. Atlanta, Georgia, USA: Georgia Institute of Technology. Archived (PDF) from the original on 2020-06-20. Retrieved 2020-06-12.
  2. 1 2 Aho, Alfred Vaino; Lam, Monica Sin-Ling; Sethi, Ravi; Ullman, Jeffrey David (2006). Compilers: Principles, Techniques, and Tools (2 ed.). Boston, Massachusetts, USA: Addison-Wesley. ISBN   0-321-48681-1. OCLC   70775643.
  3. Kore, Aamod (2020). "Upwards Exposed Uses". Toronto, Ontario, Canada: Department of Computer Science, University of Toronto. Archived from the original on 2020-06-20. Retrieved 2020-06-12.
  4. 1 2 Bergeretti, Jean-Francois; Carré, Bernard A. (1985-01-02). "Information-Flow and Data-Flow Analysis of while-Programs". ACM Transactions on Programming Languages and Systems . 7 (1). Association for Computing Machinery: 37–61. doi: 10.1145/2363.2366 . S2CID   19682896.