Loop-invariant code motion

Last updated November 09, 2022

In computer programming, loop-invariant code consists of statements or expressions (in an imperative programming language) that can be moved outside the body of a loop without affecting the semantics of the program. Loop-invariant code motion (also called hoisting or scalar promotion) is a compiler optimization that performs this movement automatically.

Example

In the following code sample, two optimizations can be applied.

inti=0;while(i<n){x=y+z;a[i]=6*i+x*x;++i;}

Although the calculation x = y + z and x * x is loop-invariant, precautions must be taken before moving the code outside the loop. It is possible that the loop condition is false (for example, if n holds a negative value), and in such case, the loop body should not be executed at all. One way of guaranteeing correct behaviour is using a conditional branch outside of the loop. Evaluating the loop condition can have side effects, so an additional evaluation by the if construct should be compensated by replacing the while loop with a do {} while . If the code used do {} while in the first place, the whole guarding process is not needed, as the loop body is guaranteed to execute at least once.

inti=0;if(i<n){x=y+z;intconstt1=x*x;do{a[i]=6*i+t1;++i;}while(i<n);}

This code can be optimized further. For example, strength reduction could remove the two multiplications inside the loop (6*i and a[i]), and induction variable elimination could then elide i completely. Since 6 * i must be in lock step with i itself, there is no need to have both.

Invariant code detection

Usually, a reaching definitions analysis is used to detect whether a statement or expression is loop invariant.

For example, if all reaching definitions for the operands of some simple expression are outside of the loop, the expression can be moved out of the loop.

Recent work using data-flow dependence analysis ^[1] allows to detect not only invariant commands but larger code fragments such as an inner loop. The analysis also detects quasi-invariants of arbitrary degrees, that is commands or code fragments that become invariant after a fixed number of iterations of the loop body.

Benefits

Loop-invariant code which has been hoisted out of a loop is executed less often, providing a speedup. Another effect of this transformation is allowing constants to be stored in registers and not having to calculate the address and access the memory (or cache line) at each iteration.

However, if too many variables are created, there will be high register pressure, especially on processors with few registers, like the 32-bit x86. If the compiler runs out of registers, some variables will be spilled. To counteract this, the inverse optimization can be performed, rematerialization.

Related Research Articles

In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed and which also avoids repeated evaluations (sharing).

In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power consumption.

Standard ML (SML) is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular among compiler writers and programming language researchers, as well as in the development of theorem provers.

Constant folding and constant propagation are related compiler optimizations used by many modern compilers. An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and simultaneously remove dead code.

In computing, inline expansion, or inlining, is a manual or compiler optimization that replaces a function call site with the body of the called function. Inline expansion is similar to macro expansion, but occurs during compilation, without changing the source code, while macro expansion occurs prior to compilation, and results in different text that is then processed by the compiler.

The syntax of the C programming language is the set of rules governing writing of software in the C language. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">For loop</span> Control flow statement for repeated execution

In computer science a for-loop or for loop is a control flow statement for specifying iteration. Specifically, a for loop functions by running a section of code repeatedly until a certain condition has been satisfied.

In compiler construction, strength reduction is a compiler optimization where expensive operations are replaced with equivalent but less expensive operations. The classic example of strength reduction converts "strong" multiplications inside a loop into "weaker" additions – something that frequently occurs in array addressing.

In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior, for which the language specification does not prescribe a result, and implementation-defined behavior that defers to the documentation of another component of the platform.

In computer science, a generator is a routine that can be used to control the iteration behaviour of a loop. All generators are also iterators. A generator is very similar to a function that returns an array, in that a generator has parameters, can be called, and generates a sequence of values. However, instead of building an array containing all the values and returning them all at once, a generator yields the values one at a time, which requires less memory and allows the caller to get started processing the first few values immediately. In short, a generator looks like a function but behaves like an iterator.

In computer science, a loop invariant is a property of a program loop that is true before each iteration. It is a logical assertion, sometimes checked within the code by an assertion call. Knowing its invariant(s) is essential in understanding the effect of a loop.

Loop splitting is a compiler optimization technique. It attempts to simplify a loop or eliminate dependencies by breaking it into multiple loops which have the same bodies but iterate over different contiguous portions of the index range.

Loop unrolling, also known as loop unwinding, is a loop transformation technique that attempts to optimize a program's execution speed at the expense of its binary size, which is an approach known as space–time tradeoff. The transformation can be undertaken manually by the programmer or by an optimizing compiler. On modern processors, loop unrolling is often counterproductive, as the increased code size can cause more cache misses; cf. Duff's device.

In computer programming, a nested function is a function which is defined within another function, the enclosing function. Due to simple recursive scope rules, a nested function is itself invisible outside of its immediately enclosing function, but can see (access) all local objects of its immediately enclosing function as well as of any function(s) which, in turn, encloses that function. The nesting is theoretically possible to unlimited depth, although only a few levels are normally used in practical programs.

In compiler theory, loop optimization is the process of increasing execution speed and reducing the overheads associated with loops. It plays an important role in improving cache performance and making effective use of parallel processing capabilities. Most execution time of a scientific program is spent on loops; as such, many compiler optimization techniques have been developed to make them faster.

Automatic parallelization, also auto parallelization, or autoparallelization refers to converting sequential code into multi-threaded and/or vectorized code in order to use multiple processors simultaneously in a shared-memory multiprocessor (SMP) machine. Fully automatic parallelization of sequential programs is a challenge because it requires complex program analysis and the best approach may depend upon parameter values that are not known at compilation time.

In computer science, an induction variable is a variable that gets increased or decreased by a fixed amount on every iteration of a loop or is a linear function of another induction variable.

In compiler theory, a greatest common divisor test is the test used in study of loop optimization and loop dependence analysis to test the dependency between loop statements.

Nemerle is a general-purpose, high-level, statically typed programming language designed for platforms using the Common Language Infrastructure (.NET/Mono). It offers functional, object-oriented, aspect-oriented, reflective and imperative features. It has a simple C#-like syntax and a powerful metaprogramming system.

Tracing just-in-time compilation is a technique used by virtual machines to optimize the execution of a program at runtime. This is done by recording a linear sequence of frequently executed operations, compiling them to native machine code and executing them. This is opposed to traditional just-in-time (JIT) compilers that work on a per-method basis.

References

↑ Moyen, Jean-Yves; Rubiano, Thomas; Seiller, Thomas (2017). "Loop Quasi-Invariant Chunk Detection". Automated Technology for Verification and Analysis. Lecture Notes in Computer Science. 10482: 91–108. doi:10.1007/978-3-319-68167-2_7. ISBN 978-3-319-68166-5.

External links

Compiler Optimizations — Hoisting

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Moyen, Jean-Yves; Rubiano, Thomas; Seiller, Thomas (2017). "Loop Quasi-Invariant Chunk Detection". Automated Technology for Verification and Analysis. Lecture Notes in Computer Science. 10482: 91–108. doi:10.1007/978-3-319-68167-2_7. ISBN 978-3-319-68166-5.

[1]

v t e Compiler optimizations
Basic block	Peephole optimization
Loop optimization	Automatic parallelization Induction variable Loop fusion Loop-invariant code motion Loop inversion Loop interchange Loop nest optimization Loop splitting Loop unrolling Loop unswitching Software pipelining Strength reduction
Data-flow analysis	Available expression Common subexpression elimination Constant folding Dead-store elimination Induction variable recognition and elimination Live-variable analysis Use-define chain
SSA-based	Global value numbering Sparse conditional constant propagation
Code generation	Instruction scheduling Instruction selection Register allocation Rematerialization
Functional	Deforestation Tail-call elimination
Global	Interprocedural optimization
Other	Bounds-checking elimination Compile-time function execution Dead-code elimination Expression templates Inline expansion Jump threading Profile-guided optimization
Static analysis	Alias analysis Array-access analysis Control-flow analysis Data-flow analysis Dependence analysis Escape analysis Pointer analysis Shape analysis