Induction variable

Last updated August 13, 2023

In computer science, an induction variable is a variable that gets increased or decreased by a fixed amount on every iteration of a loop or is a linear function of another induction variable.^[1]

Application to strength reduction

A common compiler optimization is to recognize the existence of induction variables and replace them with simpler computations; for example, the code above could be rewritten by the compiler as follows, on the assumption that the addition of a constant will be cheaper than a multiplication.

j=-17;for(i=0;i<10;++i){j=j+17;}

This optimization is a special case of strength reduction.

Application to reduce register pressure

In some cases, it is possible to reverse this optimization in order to remove an induction variable from the code entirely. For example:

externintsum;intfoo(intn){intj=5;for(inti=0;i<n;++i){j+=2;sum+=j;}returnsum;}

This function's loop has two induction variables: i and j. Either one can be rewritten as a linear function of the other; therefore, the compiler may optimize this code as if it had been written

externintsum;intfoo(intn){for(inti=0;i<n;++i){sum+=5+2*(i+1);}returnsum;}

Induction variable substitution

Induction variable substitution is a compiler transformation to recognize variables which can be expressed as functions of the indices of enclosing loops and replace them with expressions involving loop indices.

This transformation makes the relationship between the variables and loop indices explicit, which helps other compiler analysis, such as dependence analysis.

Example:

Input code:

intc=10;for(inti=0;i<10;i++){c=c+5;// c is incremented by 5 for each loop iteration}

Output code

intc=10;for(inti=0;i<10;i++){c=10+5*(i+1);// c is explicitly expressed as a function of loop index}

Non-linear induction variables

The same optimizations can be applied to induction variables that are not necessarily linear functions of the loop counter; for example, the loop

j=1;for(i=0;i<10;++i){j=j<<1;}

may be converted to

for(i=0;i<10;++i){j=1<<(i+1);}

Related Research Articles

In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed and which also avoids repeated evaluations.

In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power consumption.

In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions. Operationally, a closure is a record storing a function together with an environment. The environment is a mapping associating each free variable of the function with the value or reference to which the name was bound when the closure was created. Unlike a plain function, a closure allows the function to access those captured variables through the closure's copies of their values or references, even when the function is invoked outside their scope.

Constant folding and constant propagation are related compiler optimizations used by many modern compilers. An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and simultaneously remove dead code.

The syntax of the C programming language is the set of rules governing writing of software in the C language. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

In compiler theory, dead-code elimination is a compiler optimization to remove dead code. Removing such code has several benefits: it shrinks program size, an important consideration in some contexts, and it allows the running program to avoid executing irrelevant operations, which reduces its running time. It can also enable further optimizations by simplifying program structure. Dead code includes code that can never be executed, and code that only affects dead variables, that is, irrelevant to the program.

In computer science, a symbol table is a data structure used by a language translator such as a compiler or interpreter, where each identifier, constant, procedure and function in a program's source code is associated with information relating to its declaration or appearance in the source. In other words, the entries of a symbol table store the information related to the entry's corresponding symbol.

In compiler construction, strength reduction is a compiler optimization where expensive operations are replaced with equivalent but less expensive operations. The classic example of strength reduction converts strong multiplications inside a loop into weaker additions – something that frequently occurs in array addressing.(Cooper, Simpson & Vick 1995, p. 1)

In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior, for which the language specification does not prescribe a result, and implementation-defined behavior that defers to the documentation of another component of the platform.

In computer programming, loop-invariant code consists of statements or expressions that can be moved outside the body of a loop without affecting the semantics of the program. Loop-invariant code motion is a compiler optimization that performs this movement automatically.

In computer science, loop fission is a compiler optimization in which a loop is broken into multiple loops over the same index range with each taking only a part of the original loop's body. The goal is to break down a large loop body into smaller ones to achieve better utilization of locality of reference. This optimization is most efficient in multi-core processors that can split a task into multiple tasks for each processor.

In computer programming, duplicate code is a sequence of source code that occurs more than once, either within a program or across different programs owned or maintained by the same entity. Duplicate code is generally considered undesirable for a number of reasons. A minimum requirement is usually applied to the quantity of code that must appear in a sequence for it to be considered duplicate rather than coincidentally similar. Sequences of duplicate code are sometimes known as code clones or just clones, the automated process of finding duplications in source code is called clone detection.

The C and C++ programming languages are closely related but have many significant differences. C++ began as a fork of an early, pre-standardized C, and was designed to be mostly source-and-link compatible with C compilers of the time. Due to this, development tools for the two languages are often integrated into a single product, with the programmer able to specify C or C++ as their source language.

Haxe is a high-level cross-platform programming language and compiler that can produce applications and source code for many different computing platforms from one code-base. It is free and open-source software, released under the MIT License. The compiler, written in OCaml, is released under the GNU General Public License (GPL) version 2.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

In compiler theory, a greatest common divisor test is the test used in study of loop optimization and loop dependence analysis to test the dependency between loop statements.

This article describes the syntax of the C# programming language. The features described are compatible with .NET Framework and Mono.

Expression templates are a C++ template metaprogramming technique that builds structures representing a computation at compile time, where expressions are evaluated only as needed to produce efficient code for the entire computation. Expression templates thus allow programmers to bypass the normal order of evaluation of the C++ language and achieve optimizations such as loop fusion.

OptimJ is an extension for Java with language support for writing optimization models and abstractions for bulk data processing. The extensions and the proprietary product implementing the extensions were developed by Ateji which went out of business in September 2011. OptimJ aims at providing a clear and concise algebraic notation for optimization modeling, removing compatibility barriers between optimization modeling and application programming tools, and bringing software engineering techniques such as object-orientation and modern IDE support to optimization experts.

Nemerle is a general-purpose, high-level, statically typed programming language designed for platforms using the Common Language Infrastructure (.NET/Mono). It offers functional, object-oriented, aspect-oriented, reflective and imperative features. It has a simple C#-like syntax and a powerful metaprogramming system.

References

↑ Steven Muchnick; Muchnick and Associates (15 August 1997). Advanced Compiler Design Implementation . Morgan Kaufmann. ISBN 978-1-55860-320-2. induction variable.

v t e Compiler optimizations
Basic block	Peephole optimization Local value numbering
Loop optimization	Automatic parallelization Induction variable Loop fusion Loop-invariant code motion Loop inversion Loop interchange Loop nest optimization Loop splitting Loop unrolling Loop unswitching Software pipelining Strength reduction
Data-flow analysis	Available expression Common subexpression elimination Constant folding Dead-store elimination Induction variable recognition and elimination Live-variable analysis Use-define chain
SSA-based	Global value numbering Sparse conditional constant propagation
Code generation	Instruction scheduling Instruction selection Register allocation Rematerialization
Functional	Deforestation Tail-call elimination
Global	Interprocedural optimization
Other	Bounds-checking elimination Compile-time function execution Dead-code elimination Expression templates Inline expansion Jump threading Partial evaluation Profile-guided optimization
Static analysis	Alias analysis Array-access analysis Control-flow analysis Data-flow analysis Dependence analysis Escape analysis Pointer analysis Shape analysis Value range analysis