Constant folding

Last updated February 25, 2024

Constant folding and constant propagation are related compiler optimizations used by many modern compilers.^[1] An advanced form of constant propagation known as sparse conditional constant propagation can more accurately propagate constants and simultaneously remove dead code.

Constant folding

Constant folding is the process of recognizing and evaluating constant expressions at compile time rather than computing them at runtime. Terms in constant expressions are typically simple literals, such as the integer literal 2, but they may also be variables whose values are known at compile time. Consider the statement:

i=320*200*32;

Most compilers would not actually generate two multiply instructions and a store for this statement. Instead, they identify constructs such as these and substitute the computed values at compile time (in this case, 2,048,000).

Constant folding can make use of arithmetic identities. If x is numeric, the value of 0 * x is zero even if the compiler does not know the value of x (note that this is not valid for IEEE floats since x could be Infinity or NaN. Still, some environments that favor performance such as GLSL shaders allow this for constants, which can occasionally cause bugs).

Constant folding may apply to more than just numbers. Concatenation of string literals and constant strings can be constant folded. Code such as "abc" + "def" may be replaced with "abcdef".

Constant folding and cross compilation

In implementing a cross compiler, care must be taken to ensure that the behaviour of the arithmetic operations on the host architecture matches that on the target architecture, as otherwise enabling constant folding will change the behaviour of the program. This is of particular importance in the case of floating point operations, whose precise implementation may vary widely.

Constant propagation

Constant propagation is the process of substituting the values of known constants in expressions at compile time. Such constants include those defined above, as well as intrinsic functions applied to constant values. Consider the following pseudocode:

intx=14;inty=7-x/2;returny*(28/x+2);

Propagating x yields:

intx=14;inty=7-14/2;returny*(28/14+2);

Continuing to propagate yields the following (which would likely be further optimized by dead-code elimination of both x and y.)

intx=14;inty=0;return0;

Constant propagation is implemented in compilers using reaching definition analysis results. If all variable's reaching definitions are the same assignment which assigns a same constant to the variable, then the variable has a constant value and can be replaced with the constant.

Constant propagation can also cause conditional branches to simplify to one or more unconditional statements, when the conditional expression can be evaluated to true or false at compile time to determine the only possible outcome.

The optimizations in action

Constant folding and propagation are typically used together to achieve many simplifications and reductions, and their interleaved, iterative application continues until those effects cease.

Consider this unoptimized pseudocode returning a number unknown pending analysis:

inta=30;intb=9-(a/5);intc;c=b*4;if(c>10){c=c-10;}returnc*(60/a);

Applying constant propagation once, followed by constant folding, yields:

inta=30;intb=3;intc;c=b*4;if(c>10){c=c-10;}returnc*2;

Repeating both steps twice produces:

inta=30;intb=3;intc;c=12;if(true){c=2;}returnc*2;

Having replaced all uses of variables a and b with constants, the compiler's dead-code elimination applies to those variables, leaving:

intc;c=12;if(true){c=2;}returnc*2;

(Boolean constructs vary among languages and compilers, but their details—such as the status, origin, and representation of true—do not affect these optimization principles.)

Traditional constant propagation produces no further optimization; it does not restructure programs.

However, a similar optimization, sparse conditional constant propagation, goes further by selecting the appropriate conditional branch^[2], and removing the always-true conditional test. Thus, variable c becomes redundant, and only an operation on a constant remains:

return4;

If that pseudocode constitutes a function body, the compiler knows the function evaluates to integer constant 4, allowing replacement of calls to the function with 4, and further increasing program efficiency.

Related Research Articles

In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power consumption.

Standard ML (SML) is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular for writing compilers, for programming language research, and for developing theorem provers.

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">Conditional (computer programming)</span> Control flow statement that executes code according to some condition(s)

In computer science, conditionals are programming language commands for handling decisions. Specifically, conditionals perform different computations or actions depending on whether a programmer-defined Boolean condition evaluates to true or false. In terms of control flow, the decision is always achieved by selectively altering the control flow based on some condition . Although dynamic dispatch is not usually classified as a conditional construct, it is another way to select between alternatives at runtime. Conditional statements are the checkpoints in the programe that determines behaviour according to situation.

In compiler theory, dead-code elimination is a compiler optimization to remove dead code. Removing such code has several benefits: it shrinks program size, an important consideration in some contexts, it reduces resource usage such as the number of bytes to be transferred and it allows the running program to avoid executing irrelevant operations, which reduces its running time. It can also enable further optimizations by simplifying program structure. Dead code includes code that can never be executed, and code that only affects dead variables, that is, irrelevant to the program.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, ternary if, or inline if. An expression a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is by far and large the most common, but alternative syntaxes do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

In computing, aliasing describes a situation in which a data location in memory can be accessed through different symbolic names in the program. Thus, modifying the data through one name implicitly modifies the values associated with all aliased names, which may not be expected by the programmer. As a result, aliasing makes it particularly difficult to understand, analyze and optimize programs. Aliasing analysers intend to make and compute useful information for understanding aliasing in programs.

Short-circuit evaluation, minimal evaluation, or McCarthy evaluation is the semantics of some Boolean operators in some programming languages in which the second argument is executed or evaluated only if the first argument does not suffice to determine the value of the expression: when the first argument of the AND function evaluates to false, the overall value must be false; and when the first argument of the OR function evaluates to true, the overall value must be true.

In computer programming, loop-invariant code consists of statements or expressions that can be moved outside the body of a loop without affecting the semantics of the program. Loop-invariant code motion is a compiler optimization that performs this movement automatically.

In computer programming languages, a switch statement is a type of selection control mechanism used to allow the value of a variable or expression to change the control flow of program execution via search and map.

In computer science, sparse conditional constant propagation (SCCP) is an optimization frequently applied in compilers after conversion to static single assignment form (SSA). It simultaneously removes some kinds of dead code and propagates constants throughout a program. Moreover, it can find more constant values, and thus more opportunities for improvement, than separately applying dead code elimination and constant propagation in any order or any number of repetitions.

In compiler theory, copy propagation is the process of replacing the occurrences of targets of direct assignments with their values. A direct assignment is an instruction of the form x = y, which simply assigns the value of y to x.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

In computing, IIf is a function in several editions of the Visual Basic programming language and ColdFusion Markup Language (CFML), and on spreadsheets that returns the second or third parameter based on the evaluation of the first parameter. It is an example of a conditional expression, which is similar to a conditional statement.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

The conditional operator is supported in many programming languages. This term usually refers to ?: as in C, C++, C#, and JavaScript. However, in Java, this term can also refer to && and ||.

In computer programming, an anonymous function is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for constructing the result of a higher-order function that needs to return a function. If the function is only used once, or a limited number of times, an anonymous function may be syntactically lighter than using a named function. Anonymous functions are ubiquitous in functional programming languages and other languages with first-class functions, where they fulfil the same role for the function type as literals do for other data types.

In computing, compile-time function execution is the ability of a compiler, that would normally compile a function to machine code and execute it at run time, to execute the function at compile time. This is possible if the arguments to the function are known at compile time, and the function does not make any reference to or attempt to modify any global state.

This article compares a large number of programming languages by tabulating their data types, their expression, statement, and declaration syntax, and some common operating-system interfaces.

Value numbering is a technique of determining when two computations in a program are equivalent and eliminating one of them with a semantics-preserving optimization.

References

↑ Steven Muchnick; Muchnick and Associates (15 August 1997). Advanced Compiler Design Implementation . Morgan Kaufmann. ISBN 978-1-55860-320-2. constant propagation OR constant folding.
↑ Wegman, Mark N; Zadeck, F. Kenneth (April 1991), "Constant Propagation with Conditional Branches", ACM Transactions on Programming Languages and Systems, 13 (2): 181–210, CiteSeerX 10.1.1.130.3532 , doi:10.1145/103135.103136, S2CID 52813828

v t e Compiler optimizations
Basic block	Peephole optimization Local value numbering
Loop	Automatic parallelization Automatic vectorization Induction variable Loop fusion Loop-invariant code motion Loop inversion Loop interchange Loop nest optimization Loop splitting Loop unrolling Loop unswitching Software pipelining Strength reduction
Data-flow analysis	Available expression Common subexpression elimination Constant folding Dead store elimination Induction variable recognition and elimination Live-variable analysis Use-define chain
SSA-based	Global value numbering Sparse conditional constant propagation
Code generation	Instruction scheduling Instruction selection Register allocation Rematerialization
Functional	Deforestation Tail-call elimination
Global	Interprocedural optimization
Other	Bounds-checking elimination Compile-time function execution Dead-code elimination Expression templates Inline expansion Jump threading Partial evaluation Profile-guided optimization
Static analysis	Alias analysis Array-access analysis Control-flow analysis Data-flow analysis Dependence analysis Escape analysis Pointer analysis Shape analysis Value range analysis

Constant folding

Contents

Constant folding

Constant folding and cross compilation

Constant propagation

The optimizations in action

See also

Related Research Articles

References

Further reading