Compile-time function execution

Last updated

In computing, compile-time function execution (or compile time function evaluation, or general constant expressions) is the ability of a compiler, that would normally compile a function to machine code and execute it at run time, to execute the function at compile time. This is possible if the arguments to the function are known at compile time, and the function does not make any reference to or attempt to modify any global state (i.e. it is a pure function).

Contents

If the value of only some of the arguments are known, the compiler may still be able to perform some level of compile-time function execution (partial evaluation), possibly producing more optimized code than if no arguments were known.

Examples

Lisp

The Lisp macro system is an early example of the use of compile-time evaluation of user-defined functions in the same language.

C++

The Metacode extension to C++ (Vandevoorde 2003) [1] was an early experimental system to allow compile-time function evaluation (CTFE) and code injection as an improved syntax for C++ template metaprogramming.

In earlier versions of C++, template metaprogramming is often used to compute values at compile time, such as:

template<intN>structFactorial{enum{value=N*Factorial<N-1>::value};};template<>structFactorial<0>{enum{value=1};};// Factorial<4>::value == 24// Factorial<0>::value == 1voidFoo(){intx=Factorial<0>::value;// == 1inty=Factorial<4>::value;// == 24}

Using compile-time function evaluation, code used to compute the factorial would be similar to what one would write for run-time evaluation e.g. using C++11 constexpr.

#include<cstdio>constexprintFactorial(intn){returnn?(n*Factorial(n-1)):1;}constexprintf10=Factorial(10);intmain(){printf("%d\n",f10);return0;}

In C++11, this technique is known as generalized constant expressions (constexpr). [2] C++14 relaxes the constraints on constexpr – allowing local declarations and use of conditionals and loops (the general restriction that all data required for the execution be available at compile-time remains).

Here's an example of compile time function evaluation in C++14:

// Iterative factorial at compile time.constexprintFactorial(intn){intresult=1;while(n>1){result*=n--;}returnresult;}intmain(){constexprintf4=Factorial(4);// f4 == 24}

Immediate functions (C++)

In C++20, immediate functions were introduced, and compile-time function execution was made more accessible and flexible with relaxed constexpr restrictions.

// Iterative factorial at compile time.constevalintFactorial(intn){intresult=1;while(n>1){result*=n--;}returnresult;}intmain(){intf4=Factorial(4);// f4 == 24}

Since function Factorial is marked consteval, it is guaranteed to invoke at compile-time without being forced in another manifestly constant-evaluated context. Hence, the usage of immediate functions offers wide uses in metaprogramming, and compile-time checking (used in C++20 text formatting library).

Here's an example of using immediate functions in compile-time function execution:

voidyou_see_this_error_because_assertion_fails(){}constevalvoidcassert(boolb){if(!b)you_see_this_error_because_assertion_fails();}constevalvoidtest(){intx=10;cassert(x==10);// okx++;cassert(x==11);// okx--;cassert(x==12);// fails here}intmain(){test();}

In this example, the compilation fails because the immediate function invoked function which is not usable in constant expressions. In other words, the compilation stops after failed assertion.

The typical compilation error message would display:

Infunction'intmain()':in'constexpr'expansionof'test()'in'constexpr'expansionof'cassert(x==12)'error:calltonon-'constexpr'function'you_see_this_error_because_assertion_fails()'you_see_this_error_because_assertion_fails();~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~[...]

Here's another example of using immediate functions as constructors which enables compile-time argument checking:

#include<string_view>#include<iostream>voidyou_see_this_error_because_the_message_ends_with_exclamation_point(){}structchecked_message{std::string_viewmsg;constevalchecked_message(constchar*arg):msg(arg){if(msg.ends_with('!'))you_see_this_error_because_the_message_ends_with_exclamation_point();}};voidsend_calm_message(checked_messagearg){std::cout<<arg.msg<<'\n';}intmain(){send_calm_message("Hello, world");send_calm_message("Hello, world!");}

The compilation fails here with the message:

Infunction'intmain()':in'constexpr'expansionof'checked_message(((constchar*)"Hello, world!"))'error:calltonon-'constexpr'function'voidyou_see_this_error_because_the_message_ends_with_exclamation_point()'you_see_this_error_because_the_message_ends_with_exclamation_point();~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~[...]

D

Here's an example of compile time function evaluation in the D programming language: [3]

intfactorial(intn){if(n==0)return1;returnn*factorial(n-1);}// computed at compile timeenumy=factorial(0);// == 1enumx=factorial(4);// == 24

This example specifies a valid D function called "factorial" which would typically be evaluated at run time. The use of enum tells the compiler that the initializer for the variables must be computed at compile time. Note that the arguments to the function must be able to be resolved at compile time as well. [4]

CTFE can be used to populate data structures at compile-time in a simple way (D version 2):

int[]genFactorials(intn){autoresult=newint[n];result[0]=1;foreach(i;1..n)result[i]=result[i-1]*i;returnresult;}enumfactorials=genFactorials(13);voidmain(){}// 'factorials' contains at compile-time:// [1, 1, 2, 6, 24, 120, 720, 5_040, 40_320, 362_880, 3_628_800,//  39_916_800, 479_001_600]

CTFE can be used to generate strings which are then parsed and compiled as D code in D.

Zig

Here's an example of compile time function evaluation in the Zig programming language: [5]

pubfnfactorial(n:usize)usize{varresult=1;for(1..(n+1))|i|{result*=i;}returnresult;}pubfnmain()void{constx=comptimefactorial(0);// == 0consty=comptimefactorial(4);// == 24}

This example specifies a valid Zig function called "factorial" which would typically be evaluated at run time. The use of comptime tells the compiler that the initializer for the variables must be computed at compile time. Note that the arguments to the function must be able to be resolved at compile time as well.

Zig also support Compile-Time Parameters. [6]

pubfnfactorial(comptimen:usize)usize{varresult:usize=1;for(1..(n+1))|i|{result*=i;}returnresult;}pubfnmain()void{constx=factorial(0);// == 0consty=factorial(4);// == 24}

CTFE can be used to create generic data structures at compile-time:

fnList(comptimeT:type)type{returnstruct{items:[]T,len:usize,};}// The generic List data structure can be instantiated by passing in a type:varbuffer:[10]i32=undefined;varlist=List(i32){.items=&buffer,.len=0,};

Related Research Articles

Templates are a feature of the C++ programming language that allows functions and classes to operate with generic types. This allows a function or class declaration to reference via a generic variable another different class without creating full declaration for each of these different classes.

Standard ML (SML) is a general-purpose, modular, functional programming language with compile-time type checking and type inference. It is popular for writing compilers, for programming language research, and for developing theorem provers.

Generic programming is a style of computer programming in which algorithms are written in terms of data types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by the ML programming language in 1973, permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplicate code.

Template metaprogramming (TMP) is a metaprogramming technique in which templates are used by a compiler to generate temporary source code, which is merged by the compiler with the rest of the source code and then compiled. The output of these templates can include compile-time constants, data structures, and complete functions. The use of templates can be thought of as compile-time polymorphism. The technique is used by a number of languages, the best-known being C++, but also Curl, D, Nim, and XL.

<span class="mw-page-title-main">D (programming language)</span> Multi-paradigm system programming language

D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineering of C++, D is now a very different language drawing inspiration from other high-level programming languages, notably Java, Python, Ruby, C#, and Eiffel.

In mathematics and computer science, a higher-order function (HOF) is a function that does at least one of the following:

In computer science, a tagged union, also called a variant, variant record, choice type, discriminated union, disjoint union, sum type or coproduct, is a data structure used to hold a value that could take on several different, but fixed, types. Only one of the types can be in use at any one time, and a tag field explicitly indicates which one is in use. It can be thought of as a type that has several "cases", each of which should be handled correctly when that type is manipulated. This is critical in defining recursive datatypes, in which some component of a value may have the same type as that value, for example in defining a type for representing trees, where it is necessary to distinguish multi-node subtrees and leaves. Like ordinary unions, tagged unions can save storage by overlapping storage areas for each type, since only one is in use at a time.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

typedef is a reserved keyword in the programming languages C, C++, and Objective-C. It is used to create an additional name (alias) for another data type, but does not create a new type, except in the obscure case of a qualified typedef of an array type where the typedef qualifiers are transferred to the array element type. As such, it is often used to simplify the syntax of declaring complex data structures consisting of struct and union types, although it is also commonly used to provide specific descriptive type names for integer data types of varying sizes.

Automata-based programming is a programming paradigm in which the program or part of it is thought of as a model of a finite-state machine (FSM) or any other formal automaton. Sometimes a potentially infinite set of possible states is introduced, and such a set can have a complicated structure, not just an enumeration.

<span class="mw-page-title-main">Recursion (computer science)</span> Use of functions that call themselves

In computer science, recursion is a method of solving a computational problem where the solution depends on solutions to smaller instances of the same problem. Recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.

The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

In computer programming, a pure function is a function that has the following properties:

  1. the function return values are identical for identical arguments, and
  2. the function has no side effects.

In computer programming, an anonymous function is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for constructing the result of a higher-order function that needs to return a function. If the function is only used once, or a limited number of times, an anonymous function may be syntactically lighter than using a named function. Anonymous functions are ubiquitous in functional programming languages and other languages with first-class functions, where they fulfil the same role for the function type as literals do for other data types.

In computer programming, variadic templates are templates that take a variable number of arguments.

"typename" is a keyword in the C++ programming language used when writing templates. It is used for specifying that a dependent name in a template definition or declaration is a type. In the original C++ compilers before the first ISO standard was completed, the typename keyword was not part of the C++ language and Bjarne Stroustrup used the class keyword for template arguments instead. While typename is now the preferred keyword, older source code may still use the class keyword instead.

This article describes the features in the programming language Haskell.

C++14 is a version of the ISO/IEC 14882 standard for the C++ programming language. It is intended to be a small extension over C++11, featuring mainly bug fixes and small improvements, and was replaced by C++17. Its approval was announced on August 18, 2014. C++14 was published as ISO/IEC 14882:2014 in December 2014.

<span class="mw-page-title-main">Zig (programming language)</span> A general-purpose programming language, toolchain to build Zig/C/C++ code

Zig is an imperative, general-purpose, statically typed, compiled system programming language designed by Andrew Kelley. It is intended to be a successor to the C programming language, with the goals of being even smaller and simpler to program in while also offering modern features, new optimizations and a variety of safety mechanisms while not as demanding of runtime safety as seen in other languages. It is distinct from languages like Go, Rust and Carbon, which have similar goals but also target the C++ space.

References

  1. Daveed Vandevoorde, Edison Design Group (April 18, 2003). "Reflective Metaprogramming in C++" (PDF). Retrieved July 19, 2015.
  2. Gabriel Dos Reis and Bjarne Stroustrup (March 2010). "General Constant Expressions for System Programming Languages. SAC-2010. The 25th ACM Symposium On Applied Computing" (PDF).
  3. D 2.0 language specification: Functions
  4. D 2.0 language specification: Attributes
  5. Zig 0.11.0 Language Reference: Compile-Time Expressions
  6. Zig 0.11.0 Language Reference: Compile-Time Parameters