Compile-time function execution

Last updated

In computing, compile-time function execution (or compile-time function evaluation, or general constant expressions) is the ability of a compiler, that would normally compile a function to machine code and execute it at run time, to execute the function at compile time. This is possible if the arguments to the function are known at compile time, and the function does not make any reference to or attempt to modify any global state (i.e. it is a pure function).

Contents

If the value of only some of the arguments are known, the compiler may still be able to perform some level of compile-time function execution (partial evaluation), possibly producing more optimized code than if no arguments were known.

Examples

Lisp

The Lisp macro system is an early example of the use of compile-time evaluation of user-defined functions in the same language.

C++

The Metacode extension to C++ (Vandevoorde 2003) [1] was an early experimental system to allow compile-time function evaluation (CTFE) and code injection as an improved syntax for C++ template metaprogramming.

In earlier versions of C++, template metaprogramming is often used to compute values at compile time, such as:

template<intN>structFactorial{enum{value=N*Factorial<N-1>::value};};template<>structFactorial<0>{enum{value=1};};// Factorial<4>::value == 24// Factorial<0>::value == 1voidFoo(){intx=Factorial<0>::value;// == 1inty=Factorial<4>::value;// == 24}

Using compile-time function evaluation, code used to compute the factorial would be similar to what one would write for run-time evaluation e.g. using C++11 constexpr.

#include<cstdio>constexprintFactorial(intn){returnn?(n*Factorial(n-1)):1;}constexprintf10=Factorial(10);intmain(){printf("%d\n",f10);return0;}

In C++11, this technique is known as generalized constant expressions (constexpr). [2] C++14 relaxes the constraints on constexpr – allowing local declarations and use of conditionals and loops (the general restriction that all data required for the execution be available at compile-time remains).

Here's an example of compile-time function evaluation in C++14:

// Iterative factorial at compile time.constexprintFactorial(intn){intresult=1;while(n>1){result*=n--;}returnresult;}intmain(){constexprintf4=Factorial(4);// f4 == 24}

Immediate functions (C++)

In C++20, immediate functions were introduced, and compile-time function execution was made more accessible and flexible with relaxed constexpr restrictions.

// Iterative factorial at compile time.constevalintFactorial(intn){intresult=1;while(n>1){result*=n--;}returnresult;}intmain(){intf4=Factorial(4);// f4 == 24}

Since function Factorial is marked consteval, it is guaranteed to invoke at compile-time without being forced in another manifestly constant-evaluated context. Hence, the usage of immediate functions offers wide uses in metaprogramming, and compile-time checking (used in C++20 text formatting library).

Here's an example of using immediate functions in compile-time function execution:

voidyou_see_this_error_because_assertion_fails(){}constevalvoidcassert(boolb){if(!b)you_see_this_error_because_assertion_fails();}constevalvoidtest(){intx=10;cassert(x==10);// okx++;cassert(x==11);// okx--;cassert(x==12);// fails here}intmain(){test();}

In this example, the compilation fails because the immediate function invoked function which is not usable in constant expressions. In other words, the compilation stops after failed assertion.

The typical compilation error message would display:

Infunction'intmain()':in'constexpr'expansionof'test()'in'constexpr'expansionof'cassert(x==12)'error:calltonon-'constexpr'function'you_see_this_error_because_assertion_fails()'you_see_this_error_because_assertion_fails();~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~[...]

Here's another example of using immediate functions as constructors which enables compile-time argument checking:

#include<string_view>#include<iostream>voidyou_see_this_error_because_the_message_ends_with_exclamation_point(){}structchecked_message{std::string_viewmsg;constevalchecked_message(constchar*arg):msg(arg){if(msg.ends_with('!'))you_see_this_error_because_the_message_ends_with_exclamation_point();}};voidsend_calm_message(checked_messagearg){std::cout<<arg.msg<<'\n';}intmain(){send_calm_message("Hello, world");send_calm_message("Hello, world!");}

The compilation fails here with the message:

Infunction'intmain()':in'constexpr'expansionof'checked_message(((constchar*)"Hello, world!"))'error:calltonon-'constexpr'function'voidyou_see_this_error_because_the_message_ends_with_exclamation_point()'you_see_this_error_because_the_message_ends_with_exclamation_point();~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~[...]

D

Here's an example of compile-time function evaluation in the D programming language: [3]

intfactorial(intn){if(n==0)return1;returnn*factorial(n-1);}// computed at compile timeenumy=factorial(0);// == 1enumx=factorial(4);// == 24

This example specifies a valid D function called "factorial" which would typically be evaluated at run time. The use of enum tells the compiler that the initializer for the variables must be computed at compile time. Note that the arguments to the function must be able to be resolved at compile time as well. [4]

CTFE can be used to populate data structures at compile-time in a simple way (D version 2):

int[]genFactorials(intn){autoresult=newint[n];result[0]=1;foreach(i;1..n)result[i]=result[i-1]*i;returnresult;}enumfactorials=genFactorials(13);voidmain(){}// 'factorials' contains at compile-time:// [1, 1, 2, 6, 24, 120, 720, 5_040, 40_320, 362_880, 3_628_800,//  39_916_800, 479_001_600]

CTFE can be used to generate strings which are then parsed and compiled as D code in D.

Zig

Here's an example of compile-time function evaluation in the Zig programming language: [5]

pubfnfactorial(n:usize)usize{varresult=1;for(1..(n+1))|i|{result*=i;}returnresult;}pubfnmain()void{constx=comptimefactorial(0);// == 0consty=comptimefactorial(4);// == 24}

This example specifies a valid Zig function called "factorial" which would typically be evaluated at run time. The use of comptime tells the compiler that the initializer for the variables must be computed at compile time. Note that the arguments to the function must be able to be resolved at compile time as well.

Zig also support Compile-Time Parameters. [6]

pubfnfactorial(comptimen:usize)usize{varresult:usize=1;for(1..(n+1))|i|{result*=i;}returnresult;}pubfnmain()void{constx=factorial(0);// == 0consty=factorial(4);// == 24}

CTFE can be used to create generic data structures at compile-time:

fnList(comptimeT:type)type{returnstruct{items:[]T,len:usize,};}// The generic List data structure can be instantiated by passing in a type:varbuffer:[10]i32=undefined;varlist=List(i32){.items=&buffer,.len=0,};

Related Research Articles

Templates are a feature of the C++ programming language that allows functions and classes to operate with generic types. This allows a function or class declaration to reference via a generic variable another different class without creating full declaration for each of these different classes.

Standard ML (SML) is a general-purpose, high-level, modular, functional programming language with compile-time type checking and type inference. It is popular for writing compilers, for programming language research, and for developing theorem provers.

Generic programming is a style of computer programming in which algorithms are written in terms of data types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered in the programming language ML in 1973, permits writing common functions or data types that differ only in the set of types on which they operate when used, thus reducing duplicate code.

Template metaprogramming (TMP) is a metaprogramming technique in which templates are used by a compiler to generate temporary source code, which is merged by the compiler with the rest of the source code and then compiled. The output of these templates can include compile-time constants, data structures, and complete functions. The use of templates can be thought of as compile-time polymorphism. The technique is used by a number of languages, the best-known being C++, but also Curl, D, Nim, and XL.

<span class="mw-page-title-main">D (programming language)</span> Multi-paradigm system programming language

D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineering of C++, D is now a very different language. As it has developed, it has drawn inspiration from other high-level programming languages. Notably, it has been influenced by Java, Python, Ruby, C#, and Eiffel.

In mathematics and computer science, a higher-order function (HOF) is a function that does at least one of the following:

In computer science, a tagged union, also called a variant, variant record, choice type, discriminated union, disjoint union, sum type, or coproduct, is a data structure used to hold a value that could take on several different, but fixed, types. Only one of the types can be in use at any one time, and a tag field explicitly indicates which type is in use. It can be thought of as a type that has several "cases", each of which should be handled correctly when that type is manipulated. This is critical in defining recursive datatypes, in which some component of a value may have the same type as that value, for example in defining a type for representing trees, where it is necessary to distinguish multi-node subtrees and leaves. Like ordinary unions, tagged unions can save storage by overlapping storage areas for each type, since only one is in use at a time.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

typedef is a reserved keyword in the programming languages C, C++, and Objective-C. It is used to create an additional name (alias) for another data type, but does not create a new type, except in the obscure case of a qualified typedef of an array type where the typedef qualifiers are transferred to the array element type. As such, it is often used to simplify the syntax of declaring complex data structures consisting of struct and union types, although it is also commonly used to provide specific descriptive type names for integer data types of varying sizes.

<span class="mw-page-title-main">Recursion (computer science)</span> Use of functions that call themselves

In computer science, recursion is a method of solving a computational problem where the solution depends on solutions to smaller instances of the same problem. Recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.

The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions.

The C and C++ programming languages are closely related but have many significant differences. C++ began as a fork of an early, pre-standardized C, and was designed to be mostly source-and-link compatible with C compilers of the time. Due to this, development tools for the two languages are often integrated into a single product, with the programmer able to specify C or C++ as their source language.

C++11 is a version of a joint technical standard, ISO/IEC 14882, by the International Organization for Standardization (ISO) and International Electrotechnical Commission (IEC), for the C++ programming language. C++11 replaced the prior version of the C++ standard, named C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

In computer programming, a pure function is a function that has the following properties:

  1. the function return values are identical for identical arguments, and
  2. the function has no side effects.

In computer programming, variadic templates are templates that take a variable number of arguments.

"typename" is a keyword in the C++ programming language used when writing templates. It is used for specifying that a dependent name in a template definition or declaration is a type. In the original C++ compilers before the first ISO standard was completed, the typename keyword was not part of the C++ language and Bjarne Stroustrup used the class keyword for template arguments instead. While typename is now the preferred keyword, older source code may still use the class keyword instead.

This article describes the features in the programming language Haskell.

C++14 is a version of the ISO/IEC 14882 standard for the C++ programming language. It is intended to be a small extension over C++11, featuring mainly bug fixes and small improvements, and was replaced by C++17. Its approval was announced on August 18, 2014. C++14 was published as ISO/IEC 14882:2014 in December 2014.

Elixir is a functional, concurrent, high-level general-purpose programming language that runs on the BEAM virtual machine, which is also used to implement the Erlang programming language. Elixir builds on top of Erlang and shares the same abstractions for building distributed, fault-tolerant applications. Elixir also provides tooling and an extensible design. The latter is supported by compile-time metaprogramming with macros and polymorphism via protocols.

<span class="mw-page-title-main">Zig (programming language)</span> General-purpose programming language

Zig is an imperative, general-purpose, statically typed, compiled system programming language designed by Andrew Kelley. It is free and open-source software, released under an MIT License.

References

  1. Daveed Vandevoorde, Edison Design Group (April 18, 2003). "Reflective Metaprogramming in C++" (PDF). Retrieved July 19, 2015.
  2. Gabriel Dos Reis and Bjarne Stroustrup (March 2010). "General Constant Expressions for System Programming Languages. SAC-2010. The 25th ACM Symposium On Applied Computing" (PDF).
  3. D 2.0 language specification: Functions
  4. D 2.0 language specification: Attributes
  5. Zig 0.11.0 Language Reference: Compile-Time Expressions
  6. Zig 0.11.0 Language Reference: Compile-Time Parameters