Template metaprogramming

Last updated

Template metaprogramming (TMP) is a metaprogramming technique in which templates are used by a compiler to generate temporary source code, which is merged by the compiler with the rest of the source code and then compiled. The output of these templates can include compile-time constants, data structures, and complete functions. The use of templates can be thought of as compile-time polymorphism. The technique is used by a number of languages, the best-known being C++, but also Curl, D, Nim, and XL.

Contents

Template metaprogramming was, in a sense, discovered accidentally. [1] [2]

Some other languages support similar, if not more powerful, compile-time facilities (such as Lisp macros), but those are outside the scope of this article.

Components of template metaprogramming

The use of templates as a metaprogramming technique requires two distinct operations: a template must be defined, and a defined template must be instantiated. The generic form of the generated source code is described in the template definition, and when the template is instantiated, the generic form in the template is used to generate a specific set of source code.

Template metaprogramming is Turing-complete, meaning that any computation expressible by a computer program can be computed, in some form, by a template metaprogram. [3]

Templates are different from macros . A macro is a piece of code that executes at compile time and either performs textual manipulation of code to-be compiled (e.g. C++ macros) or manipulates the abstract syntax tree being produced by the compiler (e.g. Rust or Lisp macros). Textual macros are notably more independent of the syntax of the language being manipulated, as they merely change the in-memory text of the source code right before compilation.

Template metaprograms have no mutable variables that is, no variable can change value once it has been initialized, therefore template metaprogramming can be seen as a form of functional programming. In fact many template implementations implement flow control only through recursion, as seen in the example below.

Using template metaprogramming

Though the syntax of template metaprogramming is usually very different from the programming language it is used with, it has practical uses. Some common reasons to use templates are to implement generic programming (avoiding sections of code which are similar except for some minor variations) or to perform automatic compile-time optimization such as doing something once at compile time rather than every time the program is run — for instance, by having the compiler unroll loops to eliminate jumps and loop count decrements whenever the program is executed.

Compile-time class generation

What exactly "programming at compile-time" means can be illustrated with an example of a factorial function, which in non-template C++ can be written using recursion as follows:

unsignedfactorial(unsignedn){returnn==0?1:n*factorial(n-1);}// Usage examples:// factorial(0) would yield 1;// factorial(4) would yield 24.

The code above will execute at run time to determine the factorial value of the literals 0 and 4. By using template metaprogramming and template specialization to provide the ending condition for the recursion, the factorials used in the program—ignoring any factorial not used—can be calculated at compile time by this code:

template<unsignedN>structfactorial{staticconstexprunsignedvalue=N*factorial<N-1>::value;};template<>structfactorial<0>{staticconstexprunsignedvalue=1;};// Usage examples:// factorial<0>::value would yield 1;// factorial<4>::value would yield 24.

The code above calculates the factorial value of the literals 0 and 4 at compile time and uses the results as if they were precalculated constants. To be able to use templates in this manner, the compiler must know the value of its parameters at compile time, which has the natural precondition that factorial<X>::value can only be used if X is known at compile time. In other words, X must be a constant literal or a constant expression.

In C++11 and C++20, constexpr and consteval were introduced to let the compiler execute code. Using constexpr and consteval, one can use the usual recursive factorial definition with the non-templated syntax. [4]

Compile-time code optimization

The factorial example above is one example of compile-time code optimization in that all factorials used by the program are pre-compiled and injected as numeric constants at compilation, saving both run-time overhead and memory footprint. It is, however, a relatively minor optimization.

As another, more significant, example of compile-time loop unrolling, template metaprogramming can be used to create length-n vector classes (where n is known at compile time). The benefit over a more traditional length-n vector is that the loops can be unrolled, resulting in very optimized code. As an example, consider the addition operator. A length-n vector addition might be written as

template<intlength>Vector<length>&Vector<length>::operator+=(constVector<length>&rhs){for(inti=0;i<length;++i)value[i]+=rhs.value[i];return*this;}

When the compiler instantiates the function template defined above, the following code may be produced:[ citation needed ]

template<>Vector<2>&Vector<2>::operator+=(constVector<2>&rhs){value[0]+=rhs.value[0];value[1]+=rhs.value[1];return*this;}

The compiler's optimizer should be able to unroll the for loop because the template parameter length is a constant at compile time.

However, take care and exercise caution as this may cause code bloat as separate unrolled code will be generated for each 'N'(vector size) you instantiate with.

Static polymorphism

Polymorphism is a common standard programming facility where derived objects can be used as instances of their base object but where the derived objects' methods will be invoked, as in this code

classBase{public:virtualvoidmethod(){std::cout<<"Base";}virtual~Base(){}};classDerived:publicBase{public:virtualvoidmethod(){std::cout<<"Derived";}};intmain(){Base*pBase=newDerived;pBase->method();//outputs "Derived"deletepBase;return0;}

where all invocations of virtual methods will be those of the most-derived class. This dynamically polymorphic behaviour is (typically) obtained by the creation of virtual look-up tables for classes with virtual methods, tables that are traversed at run time to identify the method to be invoked. Thus, run-time polymorphism necessarily entails execution overhead (though on modern architectures the overhead is small).

However, in many cases the polymorphic behaviour needed is invariant and can be determined at compile time. Then the Curiously Recurring Template Pattern (CRTP) can be used to achieve static polymorphism, which is an imitation of polymorphism in programming code but which is resolved at compile time and thus does away with run-time virtual-table lookups. For example:

template<classDerived>structbase{voidinterface(){// ...static_cast<Derived*>(this)->implementation();// ...}};structderived:base<derived>{voidimplementation(){// ...}};

Here the base class template will take advantage of the fact that member function bodies are not instantiated until after their declarations, and it will use members of the derived class within its own member functions, via the use of a static_cast, thus at compilation generating an object composition with polymorphic characteristics. As an example of real-world usage, the CRTP is used in the Boost iterator library. [5]

Another similar use is the "Barton–Nackman trick", sometimes referred to as "restricted template expansion", where common functionality can be placed in a base class that is used not as a contract but as a necessary component to enforce conformant behaviour while minimising code redundancy.

Static Table Generation

The benefit of static tables is the replacement of "expensive" calculations with a simple array indexing operation (for examples, see lookup table). In C++, there exists more than one way to generate a static table at compile time. The following listing shows an example of creating a very simple table by using recursive structs and variadic templates. The table has a size of ten. Each value is the square of the index.

#include<iostream>#include<array>constexprintTABLE_SIZE=10;/** * Variadic template for a recursive helper struct. */template<intINDEX=0,int...D>structHelper:Helper<INDEX+1,D...,INDEX*INDEX>{};/** * Specialization of the template to end the recursion when the table size reaches TABLE_SIZE. */template<int...D>structHelper<TABLE_SIZE,D...>{staticconstexprstd::array<int,TABLE_SIZE>table={D...};};constexprstd::array<int,TABLE_SIZE>table=Helper<>::table;enum{FOUR=table[2]// compile time use};intmain(){for(inti=0;i<TABLE_SIZE;i++){std::cout<<table[i]<<std::endl;// run time use}std::cout<<"FOUR: "<<FOUR<<std::endl;}

The idea behind this is that the struct Helper recursively inherits from a struct with one more template argument (in this example calculated as INDEX * INDEX) until the specialization of the template ends the recursion at a size of 10 elements. The specialization simply uses the variable argument list as elements for the array. The compiler will produce code similar to the following (taken from clang called with -Xclang -ast-print -fsyntax-only).

template<intINDEX=0,int...D>structHelper:Helper<INDEX+1,D...,INDEX*INDEX>{};template<>structHelper<0,<>>:Helper<0+1,0*0>{};template<>structHelper<1,<0>>:Helper<1+1,0,1*1>{};template<>structHelper<2,<0,1>>:Helper<2+1,0,1,2*2>{};template<>structHelper<3,<0,1,4>>:Helper<3+1,0,1,4,3*3>{};template<>structHelper<4,<0,1,4,9>>:Helper<4+1,0,1,4,9,4*4>{};template<>structHelper<5,<0,1,4,9,16>>:Helper<5+1,0,1,4,9,16,5*5>{};template<>structHelper<6,<0,1,4,9,16,25>>:Helper<6+1,0,1,4,9,16,25,6*6>{};template<>structHelper<7,<0,1,4,9,16,25,36>>:Helper<7+1,0,1,4,9,16,25,36,7*7>{};template<>structHelper<8,<0,1,4,9,16,25,36,49>>:Helper<8+1,0,1,4,9,16,25,36,49,8*8>{};template<>structHelper<9,<0,1,4,9,16,25,36,49,64>>:Helper<9+1,0,1,4,9,16,25,36,49,64,9*9>{};template<>structHelper<10,<0,1,4,9,16,25,36,49,64,81>>{staticconstexprstd::array<int,TABLE_SIZE>table={0,1,4,9,16,25,36,49,64,81};};

Since C++17 this can be more readably written as:

#include<iostream>#include<array>constexprintTABLE_SIZE=10;constexprstd::array<int,TABLE_SIZE>table=[]{// OR: constexpr auto tablestd::array<int,TABLE_SIZE>A={};for(unsignedi=0;i<TABLE_SIZE;i++){A[i]=i*i;}returnA;}();enum{FOUR=table[2]// compile time use};intmain(){for(inti=0;i<TABLE_SIZE;i++){std::cout<<table[i]<<std::endl;// run time use}std::cout<<"FOUR: "<<FOUR<<std::endl;}

To show a more sophisticated example the code in the following listing has been extended to have a helper for value calculation (in preparation for more complicated computations), a table specific offset and a template argument for the type of the table values (e.g. uint8_t, uint16_t, ...).

#include<iostream>#include<array>constexprintTABLE_SIZE=20;constexprintOFFSET=12;/** * Template to calculate a single table entry */template<typenameVALUETYPE,VALUETYPEOFFSET,VALUETYPEINDEX>structValueHelper{staticconstexprVALUETYPEvalue=OFFSET+INDEX*INDEX;};/** * Variadic template for a recursive helper struct. */template<typenameVALUETYPE,VALUETYPEOFFSET,intN=0,VALUETYPE...D>structHelper:Helper<VALUETYPE,OFFSET,N+1,D...,ValueHelper<VALUETYPE,OFFSET,N>::value>{};/** * Specialization of the template to end the recursion when the table size reaches TABLE_SIZE. */template<typenameVALUETYPE,VALUETYPEOFFSET,VALUETYPE...D>structHelper<VALUETYPE,OFFSET,TABLE_SIZE,D...>{staticconstexprstd::array<VALUETYPE,TABLE_SIZE>table={D...};};constexprstd::array<uint16_t,TABLE_SIZE>table=Helper<uint16_t,OFFSET>::table;intmain(){for(inti=0;i<TABLE_SIZE;i++){std::cout<<table[i]<<std::endl;}}

Which could be written as follows using C++17:

#include<iostream>#include<array>constexprintTABLE_SIZE=20;constexprintOFFSET=12;template<typenameVALUETYPE,intOFFSET>constexprstd::array<VALUETYPE,TABLE_SIZE>table=[]{// OR: constexpr auto tablestd::array<VALUETYPE,TABLE_SIZE>A={};for(unsignedi=0;i<TABLE_SIZE;i++){A[i]=OFFSET+i*i;}returnA;}();intmain(){for(inti=0;i<TABLE_SIZE;i++){std::cout<<table<uint16_t,OFFSET>[i]<<std::endl;}}

Concepts

The C++20 standard brought C++ programmers a new tool for meta template programming, concepts. [6]

Concepts allow programmers to specify requirements for the type, to make instantiation of template possible. The compiler looks for a template with the concept that has the highest requirements.

Here is an example of the famous Fizz buzz problem solved with Template Meta Programming.

#include<boost/type_index.hpp> // for pretty printing of types#include<iostream>#include<tuple>/** * Type representation of words to print */structFizz{};structBuzz{};structFizzBuzz{};template<size_t_N>structnumber{constexprstaticsize_tN=_N;};/** * Concepts used to define condition for specializations */template<typenameAny>concepthas_N=requires{requiresAny::N-Any::N==0;};template<typenameA>conceptfizz_c=has_N<A>&&requires{requiresA::N%3==0;};template<typenameA>conceptbuzz_c=has_N<A>&&requires{requiresA::N%5==0;};template<typenameA>conceptfizzbuzz_c=fizz_c<A>&&buzz_c<A>;/** * By specializing `res` structure, with concepts requirements, proper instantiation is performed */template<typenameX>structres;template<fizzbuzz_cX>structres<X>{usingresult=FizzBuzz;};template<fizz_cX>structres<X>{usingresult=Fizz;};template<buzz_cX>structres<X>{usingresult=Buzz;};template<has_NX>structres<X>{usingresult=X;};/** * Predeclaration of concatenator */template<size_tcnt,typename...Args>structconcatenator;/** * Recursive way of concatenating next types */template<size_tcnt,typename...Args>structconcatenator<cnt,std::tuple<Args...>>{usingtype=typenameconcatenator<cnt-1,std::tuple<typenameres<number<cnt>>::result,Args...>>::type;};/** * Base case */template<typename...Args>structconcatenator<0,std::tuple<Args...>>{usingtype=std::tuple<Args...>;};/** * Final result getter */template<size_tAmount>usingfizz_buzz_full_template=typenameconcatenator<Amount-1,std::tuple<typenameres<number<Amount>>::result>>::type;intmain(){// printing result with boost, so it's clearstd::cout<<boost::typeindex::type_id<fizz_buzz_full_template<100>>().pretty_name()<<std::endl;/*Result: std::tuple<number<1ul>, number<2ul>, Fizz, number<4ul>, Buzz, Fizz, number<7ul>, number<8ul>, Fizz, Buzz, number<11ul>, Fizz, number<13ul>, number<14ul>, FizzBuzz, number<16ul>, number<17ul>, Fizz, number<19ul>, Buzz, Fizz, number<22ul>, number<23ul>, Fizz, Buzz, number<26ul>, Fizz, number<28ul>, number<29ul>, FizzBuzz, number<31ul>, number<32ul>, Fizz, number<34ul>, Buzz, Fizz, number<37ul>, number<38ul>, Fizz, Buzz, number<41ul>, Fizz, number<43ul>, number<44ul>, FizzBuzz, number<46ul>, number<47ul>, Fizz, number<49ul>, Buzz, Fizz, number<52ul>, number<53ul>, Fizz, Buzz, number<56ul>, Fizz, number<58ul>, number<59ul>, FizzBuzz, number<61ul>, number<62ul>, Fizz, number<64ul>, Buzz, Fizz, number<67ul>, number<68ul>, Fizz, Buzz, number<71ul>, Fizz, number<73ul>, number<74ul>, FizzBuzz, number<76ul>, number<77ul>, Fizz, number<79ul>, Buzz, Fizz, number<82ul>, number<83ul>, Fizz, Buzz, number<86ul>, Fizz, number<88ul>, number<89ul>, FizzBuzz, number<91ul>, number<92ul>, Fizz, number<94ul>, Buzz, Fizz, number<97ul>, number<98ul>, Fizz, Buzz>*/}

Benefits and drawbacks of template metaprogramming

Compile-time versus execution-time tradeoff
If a great deal of template metaprogramming is used.
Generic programming
Template metaprogramming allows the programmer to focus on architecture and delegate to the compiler the generation of any implementation required by client code. Thus, template metaprogramming can accomplish truly generic code, facilitating code minimization and better maintainability[ citation needed ].
Readability
With respect to C++ prior to C++11, the syntax and idioms of template metaprogramming were esoteric compared to conventional C++ programming, and template metaprograms could be very difficult to understand. [7] [8] But from C++11 onward the syntax for value computation metaprogramming becomes more and more akin to "normal" C++, with less and less readability penalty.

See also

Related Research Articles

Templates are a feature of the C++ programming language that allows functions and classes to operate with generic types. This allows a function or class declaration to reference via a generic variable another different class without creating full declaration for each of these different classes.

Generic programming is a style of computer programming in which algorithms are written in terms of data types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by the ML programming language in 1973, permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplicate code.

<span class="mw-page-title-main">D (programming language)</span> Multi-paradigm system programming language

D, also known as dlang, is a multi-paradigm system programming language created by Walter Bright at Digital Mars and released in 2001. Andrei Alexandrescu joined the design and development effort in 2007. Though it originated as a re-engineering of C++, D is now a very different language drawing inspiration from other high-level programming languages, notably Java, Python, Ruby, C#, and Eiffel.

In mathematics and computer science, a higher-order function (HOF) is a function that does at least one of the following:

In computer programming, rank with no further specifications is usually a synonym for "number of dimensions"; thus, a two-dimensional array has rank two, a three-dimensional array has rank three and so on. Strictly, no formal definition can be provided which applies to every programming language, since each of them has its own concepts, semantics and terminology; the term may not even be applicable or, to the contrary, applied with a very specific meaning in the context of a given language.

In computer science, a tagged union, also called a variant, variant record, choice type, discriminated union, disjoint union, sum type or coproduct, is a data structure used to hold a value that could take on several different, but fixed, types. Only one of the types can be in use at any one time, and a tag field explicitly indicates which one is in use. It can be thought of as a type that has several "cases", each of which should be handled correctly when that type is manipulated. This is critical in defining recursive datatypes, in which some component of a value may have the same type as that value, for example in defining a type for representing trees, where it is necessary to distinguish multi-node subtrees and leaves. Like ordinary unions, tagged unions can save storage by overlapping storage areas for each type, since only one is in use at a time.

<i>Modern C++ Design</i> Book by Andrei Alexandrescu

Modern C++ Design: Generic Programming and Design Patterns Applied is a book written by Andrei Alexandrescu, published in 2001 by Addison-Wesley. It has been regarded as "one of the most important C++ books" by Scott Meyers.

typedef is a reserved keyword in the programming languages C, C++, and Objective-C. It is used to create an additional name (alias) for another data type, but does not create a new type, except in the obscure case of a qualified typedef of an array type where the typedef qualifiers are transferred to the array element type. As such, it is often used to simplify the syntax of declaring complex data structures consisting of struct and union types, although it is also commonly used to provide specific descriptive type names for integer data types of varying sizes.

C++ Technical Report 1 (TR1) is the common name for ISO/IEC TR 19768, C++ Library Extensions, which is a document that proposed additions to the C++ standard library for the C++03 language standard. The additions include regular expressions, smart pointers, hash tables, and random number generators. TR1 was not a standard itself, but rather a draft document. However, most of its proposals became part of the later official standard, C++11. Before C++11 was standardized, vendors used this document as a guide to create extensions. The report's goal was "to build more widespread existing practice for an expanded C++ standard library".

A class in C++ is a user-defined type or data structure declared with any of the keywords class, struct or union that has data and functions as its members whose access is governed by the three access specifiers private, protected or public. By default access to members of a C++ class declared with the keyword class is private. The private members are not accessible outside the class; they can be accessed only through member functions of the class. The public members form an interface to the class and are accessible outside the class.

sizeof is a unary operator in the programming languages C and C++. It generates the storage size of an expression or a data type, measured in the number of char-sized units. Consequently, the construct sizeof (char) is guaranteed to be 1. The actual number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern computing platforms this is eight bits. The result of sizeof has an unsigned integer type that is usually denoted by size_t.

The curiously recurring template pattern (CRTP) is an idiom, originally in C++, in which a class X derives from a class template instantiation using X itself as a template argument. More generally it is known as F-bound polymorphism, and it is a form of F-bounded quantification.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

Substitution failure is not an error (SFINAE) is a principle in C++ where an invalid substitution of template parameters is not in itself an error. David Vandevoorde first introduced the acronym SFINAE to describe related programming techniques.

In computer science, a type punning is any programming technique that subverts or circumvents the type system of a programming language in order to achieve an effect that would be difficult or impossible to achieve within the bounds of the formal language.

In computing, compile-time function execution is the ability of a compiler, that would normally compile a function to machine code and execute it at run time, to execute the function at compile time. This is possible if the arguments to the function are known at compile time, and the function does not make any reference to or attempt to modify any global state.

In the programming language C++, unordered associative containers are a group of class templates in the C++ Standard Library that implement hash table variants. Being templates, they can be used to store arbitrary elements, such as integers or custom classes. The following containers are defined in the current revision of the C++ standard: unordered_set, unordered_map, unordered_multiset, unordered_multimap. Each of these containers differ only on constraints placed on their elements.

Expression templates are a C++ template metaprogramming technique that builds structures representing a computation at compile time, where expressions are evaluated only as needed to produce efficient code for the entire computation. Expression templates thus allow programmers to bypass the normal order of evaluation of the C++ language and achieve optimizations such as loop fusion.

In computer programming, variadic templates are templates that take a variable number of arguments.

In C++, associative containers refer to a group of class templates in the standard library of the C++ programming language that implement ordered associative arrays. Being templates, they can be used to store arbitrary elements, such as integers or custom classes. The following containers are defined in the current revision of the C++ standard: set, map, multiset, multimap. Each of these containers differ only on constraints placed on their elements.

References

  1. Scott Meyers (12 May 2005). Effective C++: 55 Specific Ways to Improve Your Programs and Designs. Pearson Education. ISBN   978-0-13-270206-5.
  2. See History of TMP on Wikibooks
  3. Veldhuizen, Todd L. (2003). "C++ Templates are Turing Complete". CiteSeerX   10.1.1.14.3670 .
  4. "Constexpr - Generalized Constant Expressions in C++11 - Cprogramming.com". www.cprogramming.com.
  5. "Iterator Facade - 1.79.0".
  6. "Constraints and concepts (since C++20) - cppreference.com". en.cppreference.com.
  7. Czarnecki, K.; O'Donnell, J.; Striegnitz, J.; Taha, Walid Mohamed (2004). "DSL implementation in metaocaml, template haskell, and C++" (PDF). University of Waterloo, University of Glasgow, Research Centre Julich, Rice University. C++ Template Metaprogramming suffers from a number of limitations, including portability problems due to compiler limitations (although this has significantly improved in the last few years), lack of debugging support or IO during template instantiation, long compilation times, long and incomprehensible errors, poor readability of the code, and poor error reporting.
  8. Sheard, Tim; Jones, Simon Peyton (2002). "Template Meta-programming for Haskell" (PDF). ACM 1-58113-415-0/01/0009. Robinson's provocative paper identifies C++ templates as a major, albeit accidental, success of the C++ language design. Despite the extremely baroque nature of template meta-programming, templates are used in fascinating ways that extend beyond the wildest dreams of the language designers. Perhaps surprisingly, in view of the fact that templates are functional programs, functional programmers have been slow to capitalize on C++'s success