Copy elision

Last updated March 13, 2024

In C++ computer programming, copy elision refers to a compiler optimization technique that eliminates unnecessary copying of objects.

The C++ language standard generally allows implementations to perform any optimization, provided the resulting program's observable behavior is the same as if , i.e. pretending, the program were executed exactly as mandated by the standard. Beyond that, the standard also describes a few situations where copying can be eliminated even if this would alter the program's behavior, the most common being the return value optimization (see below). Another widely implemented optimization, described in the C++ standard, is when a temporary object of class type is copied to an object of the same type.^[1]^[2] As a result, copy-initialization is usually equivalent to direct-initialization in terms of performance, but not in semantics; copy-initialization still requires an accessible copy constructor.^[3] The optimization can not be applied to a temporary object that has been bound to a reference.

Example

#include<iostream>intn=0;structC{explicitC(int){}C(constC&){++n;}// the copy constructor has a visible side effect};// it modifies an object with static storage durationintmain(){Cc1(42);// direct-initialization, calls C::C(int)Cc2=C(42);// copy-initialization, calls C::C(const C&)std::cout<<n<<std::endl;// prints 0 if the copy was elided, 1 otherwise}

According to the standard a similar optimization may be applied to objects being thrown and caught,^[4]^[5] but it is unclear whether the optimization applies to both the copy from the thrown object to the exception object, and the copy from the exception object to the object declared in the exception-declaration of the catch clause. It is also unclear whether this optimization only applies to temporary objects, or named objects as well.^[6] Given the following source code:

#include<iostream>structC{C()=default;C(constC&){std::cout<<"Hello World!\n";}};voidf(){Cc;throwc;// copying the named object c into the exception object.}// It is unclear whether this copy may be elided (omitted).intmain(){try{f();}catch(Cc){// copying the exception object into the temporary in the// exception declaration.}// It is also unclear whether this copy may be elided (omitted).}

A conforming compiler should therefore produce a program which prints "Hello World!" twice. In the C++11 revision of the C++ standard, the issues have been addressed, essentially allowing both the copy from the named object to the exception object, and the copy into the object declared in the exception handler to be elided.^[6]

GCC provides the -fno-elide-constructors option to disable copy-elision. This option is useful to observe (or not observe) the effects of return value optimization or other optimizations where copies are elided. It is generally not recommended to disable this important optimization.

C++17 Provides for "guaranteed copy elision", a prvalue is not materialized until needed, and then it is constructed directly into the storage of its final destination.^[7]

Return value optimization

In the context of the C++ programming language, return value optimization (RVO) is a compiler optimization that involves eliminating the temporary object created to hold a function's return value.^[8] RVO is allowed to change the observable behaviour of the resulting program by the C++ standard.^[9]

Summary

In general, the C++ standard allows a compiler to perform any optimization, provided the resulting executable exhibits the same observable behaviour as if (i.e. pretending) all the requirements of the standard have been fulfilled. This is commonly referred to as the "as-if rule".^[10]^[2] The term return value optimization refers to a special clause in the C++ standard that goes even further than the "as-if" rule: an implementation may omit a copy operation resulting from a return statement, even if the copy constructor has side effects.^[1]^[2]

The following example demonstrates a scenario where the implementation may eliminate one or both of the copies being made, even if the copy constructor has a visible side effect (printing text).^[1]^[2] The first copy that may be eliminated is the one where a nameless temporary C could be copied into the function f's return value. The second copy that may be eliminated is the copy of the temporary object returned by f to obj.

#include<iostream>structC{C()=default;C(constC&){std::cout<<"A copy was made.\n";}};Cf(){returnC();}intmain(){std::cout<<"Hello World!\n";Cobj=f();}

Depending upon the compiler, and that compiler's settings, the resulting program may display any of the following outputs:

Hello World! A copy was made. A copy was made.

Hello World! A copy was made.

Hello World!

Background

Returning an object of built-in type from a function usually carries little to no overhead, since the object typically fits in a CPU register. Returning a larger object of class type may require more expensive copying from one memory location to another. To avoid this, an implementation may create a hidden object in the caller's stack frame, and pass the address of this object to the function. The function's return value is then copied into the hidden object.^[11] Thus, code such as this:

structData{charbytes[16];};DataF(){Dataresult={};// generate resultreturnresult;}intmain(){Datad=F();}

may generate code equivalent to this:

structData{charbytes[16];};Data*F(Data*_hiddenAddress){Dataresult={};// copy result into hidden object*_hiddenAddress=result;return_hiddenAddress;}intmain(){Data_hidden;// create hidden objectDatad=*F(&_hidden);// copy the result into d}

which causes the Data object to be copied twice.

In the early stages of the evolution of C++, the language's inability to efficiently return an object of class type from a function was considered a weakness.^[12] Around 1991, Walter Bright implemented a technique to minimize copying, effectively replacing the hidden object and the named object inside the function with the object used for holding the result:^[13]

structData{charbytes[16];};voidF(Data*p){// generate result directly in *p}intmain(){Datad;F(&d);}

Bright implemented this optimization in his Zortech C++ compiler.^[12] This particular technique was later coined "Named return value optimization" (NRVO), referring to the fact that the copying of a named object is elided.^[13]

Compiler support

Return value optimization is supported on most compilers.^[8]^[14]^[15] There may be, however, circumstances where the compiler is unable to perform the optimization. One common case is when a function may return different named objects depending on the path of execution:^[11]^[14]^[16]

#include<string>std::stringF(boolcond=false){std::stringfirst("first");std::stringsecond("second");// the function may return one of two named objects// depending on its argument. RVO might not be appliedreturncond?first:second;}intmain(){std::stringresult=F();}

External links

Copy elision on cppreference.com

Related Research Articles

C++ is a high-level, general-purpose programming language created by Danish computer scientist Bjarne Stroustrup. First released in 1985 as an extension of the C programming language, it has since expanded significantly over time; as of 1997, C++ has object-oriented, generic, and functional features, in addition to facilities for low-level memory manipulation for making things like microcomputers or to make operating systems like Linux or Windows. It is almost always implemented as a compiled language, and many vendors provide C++ compilers, including the Free Software Foundation, LLVM, Microsoft, Intel, Embarcadero, Oracle, and IBM.

Template metaprogramming (TMP) is a metaprogramming technique in which templates are used by a compiler to generate temporary source code, which is merged by the compiler with the rest of the source code and then compiled. The output of these templates can include compile-time constants, data structures, and complete functions. The use of templates can be thought of as compile-time polymorphism. The technique is used by a number of languages, the best-known being C++, but also Curl, D, Nim, and XL.

In computer science, a tagged union, also called a variant, variant record, choice type, discriminated union, disjoint union, sum type or coproduct, is a data structure used to hold a value that could take on several different, but fixed, types. Only one of the types can be in use at any one time, and a tag field explicitly indicates which one is in use. It can be thought of as a type that has several "cases", each of which should be handled correctly when that type is manipulated. This is critical in defining recursive datatypes, in which some component of a value may have the same type as that value, for example in defining a type for representing trees, where it is necessary to distinguish multi-node subtrees and leaves. Like ordinary unions, tagged unions can save storage by overlapping storage areas for each type, since only one is in use at a time.

In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior, for which the language specification does not prescribe a result, and implementation-defined behavior that defers to the documentation of another component of the platform.

In the C++ programming language, a copy constructor is a special constructor for creating a new object as a copy of an existing object. Copy constructors are the standard way of copying objects in C++, as opposed to cloning, and have C++-specific nuances.

In the C++ programming language, a reference is a simple reference datatype that is less powerful but safer than the pointer type inherited from C. The name C++ reference may cause confusion, as in computer science a reference is a general concept datatype, with pointers and C++ references being specific reference datatype implementations. The definition of a reference in C++ is such that it does not need to exist. It can be implemented as a new name for an existing object.

In C and C++, a sequence point defines any point in a computer program's execution at which it is guaranteed that all side effects of previous evaluations will have been performed, and no side effects from subsequent evaluations have yet been performed. They are a core concept for determining the validity of and, if valid, the possible results of expressions. Adding more sequence points is sometimes necessary to make an expression defined and to ensure a single valid order of evaluation.

The One Definition Rule (ODR) is an important rule of the C++ programming language that prescribes that classes/structs and non-inline functions cannot have more than one definition in the entire program and template and types cannot have more than one definition by translation unit. It is defined in the ISO C++ Standard 2003, at section 3.2. Some other programming languages have similar but differently defined rules towards the same objective.

A class in C++ is a user-defined type or data structure declared with any of the keywords class, struct or union that has data and functions as its members whose access is governed by the three access specifiers private, protected or public. By default access to members of a C++ class declared with the keyword class is private. The private members are not accessible outside the class; they can be accessed only through member functions of the class. The public members form an interface to the class and are accessible outside the class.

sizeof is a unary operator in the programming languages C and C++. It generates the storage size of an expression or a data type, measured in the number of char-sized units. Consequently, the construct sizeof (char) is guaranteed to be 1. The actual number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern computing platforms this is eight bits. The result of sizeof has an unsigned integer type that is usually denoted by size_t.

In computer programming languages, the term default constructor can refer to a constructor that is automatically generated by the compiler in the absence of any programmer-defined constructors, and is usually a nullary constructor. In other languages it is a constructor that can be called without having to provide any arguments, irrespective of whether the constructor is auto-generated or user-defined. Note that a constructor with formal parameters can still be called without arguments if default arguments were provided in the constructor's definition.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

Substitution failure is not an error (SFINAE) is a principle in C++ where an invalid substitution of template parameters is not in itself an error. David Vandevoorde first introduced the acronym SFINAE to describe related programming techniques.

In computer science, a type punning is any programming technique that subverts or circumvents the type system of a programming language in order to achieve an effect that would be difficult or impossible to achieve within the bounds of the formal language.

In the C++ programming language, special member functions are functions which the compiler will automatically generate if they are used, but not declared explicitly by the programmer. The automatically generated special member functions are:

In C++ computer programming, allocators are a component of the C++ Standard Library. The standard library provides several data structures, such as list and set, commonly referred to as containers. A common trait among these containers is their ability to change size during the execution of the program. To achieve this, some form of dynamic memory allocation is usually required. Allocators handle all the requests for allocation and deallocation of memory for a given container. The C++ Standard Library provides general-purpose allocators that are used by default, however, custom allocators may also be supplied by the programmer.

In computer programming, unspecified behavior is behavior that may vary on different implementations of a programming language. A program can be said to contain unspecified behavior when its source code may produce an executable that exhibits different behavior when compiled on a different compiler, or on the same compiler with different settings, or indeed in different parts of the same executable. While the respective language standards or specifications may impose a range of possible behaviors, the exact behavior depends on the implementation and may not be completely determined upon examination of the program's source code. Unspecified behavior will often not manifest itself in the resulting program's external behavior, but it may sometimes lead to differing outputs or results, potentially causing portability problems.

In computing, sequence containers refer to a group of container class templates in the standard library of the C++ programming language that implement storage of data elements. Being templates, they can be used to store arbitrary elements, such as integers or custom classes. One common property of all sequential containers is that the elements can be accessed sequentially. Like all other standard library components, they reside in namespace std.

C++14 is a version of the ISO/IEC 14882 standard for the C++ programming language. It is intended to be a small extension over C++11, featuring mainly bug fixes and small improvements, and was replaced by C++17. Its approval was announced on August 18, 2014. C++14 was published as ISO/IEC 14882:2014 in December 2014.

Although C++ is one of the most widespread programming languages, many prominent software engineers criticize C++ for being overly complex and fundamentally flawed. Among the critics have been: Robert Pike, Joshua Bloch, Linus Torvalds, Donald Knuth, Richard Stallman, and Ken Thompson. C++ has been widely adopted and implemented as a systems language through most of its existence. It has been used to build many pieces of very important software.

References

1 2 3 ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §12.8 Copying class objects [class.copy] para. 15
1 2 3 4 ISO/IEC (2003). "§ 12.8 Copying class objects [class.copy]". ISO/IEC 14882:2003(E): Programming Languages - C++ (PDF). para. 15. Archived from the original (PDF) on 2023-04-10. Retrieved 2024-02-26.
↑ Sutter, Herb (2001). More Exceptional C++. Addison-Wesley.
↑ ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §15.1 Throwing an exception [except.throw] para. 5
↑ ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §15.3 Handling an exception [except.handle] para. 17
1 2 "C++ Standard Core Language Defect Reports". WG21. Retrieved 2009-03-27.
↑ https://en.cppreference.com/w/cpp/language/copy_elision
1 2 Meyers, Scott (1995). More Effective C++ . Addison-Wesley. ISBN 9780201633719.
↑ Alexandrescu, Andrei (2003-02-01). "Move Constructors". Dr. Dobb's Journal . Retrieved 2009-03-25.
↑ ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §1.9 Program execution [intro.execution] para. 1
1 2 Bulka, Dov; David Mayhew (2000). Efficient C++. Addison-Wesley. ISBN 0-201-37950-3.
1 2 Lippman, Stan (2004-02-03). "The Name Return Value Optimization". Microsoft. Retrieved 2009-03-23.
1 2 "Glossary D Programming Language 2.0". Digital Mars . Retrieved 2009-03-23.
1 2 Shoukry, Ayman B. (October 2005). "Named Return Value Optimization in Visual C++ 2005". Microsoft . Retrieved 2009-03-20.
↑ "Options Controlling C++ Dialect". GCC. 2001-03-17. Retrieved 2018-01-20.
↑ Hinnant, Howard; et al. (2002-09-10). "N1377: A Proposal to Add Move Semantics Support to the C++ Language". WG21. Retrieved 2009-03-25.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[C++03_12.8/15-1] 1 2 3 ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §12.8 Copying class objects [class.copy] para. 15

[staff.ustc.edu.cn_2023_u994-2] 1 2 3 4 ISO/IEC (2003). "§ 12.8 Copying class objects [class.copy]". ISO/IEC 14882:2003(E): Programming Languages - C++ (PDF). para. 15. Archived from the original (PDF) on 2023-04-10. Retrieved 2024-02-26.

[moreexcept-3] Sutter, Herb (2001). More Exceptional C++. Addison-Wesley.

[C++03_15.1/5-4] ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §15.1 Throwing an exception [except.throw] para. 5

[C++03_15.3/17-5] ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §15.3 Handling an exception [except.handle] para. 17

[DR_479-6] 1 2 "C++ Standard Core Language Defect Reports". WG21. Retrieved 2009-03-27.

[7] ttps://en.cppreference.com/w/cpp/language/copy_elision

[moreeffcpp-8] 1 2 Meyers, Scott (1995). More Effective C++ . Addison-Wesley. ISBN 9780201633719.

[andrei-9] Alexandrescu, Andrei (2003-02-01). "Move Constructors". Dr. Dobb's Journal . Retrieved 2009-03-25.

[C++03_1.9/1-10] ISO/IEC (2003). ISO/IEC 14882:2003(E): Programming Languages - C++ §1.9 Program execution [intro.execution] para. 1

[efficient-11] 1 2 Bulka, Dov; David Mayhew (2000). Efficient C++. Addison-Wesley. ISBN 0-201-37950-3.

[lippman-12] 1 2 Lippman, Stan (2004-02-03). "The Name Return Value Optimization". Microsoft. Retrieved 2009-03-23.

[d20-13] 1 2 "Glossary D Programming Language 2.0". Digital Mars . Retrieved 2009-03-23.

[vc8-14] 1 2 Shoukry, Ayman B. (October 2005). "Named Return Value Optimization in Visual C++ 2005". Microsoft . Retrieved 2009-03-20.

[gcc-15] "Options Controlling C++ Dialect". GCC. 2001-03-17. Retrieved 2018-01-20.

[n1377-16] Hinnant, Howard; et al. (2002-09-10). "N1377: A Proposal to Add Move Semantics Support to the C++ Language". WG21. Retrieved 2009-03-25.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

v t e Compiler optimizations
Basic block	Peephole optimization Local value numbering
Loop	Automatic parallelization Automatic vectorization Induction variable Loop fusion Loop-invariant code motion Loop inversion Loop interchange Loop nest optimization Loop splitting Loop unrolling Loop unswitching Software pipelining Strength reduction
Data-flow analysis	Available expression Common subexpression elimination Constant folding Dead store elimination Induction variable recognition and elimination Live-variable analysis Use-define chain
SSA-based	Global value numbering Sparse conditional constant propagation
Code generation	Instruction scheduling Instruction selection Register allocation Rematerialization
Functional	Deforestation Tail-call elimination
Global	Interprocedural optimization
Other	Bounds-checking elimination Compile-time function execution Dead-code elimination Expression templates Inline expansion Jump threading Partial evaluation Profile-guided optimization
Static analysis	Alias analysis Array-access analysis Control-flow analysis Data-flow analysis Dependence analysis Escape analysis Pointer analysis Shape analysis Value range analysis