Most vexing parse

Last updated

The most vexing parse is a counterintuitive form of syntactic ambiguity resolution in the C++ programming language. In certain situations, the C++ grammar cannot distinguish between the creation of an object parameter and specification of a function's type. In those situations, the compiler is required to interpret the line as a function type specification.

Contents

Occurrence

The term "most vexing parse" was first used by Scott Meyers in his 2001 book Effective STL . [1] While unusual in C, the phenomenon was quite common in C++ until the introduction of uniform initialization in C++11. [2]

Examples

C-style casts

A simple example appears when a functional cast is intended to convert an expression for initializing a variable:

voidf(doublemy_dbl){inti(int(my_dbl));}

Line 2 above is ambiguous. One possible interpretation is to declare a variable i with initial value produced by converting my_dbl to an int. However, C allows superfluous parentheses around function parameter declarations; in this case, the declaration of i is instead a function declaration equivalent to the following:

// A function named i takes an integer and returns an integer.inti(intmy_dbl);

Unnamed temporary

A more elaborate example is:

structTimer{};structTimeKeeper{explicitTimeKeeper(Timert);intget_time();};intmain(){TimeKeepertime_keeper(Timer());returntime_keeper.get_time();}

The line

TimeKeepertime_keeper(Timer());

is ambiguous, since it could be interpreted either as

  1. a variable definition for variable time_keeper of class TimeKeeper, initialized with an anonymous instance of class Timer or
  2. a function declaration for a function time_keeper that returns an object of type TimeKeeper and has a single (unnamed) parameter, whose type is a (pointer to a) function [Note 1] taking no input and returning Timer objects.

The C++ standard requires the second interpretation, which is inconsistent with the subsequent line 10 above. For example, Clang++ warns that the most vexing parse has been applied on line 9 and errors on the subsequent line 10: [3]

$ clang++ time_keeper.cc timekeeper.cc:9:25: warning: parentheses were disambiguated as a function declaration[-Wvexing-parse]   TimeKeeper time_keeper(Timer());                         ^~~~~~~~~timekeeper.cc:9:26: note: add a pair of parentheses to declare a variable   TimeKeeper time_keeper(Timer());                          ^(      )timekeeper.cc:10:21: error: member reference base type 'TimeKeeper (Timer (*)())' is not astructure or union   return time_keeper.get_time();          ~~~~~~~~~~~^~~~~~~~~

Solutions

The required interpretation of these ambiguous declarations is rarely the intended one. [4] [5] Function types in C++ are usually hidden behind typedefs and typically have an explicit reference or pointer qualifier. To force the alternate interpretation, the typical technique is a different object creation or conversion syntax.

In the type conversion example, there are two alternate syntaxes available for casts: the "C-style cast"

// declares a variable of type intinti((int)my_dbl);

or a named cast:

inti(static_cast<int>(my_dbl));

In the variable declaration example, the preferred method (since C++11) is uniform (brace) initialization. [6] This also allows limited omission of the type name entirely:

//Any of the following work:TimeKeepertime_keeper(Timer{});TimeKeepertime_keeper{Timer()};TimeKeepertime_keeper{Timer{}};TimeKeepertime_keeper({});TimeKeepertime_keeper{{}};

Prior to C++11, the common techniques to force the intended interpretation were use of an extra parenthesis or copy-initialization: [5]

TimeKeepertime_keeper(/*Avoid MVP*/(Timer()));TimeKeepertime_keeper=TimeKeeper(Timer());

In the latter syntax, the copy-initialization is likely to be optimized out by the compiler. [7] Since C++17, this optimization is guaranteed. [8]

Notes

  1. According to C++ type decay rules, a function object declared as a parameter is equivalent to a pointer to a function of that type. See Function object#In C and C++.

Related Research Articles

<span class="mw-page-title-main">C++</span> General-purpose programming language

C++ is a high-level, general-purpose programming language created by Danish computer scientist Bjarne Stroustrup. First released in 1985 as an extension of the C programming language, it has since expanded significantly over time; as of 1997, C++ has object-oriented, generic, and functional features, in addition to facilities for low-level memory manipulation for making things like microcomputers or to make operating systems like Linux or Windows. It is almost always implemented as a compiled language, and many vendors provide C++ compilers, including the Free Software Foundation, LLVM, Microsoft, Intel, Embarcadero, Oracle, and IBM.

In computer programming, a parameter or a formal argument is a special kind of variable used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are the values of the arguments with which the subroutine is going to be called/invoked. An ordered list of parameters is usually included in the definition of a subroutine, so that, each time the subroutine is called, its arguments for that call are evaluated, and the resulting values can be assigned to the corresponding parameters.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

In class-based, object-oriented programming, a constructor is a special type of function called to create an object. It prepares the new object for use, often accepting arguments that the constructor uses to set required member variables.

In some programming languages, const is a type qualifier, which indicates that the data is read-only. While this can be used to declare constants, const in the C family of languages differs from similar constructs in other languages in that it is part of the type, and thus has complicated behavior when combined with pointers, references, composite data types, and type-checking. In other languages, the data is not in a single memory location, but copied at compile time for each use. Languages which use it include C, C++, D, JavaScript, Julia, and Rust.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

In computer programming, the lexer hack is a common solution to the problems in parsing ANSI C, due to the reference grammar being context-sensitive. In C, classifying a sequence of characters as a variable name or a type name requires contextual information of the phrase structure, which prevents one from having a context-free lexer.

A class in C++ is a user-defined type or data structure declared with any of the keywords class, struct or union that has data and functions as its members whose access is governed by the three access specifiers private, protected or public. By default access to members of a C++ class declared with the keyword class is private. The private members are not accessible outside the class; they can be accessed only through member functions of the class. The public members form an interface to the class and are accessible outside the class.

The C and C++ programming languages are closely related but have many significant differences. C++ began as a fork of an early, pre-standardized C, and was designed to be mostly source-and-link compatible with C compilers of the time. Due to this, development tools for the two languages are often integrated into a single product, with the programmer able to specify C or C++ as their source language.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

In computer programming, an anonymous function is a function definition that is not bound to an identifier. Anonymous functions are often arguments being passed to higher-order functions or used for constructing the result of a higher-order function that needs to return a function. If the function is only used once, or a limited number of times, an anonymous function may be syntactically lighter than using a named function. Anonymous functions are ubiquitous in functional programming languages and other languages with first-class functions, where they fulfil the same role for the function type as literals do for other data types.

stdarg.h is a header in the C standard library of the C programming language that allows functions to accept an indefinite number of arguments. It provides facilities for stepping through a list of function arguments of unknown number and type. C++ provides this functionality in the header cstdarg.

This article describes the syntax of the C# programming language. The features described are compatible with .NET Framework and Mono.

This article compares a large number of programming languages by tabulating their data types, their expression, statement, and declaration syntax, and some common operating-system interfaces.

The syntax and semantics of PHP, a programming language, form a set of rules that define how a PHP program can be written and interpreted.

This comparison of programming languages compares how object-oriented programming languages such as C++, Java, Smalltalk, Object Pascal, Perl, Python, and others manipulate data structures.

C++14 is a version of the ISO/IEC 14882 standard for the C++ programming language. It is intended to be a small extension over C++11, featuring mainly bug fixes and small improvements, and was replaced by C++17. Its approval was announced on August 18, 2014. C++14 was published as ISO/IEC 14882:2014 in December 2014.

Objective-C is a high-level general-purpose, object-oriented programming language that adds Smalltalk-style messaging to the C programming language. Originally developed by Brad Cox and Tom Love in the early 1980s, it was selected by NeXT for its NeXTSTEP operating system. Due to Apple macOS’s direct lineage from NeXTSTEP, Objective-C was the standard programming language used, supported, and promoted by Apple for developing macOS and iOS applications until the introduction of the Swift programming language in 2014.

<span class="mw-page-title-main">Nim (programming language)</span> Programming language

Nim is a general-purpose, multi-paradigm, statically typed, compiled high-level systems programming language, designed and developed by a team around Andreas Rumpf. Nim is designed to be "efficient, expressive, and elegant", supporting metaprogramming, functional, message passing, procedural, and object-oriented programming styles by providing several features such as compile time code generation, algebraic data types, a foreign function interface (FFI) with C, C++, Objective-C, and JavaScript, and supporting compiling to those same languages as intermediate representations.

C23 is the informal name for ISO/IEC 9899:2024, the next standard for the C programming language, which will replace C17. It was started in 2016 informally as C2x, and expected to be published in 2024. The most recent publicly available working draft of C23 was released on April 1, 2023. The first WG14 meeting for the C2x draft was held in October 2019, virtual remote meetings were held in 2020 due to the COVID-19 pandemic, then various teleconference meetings continued to occur through 2024.

References

  1. Meyers, Scott (2001). Effective STL: 50 Specific Ways to Improve Your Use of the Standard Template Library. Addison-Wesley. ISBN   0-201-74962-9.
  2. Coffin, Jerry (29 December 2012). "c++ - What is the purpose of the Most Vexing Parse?". Stack Overflow. Archived from the original on 17 January 2021. Retrieved 2021-01-17.
  3. Lattner, Chris (5 April 2010). "Amazing Feats of Clang Error Recovery". LLVM Project Blog. The Most Vexing Parse. Archived from the original on 26 September 2020. Retrieved 2021-01-17.
  4. DrPizza; Prototyped; wb; euzeka; Simpson, Homer J (October 2002). "C++'s "most vexing parse"". ArsTechnica OpenForum. Archived from the original on 20 May 2015. Retrieved 2021-01-17.
  5. 1 2 Boccara, Jonathan (2018-01-30). "The Most Vexing Parse: How to Spot It and Fix It Quickly". Fluent C++. Archived from the original on 2021-11-25. Retrieved 2021-01-17.
  6. Stroustrup, Bjarne (19 August 2016). "C++11 FAQ". www.stroustrup.com. Uniform initialization syntax and semantics. Archived from the original on 2021-08-20. Retrieved 2021-01-17.
  7. "Myths and urban legends about C++". C++ FAQ. What is copy elision? What is RVO?. Archived from the original on 17 January 2021. Retrieved 2021-01-17.
  8. Devlieghere, Jonas (2016-11-21). "Guaranteed Copy Elision". Jonas Devlieghere. Archived from the original on 2021-11-25. Retrieved 2021-01-17. Note, however, the caveats covered in Brand, C++ (2018-12-11). "Guaranteed Copy Elision Does Not Elide Copies". Microsoft C++ Team Blog. Archived from the original on 2021-11-25. Retrieved 2021-01-17.