Sequence point

Last updated

In C and C++, a sequence point defines any point in a computer program's execution at which it is guaranteed that all side effects of previous evaluations will have been performed, and no side effects from subsequent evaluations have yet been performed. They are a core concept for determining the validity of and, if valid, the possible results of expressions. Adding more sequence points is sometimes necessary to make an expression defined and to ensure a single valid order of evaluation.

Contents

With C11 and C++11, usage of the term sequence point has been replaced by sequencing. There are three possibilities: [1] [2] [3]

  1. An expression's evaluation can be sequenced before that of another expression, or equivalently the other expression's evaluation is sequenced after that of the first.
  2. The expressions' evaluation is indeterminately sequenced, meaning one is sequenced before the other, but which is unspecified.
  3. The expressions' evaluation is unsequenced.

The execution of unsequenced evaluations can overlap, leading to potentially catastrophic undefined behavior if they share state. This situation can arise in parallel computations, causing race conditions, but undefined behavior can also result in single-threaded situations. For example, a[i] = i++; (where a is an array and i is an integer) has undefined behavior.

Examples of ambiguity

Consider two functions f() and g(). In C and C++, the + operator is not associated with a sequence point, and therefore in the expression f()+g() it is possible that either f() or g() will be executed first. The comma operator introduces a sequence point, and therefore in the code f(),g() the order of evaluation is defined: first f() is called, and then g() is called.

Sequence points also come into play when the same variable is modified more than once within a single expression. An often-cited example is the C expression i=i++, which apparently both assigns i its previous value and increments i. The final value of i is ambiguous, because, depending on the order of expression evaluation, the increment may occur before, after, or interleaved with the assignment. The definition of a particular language might specify one of the possible behaviors or simply say the behavior is undefined. In C and C++, evaluating such an expression yields undefined behavior. [4] Other languages, such as C#, define the precedence of the assignment and increment operator in such a way that the result of the expression i=i++ is guaranteed.

Behavior

Up to C++03

In C [5] and C++, [6] sequence points occur in the following places. (In C++, overloaded operators act like functions, and thus operators that have been overloaded introduce sequence points in the same way as function calls.)

  1. Between evaluation of the left and right operands of the && (logical AND), || (logical OR) (as part of short-circuit evaluation), and comma operators. For example, in the expression *p++!=0&&*q++!=0, all side effects of the sub-expression *p++!=0 are completed before any attempt to access q.
  2. Between the evaluation of the first operand of the ternary conditional operator and its second or third operand. For example, in the expression a=(*p++)?(*p++):0 there is a sequence point after the first *p++, meaning it has already been incremented by the time the second instance is executed.
  3. At the end of a full expression. This category includes expression statements (such as the assignment a=b;), return statements, the controlling expressions of if, switch, while, or do-while statements, and each of the three expressions in a for statement.
  4. Before a function is entered in a function call. The order in which the arguments are evaluated is not specified, but this sequence point means that all of their side effects are complete before the function is entered. In the expression f(i++)+g(j++)+h(k++), f is called with a parameter of the original value of i, but i is incremented before entering the body of f. Similarly, j and k are updated before entering g and h respectively. However, it is not specified in which order f(), g(), h() are executed, nor in which order i, j, k are incremented. If the body of f accesses the variables j and k, it might find both, neither, or just one of them to have been incremented. (The function call f(a,b,c) is not a use of the comma operator; the order of evaluation for a, b, and c is unspecified.)
  5. At a function return, after the return value is copied into the calling context. (This sequence point is only specified in the C++ standard; it is present only implicitly in C. [7] )
  6. At the end of an initializer; for example, after the evaluation of 5 in the declaration inta=5;.
  7. Between each declarator in each declarator sequence; for example, between the two evaluations of a++ in intx=a++,y=a++. [8] (This is not an example of the comma operator.)
  8. After each conversion associated with an input/output format specifier. For example, in the expression printf("foo %n %d",&a,42), there is a sequence point after the %n is evaluated and before printing 42.

C11 and C++11

Partially because of the introduction of language support for threads, C11 and C++11 introduced new terminology for evaluation order. An operation may be "sequenced before" another, or the two can be "indeterminately" sequenced (one must complete before the other) or "unsequenced" (the operations in each expression may be interleaved).

C++17

C++17 restricted several aspects of evaluation order. The new expression will always perform the memory allocation before evaluating the constructor arguments. The operators <<, >>, ., .*, ->*, and the subscript and function call operator are guaranteed to be evaluated left to right (whether they are overloaded or not). For example, the code

std::cout<<a()<<b()<<c();// parsed as (((std::cout << a()) << b()) << c());

is newly guaranteed to call a, b and c in that order. The right-hand side of any assignment-like operator is evaluated before the left-hand side, so that b() *= a(); is guaranteed to evaluate a first. Finally, although the order in which function parameters are evaluated remains implementation-defined, the compiler is no longer allowed to interleave sub-expressions across multiple parameters. [9]

See also

Related Research Articles

In logic, mathematics, and computer science, arity is the number of arguments or operands taken by a function, operation or relation. In mathematics, arity may also be called rank, but this word can have many other meanings. In logic and philosophy, arity may also be called adicity and degree. In linguistics, it is usually named valency.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in the C language. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">Race condition</span> When a systems behavior depends on timing of uncontrollable events

A race condition or race hazard is the condition of an electronics, software, or other system where the system's substantive behavior is dependent on the sequence or timing of other uncontrollable events. It becomes a bug when one or more of the possible behaviors is undesirable.

In computer programming, undefined behavior (UB) is the result of executing a program whose behavior is prescribed to be unpredictable, in the language specification to which the computer code adheres. This is different from unspecified behavior, for which the language specification does not prescribe a result, and implementation-defined behavior that defers to the documentation of another component of the platform.

This is a list of operators in the C and C++ programming languages. All the operators listed exist in C++; the column "Included in C", states whether an operator is also present in C. Note that C does not support operator overloading.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, ternary if, or inline if. An expression a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is by far and large the most common, but alternative syntaxes do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

In computer programming, operators are constructs defined within programming languages which behave generally like functions, but which differ syntactically or semantically.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

<span class="mw-page-title-main">JavaScript syntax</span> Set of rules defining correctly structured programs

The syntax of JavaScript is the set of rules that define a correctly structured JavaScript program.

A class in C++ is a user-defined type or data structure declared with keyword class that has data and functions as its members whose access is governed by the three access specifiers private, protected or public. By default access to members of a C++ class is private. The private members are not accessible outside the class; they can be accessed only through methods of the class. The public members form an interface to the class and are accessible outside the class.

sizeof is a unary operator in the programming languages C and C++. It generates the storage size of an expression or a data type, measured in the number of char-sized units. Consequently, the construct sizeof (char) is guaranteed to be 1. The actual number of bits of type char is specified by the preprocessor macro CHAR_BIT, defined in the standard include file limits.h. On most modern computing platforms this is eight bits. The result of sizeof has an unsigned integer type that is usually denoted by size_t.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

Memory ordering describes the order of accesses to computer memory by a CPU. The term can refer either to the memory ordering generated by the compiler during compile time, or to the memory ordering generated by a CPU during runtime.

The null coalescing operator is a binary operator that is part of the syntax for a basic conditional expression in several programming languages, including C#, PowerShell as of version 7.0.0, Perl as of version 5.10, Swift, and PHP 7.0.0. While its behavior differs between implementations, the null coalescing operator generally returns the result of its left-most operand if it exists and is not null, and otherwise returns the right-most operand. This behavior allows a default value to be defined for cases where a more specific value is not available.

In the C and C++ programming languages, the comma operator is a binary operator that evaluates its first operand and discards the result, and then evaluates the second operand and returns this value ; there is a sequence point between these evaluations.

In computer programming, variadic templates are templates that take a variable number of arguments.

Unspecified behavior is behavior that may vary on different implementations of a programming language. A program can be said to contain unspecified behavior when its source code may produce an executable that exhibits different behavior when compiled on a different compiler, or on the same compiler with different settings, or indeed in different parts of the same executable. While the respective language standards or specifications may impose a range of possible behaviors, the exact behavior depends on the implementation and may not be completely determined upon examination of the program's source code. Unspecified behavior will often not manifest itself in the resulting program's external behavior, but it may sometimes lead to differing outputs or results, potentially causing portability problems.

Increment and decrement operators are unary operators that increase or decrease their operand by one.

In computer programming, initialization is the assignment of an initial value for a data object or variable. The manner in which initialization is performed depends on the programming language, as well as the type, storage class, etc., of an object to be initialized. Programming constructs which perform initialization are typically called initializers and initializer lists. Initialization is distinct from declaration, although the two can sometimes be conflated in practice. The complement of initialization is finalization, which is primarily used for objects, but not variables.

In certain computer programming languages, the Elvis operator, often written ?:, is a binary operator that returns its first operand if that operand evaluates to a true value, and otherwise evaluates and returns its second operand. This is identical to a short-circuit or with "last value" semantics. The notation of the Elvis operator was inspired by the ternary conditional operator, ? :, since the Elvis operator expression A ?: B is approximately equivalent to the ternary conditional A ? A : B.

References

  1. "ISO/IEC 14882:2011" . Retrieved 2012-07-04.
  2. "A finer-grained alternative to sequence points (revised) (WG21/N2239 J16/07-0099)" . Retrieved 2012-07-05.
  3. "Order of evaluation" . Retrieved 2015-10-14.
  4. Clause 6.5#2 of the C99 specification: "Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored."
  5. Annex C of the C99 specification lists the circumstances under which a sequence point may be assumed.
  6. The 1998 C++ standard lists sequence points for that language in section 1.9, paragraphs 1618.
  7. C++ standard, ISO 14882:2003, section 1.9, footnote 11.
  8. C++ standard, ISO 14882:2003, section 8.3: "Each init-declarator in a declaration is analyzed separately as if it was in a declaration by itself."
  9. Dos Reis, Gabriel; Sutter, Herb; Caves, Jonathan (2016-06-23). "Refining Expression Evaluation Order for Idiomatic C++" (PDF). open-std.org. pp. 1–5. Retrieved 28 April 2023.