Comma operator

Last updated

In the C and C++ programming languages, the comma operator (represented by the token ,) is a binary operator that evaluates its first operand and discards the result, and then evaluates the second operand and returns this value (and type); there is a sequence point between these evaluations.

Contents

The use of the comma token as an operator is distinct from its use in function calls and definitions, variable declarations, enum declarations, and similar constructs, where it acts as a separator .

Syntax

The comma operator separates expressions (which have value) in a way analogous to how the semicolon terminates statements, and sequences of expressions are enclosed in parentheses analogously to how sequences of statements are enclosed in braces: [1] (a, b, c) is a sequence of expressions, separated by commas, which evaluates to the last expression c, while {a; b; c;} is a sequence of statements, and does not evaluate to any value. A comma can only occur between two expressions – commas separate expressions – unlike the semicolon, which occurs at the end of a (non-block) statement – semicolons terminate statements.

The comma operator has the lowest precedence of any C operator, and acts as a sequence point. In a combination of commas and semicolons, semicolons have lower precedence than commas, as semicolons separate statements but commas occur within statements, which accords with their use as ordinary punctuation: a, b; c, d is grouped as (a, b); (c, d) because these are two separate statements.

The comma operator has been deprecated in subscripting expressions (as of C++20); [2] to reduce confusion, and open up the future possibility of repurposing the syntax for multidimensional array indexing. In C++23, the ability to overload operator[] with multiple arguments was added making unparenthesised comma expressions unusable in subscripts. [3] The comma operator is still usable and not deprecated in this context if the comma expression is surrounded by parentheses (as in a[(b,c)]).

Examples

In this example, the differing behavior between the second and third lines is due to the comma operator having lower precedence than assignment. The last example differs as well since the return expression must be fully evaluated before the function can return.

/** *  Commas act as separators in this line, not as an operator. *  Results: a=1, b=2, c=3, i=0 */inta=1,b=2,c=3,i=0;/** *  Assigns value of b into i. *  Commas act as separators in the first line and as an operator in the second line. *  Results: a=1, b=2, c=3, i=2 */inta=1,b=2,c=3;inti=(a,b);/** *  Assigns value of a into i. *  Equivalent to: int i = a; int b; *  Commas act as separators in both lines. *  The braces on the second line avoid variable redeclaration in the same block, *  which would cause a compilation error. *  The second b declared is given no initial value. *  Results: a=1, b=2, c=3, i=1 */inta=1,b=2,c=3;{inti=a,b;}/** *  Increases value of a by 2, then assigns value of resulting operation a + b into i. *  Commas act as separators in the first line and as an operator in the second line. *  Results: a=3, b=2, c=3, i=5 */inta=1,b=2,c=3;inti=(a+=2,a+b);/** *  Increases value of a by 2, then stores value of a to i, and discards unused *  values of resulting operation a + b. *  Equivalent to: (i = (a += 2)), a + b; *  Commas act as separators in the first line and as an operator in the third line. *  Results: a=3, b=2, c=3, i=3 */inta=1,b=2,c=3;inti;i=a+=2,a+b;/** *  Assigns value of a into i. *  Commas act as separators in both lines. *  The braces on the second line avoid variable redeclaration in the same block, *  which would cause a compilation error. *  The second b and c declared are given no initial value. *  Results: a=1, b=2, c=3, i=1 */inta=1,b=2,c=3;{inti=a,b,c;}/** *  Commas act as separators in the first line and as an operator in the second line. *  Assigns value of c into i, discarding the unused a and b values. *  Results: a=1, b=2, c=3, i=3 */inta=1,b=2,c=3;inti=(a,b,c);/** *  Returns 6, not 4, since comma operator sequence points following the keyword  *  return are considered a single expression evaluating to rvalue of final  *  subexpression c=6. *  Commas act as operators in this line. */returna=4,b=5,c=6;/** *  Returns 3, not 1, for same reason as previous example. *  Commas act as operators in this line. */return1,2,3;/** *  Returns 3, not 1, still for same reason as above. This example works as it does *  because return is a keyword, not a function call. Even though compilers will  *  allow for the construct return(value), the parentheses are only relative to "value" *  and have no special effect on the return keyword. *  Return simply gets an expression and here the expression is "(1), 2, 3". *  Commas act as operators in this line. */return(1),2,3;

Uses

The comma operator has relatively limited use cases. Because it discards its first operand, it is generally only useful where the first operand has desirable side effects that must be sequenced before the second operand. Further, because it is rarely used outside of specific idioms, and easily mistaken with other commas or the semicolon, it is potentially confusing and error-prone. Nevertheless, there are certain circumstances where it is commonly used, notably in for loops and in SFINAE. [4] For embedded systems which may have limited debugging capabilities, the comma operator can be used in combination with a macro to seamlessly override a function call, to insert code just before the function call.

For loops

The most common use is to allow multiple assignment statements without using a block statement, primarily in the initialization and the increment expressions of a for loop. This is the only idiomatic use in elementary C programming. In the following example, the order of the loop's initializers is significant:

voidrev(char*s,size_tlen){char*first;for(first=s,s+=len;s>=first;--s){putchar(*s);}}

An alternative solution to this problem in other languages is parallel assignment, which allows multiple assignments to occur within a single statement, and also uses a comma, though with different syntax and semantics. This is used in Go in its analogous for loop. [5]

Outside of for loop initializers (which have a special use of semicolons), the comma might be used instead of a semicolon, particularly when the statements in question function similarly to a loop increment (e.g. at the end of a while loop):

++p,++q;++p;++q;

Macros

The comma can be used in preprocessor macros to perform multiple operations in the space of a single syntactic expression.

One common use is to provide custom error messages in failed assertions. This is done by passing a parenthesized expression list to the assert macro, where the first expression is an error string and the second expression is the condition being asserted. The assert macro outputs its argument verbatim on an assertion failure. The following is an example:

#include<stdio.h>#include<assert.h>intmain(void){inti;for(i=0;i<=9;i++){assert(("i is too big!",i<=4));printf("i = %i\n",i);}return0;}

Output:

i = 0 i = 1 i = 2 i = 3 i = 4 assert: assert.c:6: test_assert: Assertion `( "i is too big!", i <= 4 )' failed. Aborted

However the assert macro is usually disabled in production code, so use it only for debug purposes.

Condition

The comma can be used within a condition (of an if, while, do while, or for) to allow auxiliary computations, particularly calling a function and using the result, with block scoping:

if(y=f(x),y>x){...// statements involving x and y}

A similar idiom exists in Go, where the syntax of the if statement explicitly allows an optional statement. [6]

Complex return

The comma can be used in return statements, to assign to a global variable or out parameter (passed by reference). This idiom suggests that the assignments are part of the return, rather than auxiliary assignments in a block that terminates with the actual return. For example, in setting a global error number:

if(failure)return(errno=EINVAL,-1);

This can be written more verbosely as:

if(failure){errno=EINVAL;return-1;}

Avoid a block

For brevity, the comma can be used to avoid a block and associated braces, as in:

if(x==1)y=2,z=3;
if(x==1)y=2,z=3;

instead of:

if(x==1){y=2;z=3;}
if(x==1){y=2;z=3;}

Other languages

In the OCaml and Ruby programming languages, the semicolon (";") is used for this purpose. JavaScript [7] and Perl [8] utilize the comma operator in the same way C/C++ does. In Java, the comma is a separator used to separate elements in a list in various contexts. [9] It is not an operator and does not evaluate to the last element in the list. [10]

See also

Related Research Articles

<span class="mw-page-title-main">AWK</span> Programming language

AWK (awk) is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and is a standard feature of most Unix-like operating systems.

C is a general-purpose computer programming language. It was created in the 1970s by Dennis Ritchie, and remains very widely used and influential. By design, C's features cleanly reflect the capabilities of the targeted CPUs. It has found lasting use in operating systems, device drivers, protocol stacks, though decreasingly for application software. C is commonly used on computer architectures that range from the largest supercomputers to the smallest microcontrollers and embedded systems.

The semicolon or semi-colon; is a symbol commonly used as orthographic punctuation. In the English language, a semicolon is most commonly used to link two independent clauses that are closely related in thought, such as when restating the preceding idea with a different expression. When a semicolon joins two or more ideas in one sentence, those ideas are then given equal rank. Semicolons can also be used in place of commas to separate items in a list, particularly when the elements of the list themselves have embedded commas.

In computer science, lexical analysis, lexing or tokenization is the process of converting a sequence of characters into a sequence of lexical tokens. A program that performs lexical analysis may be termed a lexer, tokenizer, or scanner, although scanner is also a term for the first stage of a lexer. A lexer is generally combined with a parser, which together analyze the syntax of programming languages, web pages, and so forth.

In computer programming, an assignment statement sets and/or re-sets the value stored in the storage location(s) denoted by a variable name; in other words, it copies a value into the variable. In most imperative programming languages, the assignment statement is a fundamental construct.

The syntax of the C programming language is the set of rules governing writing of software in the C language. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

In programming language theory, the associativity of an operator is a property that determines how operators of the same precedence are grouped in the absence of parentheses. If an operand is both preceded and followed by operators, and those operators have equal precedence, then the operand may be used as input to two different operations. The choice of which operations to apply the operand to, is determined by the associativity of the operators. Operators may be associative, left-associative, right-associative or non-associative. The associativity and precedence of an operator is a part of the definition of the programming language; different programming languages may have different associativity and precedence for the same type of operator.

This is a list of operators in the C and C++ programming languages. All the operators listed exist in C++; the column "Included in C", states whether an operator is also present in C. Note that C does not support operator overloading.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, ternary if, or inline if. An expression a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is by far and large the most common, but alternative syntaxes do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

In computer programming, operators are constructs defined within programming languages which behave generally like functions, but which differ syntactically or semantically.

In computer programming, a statement is a syntactic unit of an imperative programming language that expresses some action to be carried out. A program written in such a language is formed by a sequence of one or more statements. A statement may have internal components.

In C and C++, a sequence point defines any point in a computer program's execution at which it is guaranteed that all side effects of previous evaluations will have been performed, and no side effects from subsequent evaluations have yet been performed. They are a core concept for determining the validity of and, if valid, the possible results of expressions. Adding more sequence points is sometimes necessary to make an expression defined and to ensure a single valid order of evaluation.

HP Time-Shared BASIC is a BASIC programming language interpreter for Hewlett-Packard's HP 2000 line of minicomputer-based time-sharing computer systems. TSB is historically notable as the platform that released the first public versions of the game Star Trek.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

The syntax of JavaScript is the set of rules that define a correctly structured JavaScript program.

The conditional operator is supported in many programming languages. This term usually refers to ?: as in C, C++, C#, and JavaScript. However, in Java, this term can also refer to && and ||.

In 1979, Honeywell Information Systems announced a new programming language for their time-sharing service named TEX, an acronym for the Text Executive text processing system. TEX was a first-generation scripting language developed around the time of AWK and used by Honeywell initially as an in-house system test automation tool.

In programming languages, a label is a sequence of characters that identifies a location within source code. In most languages, labels take the form of an identifier, often followed by a punctuation character. In many high-level languages, the purpose of a label is to act as the destination of a GOTO statement. In assembly language, labels can be used anywhere an address can. Also in Pascal and its derived variations. Some languages, such as Fortran and BASIC, support numeric labels. Labels are also used to identify an entry point into a compiled sequence of statements.

Increment and decrement operators are unary operators that increase or decrease their operand by one.

In computer science, an integer literal is a kind of literal for an integer whose value is directly represented in source code. For example, in the assignment statement x = 1, the string 1 is an integer literal indicating the value 1, while in the statement x = 0x10 the string 0x10 is an integer literal indicating the value 16, which is represented by 10 in hexadecimal.

References

  1. "Comma Operator". Microsoft dev docs. Archived from the original on 1 August 2019. Retrieved 1 August 2019. Two expressions separated by a comma are evaluated left to right. The left operand is always evaluated, and all side effects are completed before the right operand is evaluated.
  2. "P1161R2: Deprecate uses of the comma operator in subscripting expressions". www.open-std.org. Retrieved 2022-09-05.
  3. Mark Hoemmen; Daisy Hollman; Corentin Jabot; Isabella Muerte; Christian Trott (2021-09-14). "Multidimensional subscript operator" (PDF).
  4. "SFINAE". en.cppreference.com. Retrieved 2022-08-30.
  5. Effective Go: for, "Finally, Go has no comma operator and ++ and -- are statements not expressions. Thus if you want to run multiple variables in a for you should use parallel assignment (although that precludes ++ and --)."
  6. The Go Programming Language Specification: If statements
  7. "Comma operator". MDN Web Docs . 17 January 2020. Archived from the original on 12 July 2014. Retrieved 25 January 2020. You can use the comma operator when you want to include multiple expressions in a location that requires a single expression.
  8. "Perlop - Perl operators and precedence - Perldoc Browser".
  9. "2.4. Grammar Notation". Oracle Corporation . Archived from the original on 22 July 2019. Retrieved 25 July 2019.
  10. "Is comma (,) operator or separator in Java?".

Bibliography

  • Ramajaran, V. (1994), Computer Programming in C, New Delhi: Prentice Hall of India
  • Dixit, J.B (2005), Fundamentals of computers and programming in C, New Delhi: Laxmi Publications
  • Kernighan, Brian W.; Ritchie, Dennis M. (1988), The C Programming Language (2nd ed.), Englewood Cliffs, NJ: Prentice Hall