Operator associativity

Last updated

In programming language theory, the associativity of an operator is a property that determines how operators of the same precedence are grouped in the absence of parentheses. If an operand is both preceded and followed by operators (for example, ^ 3 ^), and those operators have equal precedence, then the operand may be used as input to two different operations (i.e. the two operations indicated by the two operators). The choice of which operations to apply the operand to, is determined by the associativity of the operators. Operators may be associative (meaning the operations can be grouped arbitrarily), left-associative (meaning the operations are grouped from the left), right-associative (meaning the operations are grouped from the right) or non-associative (meaning operations cannot be chained, often because the output type is incompatible with the input types). The associativity and precedence of an operator is a part of the definition of the programming language; different programming languages may have different associativity and precedence for the same type of operator.

Contents

Consider the expression a ~ b ~ c. If the operator ~ has left associativity, this expression would be interpreted as (a ~ b) ~ c. If the operator has right associativity, the expression would be interpreted as a ~ (b ~ c). If the operator is non-associative, the expression might be a syntax error, or it might have some special meaning. Some mathematical operators have inherent associativity. For example, subtraction and division, as used in conventional math notation, are inherently left-associative. Addition and multiplication, by contrast, are both left and right associative. (e.g. (a * b) * c = a * (b * c)).

Many programming language manuals provide a table of operator precedence and associativity; see, for example, the table for C and C++.

The concept of notational associativity described here is related to, but different from, the mathematical associativity. An operation that is mathematically associative, by definition requires no notational associativity. (For example, addition has the associative property, therefore it does not have to be either left associative or right associative.) An operation that is not mathematically associative, however, must be notationally left-, right-, or non-associative. (For example, subtraction does not have the associative property, therefore it must have notational associativity.)

Examples

Associativity is only needed when the operators in an expression have the same precedence. Usually + and - have the same precedence. Consider the expression 7 - 4 + 2. The result could be either (7 - 4) + 2 = 5 or 7 - (4 + 2) = 1. The former result corresponds to the case when + and - are left-associative, the latter to when + and - are right-associative.

In order to reflect normal usage, addition, subtraction, multiplication, and division operators are usually left-associative, [1] [2] [3] while for an exponentiation operator (if present) [4] [ better source needed ] there is no general agreement. Any assignment operators are typically right-associative. To prevent cases where operands would be associated with two operators, or no operator at all, operators with the same precedence must have the same associativity.

A detailed example

Consider the expression 5^4^3^2, in which ^ is taken to be a right-associative exponentiation operator. A parser reading the tokens from left to right would apply the associativity rule to a branch, because of the right-associativity of ^, in the following way:

  1. Term 5 is read.
  2. Nonterminal ^ is read. Node: "5^".
  3. Term 4 is read. Node: "5^4".
  4. Nonterminal ^ is read, triggering the right-associativity rule. Associativity decides node: "5^(4^".
  5. Term 3 is read. Node: "5^(4^3".
  6. Nonterminal ^ is read, triggering the re-application of the right-associativity rule. Node "5^(4^(3^".
  7. Term 2 is read. Node "5^(4^(3^2".
  8. No tokens to read. Apply associativity to produce parse tree "5^(4^(3^2))".

This can then be evaluated depth-first, starting at the top node (the first ^):

  1. The evaluator walks down the tree, from the first, over the second, to the third ^ expression.
  2. It evaluates as: 32 = 9. The result replaces the expression branch as the second operand of the second ^.
  3. Evaluation continues one level up the parse tree as: 49 = 262,144. Again, the result replaces the expression branch as the second operand of the first ^.
  4. Again, the evaluator steps up the tree to the root expression and evaluates as: 52621446.2060699×10183230. The last remaining branch collapses and the result becomes the overall result, therefore completing overall evaluation.

A left-associative evaluation would have resulted in the parse tree ((5^4)^3)^2 and the completely different result (6253)2 = 244,140,62525.9604645×1016.

Right-associativity of assignment operators

In many imperative programming languages, the assignment operator is defined to be right-associative, and assignment is defined to be an expression (which evaluates to a value), not just a statement. This allows chained assignment by using the value of one assignment expression as the right operand of the next assignment expression.

In C, the assignment a = b is an expression that evaluates to the same value as the expression b converted to the type of a, with the side effect of storing the R-value of b into the L-value of a. [lower-alpha 1] Therefore the expression a = (b = c) can be interpreted as b = c; a = b;. The alternative expression (a = b) = c raises an error because a = b is not an L-value expression, i.e. it has an R-value but not an L-value where to store the R-value of c. The right-associativity of the = operator allows expressions such as a = b = c to be interpreted as a = (b = c).

In C++, the assignment a = b is an expression that evaluates to the same value as the expression a, with the side effect of storing the R-value of b into the L-value of a. Therefore the expression a = (b = c) can still be interpreted as b = c; a = b;. And the alternative expression (a = b) = c can be interpreted as a = b; a = c; instead of raising an error. The right-associativity of the = operator allows expressions such as a = b = c to be interpreted as a = (b = c).

Non-associative operators

Non-associative operators are operators that have no defined behavior when used in sequence in an expression. In Prolog the infix operator :- is non-associative because constructs such as "a :- b :- c" constitute syntax errors.

Another possibility is that sequences of certain operators are interpreted in some other way, which cannot be expressed as associativity. This generally means that syntactically, there is a special rule for sequences of these operations, and semantically the behavior is different. A good example is in Python, which has several such constructs. [5] Since assignments are statements, not operations, the assignment operator does not have a value and is not associative. Chained assignment is instead implemented by having a grammar rule for sequences of assignments a = b = c, which are then assigned left-to-right. Further, combinations of assignment and augmented assignment, like a = b += c are not legal in Python, though they are legal in C. Another example are comparison operators, such as >, ==, and <=. A chained comparison like a < b < c is interpreted as (a < b) and (b < c), not equivalent to either (a < b) < c or a < (b < c). [6]

See also

Notes

  1. An expression can be made into a statement by following it with a semicolon; i.e. a = b is an expression but a = b; is a statement.

Related Research Articles

<span class="mw-page-title-main">Associative property</span> Property of a mathematical operation

In mathematics, the associative property is a property of some binary operations, which means that rearranging the parentheses in an expression will not change the result. In propositional logic, associativity is a valid rule of replacement for expressions in logical proofs.

<span class="mw-page-title-main">Logical disjunction</span> Logical connective OR

In logic, disjunction, also known as logical disjunction or logical or or logical addition or inclusive disjunction, is a logical connective typically notated as and read aloud as "or". For instance, the English language sentence "it is sunny or it is warm" can be represented in logic using the disjunctive formula , assuming that abbreviates "it is sunny" and abbreviates "it is warm".

In mathematics, an operand is the object of a mathematical operation, i.e., it is the object or quantity that is operated on.

Polish notation (PN), also known as normal Polish notation (NPN), Łukasiewicz notation, Warsaw notation, Polish prefix notation or simply prefix notation, is a mathematical notation in which operators precede their operands, in contrast to the more common infix notation, in which operators are placed between operands, as well as reverse Polish notation (RPN), in which operators follow their operands. It does not need any parentheses as long as each operator has a fixed number of operands. The description "Polish" refers to the nationality of logician Jan Łukasiewicz, who invented Polish notation in 1924.

Rebol is a cross-platform data exchange language and a multi-paradigm dynamic programming language designed by Carl Sassenrath for network communications and distributed computing. It introduces the concept of dialecting: small, optimized, domain-specific languages for code and data, which is also the most notable property of the language according to its designer Carl Sassenrath:

Although it can be used for programming, writing functions, and performing processes, its greatest strength is the ability to easily create domain-specific languages or dialects

In mathematics and computer programming, the order of operations is a collection of rules that reflect conventions about which operations to perform first in order to evaluate a given mathematical expression.

In computer programming, an assignment statement sets and/or re-sets the value stored in the storage location(s) denoted by a variable name; in other words, it copies a value into the variable. In most imperative programming languages, the assignment statement is a fundamental construct.

In programming languages, scientific calculators and similar common operator notation or operator grammar is a way to define and analyse mathematical and other formal expressions. In this model a linear sequence of tokens are divided into two classes: operators and operands.

This is a list of operators in the C and C++ programming languages. All the operators listed exist in C++; the column "Included in C", states whether an operator is also present in C. Note that C does not support operator overloading.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, ternary if, or inline if. An expression a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is by far and large the most common, but alternative syntaxes do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

In computer programming, operators are constructs defined within programming languages which behave generally like functions, but which differ syntactically or semantically.

In computer science, a parsing expression grammar (PEG) is a type of analytic formal grammar, i.e. it describes a formal language in terms of a set of rules for recognizing strings in the language. The formalism was introduced by Bryan Ford in 2004 and is closely related to the family of top-down parsing languages introduced in the early 1970s. Syntactically, PEGs also look similar to context-free grammars (CFGs), but they have a different interpretation: the choice operator selects the first match in PEG, while it is ambiguous in CFG. This is closer to how string recognition tends to be done in practice, e.g. by a recursive descent parser.

In computer science, a relational operator is a programming language construct or operator that tests or defines some kind of relation between two entities. These include numerical equality and inequalities.

In C and C++, a sequence point defines any point in a computer program's execution at which it is guaranteed that all side effects of previous evaluations will have been performed, and no side effects from subsequent evaluations have yet been performed. They are a core concept for determining the validity of and, if valid, the possible results of expressions. Adding more sequence points is sometimes necessary to make an expression defined and to ensure a single valid order of evaluation.

In computer science, an operator precedence parser is a bottom-up parser that interprets an operator-precedence grammar. For example, most calculators use operator precedence parsers to convert from the human-readable infix notation relying on order of operations to a format that is optimized for evaluation such as Reverse Polish notation (RPN).

In computer science, the shunting yard algorithm is a method for parsing arithmetical or logical expressions, or a combination of both, specified in infix notation. It can produce either a postfix notation string, also known as Reverse Polish notation (RPN), or an abstract syntax tree (AST). The algorithm was invented by Edsger Dijkstra and named the "shunting yard" algorithm because its operation resembles that of a railroad shunting yard. Dijkstra first described the shunting yard algorithm in the Mathematisch Centrum report MR 34/61.

In the C and C++ programming languages, the comma operator is a binary operator that evaluates its first operand and discards the result, and then evaluates the second operand and returns this value ; there is a sequence point between these evaluations.

A binary expression tree is a specific kind of a binary tree used to represent expressions. Two common types of expressions that a binary expression tree can represent are algebraic and boolean. These trees can represent expressions that contain both unary and binary operators.

A shift-reduce parser is a class of efficient, table-driven bottom-up parsing methods for computer languages and other notations formally defined by a grammar. The parsing methods most commonly used for parsing programming languages, LR parsing and its variations, are shift-reduce methods. The precedence parsers used before the invention of LR parsing are also shift-reduce methods. All shift-reduce parsers have similar outward effects, in the incremental order in which they build a parse tree or call specific output actions.

In certain computer programming languages, the Elvis operator, often written ?:, is a binary operator that returns the evaluated first operand if that operand evaluates to a value likened to logically true, and otherwise returns the evaluated second operand. This is identical to a short-circuit or with "last value" semantics. The notation of the Elvis operator was inspired by the ternary conditional operator, ? :, since the Elvis operator expression A ?: B is approximately equivalent to the ternary conditional expression A ? A : B.

References