Operator (computer programming)

Last updated

In computer programming, operators are constructs defined within programming languages which behave generally like functions, but which differ syntactically or semantically.

Contents

Common simple examples include arithmetic (e.g. addition with +), comparison (e.g. "greater than" with >), and logical operations (e.g. AND, also written && in some languages). More involved examples include assignment (usually = or :=), field access in a record or object (usually .), and the scope resolution operator (often :: or .). Languages usually define a set of built-in operators, and in some cases allow users to add new meanings to existing operators or even define completely new operators.

Syntax

Syntactically operators usually contrast to functions. In most languages, functions may be seen as a special form of prefix operator with fixed precedence level and associativity, often with compulsory parentheses e.g. Func(a) (or (Func a) in Lisp). Most languages support programmer-defined functions, but cannot really claim to support programmer-defined operators, unless they have more than prefix notation and more than a single precedence level. Semantically operators can be seen as special form of function with different calling notation and a limited number of parameters (usually 1 or 2).

The position of the operator with respect to its operands may be prefix, infix or postfix, and the syntax of an expression involving an operator depends on its arity (number of operands), precedence, and (if applicable), associativity. Most programming languages support binary operators and a few unary operators, with a few supporting more operands, such as the ?: operator in C, which is ternary. There are prefix unary operators, such as unary minus -x, and postfix unary operators, such as post-increment x++; and binary operations are infix, such as x + y or x = y. Infix operations of higher arity require additional symbols, such as the ternary operator  ?: in C, written as a ? b : c – indeed, since this is the only common example, it is often referred to as the ternary operator. Prefix and postfix operations can support any desired arity, however, such as 1 2 3 4 +.

Occasionally [1] [2] parts of a language may be described as "matchfix" or "circumfix" [3] [4] operators, either to simplify the language's description or implementation. A circumfix operator consists of two or more parts which enclose its operands. Circumfix operators have the highest precedence, with their contents being evaluated and the resulting value used in the surrounding expression. The most familiar circumfix operator are the parentheses mentioned above, used to indicate which parts of an expression are to be evaluated before others. Another example from physics is the inner product notation of Dirac's bra–ket notation. Circumfix operators are especially useful to denote operations that involve many or varying numbers of operands.

The specification of a language will specify the syntax the operators it supports, while languages such as Prolog that support programmer-defined operators require that the syntax be defined by the programmer.

Semantics

The semantics of operators particularly depends on value, evaluation strategy, and argument passing mode (such as boolean short-circuiting). Simply, an expression involving an operator is evaluated in some way, and the resulting value may be just a value (an r-value), or may be an object allowing assignment (an l-value).

In simple cases this is identical to usual function calls; for example, addition x + y is generally equivalent to a function call add(x, y) and less-than comparison x < y to lt(x, y), meaning that the arguments are evaluated in their usual way, then some function is evaluated and the result is returned as a value. However, the semantics can be significantly different. For example, in assignment a = b the target a is not evaluated, but instead its location (address) is used to store the value of b – corresponding to call-by-reference semantics. Further, an assignment may be a statement (no value), or may be an expression (value), with the value itself either an r-value (just a value) or an l-value (able to be assigned to). As another example, the scope resolution operator  :: and the element access operator . (as in Foo::Bar or a.b) operate not on values, but on names, essentially call-by-name semantics, and their value is a name.

Use of l-values as operator operands is particularly notable in unary increment and decrement operators. In C, for instance, the following statement is legal and well-defined, and depends on the fact that array indexing returns an l-value:

x=++a[i];

An important use is when a left-associative binary operator modifies its left argument (or produces a side effect) and then evaluates to that argument as an l-value. [lower-alpha 1] This allows a sequence of operators all affecting the original argument, allowing a fluent interface, similar to method cascading. A common example is the << operator in the C++ iostream library, which allows fluent output, as follows:

cout<<"Hello"<<" "<<"world!"<<endl;

User-defined operators

A language may contain a fixed number of built-in operators (e.g. +, -, *, <, <=, !, =, etc. in C and C++, PHP), or it may allow the creation of programmer-defined operators (e.g. Prolog, [5] Seed7, [6] F#, OCaml, Haskell). Some programming languages restrict operator symbols to special characters like + or := while others allow also names like div (e.g. Pascal).

Most languages have a built-in set of operators, but do not allow user-defined operators, as this significantly complicates parsing. [lower-alpha 2] Many languages only allow operators to be used for built-in types, but others allow existing operators to be used for user-defined types; this is known as operator overloading. Some languages allow new operators to be defined, however, either at compile time or at run time. This may involve meta-programming (specifying the operators in a separate language), or within the language itself. Definition of new operators, particularly runtime definition, often makes correct static analysis of programs impossible, since the syntax of the language may be Turing-complete, so even constructing the syntax tree may require solving the halting problem, which is impossible. This occurs for Perl, for example, and some dialects of Lisp.

Examples

Common examples that differ syntactically are mathematical arithmetic operations, e.g. ">" for "greater than", with names often outside the language's set of identifiers for functions, and called with a syntax different from the language's syntax for calling functions. As a function, "greater than" would generally be named by an identifier, such as gt or greater_than and called as a function, as gt(x, y). Instead, the operation uses the special character > (which is tokenized separately during lexical analysis), and infix notation, as x > y.

Common examples that differ semantically (by argument passing mode) are boolean operations, which frequently feature short-circuit evaluation: e.g. a short-circuiting conjunction (X AND Y) that only evaluates later arguments if earlier ones are not false, in a language with strict call-by-value functions. This behaves instead similarly to if/then/else.

Less common operators include:

Compilation

A compiler can implement operators and functions with subroutine calls or with inline code. Some built-in operators supported by a language have a direct mapping to a small number of instructions commonly found on central processing units, though others (e.g. '+' used to express string concatenation) may have complicated implementations.

Operator overloading

In some programming languages an operator may be ad hoc polymorphic, that is, have definitions for more than one kind of data, (such as in Java where the + operator is used both for the addition of numbers and for the concatenation of strings). Such an operator is said to be overloaded. In languages that support operator overloading by the programmer (such as C++) but have a limited set of operators, operator overloading is often used to define customized uses for operators.

In the example IFORDER_DATE>"12/31/2011"ANDORDER_DATE<"01/01/2013"THENCONTINUEELSESTOP, the operators are: > (greater than), AND and < (less than).

Operand coercion

Some languages also allow for the operands of an operator to be implicitly converted, or coerced , to suitable data types for the operation to occur. For example, in Perl coercion rules lead into 12 + "3.14" producing the result of 15.14. The text "3.14" is converted to the number 3.14 before addition can take place. Further, 12 is an integer and 3.14 is either a floating or fixed-point number (a number that has a decimal place in it) so the integer is then converted to a floating point or fixed-point number respectively.

JavaScript follows opposite rules—finding the same expression above, it will convert the integer 12 into a string "12", then concatenate the two operands to form "123.14".

In the presence of coercions in a language, the programmer must be aware of the specific rules regarding operand types and the operation result type to avoid subtle programming mistakes.

Operator features in programming languages

The following table shows the operator features in several programming languages:

Programming languageNonalphanumeric operator symbolsAlphanumeric operator symbolsPrefixInfixPostfixPrecedenceAssociativity
Overloading
Programmer-defined
overloading
Programmer-defined
operator symbols
ALGOL 68 +* ** * / % %* %× - + <<= >= > = /= & -:= +:= *:= /:= %:= %*:= +=: :=: :/=:

(All operators have bold Alphanumeric equivalents, c.f. next column. Some have non ASCII equivalents, c.f. below.) ¬ +× ⊥ ↑ ↓ ⌊ ⌈ × ÷ ÷× ÷* □ ≤ ≥ ≠ ∧ ∨ ×:= ÷:= ÷×:= ÷*:= %×:= :≠:

notabsargbinentierlengleveloddreprroundshortenishlshrupdownlwbupbltlegegteqneandorovermodelemminusabplusabtimesabdivaboverabmodabplustoisisntYesYesNoYes (prefix operators always have priority 10)Infix operators are left associative, prefix operators are right associativeYesYesYes
APL + - × ÷ ⌈ ⌊ * ⍟ | ! ○ ~ ∨ ∧ ⍱ ⍲ < ≤ = ≥ > ≠ . @ ≡ ≢ ⍴ , ⍪ ⍳ ↑ ↓ ? ⍒ ⍋ ⍉ ⌽ ⊖ ∊ ⊥ ⊤ ⍎ ⍕ ⌹ ⊂ ⊃ ∪ ∩ ⍷ ⌷ ∘ → ← / ⌿ \ ⍀ ¨ ⍣ & ⍨ ⌶ ⊆ ⊣ ⊢ ⍠ ⍤ ⌸ ⌺ ⍸Alphanumeric symbols need a ⎕ before the keywordYes (first-order functions only)YesYes (higher-order functions only)Higher-order functions precede first-order functionsHigher-order functions are left associative, first-order functions are right associativeYesYesYes (alphanumeric only)
C () [] -> . ! ~ ++ -- + - * & / % << >> < <= > >= == != ^ | && || ?: = += -= *= /= %= &= ^= sizeof YesYesYesYesYesYesNoNo
C++ (more) sizeof typeid new delete throw decltype static_cast dynamic cast reinterpret_cast const_cast YesYesYesYesYesYesYesNo
C# (more)Same as C/C++, along with ?.  ?[] ?? ??= sizeof nameof new stackalloc await throw checked unchecked is as delegate default true false
LINQ:from select where group...by group...by...into join...in...on...equals join...in...on...equals...into orderby orderby...descending
Roslyn-only:__makeref __refvalue __reftype
YesYesYesYesYesYesYesNo
Java Same as C/C++ new throw instanceof YesYesYesYesYesYesNoNo
Eiffel [] + - * / // = /=not and or implies "and then" "or else"YesYesNoYesYesNoYesYes
Haskell + - * / ^ ^^ ** == /= > < >= <= && || >>= >> $ $! . ++ !! :Many more in common librariesThe function's name must be put into backticksYesYesNoYesYesYes, using Type classes Yes
Pascal * / + - = < > <> <= >= := not div mod and or inYesYesNoYesYesYesNoNo
Perl -> ++ -- ** ! ~ \ + - . =~ !~ * / % < > <= >= == != <=> ~~ & | ^ && || ' print sort chmod chdir rand and or not xor lt gt le ge eq ne cmp xYesYesYesYesYesYesYesNo
PHP [] ** ++ -- ~ @! [7] * / % + - . << >> < <= > >= == != === !== <> <=> & ^ | && || ?? ?: = += -= *= **= /= .= %= &= |= ^= <<= >>=clone new unset print echo isset instanceof and or xor YesYesYesYesYesNoNoNo
PL/I ( ) -> + - * / ** > ¬> >= = ¬= <= < ¬< ¬ & | ||YesYesNoYesYesNoNoNo
Prolog :- ?- ; , . =.. = \= < =< >= > == \== - + / *spy nospy not is modYesYesYesYesYesNoNoYes
Raku ++ -- ** ! ~ ~~ * / + - . < > <= >= == != <=> & | ^ && || // [8] print sort chmod chdir rand and or not xor lt gt le ge eq ne leg cmp x xxYesYesYesYesYesYesYesYes [9]
Seed7 {} [] -> ** ! + - * / << >> & >< | = <> > >= < <= <& := +:= -:= *:= /:= <<:= >>:= &:= @:=conv varConv parse conj div rem mdiv mod times mult in not and or digits lpad rpad lpad0YesYesYesYesYesYesYesYes
Smalltalk (yes - Up to two characters [10] )Alphanumeric symbols need a colon after the keywordNoYesYesNoNoYesYesYes
Swift Any Unicode symbol string except ., including ! ~ + - * / % =+ =- =* =/ =% &+ &- &* =&+ =&- =&* && || << >> & | ^ == != < <= > >= ?? ... ..< in standard libraryis as as?YesYesYesYes (defined as partial order in precedence groups)Yes (defined as part of precedence groups)YesYesYes
Visual Basic .NET () . ! ?() ?. ?! + - * / \ & << >> < <= > >= ^ <> = += -= *= /= \= &= ^= <<= >>=New Await Mod Like Is IsNot Not And AndAlso Or OrElse Xor If(...,...) If(...,...,...) GetXmlNamespace(...) GetType(...) NameOf(...) TypeOf...Is TypeOf...IsNot DirectCast(...,...) TryCast(...,...) CType(...,...) CBool(...) CByte(...) CChar(...) CDate(...) CDec(...) CDbl(...) CInt(...) CLng(...) CObj(...) CSByte(...) CShort(...) CSng(...) CStr(...) CUInt(...) CULng(...) CUShort(...)
LINQ:From Aggregate...Into Select Distinct Where <Order By>...[Ascending|Descending] Take <Take While> Skip <Skip While> Let Group...By...Into Join...On <Group Join...On...Into>
YesYesYesYesYesYesYesNo

See also

Notes

  1. Conversely a right-associative operator with its right argument, though this is rarer.
  2. Introducing a new operator changes the lexical specification of the language, which changes the lexical analysis. The arity and precedence of the operator is then part of the phrase syntax of the language, which changes the phrase-level analysis. For example, adding an operator @ requires lexing and tokenizing this character, and the phrase structure (syntax tree) depends on the arity and precedence of this operator.

Related Research Articles

In mathematics, an operand is the object of a mathematical operation, i.e., it is the object or quantity that is operated on.

In computer programming, operator overloading, sometimes termed operator ad hoc polymorphism, is a specific case of polymorphism, where different operators have different implementations depending on their arguments. Operator overloading is generally defined by a programming language, a programmer, or both.

Polish notation (PN), also known as normal Polish notation (NPN), Łukasiewicz notation, Warsaw notation, Polish prefix notation or simply prefix notation, is a mathematical notation in which operators precede their operands, in contrast to the more common infix notation, in which operators are placed between operands, as well as reverse Polish notation (RPN), in which operators follow their operands. It does not need any parentheses as long as each operator has a fixed number of operands. The description "Polish" refers to the nationality of logician Jan Łukasiewicz, who invented Polish notation in 1924.

In logic, mathematics, and computer science, arity is the number of arguments or operands taken by a function, operation or relation. In mathematics, arity may also be called rank, but this word can have many other meanings. In logic and philosophy, arity may also be called adicity and degree. In linguistics, it is usually named valency.

In mathematics, a unary operation is an operation with only one operand, i.e. a single input. This is in contrast to binary operations, which use two operands. An example is any function f : AA, where A is a set. The function f is a unary operation on A.

<span class="mw-page-title-main">Infix notation</span> Mathematics notation with operators between operands

Infix notation is the notation commonly used in arithmetical and logical formulae and statements. It is characterized by the placement of operators between operands—"infixed operators"—such as the plus sign in 2 + 2.

In mathematics and computer programming, the order of operations is a collection of rules that reflect conventions about which procedures to perform first in order to evaluate a given mathematical expression.

In programming languages, scientific calculators and similar common operator notation or operator grammar is a way to define and analyse mathematical and other formal expressions. In this model a linear sequence of tokens are divided into two classes: operators and operands.

In programming language theory, the associativity of an operator is a property that determines how operators of the same precedence are grouped in the absence of parentheses. If an operand is both preceded and followed by operators, and those operators have equal precedence, then the operand may be used as input to two different operations. The choice of which operations to apply the operand to, is determined by the associativity of the operators. Operators may be associative, left-associative, right-associative or non-associative. The associativity and precedence of an operator is a part of the definition of the programming language; different programming languages may have different associativity and precedence for the same type of operator.

This is a list of operators in the C and C++ programming languages. All the operators listed exist in C++; the column "Included in C", states whether an operator is also present in C. Note that C does not support operator overloading.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, ternary if, or inline if. An expression a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is by far and large the most common, but alternative syntaxes do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

Short-circuit evaluation, minimal evaluation, or McCarthy evaluation is the semantics of some Boolean operators in some programming languages in which the second argument is executed or evaluated only if the first argument does not suffice to determine the value of the expression: when the first argument of the AND function evaluates to false, the overall value must be false; and when the first argument of the OR function evaluates to true, the overall value must be true.

In computer science, a relational operator is a programming language construct or operator that tests or defines some kind of relation between two entities. These include numerical equality and inequalities.

In computer programming, a sigil is a symbol affixed to a variable name, showing the variable's datatype or scope, usually a prefix, as in $foo, where $ is the sigil.

There are various ways in which calculators interpret keystrokes. These can be categorized into two main types:

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

In computer science, a Boolean expression is an expression used in programming languages that produces a Boolean value when evaluated. A Boolean value is either true or false. A Boolean expression may be composed of a combination of the Boolean constants true or false, Boolean-typed variables, Boolean-valued operators, and Boolean-valued functions.

<span class="mw-page-title-main">Operation (mathematics)</span> Addition, multiplication, division, ...

In mathematics, an operation is a function which takes zero or more input values to a well-defined output value. The number of operands is the arity of the operation.

In the C and C++ programming languages, the comma operator is a binary operator that evaluates its first operand and discards the result, and then evaluates the second operand and returns this value ; there is a sequence point between these evaluations.

The syntax and semantics of Prolog, a programming language, are the sets of rules that define how a Prolog program is written and how it is interpreted, respectively. The rules are laid out in ISO standard ISO/IEC 13211 although there are differences in the Prolog implementations.

References

  1. "Operator Input Forms—Wolfram Language Documentation". reference.wolfram.com.
  2. "Maxima 5.42.0 Manual: 7. Operators". maxima.sourceforge.net.
  3. "Prefix, Postfix and Circumfix Operators". mythryl.org.
  4. "Operators". doc.perl6.org.
  5. "SWI-Prolog -- op/3". www.swi-prolog.org.
  6. "Declare an operator". seed7.sourceforge.net.
  7. "PHP: Error Control Operators - Manual". php.net.
  8. "Operators". docs.perl6.org.
  9. "Functions". docs.perl6.org.
  10. Goldberg, Adele. "Smalltalk-80: The Language and its Implementation, p. 27, ISBN 0-201-11371-6" (PDF).