Short-circuit evaluation

Last updated

Short-circuit evaluation, minimal evaluation, or McCarthy evaluation (after John McCarthy) is the semantics of some Boolean operators in some programming languages in which the second argument is executed or evaluated only if the first argument does not suffice to determine the value of the expression: when the first argument of the AND function evaluates to false, the overall value must be false; and when the first argument of the OR function evaluates to true, the overall value must be true.

Contents

In programming languages with lazy evaluation (Lisp, Perl, Haskell), the usual Boolean operators short-circuit. In others (Ada, Java, Delphi), both short-circuit and standard Boolean operators are available. For some Boolean operations, like exclusive or (XOR), it is impossible to short-circuit, because both operands are always needed to determine a result.

Short-circuit operators are, in effect, control structures rather than simple arithmetic operators, as they are not strict. In imperative language terms (notably C and C++), where side effects are important, short-circuit operators introduce a sequence point: they completely evaluate the first argument, including any side effects, before (optionally) processing the second argument. ALGOL 68 used proceduring to achieve user-defined short-circuit operators and procedures.

The use of short-circuit operators has been criticized as problematic:

The conditional connectives — "cand" and "cor" for short — are ... less innocent than they might seem at first sight. For instance, cor does not distribute over cand: compare

(A cand B) cor C with (A cor C) cand (B cor C);

in the case ¬A ∧ C , the second expression requires B to be defined, the first one does not. Because the conditional connectives thus complicate the formal reasoning about programs, they are better avoided.

Definition

In any programming language that implements short-circuit evaluation, the expression x and y is equivalent to the conditional expression if x then y else x, and the expression x or y is equivalent to if x then x else y. In either case, x is only evaluated once.

The generalized definition above accommodates loosely typed languages that have more than the two truth-values True and False, where short-circuit operators may return the last evaluated subexpression. This is called "last value" in the table below. For a strictly-typed language, the expression is simplified to if x then y else false and if x then true else y respectively for the boolean case.

Precedence

Although AND takes precedence over OR in many languages, this is not a universal property of short-circuit evaluation. An example of the two operators taking the same precedence and being left-associative with each other is POSIX shell's command-list syntax. [2] :§2.9.3

The following simple left-to-right evaluator enforces a precedence of AND over OR by a continue:

function short-circuit-eval (operators, values)     letresult := True     for each (op, val) in (operators, values):         ifop = "AND" && result = False             continueelse ifop = "OR" && result = True             returnresultelseresult := valreturnresult

Formalization

Short-circuit logic, with or without side-effects, have been formalized based on Hoare's conditional. A result is that non-short-circuiting operators can be defined out of short-circuit logic to have the same sequence of evaluation. [3]

Support in common programming and scripting languages

As you look at the table below, keep in mind that bitwise operators often do not behave exactly like logical operators, even if both arguments are of 0, 1 or Boolean type.

Examples:

Boolean operators in various languages
Language Eager operatorsShort-circuit operatorsResult type
Advanced Business Application Programming (ABAP)noneand, orBoolean [lower-alpha 1]
Ada and, orand then, or elseBoolean
ALGOL 68 and, &, ∧ ; or, ∨andf , orf (both user defined)Boolean
APL , , (nand), (nor), etc.:AndIf, :OrIfBoolean [lower-alpha 1]
awk none&&, ||Boolean
Bash none&&, ||Boolean
C, Objective-C &, | [lower-alpha 2] &&, ||, ? [5] int (&, |, &&,||), opnd-dependent (?)
C++ [lower-alpha 3] none&&, ||, ? [6] Boolean (&&,||), opnd-dependent (?)
C# &, |&&, ||, ?, ??Boolean (&&,||), opnd-dependent (?, ??)
ColdFusion Markup Language (CFML)noneAND, OR, &&, ||Boolean
D [lower-alpha 4] &, |&&, ||, ?Boolean (&&,||), opnd-dependent (?)
Eiffel and, orand then, or elseBoolean
Erlang and, orandalso, orelseBoolean
Fortran [lower-alpha 5] .and., .or..and., .or.Boolean
Go, Haskell, OCaml none&&, ||Boolean
Java, MATLAB, R, Swift &, |&&, ||Boolean
JavaScript none&&, &&=, ||, ||=Last value
Julia none&&, ||Last value
Lasso noneand, or, &&, ||Last value
Kotlin and, or&&, ||Boolean
Lisp, Lua, Scheme noneand, orLast value
MUMPS (M)&, !noneNumeric
Modula-2 noneAND, ORBoolean
Oberon none&, ORBoolean
OCaml land, lor [7] &&, ||Boolean
Pascal and, or [lower-alpha 6] [lower-alpha 7] and_then, or_else [lower-alpha 7] Boolean
Perl &, |&&, and, ||, orLast value
PHP none&&, and, ||, orBoolean
POSIX shell (command list)none&&, ||Last value (exit)
PowerShell Scripting Languagenone-and, -orBoolean
Python &, |and, orLast value
Ruby &, |&&, and, ||, or [8] Last value
Rust &, |&&, || [9] Boolean
Smalltalk &, |and:, or: [lower-alpha 8] Boolean
Standard ML Un­knownandalso, orelseBoolean
TTCN-3 noneand, or [10] Boolean
Beckhoff TwinCAT® (IEC 61131-3) [lower-alpha 9] AND, ORAND_THEN, [11] OR_ELSE [12] Boolean
Visual Basic .NET And, OrAndAlso, OrElseBoolean
Visual Basic, Visual Basic for Applications (VBA)And, OrSelect Case [lower-alpha 10] Numeric
Wolfram Language And @@ {...}, Or @@ {...}And, Or, &&, ||Boolean
ZTT&, |noneBoolean
  1. 1 2 ABAP and APL have no distinct boolean type.
  2. The bitwise operators behave like boolean operators when both arguments are of type bool or take only the values 0 or 1. [4]
  3. When overloaded, the operators && and || are eager and can return any type.
  4. This only applies to runtime-evaluated expressions, static if and static assert. Expressions in static initializers or manifest constants use eager evaluation.
  5. Fortran operators are neither short-circuit nor eager: the language specification allows the compiler to select the method for optimization.
  6. ISO/IEC 10206:1990 Extended Pascal allows, but does not require, short-circuiting.
  7. 1 2 Delphi and Free Pascal default to short circuit evaluation. This may be changed by compiler options but does not seem to be used widely.
  8. Smalltalk uses short-circuit semantics as long as the argument to and: is a block (e.g., false and: [Transcript show: 'Wont see me']).
  9. The norm IEC 61131-3 doesn't actually define if AND and OR use short-circuit evaluation and it doesn't define the operators AND_THEN and OR_ELSE. The entries in the table show how it works for Beckhoff TwinCAT®.
  10. BASIC languages that supported CASE statements did so by using the conditional evaluation system, rather than as jump tables limited to fixed labels.

Common use

Avoiding undesired side effects of the second argument

Usual example, using a C-based language:

intdenom=0;if(denom!=0&&num/denom){...// ensures that calculating num/denom never results in divide-by-zero error   }

Consider the following example:

inta=0;if(a!=0&&myfunc(b)){do_something();}

In this example, short-circuit evaluation guarantees that myfunc(b) is never called. This is because a != 0 evaluates to false. This feature permits two useful programming constructs.

  1. If the first sub-expression checks whether an expensive computation is needed and the check evaluates to false, one can eliminate expensive computation in the second argument.
  2. It permits a construct where the first expression guarantees a condition without which the second expression may cause a run-time error.

Both are illustrated in the following C snippet where minimal evaluation prevents both null pointer dereference and excess memory fetches:

boolis_first_char_valid_alpha_unsafe(constchar*p){returnisalpha(p[0]);// SEGFAULT highly possible with p == NULL}boolis_first_char_valid_alpha(constchar*p){returnp!=NULL&&isalpha(p[0]);// 1) no unneeded isalpha() execution with p == NULL, 2) no SEGFAULT risk}

Idiomatic conditional construct

Since minimal evaluation is part of an operator's semantic definition and not an optional optimization, a number of coding idioms rely on it as a succinct conditional construct. Examples include:

Perl idioms:

some_conditionordie;# Abort execution if some_condition is falsesome_conditionanddie;# Abort execution if some_condition is true

POSIX shell idioms: [13]

modprobe-qsome_module&&echo"some_module installed"||echo"some_module not installed"

This idiom presumes that echo cannot fail.

Possible problems

Untested second condition leads to unperformed side effect

Despite these benefits, minimal evaluation may cause problems for programmers who do not realize (or forget) it is happening. For example, in the code

if(expressionA&&myfunc(b)){do_something();}

if myfunc(b) is supposed to perform some required operation regardless of whether do_something() is executed, such as allocating system resources, and expressionA evaluates as false, then myfunc(b) will not execute, which could cause problems. Some programming languages, such as Java, have two operators, one that employs minimal evaluation and one that does not, to avoid this problem.

Problems with unperformed side effect statements can be easily solved with proper programming style, i.e., not using side effects in boolean statements, as using values with side effects in evaluations tends to generally make the code opaque and error-prone. [14]

Reduced efficiency due to constraining optimizations

Short-circuiting can lead to errors in branch prediction on modern central processing units (CPUs), and dramatically reduce performance. A notable example is highly optimized ray with axis aligned box intersection code in ray tracing.[ clarification needed ] Some compilers can detect such cases and emit faster code, but programming language semantics may constrain such optimizations.[ citation needed ]

An example of a compiler unable to optimize for such a case is Java's Hotspot virtual machine (VM) as of 2012. [15]

See also

Related Research Articles

<span class="mw-page-title-main">Logical disjunction</span> Logical connective OR

In logic, disjunction, also known as logical disjunction or logical or or logical addition or inclusive disjunction, is a logical connective typically notated as and read aloud as "or". For instance, the English language sentence "it is sunny or it is warm" can be represented in logic using the disjunctive formula , assuming that abbreviates "it is sunny" and abbreviates "it is warm".

In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed and which also avoids repeated evaluations.

In computer science, primitive data types are a set of basic data types from which all other data types are constructed. Specifically it often refers to the limited set of data representations in use by a particular processor, which all compiled programs must use. Most processors support a similar set of primitive data types, although the specific representations vary. More generally, "primitive data types" may refer to the standard data types built into a programming language. Data types which are not primitive are referred to as derived or composite.

<span class="mw-page-title-main">Conditional (computer programming)</span> Control flow statement that executes code according to some condition(s)

In computer science, conditionals are programming language commands for handling decisions. Specifically, conditionals perform different computations or actions depending on whether a programmer-defined Boolean condition evaluates to true or false. In terms of control flow, the decision is always achieved by selectively altering the control flow based on some condition . Although dynamic dispatch is not usually classified as a conditional construct, it is another way to select between alternatives at runtime. Conditional statements are the checkpoints in the programme that determines behaviour according to situation.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, ternary if, or inline if. An expression a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is by far and large the most common, but alternative syntaxes do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

In computer programming, operators are constructs defined within programming languages which behave generally like functions, but which differ syntactically or semantically.

In computer science, the Boolean is a data type that has one of two possible values which is intended to represent the two truth values of logic and Boolean algebra. It is named after George Boole, who first defined an algebraic system of logic in the mid 19th century. The Boolean data type is primarily associated with conditional statements, which allow different actions by changing control flow depending on whether a programmer-specified Boolean condition evaluates to true or false. It is a special case of a more general logical data type—logic does not always need to be Boolean.

In computer science, a relational operator is a programming language construct or operator that tests or defines some kind of relation between two entities. These include numerical equality and inequalities.

In computer programming languages, a switch statement is a type of selection control mechanism used to allow the value of a variable or expression to change the control flow of program execution via search and map.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

<span class="mw-page-title-main">Null (SQL)</span> Marker used in SQL databases to indicate a value does not exist

In SQL, null or NULL is a special marker used to indicate that a data value does not exist in the database. Introduced by the creator of the relational database model, E. F. Codd, SQL null serves to fulfil the requirement that all true relational database management systems (RDBMS) support a representation of "missing information and inapplicable information". Codd also introduced the use of the lowercase Greek omega (ω) symbol to represent null in database theory. In SQL, NULL is a reserved word used to identify this marker.

In computer science, a Boolean expression is an expression used in programming languages that produces a Boolean value when evaluated. A Boolean value is either true or false. A Boolean expression may be composed of a combination of the Boolean constants True/Yes or False/No, Boolean-typed variables, Boolean-valued operators, and Boolean-valued functions.

<span class="mw-page-title-main">JavaScript syntax</span> Set of rules defining correctly structured programs

The syntax of JavaScript is the set of rules that define a correctly structured JavaScript program.

In computing, IIf is a function in several editions of the Visual Basic programming language and ColdFusion Markup Language (CFML), and on spreadsheets that returns the second or third parameter based on the evaluation of the first parameter. It is an example of a conditional expression, which is similar to a conditional statement.

The conditional operator is supported in many programming languages. This term usually refers to ?: as in C, C++, C#, and JavaScript. However, in Java, this term can also refer to && and ||.

In functional programming, filter is a higher-order function that processes a data structure in some order to produce a new data structure containing exactly those elements of the original data structure for which a given predicate returns the boolean value true.

The null coalescing operator is a binary operator that is part of the syntax for a basic conditional expression in several programming languages, such as : C# since version 2.0, Dart since version 1.12.0, PHP since version 7.0.0., Perl since version 5.10 as logical defined-or, PowerShell since 7.0.0, and Swift as nil-coalescing operator.

Expression templates are a C++ template metaprogramming technique that builds structures representing a computation at compile time, where expressions are evaluated only as needed to produce efficient code for the entire computation. Expression templates thus allow programmers to bypass the normal order of evaluation of the C++ language and achieve optimizations such as loop fusion.

The structure of the Perl programming language encompasses both the syntactical rules of the language and the general ways in which programs are organized. Perl's design philosophy is expressed in the commonly cited motto "there's more than one way to do it". As a multi-paradigm, dynamically typed language, Perl allows a great degree of flexibility in program design. Perl also encourages modularization; this has been attributed to the component-based design structure of its Unix roots, and is responsible for the size of the CPAN archive, a community-maintained repository of more than 100,000 modules.

In certain computer programming languages, the Elvis operator, often written ?:, is a binary operator that returns the evaluated first operand if that operand evaluates to a value likened to logically true, and otherwise returns the evaluated second operand. This is identical to a short-circuit or with "last value" semantics. The notation of the Elvis operator was inspired by the ternary conditional operator, ? :, since the Elvis operator expression A ?: B is approximately equivalent to the ternary conditional expression A ? A : B.

References

  1. Edsger W. Dijkstra "On a somewhat disappointing correspondence", EWD1009-0, 25 May 1987 full text
  2. "Shell Command Language". pubs.opengroup.org.
  3. Bergstra, Jan A.; Ponse, A.; Staudt, D.J.C. (2010). "Short-circuit logic". arXiv: 1010.3674 [cs.LO].
  4. ISO/IEC 9899 standard, sections 6.2.5, 6.3.1.2, 6.5 and 7.16.
  5. ISO/IEC 9899 standard, section 6.5.13
  6. ISO/IEC IS 14882 draft.
  7. "OCaml - the OCaml language".
  8. "operators - Documentation for Ruby 3.3". docs.ruby-lang.org. Retrieved 2024-04-02.
  9. "std::ops - Rust". doc.rust-lang.org. Retrieved 2019-02-12.
  10. ETSI ES 201 873-1 V4.10.1, section 7.1.4
  11. "Beckhoff Information System - English". infosys.beckhoff.com. Retrieved 2021-08-16.
  12. "Beckhoff Information System - English". infosys.beckhoff.com. Retrieved 2021-08-16.
  13. "What does || mean in bash?". stackexchange.com. Retrieved 2019-01-09.
  14. "Referential Transparency, Definiteness and Unfoldability" (PDF). Itu.dk. Retrieved 2013-08-24.
  15. Wasserman, Louis (11 July 2012). "Java: What are the cases in which it is better to use unconditional AND (& instead of &&)". Stack Overflow.