Short-circuit evaluation

Last updated

Short-circuit evaluation, minimal evaluation, or McCarthy evaluation (after John McCarthy) is the semantics of some Boolean operators in some programming languages in which the second argument is executed or evaluated only if the first argument does not suffice to determine the value of the expression: when the first argument of the AND function evaluates to false, the overall value must be false; and when the first argument of the OR function evaluates to true, the overall value must be true.

Contents

In programming languages with lazy evaluation (Lisp, Perl, Haskell), the usual Boolean operators short-circuit. In others (Ada, Java, Delphi), both short-circuit and standard Boolean operators are available. For some Boolean operations, like exclusive or (XOR), it is impossible to short-circuit, because both operands are always needed to determine a result.

Short-circuit operators are, in effect, control structures rather than simple arithmetic operators, as they are not strict. In imperative language terms (notably C and C++), where side effects are important, short-circuit operators introduce a sequence point: they completely evaluate the first argument, including any side effects, before (optionally) processing the second argument. ALGOL 68 used proceduring to achieve user-defined short-circuit operators and procedures.

The use of short-circuit operators has been criticized as problematic:

The conditional connectives — "cand" and "cor" for short — are ... less innocent than they might seem at first sight. For instance, cor does not distribute over cand: compare

(A cand B) cor C with (A cor C) cand (B cor C);

in the case ¬A ∧ C , the second expression requires B to be defined, the first one does not. Because the conditional connectives thus complicate the formal reasoning about programs, they are better avoided.

Definition

In any programming language that implements short-circuit evaluation, the expression x and y is equivalent to the conditional expression if x then y else x, and the expression x or y is equivalent to if x then x else y. In either case, x is only evaluated once.

The generalized definition above accommodates loosely typed languages that have more than the two truth-values True and False, where short-circuit operators may return the last evaluated subexpression. This is called "last value" in the table below. For a strictly-typed language, the expression is simplified to if x then y else false and if x then true else y respectively for the boolean case.

Precedence

Although AND takes precedence over OR in many languages, this is not a universal property of short-circuit evaluation. An example of the two operators taking the same precedence and being left-associative with each other is POSIX shell's command-list syntax. [2] :§2.9.3

The following simple left-to-right evaluator enforces a precedence of AND over OR by a continue:

function short-circuit-eval (operators, values)     letresult := True     for each (op, val) in (operators, values):         ifop = "AND" && result = False             continueelse ifop = "OR" && result = True             returnresultelseresult := valreturnresult

Formalization

Short-circuit logic, with or without side-effects, have been formalized based on Hoare's conditional. A result is that non-short-circuiting operators can be defined out of short-circuit logic to have the same sequence of evaluation. [3]

Support in common programming and scripting languages

Boolean operators in various languages
Language Eager operatorsShort-circuit operatorsResult type
Advanced Business Application Programming (ABAP)noneand, orBoolean [lower-alpha 1]
Ada and, orand then, or elseBoolean
ALGOL 68 and, &, ∧ ; or, ∨andf , orf (both user defined)Boolean
APL , , (nand), (nor), etc.:AndIf, :OrIfBoolean [lower-alpha 1]
awk none&&, ||Boolean
Bash none&&, ||Boolean
C, Objective-C &, | [lower-alpha 2] &&, ||, ? [5] int (&, |, &&,||), opnd-dependent (?)
C++ [lower-alpha 3] none&&, ||, ? [6] Boolean (&&,||), opnd-dependent (?)
C# &, |&&, ||, ?, ??Boolean (&&,||), opnd-dependent (?, ??)
ColdFusion Markup Language (CFML)noneAND, OR, &&, ||Boolean
D [lower-alpha 4] &, |&&, ||, ?Boolean (&&,||), opnd-dependent (?)
Eiffel and, orand then, or elseBoolean
Erlang and, orandalso, orelseBoolean
Fortran [lower-alpha 5] .and., .or..and., .or.Boolean
Go, Haskell, OCaml none&&, ||Boolean
Java, MATLAB, R, Swift &, |&&, ||Boolean
JavaScript, Julia &, |&&, ||Last value
Lasso noneand, or, &&, ||Last value
Kotlin and, or&&, ||Boolean
Lisp, Lua, Scheme noneand, orLast value
MUMPS (M)&, !noneNumeric
Modula-2 noneAND, ORBoolean
Oberon none&, ORBoolean
OCaml land, lor [7] &&, ||Boolean
Pascal and, or [lower-alpha 6] [lower-alpha 7] and_then, or_else [lower-alpha 7] Boolean
Perl &, |&&, and, ||, orLast value
PHP &, |&&, and, ||, orBoolean
POSIX shell (command list)none&&, ||Last value (exit)
PowerShell Scripting Languagenone-and, -orBoolean
Python &, |and, orLast value
Ruby &, |&&, and, ||, or [8] Last value
Rust &, |&&, || [9] Boolean
Smalltalk &, |and:, or: [lower-alpha 8] Boolean
Standard ML Un­knownandalso, orelseBoolean
TTCN-3 noneand, or [10] Boolean
Beckhoff TwinCAT® (IEC 61131-3) [lower-alpha 9] AND, ORAND_THEN, [11] OR_ELSE [12] Boolean
Visual Basic .NET And, OrAndAlso, OrElseBoolean
Visual Basic, Visual Basic for Applications (VBA)And, OrSelect Case [lower-alpha 10] Numeric
Wolfram Language And @@ {...}, Or @@ {...}And, Or, &&, ||Boolean
ZTT&, |noneBoolean
  1. 1 2 ABAP and APL have no distinct boolean type.
  2. The bitwise operators behave like boolean operators when both arguments are of type bool or take only the values 0 or 1. [4]
  3. When overloaded, the operators && and || are eager and can return any type.
  4. This only applies to runtime-evaluated expressions, static if and static assert. Expressions in static initializers or manifest constants use eager evaluation.
  5. Fortran operators are neither short-circuit nor eager: the language specification allows the compiler to select the method for optimization.
  6. ISO/IEC 10206:1990 Extended Pascal allows, but does not require, short-circuiting.
  7. 1 2 Delphi and Free Pascal default to short circuit evaluation. This may be changed by compiler options but does not seem to be used widely.
  8. Smalltalk uses short-circuit semantics as long as the argument to and: is a block (e.g., false and: [Transcript show: 'Wont see me']).
  9. The norm IEC 61131-3 doesn't actually define if AND and OR use short-circuit evaluation and it doesn't define the operators AND_THEN and OR_ELSE. The entries in the table show how it works for Beckhoff TwinCAT®.
  10. BASIC languages that supported CASE statements did so by using the conditional evaluation system, rather than as jump tables limited to fixed labels.

Common use

Avoiding undesired side effects of the second argument

Usual example, using a C-based language:

intdenom=0;if(denom!=0&&num/denom){...// ensures that calculating num/denom never results in divide-by-zero error   }

Consider the following example:

inta=0;if(a!=0&&myfunc(b)){do_something();}

In this example, short-circuit evaluation guarantees that myfunc(b) is never called. This is because a != 0 evaluates to false. This feature permits two useful programming constructs.

  1. If the first sub-expression checks whether an expensive computation is needed and the check evaluates to false, one can eliminate expensive computation in the second argument.
  2. It permits a construct where the first expression guarantees a condition without which the second expression may cause a run-time error.

Both are illustrated in the following C snippet where minimal evaluation prevents both null pointer dereference and excess memory fetches:

boolis_first_char_valid_alpha_unsafe(constchar*p){returnisalpha(p[0]);// SEGFAULT highly possible with p == NULL}boolis_first_char_valid_alpha(constchar*p){returnp!=NULL&&isalpha(p[0]);// 1) no unneeded isalpha() execution with p == NULL, 2) no SEGFAULT risk}

Idiomatic conditional construct

Since minimal evaluation is part of an operator's semantic definition and not an optional optimization, a number of coding idioms rely on it as a succinct conditional construct. Examples include:

Perl idioms:

some_conditionordie;# Abort execution if some_condition is falsesome_conditionanddie;# Abort execution if some_condition is true

POSIX shell idioms: [13]

modprobe-qsome_module&&echo"some_module installed"||echo"some_module not installed"

This idiom presumes that echo cannot fail.

Possible problems

Untested second condition leads to unperformed side effect

Despite these benefits, minimal evaluation may cause problems for programmers who do not realize (or forget) it is happening. For example, in the code

if(expressionA&&myfunc(b)){do_something();}

if myfunc(b) is supposed to perform some required operation regardless of whether do_something() is executed, such as allocating system resources, and expressionA evaluates as false, then myfunc(b) will not execute, which could cause problems. Some programming languages, such as Java, have two operators, one that employs minimal evaluation and one that does not, to avoid this problem.

Problems with unperformed side effect statements can be easily solved with proper programming style, i.e., not using side effects in boolean statements, as using values with side effects in evaluations tends to generally make the code opaque and error-prone. [14]

Reduced efficiency due to constraining optimizations

Short-circuiting can lead to errors in branch prediction on modern central processing units (CPUs), and dramatically reduce performance. A notable example is highly optimized ray with axis aligned box intersection code in ray tracing.[ clarification needed ] Some compilers can detect such cases and emit faster code, but programming language semantics may constrain such optimizations.[ citation needed ]

An example of a compiler unable to optimize for such a case is Java's Hotspot virtual machine (VM) as of 2012. [15]

See also

Related Research Articles

<span class="mw-page-title-main">Logical connective</span> Symbol connecting sentential formulas in logic

In logic, a logical connective is a logical constant. Connectives can be used to connect logical formulas. For instance in the syntax of propositional logic, the binary connective can be used to join the two atomic formulas and , rendering the complex formula .

In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed and which also avoids repeated evaluations.

<span class="mw-page-title-main">Conditional (computer programming)</span> Control flow statement that executes code according to some condition(s)

In computer science, conditionals are programming language commands for handling decisions. Specifically, conditionals perform different computations or actions depending on whether a programmer-defined Boolean condition evaluates to true or false. In terms of control flow, the decision is always achieved by selectively altering the control flow based on some condition . Although dynamic dispatch is not usually classified as a conditional construct, it is another way to select between alternatives at runtime. Conditional statements are the checkpoints in the programme that determines behaviour according to situation.

This is a list of operators in the C and C++ programming languages. All the operators listed exist in C++; the column "Included in C", states whether an operator is also present in C. Note that C does not support operator overloading.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, ternary if, or inline if. An expression a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is by far and large the most common, but alternative syntaxes do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

In computer programming, operators are constructs defined within programming languages which behave generally like functions, but which differ syntactically or semantically.

In computer science, the Boolean is a data type that has one of two possible values which is intended to represent the two truth values of logic and Boolean algebra. It is named after George Boole, who first defined an algebraic system of logic in the mid 19th century. The Boolean data type is primarily associated with conditional statements, which allow different actions by changing control flow depending on whether a programmer-specified Boolean condition evaluates to true or false. It is a special case of a more general logical data type—logic does not always need to be Boolean.

In computer science, a relational operator is a programming language construct or operator that tests or defines some kind of relation between two entities. These include numerical equality and inequalities.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

<span class="mw-page-title-main">Null (SQL)</span> Marker used in SQL databases to indicate a value does not exist

In SQL, null or NULL is a special marker used to indicate that a data value does not exist in the database. Introduced by the creator of the relational database model, E. F. Codd, SQL null serves to fulfil the requirement that all true relational database management systems (RDBMS) support a representation of "missing information and inapplicable information". Codd also introduced the use of the lowercase Greek omega (ω) symbol to represent null in database theory. In SQL, NULL is a reserved word used to identify this marker.

In computer science, a Boolean expression is an expression used in programming languages that produces a Boolean value when evaluated. A Boolean value is either true or false. A Boolean expression may be composed of a combination of the Boolean constants true or false, Boolean-typed variables, Boolean-valued operators, and Boolean-valued functions.

<span class="mw-page-title-main">JavaScript syntax</span> Set of rules defining correctly structured programs

The syntax of JavaScript is the set of rules that define a correctly structured JavaScript program.

<span class="mw-page-title-main">Recursion (computer science)</span> Use of functions that call themselves

In computer science, recursion is a method of solving a computational problem where the solution depends on solutions to smaller instances of the same problem. Recursion solves such recursive problems by using functions that call themselves from within their own code. The approach can be applied to many types of problems, and recursion is one of the central ideas of computer science.

The power of recursion evidently lies in the possibility of defining an infinite set of objects by a finite statement. In the same manner, an infinite number of computations can be described by a finite recursive program, even if this program contains no explicit repetitions.

In computing, IIf is a function in several editions of the Visual Basic programming language and ColdFusion Markup Language (CFML), and on spreadsheets that returns the second or third parameter based on the evaluation of the first parameter. It is an example of a conditional expression, which is similar to a conditional statement.

The conditional operator is supported in many programming languages. This term usually refers to ?: as in C, C++, C#, and JavaScript. However, in Java, this term can also refer to && and ||.

In functional programming, filter is a higher-order function that processes a data structure in some order to produce a new data structure containing exactly those elements of the original data structure for which a given predicate returns the boolean value true.

The null coalescing operator is a binary operator that is part of the syntax for a basic conditional expression in several programming languages, such as : C# since version 2.0, Dart since version 1.12.0, PHP since version 7.0.0., Perl since version 5.10 as logical defined-or, PowerShell since 7.0.0, and Swift as nil-coalescing operator.

The structure of the Perl programming language encompasses both the syntactical rules of the language and the general ways in which programs are organized. Perl's design philosophy is expressed in the commonly cited motto "there's more than one way to do it". As a multi-paradigm, dynamically typed language, Perl allows a great degree of flexibility in program design. Perl also encourages modularization; this has been attributed to the component-based design structure of its Unix roots, and is responsible for the size of the CPAN archive, a community-maintained repository of more than 100,000 modules.

In certain computer programming languages, the Elvis operator, often written ?:, is a binary operator that returns the evaluated first operand if that operand evaluates to a value likened to logically true, and otherwise returns the evaluated second operand. This is identical to a short-circuit or with "last value" semantics. The notation of the Elvis operator was inspired by the ternary conditional operator, ? :, since the Elvis operator expression A ?: B is approximately equivalent to the ternary conditional expression A ? A : B.

The syntax of the SQL programming language is defined and maintained by ISO/IEC SC 32 as part of ISO/IEC 9075. This standard is not freely available. Despite the existence of the standard, SQL code is not completely portable among different database systems without adjustments.

References

  1. Edsger W. Dijkstra "On a somewhat disappointing correspondence", EWD1009-0, 25 May 1987 full text
  2. "Shell Command Language". pubs.opengroup.org.
  3. Bergstra, Jan A.; Ponse, A.; Staudt, D.J.C. (2010). "Short-circuit logic". arXiv: 1010.3674 [cs.LO].
  4. ISO/IEC 9899 standard, sections 6.2.5, 6.3.1.2, 6.5 and 7.16.
  5. ISO/IEC 9899 standard, section 6.5.13
  6. ISO/IEC IS 14882 draft.
  7. "OCaml - the OCaml language".
  8. "operators - Documentation for Ruby 3.3". docs.ruby-lang.org. Retrieved 2024-04-02.
  9. "std::ops - Rust". doc.rust-lang.org. Retrieved 2019-02-12.
  10. ETSI ES 201 873-1 V4.10.1, section 7.1.4
  11. "Beckhoff Information System - English". infosys.beckhoff.com. Retrieved 2021-08-16.
  12. "Beckhoff Information System - English". infosys.beckhoff.com. Retrieved 2021-08-16.
  13. "What does || mean in bash?". stackexchange.com. Retrieved 2019-01-09.
  14. "Referential Transparency, Definiteness and Unfoldability" (PDF). Itu.dk. Retrieved 2013-08-24.
  15. Wasserman, Louis (11 July 2012). "Java: What are the cases in which it is better to use unconditional AND (& instead of &&)". Stack Overflow.