Switch statement

Last updated

In computer programming languages, a switch statement is a type of selection control mechanism used to allow the value of a variable or expression to change the control flow of program execution via search and map.

Contents

Switch statements function somewhat similarly to the if statement used in programming languages like C/C++, C#, Visual Basic .NET, Java and exist in most high-level imperative programming languages such as Pascal, Ada, C/C++, C#, [1] :374–375 Visual Basic .NET, Java, [2] :157–167 and in many other types of language, using such keywords as switch, case, select, or inspect.

Switch statements come in two main variants: a structured switch, as in Pascal, which takes exactly one branch, and an unstructured switch, as in C, which functions as a type of goto. The main reasons for using a switch include improving clarity, by reducing otherwise repetitive coding, and (if the heuristics permit) also offering the potential for faster execution through easier compiler optimization in many cases.

An example of a Switch statement in C
switch(age){case1:printf("You're one.");break;case2:printf("You're two.");break;case3:printf("You're three.");case4:printf("You're three or four.");break;default:printf("You're not 1, 2, 3 or 4!");}

History

In his 1952 text Introduction to Metamathematics, Stephen Kleene formally proved that the CASE function (the IF-THEN-ELSE function being its simplest form) is a primitive recursive function, where he defines the notion "definition by cases" in the following manner:

"#F. The function φ defined thus

φ(x1 , ... , xn ) =
  • φ1(x1 , ... , xn ) if Q1(x1 , ... , xn ),
  • . . . . . . . . . . . .
  • φm(x1 , ... , xn ) if Qm(x1 , ... , xn ),
  • φm+1(x1 , ... , xn ) otherwise,

where Q1 , ... , Qm are mutually exclusive predicates (or φ(x1 , ... , xn) shall have the value given by the first clause which applies) is primitive recursive in φ1, ..., φm+1, Q1, ..., Qm+1.

Stephen Kleene, [3]

Kleene provides a proof of this in terms of the Boolean-like recursive functions "sign-of" sg( ) and "not sign of" ~sg( ) (Kleene 1952:222-223); the first returns 1 if its input is positive and −1 if its input is negative.

Boolos-Burgess-Jeffrey make the additional observation that "definition by cases" must be both mutually exclusive and collectively exhaustive. They too offer a proof of the primitive recursiveness of this function (Boolos-Burgess-Jeffrey 2002:74-75).

The IF-THEN-ELSE is the basis of the McCarthy formalism: its usage replaces both primitive recursion and the mu-operator.

The earliest Fortran compilers supported the computed GOTO statement for multi-way branching. Early ALGOL compilers supported a SWITCH data type which contains a list of "designational expressions". A GOTO statement could reference a switch variable and, by providing an index, branch to the desired destination. With experience it was realized that a more formal multi-way construct, with single point of entrance and exit, was needed. Languages such as BCPL, ALGOL-W, and ALGOL-68 introduced forms of this construct which have survived through modern languages.

Typical syntax

In most languages, programmers write a switch statement across many individual lines using one or two keywords. A typical syntax involves:

Each alternative begins with the particular value, or list of values (see below), that the control variable may match and which will cause the control to goto the corresponding sequence of statements. The value (or list/range of values) is usually separated from the corresponding statement sequence by a colon or by an implication arrow. In many languages, every case must also be preceded by a keyword such as case or when.

An optional default case is typically also allowed, specified by a default, otherwise, or else keyword. This executes when none of the other cases match the control expression. In some languages, such as C, if no case matches and the default is omitted the switch statement simply does nothing. In others, like PL/I, an error is raised.

Semantics

Semantically, there are two main forms of switch statements.

The first form are structured switches, as in Pascal, where exactly one branch is taken, and the cases are treated as separate, exclusive blocks. This functions as a generalized if–then–else conditional, here with any number of branches, not just two.

The second form are unstructured switches, as in C, where the cases are treated as labels within a single block, and the switch functions as a generalized goto. This distinction is referred to as the treatment of fallthrough, which is elaborated below.

Fallthrough

In many languages, only the matching block is executed, and then execution continues at the end of the switch statement. These include the Pascal family (Object Pascal, Modula, Oberon, Ada, etc.) as well as PL/I, modern forms of Fortran and BASIC dialects influenced by Pascal, most functional languages, and many others. To allow multiple values to execute the same code (and avoid needing to duplicate code), Pascal-type languages permit any number of values per case, given as a comma-separated list, as a range, or as a combination.

Languages derived from C language, and more generally those influenced by Fortran's computed GOTO, instead feature fallthrough, where control moves to the matching case, and then execution continues ("falls through") to the statements associated with the next case in the source text. This also allows multiple values to match the same point without any special syntax: they are just listed with empty bodies. Values can be special conditioned with code in the case body. In practice, fallthrough is usually prevented with a break keyword at the end of the matching body, which exits execution of the switch block, but this can cause bugs due to unintentional fallthrough if the programmer forgets to insert the break statement. This is thus seen by many [4] as a language wart, and warned against in some lint tools. Syntactically, the cases are interpreted as labels, not blocks, and the switch and break statements explicitly change control flow. Some languages influenced by C, such as JavaScript, retain default fallthrough, while others remove fallthrough, or only allow it in special circumstances. Notable variations on this in the C-family include C#, in which all blocks must be terminated with a break or return unless the block is empty (i.e. fallthrough is used as a way to specify multiple values).

In some cases languages provide optional fallthrough. For example, Perl does not fall through by default, but a case may explicitly do so using a continue keyword. This prevents unintentional fallthrough but allows it when desired. Similarly, Bash defaults to not falling through when terminated with ;;, but allows fallthrough [5] with ;& or ;;& instead.

An example of a switch statement that relies on fallthrough is Duff's device.

Compilation

Optimizing compilers such as GCC or Clang may compile a switch statement into either a branch table or a binary search through the values in the cases. [6] A branch table allows the switch statement to determine with a small, constant number of instructions which branch to execute without having to go through a list of comparisons, while a binary search takes only a logarithmic number of comparisons, measured in the number of cases in the switch statement.

Normally, the only method of finding out if this optimization has occurred is by actually looking at the resultant assembly or machine code output that has been generated by the compiler.

Advantages and disadvantages

In some languages and programming environments, the use of a case or switch statement is considered superior to an equivalent series of if else if statements because it is:

Additionally, an optimized implementation may execute much faster than the alternative, because it is often implemented by using an indexed branch table. [7] For example, deciding program flow based on a single character's value, if correctly implemented, is vastly more efficient than the alternative, reducing instruction path lengths considerably. When implemented as such, a switch statement essentially becomes a perfect hash.

In terms of the control-flow graph, a switch statement consists of two nodes (entrance and exit), plus one edge between them for each option. By contrast, a sequence of "if...else if...else if" statements has an additional node for every case other than the first and last, together with a corresponding edge. The resulting control-flow graph for the sequences of "if"s thus has many more nodes and almost twice as many edges, with these not adding any useful information. However, the simple branches in the if statements are individually conceptually easier than the complex branch of a switch statement. In terms of cyclomatic complexity, both of these options increase it by k−1 if given k cases.

Switch expressions

Switch expressions are introduced in Java SE 12, 19 March 2019, as a preview feature. Here a whole switch expression can be used to return a value. There is also a new form of case label, case L-> where the right-hand-side is a single expression. This also prevents fall through and requires that cases are exhaustive. In Java SE 13 the yield statement is introduced, and in Java SE 14 switch expressions become a standard language feature. [8] [9] [10] For example:

intndays=switch(month){caseJAN,MAR,MAY,JUL,AUG,OCT,DEC->31;caseAPR,JUN,SEP,NOV->30;caseFEB->{if(year%400==0)yield29;elseif(year%100==0)yield28;elseif(year%4==0)yield29;elseyield28;}};

Alternative uses

Many languages evaluate expressions inside switch blocks at runtime, allowing a number of less obvious uses for the construction. This prohibits certain compiler optimizations, so is more common in dynamic and scripting languages where the enhanced flexibility is more important than the performance overhead.

PHP

For example, in PHP, a constant can be used as the "variable" to check against, and the first case statement which evaluates to that constant will be executed:

switch(true){case($x=='hello'):foo();break;case($z=='howdy'):break;}switch(5){case$x:break;case$y:break;}

This feature is also useful for checking multiple variables against one value rather than one variable against many values. COBOL also supports this form (and other forms) in the EVALUATE statement. PL/I has an alternative form of the SELECT statement where the control expression is omitted altogether and the first WHEN that evaluates to true is executed.

Ruby

In Ruby, due to its handling of === equality, the statement can be used to test for variable’s class:

caseinputwhenArraythenputs'input is an Array!'whenHashthenputs'input is a Hash!'end

Ruby also returns a value that can be assigned to a variable, and doesn’t actually require the case to have any parameters (acting a bit like an else if statement):

catfood=casewhencat.age<=1juniorwhencat.age>10seniorelsenormalend

Assembler

A switch statement in assembly language:

switch:cmpah,00hjeacmpah,01hjebjmpswtend; No cases match or "default" code herea:pushahmoval,'a'movah,0Ehmovbh,00hint10hpopahjmpswtend; Equivalent to "break"b:pushahmoval,'b'movah,0Ehmovbh,00hint10hpopahjmpswtend; Equivalent to "break"...swtend:

Python

For Python 3.10.6, PEPs 634-636 were accepted, which added match and case keywords. [11] [12] [13] [14] Unlike other languages, Python notably doesn't exhibit fallthrough behavior.

letter=input("Put in a single letter: ").strip()[0].casefold()# first non-whitespace character of the input, lowercasematchletter:case'a'|'e'|'i'|'o'|'u':# Unlike conditions in if statements, the `or` keyword cannot be used here to differentiate between casesprint(f"Letter {letter} is a vowel!")case'y':print(f"Letter {letter} may be a vowel.")case_:# `case _` is equivalent to `default` from C and othersprint(f"Letter {letter} is not a vowel!")

Exception handling

A number of languages implement a form of switch statement in exception handling, where if an exception is raised in a block, a separate branch is chosen, depending on the exception. In some cases a default branch, if no exception is raised, is also present. An early example is Modula-3, which use the TRY...EXCEPT syntax, where each EXCEPT defines a case. This is also found in Delphi, Scala, and Visual Basic .NET.

Alternatives

Some alternatives to switch statements can be:

(In some languages, only actual data types are allowed as values in the lookup table. In other languages, it is also possible to assign functions as lookup table values, gaining the same flexibility as a real switch statement. See Control table article for more detail on this).
Lua does not support case/switch statements. [15] This lookup technique is one way to implement switch statements in the Lua language, which has no built-in switch. [15]
In some cases, lookup tables are more efficient than non-optimized switch statements since many languages can optimize table lookups, whereas switch statements are not optimized unless the range of values is small with few gaps. A non-optimized, non-binary search lookup, however, will almost certainly be slower than either a non-optimized switch or the equivalent multiple if-else statements.[ citation needed ]

See also

Related Research Articles

In programming language theory, lazy evaluation, or call-by-need, is an evaluation strategy which delays the evaluation of an expression until its value is needed and which also avoids repeated evaluations.

In computer science, control flow is the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated. The emphasis on explicit control flow distinguishes an imperative programming language from a declarative programming language.

In a programming language, a reserved word is a word that cannot be used by a programmer as an identifier, such as the name of a variable, function, or label – it is "reserved from use".

In computer programming, a parameter or a formal argument is a special kind of variable used in a subroutine to refer to one of the pieces of data provided as input to the subroutine. These pieces of data are the values of the arguments with which the subroutine is going to be called/invoked. An ordered list of parameters is usually included in the definition of a subroutine, so that, each time the subroutine is called, its arguments for that call are evaluated, and the resulting values can be assigned to the corresponding parameters.

<span class="mw-page-title-main">C syntax</span> Set of rules defining correctly structured programs

The syntax of the C programming language is the set of rules governing writing of software in C. It is designed to allow for programs that are extremely terse, have a close relationship with the resulting object code, and yet provide relatively high-level data abstraction. C was the first widely successful high-level language for portable operating-system development.

<span class="mw-page-title-main">Conditional (computer programming)</span> Control flow statement that executes code according to some condition(s)

In computer science, conditionals are programming language constructs that perform different computations or actions or return different values depending on the value of a Boolean expression, called a condition.

<span class="mw-page-title-main">For loop</span> Control flow statement for repeated execution

In computer science, a for-loop or for loop is a control flow statement for specifying iteration. Specifically, a for-loop functions by running a section of code repeatedly until a certain condition has been satisfied.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, conditional expression, ternary if, or inline if. An expression if a then b else c or a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is the most common, but alternative syntax do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

In computer programming, a statement is a syntactic unit of an imperative programming language that expresses some action to be carried out. A program written in such a language is formed by a sequence of one or more statements. A statement may have internal components.

<span class="mw-page-title-main">Java syntax</span> Set of rules defining correctly structured program

The syntax of Java is the set of rules defining how a Java program is written and interpreted.

In computer science, a tail call is a subroutine call performed as the final action of a procedure. If the target of a tail is the same subroutine, the subroutine is said to be tail recursive, which is a special case of direct recursion. Tail recursion is particularly useful, and is often easy to optimize in implementations.

In computer programming, a branch table or jump table is a method of transferring program control (branching) to another part of a program using a table of branch or jump instructions. It is a form of multiway branch. The branch table construction is commonly used when programming in assembly language but may also be generated by compilers, especially when implementing optimized switch statements whose values are densely packed together.

In computer science and recursion theory the McCarthy Formalism (1963) of computer scientist John McCarthy clarifies the notion of recursive functions by use of the IF-THEN-ELSE construction common to computer science, together with four of the operators of primitive recursive functions: zero, successor, equality of numbers and composition. The conditional operator replaces both primitive recursion and the mu-operator.

In programming languages, a label is a sequence of characters that identifies a location within source code. In most languages, labels take the form of an identifier, often followed by a punctuation character. In many high-level languages, the purpose of a label is to act as the destination of a GOTO statement. In assembly language, labels can be used anywhere an address can. Also in Pascal and its derived variations. Some languages, such as Fortran and BASIC, support numeric labels. Labels are also used to identify an entry point into a compiled sequence of statements.

<span class="mw-page-title-main">Control table</span> Data structures that control the execution order of computer commands

Control tables are tables that control the control flow or play a major part in program control. There are no rigid rules about the structure or content of a control table—its qualifying attribute is its ability to direct control flow in some way through "execution" by a processor or interpreter. The design of such tables is sometimes referred to as table-driven design. In some cases, control tables can be specific implementations of finite-state-machine-based automata-based programming. If there are several hierarchical levels of control table they may behave in a manner equivalent to UML state machines

<span class="mw-page-title-main">Goto</span> One-way control statement in computer programming

Goto is a statement found in many computer programming languages. It performs a one-way transfer of control to another line of code; in contrast a function call normally returns control. The jumped-to locations are usually identified using labels, though some languages use line numbers. At the machine code level, a goto is a form of branch or jump statement, in some cases combined with a stack adjustment. Many languages support the goto statement, and many do not.

In computer programming, a function is a callable unit of software logic that has a well-defined interface and behavior and can be invoked multiple times.

re2c is a free and open-source lexer generator for C, C++, D, Go, Haskell, Java, JavaScript, OCaml, Python, Rust, V and Zig. It compiles declarative regular expression specifications to deterministic finite automata. Originally written by Peter Bumbulis and described in his paper, re2c was put in public domain and has been since maintained by volunteers. It is the lexer generator adopted by projects such as PHP, SpamAssassin, Ninja build system and others. Together with the Lemon parser generator, re2c is used in BRL-CAD. This combination is also used with STEPcode, an implementation of ISO 10303 standard.

The syntax of the SQL programming language is defined and maintained by ISO/IEC SC 32 as part of ISO/IEC 9075. This standard is not freely available. Despite the existence of the standard, SQL code is not completely portable among different database systems without adjustments.

References

  1. Skeet, Jon (23 March 2019). C# in Depth. Manning. ISBN   978-1617294532.
  2. Bloch, Joshua (2018). "Effective Java: Programming Language Guide" (third ed.). Addison-Wesley. ISBN   978-0134685991.
  3. "Definition by cases", Kleene 1952:229
  4. van der Linden, Peter (1994). Expert C Programming: Deep C Secrets, p. 38. Prentice Hall, Eaglewood Cliffs. ISBN   0131774298.
  5. since version 4.0, released in 2009.
  6. Vlad Lazarenko. From Switch Statement Down to Machine Code
  7. Guntheroth, Kurt (April 27, 2016). Optimized C++. O'Reilly Media. p. 182. ISBN   9781491922033.
  8. "JEP 325: Switch Expressions (Preview)". openjdk.java.net. Retrieved 2021-04-28.
  9. "JEP 354: Switch Expressions (Second Preview)". openjdk.java.net. Retrieved 2021-04-28.
  10. "JEP 361: Switch Expressions". openjdk.java.net. Retrieved 2021-04-28.
  11. Galindo Salgado, Pablo. "What's New In Python 3.10". Python 3.10.6 documentation. Retrieved 2022-08-19.
  12. Bucher, Brandt; van Rossum, Guido (2020-09-12). "PEP 634 – Structural Pattern Matching: Specification". Python Enhancement Proposals. Retrieved 2022-08-19.
  13. Kohn, Tobias; van Rossum, Guido (2020-09-12). "PEP 635 – Structural Pattern Matching: Motivation and Rationale". Python Enhancement Proposals. Retrieved 2022-08-19.
  14. Moisset, Daniel F. "PEP 636 – Structural Pattern Matching: Tutorial". Python Enhancement Proposals. Retrieved 2022-08-19.
  15. 1 2 Switch statement in Lua

Further reading