Boolean data type

Last updated

In computer science, the Boolean (sometimes shortened to Bool) is a data type that has one of two possible values (usually denoted true and false) which is intended to represent the two truth values of logic and Boolean algebra. It is named after George Boole, who first defined an algebraic system of logic in the mid 19th century. The Boolean data type is primarily associated with conditional statements, which allow different actions by changing control flow depending on whether a programmer-specified Boolean condition evaluates to true or false. It is a special case of a more general logical data type—logic does not always need to be Boolean (see probabilistic logic).

Contents

Generalities

In programming languages with a built-in Boolean data type, such as Pascal and Java, the comparison operators such as > and are usually defined to return a Boolean value. Conditional and iterative commands may be defined to test Boolean-valued expressions.

Languages with no explicit Boolean data type, like C90 and Lisp, may still represent truth values by some other data type. Common Lisp uses an empty list for false, and any other value for true. The C programming language uses an integer type, where relational expressions like i > j and logical expressions connected by && and || are defined to have value 1 if true and 0 if false, whereas the test parts of if, while, for, etc., treat any non-zero value as true. [1] [2] Indeed, a Boolean variable may be regarded (and implemented) as a numerical variable with one binary digit (bit), or as a bit string of length one, which can store only two values. The implementation of Booleans in computers are most likely represented as a full word, rather than a bit; this is usually due to the ways computers transfer blocks of information.

Most programming languages, even those with no explicit Boolean type, have support for Boolean algebraic operations such as conjunction (AND, &, *), disjunction (OR, |, +), equivalence (EQV, =, ==), exclusive or/non-equivalence (XOR, NEQV, ^, !=, ¬), and negation (NOT, ~, !, ¬).

In some languages, like Ruby, Smalltalk, and Alice the true and false values belong to separate classes, e.g., True and False, respectively, so there is no one Boolean type.

In SQL, which uses a three-valued logic for explicit comparisons because of its special treatment of Nulls, the Boolean data type (introduced in SQL:1999) is also defined to include more than two truth values, so that SQL Booleans can store all logical values resulting from the evaluation of predicates in SQL. A column of Boolean type can be restricted to just TRUE and FALSE though.

Language-specific implementations

ALGOL and the built-in BOOLEAN type

One of the earliest programming languages to provide an explicit BOOLEAN data type is ALGOL 60 (1960) with values true and false and logical operators denoted by symbols '' (and), '' (or), '' (implies), '' (equivalence), and '' (not). Due to input device and character set limits on many computers of the time, however, most compilers used alternative representations for many of the operators, such as AND or 'AND'.

This approach with BOOLEAN as a built-in (either primitive or otherwise predefined) data type was adopted by many later programming languages, such as Simula 67 (1967), ALGOL 68 (1970), [3] Pascal (1970), Ada (1980), Java (1995), and C# (2000), among others.

Fortran

The first version of FORTRAN (1957) and its successor FORTRAN II (1958) have no logical values or operations; even the conditional IF statement takes an arithmetic expression and branches to one of three locations according to its sign; see arithmetic IF. FORTRAN IV (1962), however, follows the ALGOL 60 example by providing a Boolean data type (LOGICAL), truth literals (.TRUE. and .FALSE.), Boolean-valued numeric comparison operators (.EQ., .GT., etc.), and logical operators (.NOT., .AND., .OR.). In FORMAT statements, a specific format descriptor ('L') is provided for the parsing or formatting of logical values. [4]

Lisp and Scheme

The language Lisp (1958) never had a built-in Boolean data type. Instead, conditional constructs like cond assume that the logical value false is represented by the empty list (), which is defined to be the same as the special atom nil or NIL; whereas any other s-expression is interpreted as true. For convenience, most modern dialects of Lisp predefine the atom t to have value t, so that t can be used as a mnemonic notation for true.

This approach (any value can be used as a Boolean value) was retained in most Lisp dialects (Common Lisp, Scheme, Emacs Lisp), and similar models were adopted by many scripting languages, even ones having a distinct Boolean type or Boolean values; although which values are interpreted as false and which are true vary from language to language. In Scheme, for example, the false value is an atom distinct from the empty list, so the latter is interpreted as true. Common Lisp, on the other hand, also provides the dedicated boolean type, derived as a specialization of the symbol. [5]

Pascal, Ada, and Haskell

The language Pascal (1970) popularized the concept of programmer-defined enumerated types, previously available with different nomenclature in COBOL, FACT and JOVIAL. A built-in Boolean data type was then provided as a predefined enumerated type with values FALSE and TRUE. By definition, all comparisons, logical operations, and conditional statements applied to and/or yielded Boolean values. Otherwise, the Boolean type had all the facilities which were available for enumerated types in general, such as ordering and use as indices. In contrast, converting between Booleans and integers (or any other types) still required explicit tests or function calls, as in ALGOL 60. This approach (Boolean is an enumerated type) was adopted by most later languages which had enumerated types, such as Modula, Ada, and Haskell.

C, C++, Objective-C, AWK

Initial implementations of the language C (1972) provided no Boolean type, and to this day Boolean values are commonly represented by integers (ints) in C programs. The comparison operators (>, ==, etc.) are defined to return a signed integer (int) result, either 0 (for false) or 1 (for true). Logical operators (&&, ||, !, etc.) and condition-testing statements (if, while) assume that zero is false and all other values are true.

After enumerated types (enums) were added to the American National Standards Institute version of C, ANSI C (1989), many C programmers got used to defining their own Boolean types as such, for readability reasons. However, enumerated types are equivalent to integers according to the language standards; so the effective identity between Booleans and integers is still valid for C programs.

Standard C (since C99) provides a Boolean type, called _Bool. By including the header stdbool.h, one can use the more intuitive name bool and the constants true and false. The language guarantees that any two true values will compare equal (which was impossible to achieve before the introduction of the type). Boolean values still behave as integers, can be stored in integer variables, and used anywhere integers would be valid, including in indexing, arithmetic, parsing, and formatting. This approach (Boolean values are just integers) has been retained in all later versions of C. Note, that this does not mean that any integer value can be stored in a Boolean variable.

C++ has a separate Boolean data type bool, but with automatic conversions from scalar and pointer values that are very similar to those of C. This approach was adopted also by many later languages, especially by some scripting languages such as AWK.

Objective-C also has a separate Boolean data type BOOL, with possible values being YES or NO, equivalents of true and false respectively. [6] Also, in Objective-C compilers that support C99, C's _Bool type can be used, since Objective-C is a superset of C.

Java

In Java, the value of the boolean data type can only be either true or false. [7]

Perl and Lua

Perl has no Boolean data type. Instead, any value can behave as Boolean in Boolean context (condition of if or while statement, argument of && or ||, etc.). The number 0, the strings "0" and "", the empty list (), and the special value undef evaluate to false. [8] All else evaluates to true.

Lua has a Boolean data type, but non-Boolean values can also behave as Booleans. The non-value nil evaluates to false, whereas every other data type value evaluates to true. This includes the empty string "" and the number 0, which are very often considered false in other languages.

PL/I

PL/I has no Boolean data type. Instead, comparison operators generate BIT(1) values; '0'B represents false and '1'B represents true. The operands of, e.g., &, |, ¬, are converted to bit strings and the operations are performed on each bit. The element-expression of an IF statement is true if any bit is 1.

Rexx

Rexx has no Boolean data type. Instead, comparison operators generate 0 or 1; 0 represents false and 1 represents true. The operands of, e.g., &, |, ¬, must be 0 or 1.

Tcl

Tcl has no separate Boolean type. Like in C, the integers 0 (false) and 1 (true—in fact any nonzero integer) are used. [9]

Examples of coding:

setv1if{$v}{puts"V is 1 or true"}

The above will show V is 1 or true since the expression evaluates to 1.

setv""if{$v}.... 

The above will render an error, as variable v cannot be evaluated as 0 or 1.

Python, Ruby, and JavaScript

Python, from version 2.3 forward, has a bool type which is a subclass of int, the standard integer type. [10] It has two possible values: True and False, which are special versions of 1 and 0 respectively and behave as such in arithmetic contexts. Also, a numeric value of zero (integer or fractional), the null value (None), the empty string, and empty containers (lists, sets, etc.) are considered Boolean false; all other values are considered Boolean true by default. [11] Classes can define how their instances are treated in a Boolean context through the special method __nonzero__ (Python 2) or __bool__ (Python 3). For containers, __len__ (the special method for determining the length of containers) is used if the explicit Boolean conversion method is not defined.

In Ruby, in contrast, only nil (Ruby's null value) and a special false object are false; all else (including the integer 0 and empty arrays) is true.

In JavaScript, the empty string (""), null, undefined, NaN, +0, −0 and false [12] are sometimes called falsy (of which the complement is truthy) to distinguish between strictly type-checked and coerced Booleans. [13] As opposed to Python, empty containers (Arrays, Maps, Sets) are considered truthy. Languages such as PHP also use this approach.

SQL

Booleans appear in SQL when a condition is needed, such as WHERE clause, in form of predicate which is produced by using operators such as comparison operators, IN operator, IS (NOT) NULL etc. However, apart from TRUE and FALSE, these operators can also yield a third state, called UNKNOWN, when comparison with NULL is made.

The SQL92 standard introduced IS (NOT) TRUE, IS (NOT) FALSE, and IS (NOT) UNKNOWN operators which evaluate a predicate, which predated the introduction of Boolean type in SQL:1999.

The SQL:1999 standard introduced a BOOLEAN data type as an optional feature (T031). When restricted by a NOT NULL constraint, a SQL BOOLEAN behaves like Booleans in other languages, which can store only TRUE and FALSE values. However, if it is nullable, which is the default like all other SQL data types, it can have the special null value also. Although the SQL standard defines three literals for the BOOLEAN type – TRUE, FALSE, and UNKNOWN it also says that the NULL BOOLEAN and UNKNOWN "may be used interchangeably to mean exactly the same thing". [14] [15] This has caused some controversy because the identification subjects UNKNOWN to the equality comparison rules for NULL. More precisely UNKNOWN = UNKNOWN is not TRUE but UNKNOWN/NULL. [16] As of 2012 few major SQL systems implement the T031 feature. [17] Firebird and PostgreSQL are notable exceptions, although PostgreSQL implements no UNKNOWN literal; NULL can be used instead. [18]

The treatment of Boolean values differs between SQL systems.

For example, in Microsoft SQL Server, Boolean value is not supported at all, neither as a standalone data type nor representable as an integer. It shows the error message "An expression of non-Boolean type specified in a context where a condition is expected" if a column is directly used in the WHERE clause, e.g. SELECTaFROMtWHEREa, while a statement such as SELECTcolumnISNOTNULLFROMt yields a syntax error. The BIT data type, which can only store integers 0 and 1 apart from NULL, is commonly used as a workaround to store Boolean values, but workarounds need to be used such as UPDATEtSETflag=IIF(colISNOTNULL,1,0)WHEREflag=0 to convert between the integer and Boolean expression.

Microsoft Access, which uses the Access Database Engine (ACE/JET), [19] also does not have a Boolean data type. Similar to MS SQL Server, it uses a BIT data type. [20] In Access it is known as a Yes/No data type [21] which can have two values; Yes (True) or No (False). The BIT data type in Access can also can be represented numerically; True is 1 and False is 0. [22] This differs to MS SQL Server in two ways, even though both are Microsoft products:

  1. Access represents TRUE as 1, while it is 1 in SQL Server
  2. Access does not support the Null tri-state, supported by SQL Server

PostgreSQL has a distinct BOOLEAN type as in the standard, [23] which allows predicates to be stored directly into a BOOLEAN column, and allows using a BOOLEAN column directly as a predicate in a WHERE clause.

In MySQL, BOOLEAN is treated as an alias of TINYINT(1); [24] TRUE is the same as integer 1 and FALSE is the same is integer 0. [25] Any non-zero integer is true in conditions.

Tableau

Tableau Software has a BOOLEAN data type. [26] The literal of a Boolean value is True or False. [27]

The Tableau INT() function converts a Boolean to a number, returning 1 for True and 0 for False. [28]

Forth

Forth (programming language) has no Boolean type, it uses regular integers: value 0 (all bits low) represents false, and -1 (all bits high) represents true. This allows the language to define only one set of logical operators, instead of one for mathematical calculations and one for conditions. [29]

See also

Related Research Articles

<span class="mw-page-title-main">Logical conjunction</span> Logical connective AND

In logic, mathematics and linguistics, and is the truth-functional operator of conjunction or logical conjunction. The logical connective of this operator is typically represented as or or (prefix) or or in which is the most modern and widely used.

The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of tuples, grouped into relations. A database organized in terms of the relational model is a relational database.

<span class="mw-page-title-main">Data type</span> Attribute of data

In computer science and computer programming, a data type is a collection or grouping of data values, usually specified by a set of possible values, a set of allowed operations on these values, and/or a representation of these values as machine types. A data type specification in a program constrains the possible values that an expression, such as a variable or a function call, might take. On literal data, it tells the compiler or interpreter how the programmer intends to use the data. Most programming languages support basic data types of integer numbers, floating-point numbers, characters and Booleans.

In logic, a three-valued logic is any of several many-valued logic systems in which there are three truth values indicating true, false and some third value. This is contrasted with the more commonly known bivalent logics which provide only for true and false.

In computer science, primitive data types are a set of basic data types from which all other data types are constructed. Specifically it often refers to the limited set of data representations in use by a particular processor, which all compiled programs must use. Most processors support a similar set of primitive data types, although the specific representations vary. More generally, "primitive data types" may refer to the standard data types built into a programming language. Data types which are not primitive are referred to as derived or composite.

<span class="mw-page-title-main">Conditional (computer programming)</span> Control flow statement that executes code according to some condition(s)

In computer science, conditionals are programming language commands for handling decisions. Specifically, conditionals perform different computations or actions depending on whether a programmer-defined Boolean condition evaluates to true or false. In terms of control flow, the decision is always achieved by selectively altering the control flow based on some condition . Although dynamic dispatch is not usually classified as a conditional construct, it is another way to select between alternatives at runtime. Conditional statements are the checkpoints in the programe that determines behaviour according to situation.

In computer programming, the ternary conditional operator is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, ternary if, or inline if. An expression a ? b : c evaluates to b if the value of a is true, and otherwise to c. One can read it aloud as "if a then b otherwise c". The form a ? b : c is by far and large the most common, but alternative syntaxes do exist; for example, Raku uses the syntax a ?? b !! c to avoid confusion with the infix operators ? and !, whereas in Visual Basic .NET, it instead takes the form If(a, b, c).

Short-circuit evaluation, minimal evaluation, or McCarthy evaluation is the semantics of some Boolean operators in some programming languages in which the second argument is executed or evaluated only if the first argument does not suffice to determine the value of the expression: when the first argument of the AND function evaluates to false, the overall value must be false; and when the first argument of the OR function evaluates to true, the overall value must be true.

A bit array is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level parallelism in hardware to perform operations quickly. A typical bit array stores kw bits, where w is the number of bits in the unit of storage, such as a byte or word, and k is some nonnegative integer. If w does not divide the number of bits to be stored, some space is wasted due to internal fragmentation.

In computer science, a relational operator is a programming language construct or operator that tests or defines some kind of relation between two entities. These include numerical equality and inequalities.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

<span class="mw-page-title-main">Null (SQL)</span> Marker used in SQL databases to indicate a value does not exist

In SQL, null or NULL is a special marker used to indicate that a data value does not exist in the database. Introduced by the creator of the relational database model, E. F. Codd, SQL null serves to fulfil the requirement that all true relational database management systems (RDBMS) support a representation of "missing information and inapplicable information". Codd also introduced the use of the lowercase Greek omega (ω) symbol to represent null in database theory. In SQL, NULL is a reserved word used to identify this marker.

<span class="mw-page-title-main">C data types</span> Data types supported by the C programming language

In the C programming language, data types constitute the semantics and characteristics of storage of data elements. They are expressed in the language syntax in form of declarations for memory locations or variables. Data types also determine the types of operations or methods of processing of data elements.

<span class="mw-page-title-main">JavaScript syntax</span> Set of rules defining correctly structured programs

The syntax of JavaScript is the set of rules that define a correctly structured JavaScript program.

In computer programming, an enumerated type is a data type consisting of a set of named values called elements, members, enumeral, or enumerators of the type. The enumerator names are usually identifiers that behave as constants in the language. An enumerated type can be seen as a degenerate tagged union of unit type. A variable that has been declared as having an enumerated type can be assigned any of the enumerators as a value. In other words, an enumerated type has values that are different from each other, and that can be compared and assigned, but are not specified by the programmer as having any particular concrete representation in the computer's memory; compilers and interpreters can represent them arbitrarily.

In computer programming, a semipredicate problem occurs when a subroutine intended to return a useful value can fail, but the signalling of failure uses an otherwise valid return value. The problem is that the caller of the subroutine cannot tell what the result means in this case.

Nullable types are a feature of some programming languages which allow a value to be set to the special value NULL instead of the usual possible values of the data type. In statically typed languages, a nullable type is an option type, while in dynamically typed languages, equivalent behavior is provided by having a single null value.

The null coalescing operator is a binary operator that is part of the syntax for a basic conditional expression in several programming languages, including C#, PowerShell as of version 7.0.0, Perl as of version 5.10, Swift, and PHP 7.0.0. While its behavior differs between implementations, the null coalescing operator generally returns the result of its left-most operand if it exists and is not null, and otherwise returns the right-most operand. This behavior allows a default value to be defined for cases where a more specific value is not available.

Ion is a data serialization language developed by Amazon. It may be represented by either a human-readable text form or a compact binary form. The text form is a superset of JSON; thus, any valid JSON document is also a valid Ion document.

The syntax of the SQL programming language is defined and maintained by ISO/IEC SC 32 as part of ISO/IEC 9075. This standard is not freely available. Despite the existence of the standard, SQL code is not completely portable among different database systems without adjustments.

References

  1. Kernighan, Brian W; Ritchie, Dennis M (1978). The C Programming Language (1st ed.). Englewood Cliffs, NJ: Prentice Hall. p.  41. ISBN   0-13-110163-3.
  2. Plauger, PJ; Brodie, Jim (1992) [1989]. ANSI and ISO Standard C Programmer's reference. Microsoft Press. pp.  86–93. ISBN   1-55615-359-7.
  3. "Report on the Algorithmic Language ALGOL 68, Section 10.2.2" (PDF). August 1968. Archived (PDF) from the original on 6 April 2008. Retrieved 30 April 2007.
  4. Digital Equipment Corporation, DECSystem10 FORTRAN IV Programmers Reference Manual. Reprinted in Mathematical Languages Handbook. Online version Archived 2011-08-14 at the Wayback Machine accessed 2011-11-16.
  5. "CLHS: Type BOOLEAN".
  6. "Guides and Sample Code". developer.apple.com. Archived from the original on 7 September 2011. Retrieved 1 May 2018.
  7. "Java Booleans". W3Schools Online Web Tutorials. Retrieved 2021-02-17.
  8. "perlsyn - Perl Syntax / Truth and Falsehood". Archived from the original on 26 August 2013. Retrieved 10 September 2013.
  9. "PEP 285 -- Adding a bool type". 4 May 2011. Archived from the original on 28 March 2018. Retrieved 28 March 2018.
  10. van Rossum, Guido (3 April 2002). "PEP 285 -- Adding a bool type". Archived from the original on 1 May 2013. Retrieved 15 May 2013.
  11. "Expressions". Python v3.3.2 documentation. Archived from the original on 22 May 2013. Retrieved 15 May 2013.
  12. "ECMAScript Language Specification" (PDF). p. 43. Archived from the original (PDF) on 2015-04-12. Retrieved 2011-03-12.
  13. "The Elements of JavaScript Style". Douglas Crockford. Archived from the original on 17 March 2011. Retrieved 5 March 2011.
  14. C. Date (2011). SQL and Relational Theory: How to Write Accurate SQL Code. O'Reilly Media, Inc. p. 83. ISBN   978-1-4493-1640-2.
  15. ISO/IEC 9075-2:2011 §4.5
  16. Martyn Prigmore (2007). Introduction to Databases With Web Applications. Pearson Education Canada. p. 197. ISBN   978-0-321-26359-9.
  17. Troels Arvin, Survey of BOOLEAN data type implementation Archived 2005-03-09 at the Wayback Machine
  18. "PostgreSQL: Documentation: 10: 8.6. Boolean Type". www.postgresql.org. Archived from the original on 9 March 2018. Retrieved 1 May 2018.
  19. "Migrate an Access database to SQL Server". support.microsoft.com. Retrieved 2020-10-19.
  20. o365devx. "SQL data types (Access desktop database reference)". docs.microsoft.com. Retrieved 2020-10-19.
  21. "Introduction to data types and field properties". support.microsoft.com. Retrieved 2020-10-19.
  22. "Boolean Data - MS-Access Tutorial". sourcedaddy.com. Retrieved 2020-10-19.
  23. "Boolean Type". 27 October 2016.
  24. "MySQL :: MySQL 8.0 Reference Manual :: 12.1.1 Numeric Type Overview". dev.mysql.com. Archived from the original on 2016-09-22.
  25. "MySQL :: MySQL 8.0 Reference Manual :: 9.1.6 Boolean Literals". dev.mysql.com.
  26. "Data Types". help.tableau.com. Retrieved 2020-10-19.
  27. "Formatting Calculations in Tableau". help.tableau.com. Retrieved 2020-10-19.
  28. "Boolean makes Tableau faster - true or false?". TAR Solutions. 2020-09-11. Retrieved 2020-10-19.
  29. "4. Decisions, Decisions..." Forth Inc. 2022-02-11. Retrieved 2022-02-11.