# Three-way comparison

Last updated

In computer science, a three-way comparison takes two values A and B belonging to a type with a total order and determines whether A < B, A = B, or A > B in a single operation, in accordance with the mathematical law of trichotomy.

## Machine-level computation

Many processors have instruction sets that support such an operation on primitive types. Some machines have signed integers based on a sign-and-magnitude or one's complement representation (see signed number representations), both of which allow a differentiated positive and negative zero. This does not violate trichotomy as long as a consistent total order is adopted: either −0 = +0 or −0 < +0 is valid. Common floating point types, however, have an exception to trichotomy: there is a special value "NaN" (Not a Number) such that x< NaN, x> NaN, and x = NaN are all false for all floating-point values x (including NaN itself).

## High-level languages

### Capabilities

In C, the functions `strcmp` and `memcmp` perform a three-way comparison between strings and memory buffers, respectively. They return a negative number when the first argument is lexicographically smaller than the second, zero when the arguments are equal, and a positive number otherwise. This convention of returning the "sign of the difference" is extended to arbitrary comparison functions by the standard sorting function ` qsort `, which takes a comparison function as an argument and requires it to abide by it.

In C++, the C++20 revision adds the "spaceship operator" `<=>`, which similarly returns the sign of the difference and can also return different types (convertible to signed integers) depending on the strictness of the comparison. 

In Perl (for numeric comparisons only, the `cmp` operator is used for string lexical comparisons), PHP (since version 7), Ruby, and Apache Groovy, the "spaceship operator" `<=>` returns the values −1, 0, or 1 depending on whether A < B, A = B, or A > B, respectively. The Python 2.x `cmp`(removed in 3.x), OCaml `compare`, and Kotlin `compareTo` functions compute the same thing. In the Haskell standard library, the three-way comparison function `compare` is defined for all types in the `Ord` class; it returns type `Ordering`, whose values are `LT` (less than), `EQ` (equal), and `GT` (greater than): 

`dataOrdering=LT|EQ|GT`

Many object-oriented languages have a three-way comparison method, which performs a three-way comparison between the object and another given object. For example, in Java, any class that implements the `Comparable` interface has a `compareTo` method which either returns a negative integer, zero, or a positive integer, or throws a `NullPointerException` (if one or both objects are `null`). Similarly, in the .NET Framework, any class that implements the `IComparable` interface has such a `CompareTo` method.

Since Java version 1.5, the same can be computed using the `Math.signum` static method if the difference can be known without computational problems such as arithmetic overflow mentioned below. Many computer languages allow the definition of functions so a compare(A,B) could be devised appropriately, but the question is whether or not its internal definition can employ some sort of three-way syntax or else must fall back on repeated tests.

When implementing a three-way comparison where a three-way comparison operator or method is not already available, it is common to combine two comparisons, such as A = B and A < B, or A < B and A > B. In principle, a compiler might deduce that these two expressions could be replaced by only one comparison followed by multiple tests of the result, but mention of this optimisation is not to be found in texts on the subject.

In some cases, three-way comparison can be simulated by subtracting A and B and examining the sign of the result, exploiting special instructions for examining the sign of a number. However, this requires the type of A and B to have a well-defined difference. Fixed-width signed integers may overflow when they are subtracted, floating-point numbers have the value NaN with undefined sign, and character strings have no difference function corresponding to their total order. At the machine level, overflow is typically tracked and can be used to determine order after subtraction, but this information is not usually available to higher-level languages.

In one case of a three-way conditional provided by the programming language, Fortran's now-deprecated three-way arithmetic IF statement considers the sign of an arithmetic expression and offers three labels to jump to according to the sign of the result:

`IF(expression)negative,zero,positive`

The common library function strcmp in C and related languages is a three-way lexicographic comparison of strings; however, these languages lack a general three-way comparison of other data types.

### Spaceship operator

The three-way comparison operator for numbers is denoted as `<=>` in Perl, Ruby, Apache Groovy, PHP, Eclipse Ceylon, and C++, and is called the spaceship operator. 

The name's origin is due to it reminding Randal L. Schwartz of the spaceship in an HP BASIC Star Trek game.  Another coder has suggested that it was so named because it looked similar to Darth Vader's TIE fighter in the Star Wars saga. 

Example in PHP:

`echo1<=>1;// 0echo1<=>2;// -1echo2<=>1;// 1`

### Composite data types

Three-way comparisons have the property of being easy to compose and build lexicographic comparisons of non-primitive data types, unlike two-way comparisons.

Here is a composition example in Perl.

`subcompare(\$\$){my(\$a,\$b)=@_;return\$a->{unit}cmp\$b->{unit}||\$a->{rank}<=>\$b->{rank}||\$a->{name}cmp\$b->{name};}`

Note that `cmp`, in Perl, is for strings, since `<=>` is for numbers. Two-way equivalents tend to be less compact but not necessarily less legible. The above takes advantage of short-circuit evaluation of the `||` operator, and the fact that 0 is considered false in Perl. As a result, if the first comparison is equal (thus evaluates to 0), it will "fall through" to the second comparison, and so on, until it finds one that is non-zero, or until it reaches the end.

In some languages, including Python, Ruby, Haskell, etc., comparison of lists is done lexicographically, which means that it is possible to build a chain of comparisons like the above example by putting the values into lists in the order desired; for example, in Ruby:

`[a.unit,a.rank,a.name]<=>[b.unit,b.rank,b.name]`

## Related Research Articles

IEEE 754-1985 was an industry standard for representing floating-point numbers in computers, officially adopted in 1985 and superseded in 2008 by IEEE 754-2008, and then again in 2019 by minor revision IEEE 754-2019. During its 23 years, it was the most widely used format for floating-point computation. It was implemented in software, in the form of floating-point libraries, and in hardware, in the instructions of many CPUs and FPUs. The first integrated circuit to implement the draft of what was to become IEEE 754-1985 was the Intel 8087. In computer programming, a string is traditionally a sequence of characters, either as a literal constant or as some kind of variable. The latter may allow its elements to be mutated and the length changed, or it may be fixed. A string is generally considered as a data type and is often implemented as an array data structure of bytes that stores a sequence of elements, typically characters, using some character encoding. String may also denote more general arrays or other sequence data types and structures.

In computing, NaN, standing for Not a Number, is a member of a numeric data type that can be interpreted as a value that is undefined or unrepresentable, especially in floating-point arithmetic. Systematic use of NaNs was introduced by the IEEE 754 floating-point standard in 1985, along with the representation of other non-finite quantities such as infinities.

In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral at the level of its individual bits. It is a fast and simple action, basic to the higher level arithmetic operations and directly supported by the processor. Most bitwise operations are presented as two-operand instructions where the result replaces one of the input operands.

In computer science, primitive data type is either of the following: In computer science, conditional statements, conditional expressions and conditional constructs are features of a programming language, which perform different computations or actions depending on whether a programmer-specified boolean condition evaluates to true or false. Apart from the case of branch predication, this is always achieved by selectively altering the control flow based on some condition.

This is a list of operators in the C and C++ programming languages. All the operators listed exist in C++; the fourth column "Included in C", states whether an operator is also present in C. Note that C does not support operator overloading.

In computer science, arbitrary-precision arithmetic, also called bignum arithmetic, multiple-precision arithmetic, or sometimes infinite-precision arithmetic, indicates that calculations are performed on numbers whose digits of precision are limited only by the available memory of the host system. This contrasts with the faster fixed-precision arithmetic found in most arithmetic logic unit (ALU) hardware, which typically offers between 8 and 64 bits of precision.

In computer programming, `?:` is a ternary operator that is part of the syntax for basic conditional expressions in several programming languages. It is commonly referred to as the conditional operator, inline if (iif), or ternary if. An expression `a ? b : c` evaluates to `b` if the value of `a` is true, and otherwise to `c`. One can read it aloud as "if a then b otherwise c".

In computer programming, operators are constructs defined within programming languages which behave generally like functions, but which differ syntactically or semantically.

XPath 2.0 is a version of the XPath language defined by the World Wide Web Consortium, W3C. It became a recommendation on 23 January 2007. As a W3C Recommendation it was superseded by XPath 3.0 on 10 April 2014.

In computer science, the Boolean data type is a data type that has one of two possible values which is intended to represent the two truth values of logic and Boolean algebra. It is named after George Boole, who first defined an algebraic system of logic in the mid 19th century. The Boolean data type is primarily associated with conditional statements, which allow different actions by changing control flow depending on whether a programmer-specified Boolean condition evaluates to true or false. It is a special case of a more general logical data type —logic doesn't always need to be Boolean.

A bit array is an array data structure that compactly stores bits. It can be used to implement a simple set data structure. A bit array is effective at exploiting bit-level parallelism in hardware to perform operations quickly. A typical bit array stores kw bits, where w is the number of bits in the unit of storage, such as a byte or word, and k is some nonnegative integer. If w does not divide the number of bits to be stored, some space is wasted due to internal fragmentation.

In computer science, a relational operator is a programming language construct or operator that tests or defines some kind of relation between two entities. These include numerical equality and inequalities.

In computer programming, a sigil is a symbol affixed to a variable name, showing the variable's datatype or scope, usually a prefix, as in `\$foo`, where `\$` is the sigil.

The computer programming languages C and Pascal have similar times of origin, influences, and purposes. Both were used to design their own compilers early in their lifetimes. The original Pascal definition appeared in 1969 and a first compiler in 1970. The first version of C appeared in 1972.

Signed zero is zero with an associated sign. In ordinary arithmetic, the number 0 does not have a sign, so that −0, +0 and 0 are identical. However, in computing, some number representations allow for the existence of two zeros, often denoted by −0 and +0, regarded as equal by the numerical comparison operations but with possible different behaviors in particular operations. This occurs in the sign and magnitude and ones' complement signed number representations for integers, and in most floating-point number representations. The number 0 is usually encoded as +0, but can be represented by either +0 or −0.

The syntax of the Python programming language is the set of rules that defines how a Python program will be written and interpreted. The Python language has many similarities to Perl, C, and Java. However, there are some definite differences between the languages.

This article compares a large number of programming languages by tabulating their data types, their expression, statement, and declaration syntax, and some common operating-system interfaces.

Some programming languages provide a built-in (primitive) rational data type to represent rational numbers like 1/3 and -11/17 without rounding, and to do arithmetic on them. Examples are the `ratio` type of Common Lisp, and analogous types provided by most languages for algebraic computation, such as Mathematica and Maple. Many languages that do not have a built-in rational type still provide it as a library-defined type.

1. Herb Sutter proposed adding a three-way comparison operator to the C++ standard with the `<=>` syntax, in a paper entitled "Consistent Comparison". See "Consistent Comparison" It was successfully merged into the C++20 draft in November 2017.
2. "Math::Complex". Perl Programming Documentation. Retrieved 26 September 2014.
3. "Super Spaceship Operator". 2000-12-08. Retrieved 2014-08-06.