In computer programming, the exclusive or swap (sometimes shortened to XOR swap) is an algorithm that uses the exclusive or bitwise operation to swap the values of two variables without using the temporary variable which is normally required.
The algorithm is primarily a novelty and a way of demonstrating properties of the exclusive or operation. It is sometimes discussed as a program optimization, but there are almost no cases where swapping via exclusive or provides benefit over the standard, obvious technique.
Conventional swapping requires the use of a temporary storage variable. Using the XOR swap algorithm, however, no temporary storage is needed. The algorithm is as follows: [1] [2]
X:=YXORX;// XOR the values and store the result in XY:=XXORY;// XOR the values and store the result in YX:=YXORX;// XOR the values and store the result in X
Since XOR is a commutative operation, either X XOR Y or Y XOR X can be used interchangeably in any of the foregoing three lines. Note that on some architectures the first operand of the XOR instruction specifies the target location at which the result of the operation is stored, preventing this interchangeability. The algorithm typically corresponds to three machine-code instructions, represented by corresponding pseudocode and assembly instructions in the three rows of the following table:
Pseudocode | IBM System/370 assembly | x86 assembly | RISC-V assembly |
---|---|---|---|
X:=XXORY | XRR1,R2 | xoreax,ebx | xorx10,x11 |
Y:=YXORX | XRR2,R1 | xorebx,eax | xorx11,x10 |
X:=XXORY | XRR1,R2 | xoreax,ebx | xorx10,x11 |
In the above System/370 assembly code sample, R1 and R2 are distinct registers, and each XR
operation leaves its result in the register named in the first argument. Using x86 assembly, values X and Y are in registers eax and ebx (respectively), and xor
places the result of the operation in the first register. In RISC-V assembly, value X and Y are in registers X10 and X11, and xor
places the result of the operation in the first register (same as X86)
However, in the pseudocode or high-level language version or implementation, the algorithm fails if x and y use the same storage location, since the value stored in that location will be zeroed out by the first XOR instruction, and then remain zero; it will not be "swapped with itself". This is not the same as if x and y have the same values. The trouble only comes when x and y use the same storage location, in which case their values must already be equal. That is, if x and y use the same storage location, then the line:
X:=XXORY
sets x to zero (because x = y so X XOR Y is zero) and sets y to zero (since it uses the same storage location), causing x and y to lose their original values.
The binary operation XOR over bit strings of length exhibits the following properties (where denotes XOR): [lower-alpha 1]
Suppose that we have two distinct registers R1
and R2
as in the table below, with initial values A and B respectively. We perform the operations below in sequence, and reduce our results using the properties listed above.
Step | Operation | Register 1 | Register 2 | Reduction |
---|---|---|---|---|
0 | Initial value | — | ||
1 | R1 := R1 XOR R2 | — | ||
2 | R2 := R1 XOR R2 | L2 L4 L3 | ||
3 | R1 := R1 XOR R2 | L1 L2 L4 L3 |
As XOR can be interpreted as binary addition and a pair of bits can be interpreted as a vector in a two-dimensional vector space over the field with two elements, the steps in the algorithm can be interpreted as multiplication by 2×2 matrices over the field with two elements. For simplicity, assume initially that x and y are each single bits, not bit vectors.
For example, the step:
X:=XXORY
which also has the implicit:
Y:=Y
corresponds to the matrix as
The sequence of operations is then expressed as:
(working with binary values, so ), which expresses the elementary matrix of switching two rows (or columns) in terms of the transvections (shears) of adding one element to the other.
To generalize to where X and Y are not single bits, but instead bit vectors of length n, these 2×2 matrices are replaced by 2n×2n block matrices such as
These matrices are operating on values, not on variables (with storage locations), hence this interpretation abstracts away from issues of storage location and the problem of both variables sharing the same storage location.
A C function that implements the XOR swap algorithm:
voidXorSwap(int*x,int*y){if(x==y)return;*x^=*y;*y^=*x;*x^=*y;}
The code first checks if the addresses are distinct and uses a guard clause to exit the function early if they are equal. Without that check, if they were equal, the algorithm would fold to a triple *x ^= *x
resulting in zero.
The XOR swap algorithm can also be defined with a macro:
#define XORSWAP_UNSAFE(a, b) \ ((a) ^= (b), (b) ^= (a), \ (a) ^= (b)) /* Doesn't work when a and b are the same object - assigns zero \ (0) to the object in that case */#define XORSWAP(a, b) \ ((&(a) == &(b)) ? (a) /* Check for distinct addresses */ \ : XORSWAP_UNSAFE(a, b))
On modern CPU architectures, the XOR technique can be slower than using a temporary variable to do swapping. At least on recent x86 CPUs, both by AMD and Intel, moving between registers regularly incurs zero latency. (This is called MOV-elimination.) Even if there is not any architectural register available to use, the XCHG
instruction will be at least as fast as the three XORs taken together. Another reason is that modern CPUs strive to execute instructions in parallel via instruction pipelines. In the XOR technique, the inputs to each operation depend on the results of the previous operation, so they must be executed in strictly sequential order, negating any benefits of instruction-level parallelism. [3]
The XOR swap is also complicated in practice by aliasing. If an attempt is made to XOR-swap the contents of some location with itself, the result is that the location is zeroed out and its value lost. Therefore, XOR swapping must not be used blindly in a high-level language if aliasing is possible. This issue does not apply if the technique is used in assembly to swap the contents of two registers.
Similar problems occur with call by name, as in Jensen's Device, where swapping i
and A[i]
via a temporary variable yields incorrect results due to the arguments being related: swapping via temp = i; i = A[i]; A[i] = temp
changes the value for i
in the second statement, which then results in the incorrect i
value for A[i]
in the third statement.
The underlying principle of the XOR swap algorithm can be applied to any operation meeting criteria L1 through L4 above. Replacing XOR by addition and subtraction gives various slightly different, but largely equivalent, formulations. For example: [4]
voidAddSwap(unsignedint*x,unsignedint*y){*x=*x+*y;*y=*x-*y;*x=*x-*y;}
Unlike the XOR swap, this variation requires that the underlying processor or programming language uses a method such as modular arithmetic or bignums to guarantee that the computation of X + Y
cannot cause an error due to integer overflow. Therefore, it is seen even more rarely in practice than the XOR swap.
However, the implementation of AddSwap
above in the C programming language always works even in case of integer overflow, since, according to the C standard, addition and subtraction of unsigned integers follow the rules of modular arithmetic, i. e. are done in the cyclic group where is the number of bits of unsigned int
. Indeed, the correctness of the algorithm follows from the fact that the formulas and hold in any abelian group. This generalizes the proof for the XOR swap algorithm: XOR is both the addition and subtraction in the abelian group (which is the direct sum of s copies of ).
This doesn't hold when dealing with the signed int
type (the default for int
). Signed integer overflow is an undefined behavior in C and thus modular arithmetic is not guaranteed by the standard, which may lead to incorrect results.
The sequence of operations in AddSwap
can be expressed via matrix multiplication as:
On architectures lacking a dedicated swap instruction, because it avoids the extra temporary register, the XOR swap algorithm is required for optimal register allocation. This is particularly important for compilers using static single assignment form for register allocation; these compilers occasionally produce programs that need to swap two registers when no registers are free. The XOR swap algorithm avoids the need to reserve an extra register or to spill any registers to main memory. [5] The addition/subtraction variant can also be used for the same purpose. [6]
This method of register allocation is particularly relevant to GPU shader compilers. On modern GPU architectures, spilling variables is expensive due to limited memory bandwidth and high memory latency, while limiting register usage can improve performance due to dynamic partitioning of the register file. The XOR swap algorithm is therefore required by some GPU compilers. [7]
The Mersenne Twister is a general-purpose pseudorandom number generator (PRNG) developed in 1997 by Makoto Matsumoto and Takuji Nishimura. Its name derives from the choice of a Mersenne prime as its period length.
In computing, a linear-feedback shift register (LFSR) is a shift register whose input bit is a linear function of its previous state.
A mathematical symbol is a figure or a combination of figures that is used to represent a mathematical object, an action on mathematical objects, a relation between mathematical objects, or for structuring the other symbols that occur in a formula. As formulas are entirely constituted with symbols of various types, many symbols are needed for expressing all mathematics.
Exclusive or, exclusive disjunction, exclusive alternation, logical non-equivalence, or logical inequality is a logical operator whose negation is the logical biconditional. With two inputs, XOR is true if and only if the inputs differ. With multiple inputs, XOR is true if and only if the number of true inputs is odd.
In group theory, the quaternion group Q8 (sometimes just denoted by Q) is a non-abelian group of order eight, isomorphic to the eight-element subset of the quaternions under multiplication. It is given by the group presentation
In mathematics, a Gaussian function, often simply referred to as a Gaussian, is a function of the base form and with parametric extension for arbitrary real constants a, b and non-zero c. It is named after the mathematician Carl Friedrich Gauss. The graph of a Gaussian is a characteristic symmetric "bell curve" shape. The parameter a is the height of the curve's peak, b is the position of the center of the peak, and c controls the width of the "bell".
In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operations and directly supported by the processor. Most bitwise operations are presented as two-operand instructions where the result replaces one of the input operands.
In mathematics, and in particular linear algebra, the Moore–Penrose inverse of a matrix , often called the pseudoinverse, is the most widely known generalization of the inverse matrix. It was independently described by E. H. Moore in 1920, Arne Bjerhammar in 1951, and Roger Penrose in 1955. Earlier, Erik Ivar Fredholm had introduced the concept of a pseudoinverse of integral operators in 1903. The terms pseudoinverse and generalized inverse are sometimes used as synonyms for the Moore–Penrose inverse of a matrix, but sometimes applied to other elements of algebraic structures which share some but not all properties expected for an inverse element.
In cryptanalysis, the piling-up lemma is a principle used in linear cryptanalysis to construct linear approximations to the action of block ciphers. It was introduced by Mitsuru Matsui (1993) as an analytical tool for linear cryptanalysis. The lemma states that the bias of a linear Boolean function (XOR-clause) of independent binary random variables is related to the product of the input biases:
In mathematics and computer algebra, automatic differentiation, also called algorithmic differentiation, computational differentiation, is a set of techniques to evaluate the partial derivative of a function specified by a computer program.
In quantum computing and specifically the quantum circuit model of computation, a quantum logic gate is a basic quantum circuit operating on a small number of qubits. Quantum logic gates are the building blocks of quantum circuits, like classical logic gates are for conventional digital circuits.
In algebra, a split-complex number is based on a hyperbolic unitj satisfying A split-complex number has two real number components x and y, and is written The conjugate of z is Since the product of a number z with its conjugate is an isotropic quadratic form.
The direct sum is an operation between structures in abstract algebra, a branch of mathematics. It is defined differently, but analogously, for different kinds of structures. As an example, the direct sum of two abelian groups and is another abelian group consisting of the ordered pairs where and . To add ordered pairs, we define the sum to be ; in other words addition is defined coordinate-wise. For example, the direct sum , where is real coordinate space, is the Cartesian plane, . A similar process can be used to form the direct sum of two vector spaces or two modules.
In mathematics and computing, universal hashing refers to selecting a hash function at random from a family of hash functions with a certain mathematical property. This guarantees a low number of collisions in expectation, even if the data is chosen by an adversary. Many universal families are known, and their evaluation is often very efficient. Universal hashing has numerous uses in computer science, for example in implementations of hash tables, randomized algorithms, and cryptography.
In computer programming, the act of swapping two variables refers to mutually exchanging the values of the variables. Usually, this is done with the data in memory. For example, in a program, two variables may be defined thus :
data_item x := 1 data_item y := 0 swap ;
In numerical analysis and linear algebra, lower–upper (LU) decomposition or factorization factors a matrix as the product of a lower triangular matrix and an upper triangular matrix. The product sometimes includes a permutation matrix as well. LU decomposition can be viewed as the matrix form of Gaussian elimination. Computers usually solve square systems of linear equations using LU decomposition, and it is also a key step when inverting a matrix or computing the determinant of a matrix. The LU decomposition was introduced by the Polish astronomer Tadeusz Banachiewicz in 1938. To quote: "It appears that Gauss and Doolittle applied the method [of elimination] only to symmetric equations. More recent authors, for example, Aitken, Banachiewicz, Dwyer, and Crout … have emphasized the use of the method, or variations of it, in connection with non-symmetric problems … Banachiewicz … saw the point … that the basic problem is really one of matrix factorization, or “decomposition” as he called it." It is also sometimes referred to as LR decomposition.
In cryptography, differential equations of addition (DEA) are one of the most basic equations related to differential cryptanalysis that mix additions over two different groups (e.g. addition modulo 232 and addition over GF(2)) and where input and output differences are expressed as XORs.
A negative base may be used to construct a non-standard positional numeral system. Like other place-value systems, each position holds multiples of the appropriate power of the system's base; but that base is negative—that is to say, the base b is equal to −r for some natural number r.
Zhegalkinpolynomials, also known as algebraic normal form, are a representation of functions in Boolean algebra. Introduced by the Russian mathematician Ivan Ivanovich Zhegalkin in 1927, they are the polynomial ring over the integers modulo 2. The resulting degeneracies of modular arithmetic result in Zhegalkin polynomials being simpler than ordinary polynomials, requiring neither coefficients nor exponents. Coefficients are redundant because 1 is the only nonzero coefficient. Exponents are redundant because in arithmetic mod 2, x2 = x. Hence a polynomial such as 3x2y5z is congruent to, and can therefore be rewritten as, xyz.
In mathematics, Maass forms or Maass wave forms are studied in the theory of automorphic forms. Maass forms are complex-valued smooth functions of the upper half plane, which transform in a similar way under the operation of a discrete subgroup of as modular forms. They are eigenforms of the hyperbolic Laplace operator defined on and satisfy certain growth conditions at the cusps of a fundamental domain of . In contrast to modular forms, Maass forms need not be holomorphic. They were studied first by Hans Maass in 1949.