Swap (computer programming)

Last updated December 10, 2022

In computer programming, the act of swapping two variables refers to mutually exchanging the values of the variables. Usually, this is done with the data in memory. For example, in a program, two variables may be defined thus (in pseudocode):

Using a temporary variable

The simplest and probably most widely used method to swap two variables is to use a third temporary variable:

define swap (x, y)     temp := x     x := y     y := temp

While this is conceptually simple and in many cases the only convenient way to swap two variables, it uses extra memory. Although this should not be a problem in most applications, the sizes of the values being swapped may be huge (which means the temporary variable may occupy a lot of memory as well), or the swap operation may need to be performed many times, as in sorting algorithms.

In addition, swapping two variables in object-oriented languages such as C++ may involve one call to the class constructor and destructor for the temporary variable, and three calls to the copy constructor. Some classes may allocate memory in the constructor and deallocate it in the destructor, thus creating expensive calls to the system. Copy constructors for classes containing a lot of data, e.g. in an array, may even need to copy the data manually.

XOR swap

XOR swap uses the XOR operation to swap two numeric variables. It is generally touted to be faster than the naive method mentioned above; however it does have disadvantages. XOR swap is generally used to swap low-level data types, like integers. However, it is, in theory, capable of swapping any two values which can be represented by fixed-length bitstrings.

Swap through addition and subtraction

This method swaps two variables by adding and subtracting their values. This is rarely used in practical applications, mainly because:

It can only swap numeric variables; it may not be possible or logical to add or subtract complex data types, like containers.
When swapping variables of a fixed size, arithmetic overflow becomes an issue.
It does not work generally for floating-point values, because floating-point arithmetic is non-associative.

Swapping containers

Containers which allocate memory from the heap using pointers may be swapped in a single operation, by swapping the pointers alone. This is usually found in programming languages supporting pointers, like C or C++. The Standard Template Library overloads its built-in swap function to exchange the contents of containers efficiently this way.^[1]

As pointer variables are usually of a fixed size (e.g., most desktop computers have pointers 64 bits long), and they are numeric, they can be swapped quickly using XOR swap.

Parallel assignment

Some languages, like Ruby or Python support parallel assignments, which simplifies the notation for swapping two variables:

a, b = b, a

This is shorthand for an operation involving an intermediate data structure: in Python, a tuple; in Ruby, an array.

Javascript 6+ supports destructuring operators which do the same thing:

[a, b] = [b, a];

Function swap

Here, two globally scoped variables are passed by value through a function, eliminating the need for a temporary storage variable.

x=1;y=2;console.log(x+" "+y);// outputs 1 2functionswap(a,b){x=b;y=a;}swap(x,y);console.log(x+" "+y);// outputs 2 1

Facilitation of swapping in modern computers

Dedicated instructions

Because of the many applications of swapping data in computers, most processors now provide the ability to swap variables directly through built-in instructions. x86 processors, for example, include an XCHG instruction to swap two registers directly without requiring that a third temporary register is used. A compare-and-swap instruction is even provided in some processor architectures, which compares and conditionally swaps two registers. This is used to support mutual exclusion techniques.

XCHG may not be as efficient as it seems. For example, in x86 processors, XCHG will implicitly lock access to any operands in memory to ensure the operation is atomic, and so may not be efficient when swapping memory. Such locking is important when it is used to implement thread-safe synchronization, as in mutexes. However, an XCHG is usually the fastest way to swap two machine-size words residing in registers. Register renaming may also be used to swap registers efficiently.

Parallel execution

With the advent of instruction pipelining in modern computers and multi-core processors facilitating parallel computing, two or more operations can be performed at once. This can speed up the swap using temporary variables and give it an edge over other algorithms. For example, the XOR swap algorithm requires sequential execution of three instructions. However, using two temporary registers, two processors executing in parallel can swap two variables in two clock cycles:

Step 1  Processor 1: temp_1 := X  Processor 2: temp_2 := Y  Step 2  Processor 1: X := temp_2  Processor 2: Y := temp_1

More temporary registers are used, and four instructions are needed instead of three. In any case, in practice this could not be implemented in separate processors, as it violates Bernstein's conditions for parallel computing. It would be infeasible to keep the processors sufficiently in sync with one another for this swap to have any significant advantage over traditional versions. However, it can be used to optimize swapping for a single processor with multiple load/store units.

Related Research Articles

In computing, an optimizing compiler is a compiler that tries to minimize or maximize some attributes of an executable computer program. Common requirements are to minimize a program's execution time, memory footprint, storage size, and power consumption.

In computing, a vector processor or array processor is a central processing unit (CPU) that implements an instruction set where its instructions are designed to operate efficiently and effectively on large one-dimensional arrays of data called vectors. This is in contrast to scalar processors, whose instructions operate on single data items only, and in contrast to some of those same scalar processors having additional single instruction, multiple data (SIMD) or SWAR Arithmetic Units. Vector processors can greatly improve performance on certain workloads, notably numerical simulation and similar tasks. Vector processing techniques also operate in video-game console hardware and in graphics accelerators.

In computer science and computer programming, a data type is a set of possible values and a set of allowed operations on it. A data type tells the compiler or interpreter how the programmer intends to use the data. Most programming languages support basic data types of integer numbers, floating-point numbers, characters and Booleans. A data type constrains the possible values that an expression, such as a variable or a function, might take. This data type defines the operations that can be done on the data, the meaning of the data, and the way values of that type can be stored.

In computer programming, the exclusive or swap is an algorithm that uses the exclusive or bitwise operation to swap the values of two variables without using the temporary variable which is normally required.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

In object-oriented programming (OOP), the object lifetime of an object is the time between an object's creation and its destruction. Rules for object lifetime vary significantly between languages, in some cases between implementations of a given language, and lifetime of a particular object may vary from one run of the program to another.

In software engineering, a spinlock is a lock that causes a thread trying to acquire it to simply wait in a loop ("spin") while repeatedly checking whether the lock is available. Since the thread remains active but is not performing a useful task, the use of such a lock is a kind of busy waiting. Once acquired, spinlocks will usually be held until they are explicitly released, although in some implementations they may be automatically released if the thread being waited on blocks or "goes to sleep".

In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operations and directly supported by the processor. Most bitwise operations are presented as two-operand instructions where the result replaces one of the input operands.

In computer science, a stack is an abstract data type that serves as a collection of elements, with two main operations:

An XOR linked list is a type of data structure used in computer programming. It takes advantage of the bitwise XOR operation to decrease storage requirements for doubly linked lists.

In computer science, computer engineering and programming language implementations, a stack machine is a computer processor or a virtual machine in which the primary interaction is moving short-lived temporary values to and from a push down stack. In the case of a hardware processor, a hardware stack is used. The use of a stack significantly reduces the required number of processor registers. Stack machines extend push-down automata with additional load/store operations or multiple stacks and hence are Turing-complete.

In computer science, compare-and-swap (CAS) is an atomic instruction used in multithreading to achieve synchronization. It compares the contents of a memory location with a given value and, only if they are the same, modifies the contents of that memory location to a new given value. This is done as a single atomic operation. The atomicity guarantees that the new value is calculated based on up-to-date information; if the value had been updated by another thread in the meantime, the write would fail. The result of the operation must indicate whether it performed the substitution; this can be done either with a simple boolean response, or by returning the value read from the memory location.

In object-oriented programming, a destructor is a method which is invoked mechanically just before the memory of the object is released. It can happen when its lifetime is bound to scope and the execution leaves the scope, when it is embedded in another object whose lifetime ends, or when it was allocated dynamically and is released explicitly. Its main purpose is to free the resources which were acquired by the object during its life and/or deregister from other entities which may keep references to it. Use of destructors is needed for the process of Resource Acquisition Is Initialization (RAII).

<span class="mw-page-title-main">Quicksort</span> Divide and conquer sorting algorithm

Quicksort is an efficient, general-purpose sorting algorithm. Quicksort was developed by British computer scientist Tony Hoare in 1959 and published in 1961, it is still a commonly used algorithm for sorting. Overall, it is slightly faster than merge sort and heapsort for randomized data, particularly on larger distributions.

A class in C++ is a user-defined type or data structure declared with keyword class that has data and functions as its members whose access is governed by the three access specifiers private, protected or public. By default access to members of a C++ class is private. The private members are not accessible outside the class; they can be accessed only through methods of the class. The public members form an interface to the class and are accessible outside the class.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

Memory ordering describes the order of accesses to computer memory by a CPU. The term can refer either to the memory ordering generated by the compiler during compile time, or to the memory ordering generated by a CPU during runtime.

The HP 35s (F2215A) is a Hewlett-Packard non-graphing programmable scientific calculator. Although it is a successor to the HP 33s, it was introduced to commemorate the 35th anniversary of the HP-35, Hewlett-Packard's first pocket calculator. HP also released a limited production anniversary edition with shiny black overlay and engraving "Celebrating 35 years".

Samplesort is a sorting algorithm that is a divide and conquer algorithm often used in parallel processing systems. Conventional divide and conquer sorting algorithms partitions the array into sub-intervals or buckets. The buckets are then sorted individually and then concatenated together. However, if the array is non-uniformly distributed, the performance of these sorting algorithms can be significantly throttled. Samplesort addresses this issue by selecting a sample of size $s$ from the $n$ -element sequence, and determining the range of the buckets by sorting the sample and choosing $p -1 < s$ elements from the result. These elements then divide the array into $p$ approximately equal-sized buckets. Samplesort is described in the 1970 paper, "Samplesort: A Sampling Approach to Minimal Storage Tree Sorting", by W. D. Frazer and A. C. McKellar.

In computer science, persistent memory is any method or apparatus for efficiently storing data structures such that they can continue to be accessed using memory instructions or memory APIs even after the end of the process that created or last modified them.

References

↑ "HPPSocialUserSignonPage - Hewlett Packard Enterprise Community".

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "HPPSocialUserSignonPage - Hewlett Packard Enterprise Community".

[1]