Carry-skip adder

Last updated August 31, 2024

A carry-skip adder^{[nb 1]} (also known as a carry-bypass adder) is an adder implementation that improves on the delay of a ripple-carry adder with little effort compared to other adders. The improvement of the worst-case delay is achieved by using several carry-skip adders to form a block-carry-skip adder.

Single carry-skip adder

The worst case for a simple one level ripple-carry adder occurs, when the propagate-condition^[1] is true for each digit pair $(a_{i},b_{i})$ . Then the carry-in ripples through the $n$ -bit adder and appears as the carry-out after $\tau _{CRA}(n)\approx n\cdot \tau _{VA}$ .

Full adder with additional generate and propagate signals. FAwithGP.svg — Full adder with additional generate and propagate signals.

For each operand input bit pair $(a_{i},b_{i})$ the propagate-conditions $p_{i}=a_{i}\oplus b_{i}$ are determined using an XOR-gate. When all propagate-conditions are true, then the carry-in bit $c_{0}$ determines the carry-out bit.

The n-bit-carry-skip adder consists of a n-bit-carry-ripple-chain, a n-input AND-gate and one multiplexer. Each propagate bit $p_{i}$ , that is provided by the carry-ripple-chain is connected to the n-input AND-gate. The resulting bit is used as the select bit of a multiplexer that switches either the last carry-bit $c_{n}$ or the carry-in $c_{0}$ to the carry-out signal $c_{out}$ .

$s=p_{n-1}\wedge p_{n-2}\wedge \dots \wedge p_{1}\wedge p_{0}=p_{[0:n-1]}$

This greatly reduces the latency of the adder through its critical path, since the carry bit for each block can now "skip" over blocks with a group propagate signal set to logic 1 (as opposed to a long ripple-carry chain, which would require the carry to ripple through each bit in the adder). The number of inputs of the AND-gate is equal to the width of the adder. For a large width, this becomes impractical and leads to additional delays, because the AND-gate has to be built as a tree. A good width is achieved, when the sum-logic has the same depth like the n-input AND-gate and the multiplexer.

4 bit carry-skip adder. CSAdder4Bit.svg — 4 bit carry-skip adder.

Performance

The critical path of a carry-skip-adder begins at the first full-adder, passes through all adders and ends at the sum-bit $s_{n-1}$ . Carry-skip-adders are chained (see block-carry-skip-adders) to reduce the overall critical path, since a single $n$ -bit carry-skip-adder has no real speed benefit compared to a $n$ -bit ripple-carry adder.

\tau _{CSA}(n)=\tau _{CRA}(n)

The skip-logic consists of a $m$ -input AND-gate and one multiplexer.

T_{SK}=T_{AND}(m)+T_{MUX}

As the propagate signals are computed in parallel and are early available, the critical path for the skip logic in a carry-skip adder consists only of the delay imposed by the multiplexer (conditional skip).

T_{CSK}=T_{MUX}=2D

.

Block carry-skip adders

16-bit fixed-block-carry-skip adder with a block size of 4 bit. BCSAdder16Bit.svg — 16-bit fixed-block-carry-skip adder with a block size of 4 bit.

Block-carry-skip adders are composed of a number of carry-skip adders. There are two types of block-carry-skip adders The two operands $A=(a_{n-1},a_{n-2},\dots ,a_{1},a_{0})$ and $B=(b_{n-1},b_{n-2},\dots ,b_{1},b_{0})$ are split in $k$ blocks of $(m_{k},m_{k-1},\dots ,m_{2},m_{1})$ bits.

Why are block-carry-skip-adders used?
Should the block-size be constant or variable?
Fixed block width vs. variable block width

Fixed size block-carry-skip adders

Fixed size block-carry-skip adders split the $n$ bits of the input bits into blocks of $m$ bits each, resulting in $k={\frac {n}{m}}$ blocks. The critical path consists of the ripple path and the skip element of the first block, the skip paths that are enclosed between the first and the last block, and finally the ripple-path of the last block.

T_{FCSA}(n)=T_{CRA_{[0:c_{out}]}}(m)+T_{CSK}+(k-2)\cdot T_{CSK}+T_{CRA}(m)=3D+m\cdot 2D+(k-1)\cdot 2D+(m+2)2D=(2m+k)\cdot 2D+5D

The optimal block size for a given adder width n is derived by equating to 0

{\frac {dT_{FCSA}(n)}{dm}}=0

2D\cdot \left(2-n\cdot {\frac {1}{m^{2}}}\right)=0

\Rightarrow m_{1,2}=\pm {\sqrt {\frac {n}{2}}}

Only positive block sizes are realizable

\Rightarrow m={\sqrt {\frac {n}{2}}}

Variable size block-carry-skip adders (VBA, Oklobdzija-Barnes)^[2]

The performance can be improved, i.e. all carries propagated more quickly by varying the block sizes. Accordingly the initial blocks of the adder are made smaller so as to quickly detect carry generates that must be propagated the furthers, the middle blocks are made larger because they are not the problem case, and then the most significant blocks are again made smaller so that the late arriving carry inputs can be processed quickly.

Multilevel carry-skip adders

By using additional skip-blocks in an additional layer, the block-propagate signals $p_{[i:i+3]}$ are further summarized and used to perform larger skips:

p_{[i:i+15]}=p_{[i:i+3]}\wedge p_{[i+4:i+7]}\wedge p_{[i+8:i+11]}\wedge p_{[i+12:i+15]}

Thus making the adder even faster.

Carry-skip optimization

The problem of determining the block sizes and number of levels required to make the physically fastest carry-skip adder is known as the 'carry-skip adder optimization problem'. This problem is made complex by the fact that a carry-skip adders are implemented with physical devices whose size and other parameters also affects addition time.

The carry-skip optimization problem for variable block sizes and multiple levels for an arbitrary device process node was solved by Oklobdzija and Barnes at IBM and published in 1985.

Implementation overview

Breaking this down into more specific terms, in order to build a 4-bit carry-bypass adder, 6 full adders would be needed. The input buses would be a 4-bit A and a 4-bit B, with a carry-in (CIN) signal. The output would be a 4-bit bus X and a carry-out signal (COUT).

The first two full adders would add the first two bits together. The carry-out signal from the second full adder ( $C_{1}$ )would drive the select signal for three 2 to 1 multiplexers. The second set of 2 full adders would add the last two bits assuming $C_{1}$ is a logical 0. And the final set of full adders would assume that $C_{1}$ is a logical 1.

The multiplexers then control which output signal is used for COUT, $X_{2}$ and $X_{3}$ .

Notes

↑ Carry-skip adder is often abbreviated as CSA, however, this can be confused with carry-save adder.

Related Research Articles

In theoretical physics, a Feynman diagram is a pictorial representation of the mathematical expressions describing the behavior and interaction of subatomic particles. The scheme is named after American physicist Richard Feynman, who introduced the diagrams in 1948. The interaction of subatomic particles can be complex and difficult to understand; Feynman diagrams give a simple visualization of what would otherwise be an arcane and abstract formula. According to David Kaiser, "Since the middle of the 20th century, theoretical physicists have increasingly turned to this tool to help them undertake critical calculations. Feynman diagrams have revolutionized nearly every aspect of theoretical physics." While the diagrams are applied primarily to quantum field theory, they can also be used in other areas of physics, such as solid-state theory. Frank Wilczek wrote that the calculations that won him the 2004 Nobel Prize in Physics "would have been literally unthinkable without Feynman diagrams, as would [Wilczek's] calculations that established a route to production and observation of the Higgs particle."

In computational complexity theory, the class NC (for "Nick's Class") is the set of decision problems decidable in polylogarithmic time on a parallel computer with a polynomial number of processors. In other words, a problem with input size n is in NC if there exist constants c and k such that it can be solved in time $O ((log n) c)$ using $O (n k)$ parallel processors. Stephen Cook coined the name "Nick's class" after Nick Pippenger, who had done extensive research on circuits with polylogarithmic depth and polynomial size.

<span class="mw-page-title-main">Multiplexer</span> A device that selects between several analog or digital input signals

In electronics, a multiplexer, also known as a data selector, is a device that selects between several analog or digital input signals and forwards the selected input to a single output line. The selection is directed by a separate set of digital inputs known as select lines. A multiplexer of $inputs has select lines, which are used to select which input line to send to the output.$

In automata theory, combinational logic is a type of digital logic that is implemented by Boolean circuits, where the output is a pure function of the present input only. This is in contrast to sequential logic, in which the output depends not only on the present input but also on the history of the input. In other words, sequential logic has memory while combinational logic does not.

Quantization, in mathematics and digital signal processing, is the process of mapping input values from a large set to output values in a (countable) smaller set, often with a finite number of elements. Rounding and truncation are typical examples of quantization processes. Quantization is involved to some degree in nearly all digital signal processing, as the process of representing a signal in digital form ordinarily involves rounding. Quantization also forms the core of essentially all lossy compression algorithms.

An adder, or summer, is a digital circuit that performs addition of numbers. In many computers and other kinds of processors, adders are used in the arithmetic logic units (ALUs). They are also used in other parts of the processor, where they are used to calculate addresses, table indices, increment and decrement operators and similar operations.

In Boolean algebra, any Boolean function can be expressed in the canonical disjunctive normal form (CDNF), minterm canonical form, or Sum of Products as a disjunction (OR) of minterms. The De Morgan dual is the canonical conjunctive normal form (CCNF), maxterm canonical form, or Product of Sums which is a conjunction (AND) of maxterms. These forms can be useful for the simplification of Boolean functions, which is of great importance in the optimization of Boolean formulas in general and digital circuits in particular.

Delta-sigma modulation is an oversampling method for encoding signals into low bit depth digital signals at a very high sample-frequency as part of the process of delta-sigma analog-to-digital converters (ADCs) and digital-to-analog converters (DACs). Delta-sigma modulation achieves high quality by utilizing a negative feedback loop during quantization to the lower bit depth that continuously corrects quantization errors and moves quantization noise to higher frequencies well above the original signal's bandwidth. Subsequent low-pass filtering for demodulation easily removes this high frequency noise and time averages to achieve high accuracy in amplitude which can be ultimately encoded as pulse-code modulation (PCM).

A carry-lookahead adder (CLA) or fast adder is a type of electronics adder used in digital logic. A carry-lookahead adder improves speed by reducing the amount of time required to determine carry bits. It can be contrasted with the simpler, but usually slower, ripple-carry adder (RCA), for which the carry bit is calculated alongside the sum bit, and each stage must wait until the previous carry bit has been calculated to begin calculating its own sum bit and carry bit. The carry-lookahead adder calculates one or more carry bits before the sum, which reduces the wait time to calculate the result of the larger-value bits of the adder.

In electronics, a carry-select adder is a particular way to implement an adder, which is a logic element that computes the $-bit sum of two -bit numbers. The carry-select adder is simple but rather fast, having a gate level depth of .$

The Brent–Kung adder, proposed in 1982, is an advanced binary adder design, having a gate level depth of $.$

Early completion is a property of some classes of asynchronous circuit. It means that the output of a circuit may be available as soon as sufficient inputs have arrived to allow it to be determined. For example, if all of the inputs to a mux have arrived, and all are the same, but the select line has not yet arrived, the circuit can still produce an output. Since all the inputs are identical, the select line is irrelevant.

XOR gate is a digital logic gate that gives a true output when the number of true inputs is odd. An XOR gate implements an exclusive or from mathematical logic; that is, a true output results if one, and only one, of the inputs to the gate is true. If both inputs are false (0/LOW) or both are true, a false output results. XOR represents the inequality function, i.e., the output is true if the inputs are not alike otherwise the output is false. A way to remember XOR is "must have one or the other but not both".

Ripple in electronics is the residual periodic variation of the DC voltage within a power supply which has been derived from an alternating current (AC) source. This ripple is due to incomplete suppression of the alternating waveform after rectification. Ripple voltage originates as the output of a rectifier or from generation and commutation of DC power.

The Dadda multiplier is a hardware binary multiplier design invented by computer scientist Luigi Dadda in 1965. It uses a selection of full and half adders to sum the partial products in stages until two numbers are left. The design is similar to the Wallace multiplier, but the different reduction tree reduces the required number of gates and makes it slightly faster.

A carry-save adder is a type of digital adder, used to efficiently compute the sum of three or more binary numbers. It differs from other digital adders in that it outputs two numbers, and the answer of the original summation can be achieved by adding these outputs together. A carry save adder is typically used in a binary multiplier, since a binary multiplier involves addition of more than two binary numbers after multiplication. A big adder implemented using this technique will usually be much faster than conventional addition of those numbers.

In computer science, the prefix sum, cumulative sum, inclusive scan, or simply scan of a sequence of numbers $x 0, x 1, x 2, ...$ is a second sequence of numbers $y 0, y 1, y 2, ...$ , the sums of prefixes of the input sequence:

A lookahead carry unit (LCU) is a logical unit in digital circuit design used to decrease calculation time in adder units and used in conjunction with carry look-ahead adders (CLAs).

In electronics, a subtractor – a digital circuit that performs subtraction of numbers – can be designed using the same approach as that of an adder. The binary subtraction process is summarized below. As with an adder, in the general case of calculations on multi-bit numbers, three bits are involved in performing the subtraction for each bit of the difference: the minuend, subtrahend, and a borrow in from the previous bit order position. The outputs are the difference bit and borrow bit $. The subtractor is best understood by considering that the subtrahend and both borrow bits have negative weights, whereas the X and D bits are positive. The operation performed by the subtractor is to rewrite as the sum .$

Folding is a transformation technique using in DSP architecture implementation for minimizing the number of functional blocks in synthesizing DSP architecture. Folding was first developed by Keshab K. Parhi and his students in 1992. Its concept is contrary to unfolding. Folding transforms an operation from a unit-time processing to N unit-times processing where N is called folding factor. Therefore, multiple same operations used in original system could be replaced with a signal operation block in transformed system. Thus, in N unit-times, a functional block in transformed system could be reused to perform N operations in original system.

References

External links

Explanation for critical path of the variable-skip adder

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[NB_CSA-1] Carry-skip adder is often abbreviated as CSA, however, this can be confused with carry-save adder.

[Parhami_2000-2] Parhami, Behrooz (2000). Computer arithmetic: Algorithms and Hardware Designs . Oxford University Press. p. 108. ISBN 0-19-512583-5.

[3] V. G. Oklobdzija and E. R. Barnes, "Some Optimal Schemes For ALU Implementation In VLSI Technology", Proceedings of the 7th Symposium on Computer Arithmetic ARITH-7, pp. 2-8. Reprinted in Computer Arithmetic, E. E. Swartzlander, (editor), Vol. II, pp. 137-142, 1985.

[nb 1]

[1]

[2]