Carry-lookahead adder

Last updated February 01, 2025

A carry-lookahead adder (CLA) or fast adder is a type of electronics adder used in digital logic. A carry-lookahead adder improves speed by reducing the amount of time required to determine carry bits. It can be contrasted with the simpler, but usually slower, ripple-carry adder (RCA), for which the carry bit is calculated alongside the sum bit, and each stage must wait until the previous carry bit has been calculated to begin calculating its own sum bit and carry bit. The carry-lookahead adder calculates one or more carry bits before the sum, which reduces the wait time to calculate the result of the larger-value bits of the adder.

Already in the mid-1800s, Charles Babbage recognized the performance penalty imposed by the ripple-carry used in his Difference Engine, and subsequently designed mechanisms for anticipating carriage for his never-built Analytical Engine.^[1]^[2] Konrad Zuse is thought to have implemented the first carry-lookahead adder in his 1930s binary mechanical computer, the Zuse Z1.^[3] Gerald B. Rosenberger of IBM filed for a patent on a modern binary carry-lookahead adder in 1957.^[4]

Two widely used implementations of the concept are the Kogge–Stone adder (KSA) and Brent–Kung adder (BKA).

Theory of operation

Ripple addition

A binary ripple-carry adder works in the same way as most pencil-and-paper methods of addition. Starting at the least significant digit position, the two corresponding digits are added and a result is obtained. A 'carry out' may occur if the result requires a higher digit; for example, "9 + 5 = 4, carry 1". Binary arithmetic works in the same fashion, with fewer digits. In this case, there are only four possible operations, 0+0, 0+1, 1+0 and 1+1; the 1+1 case generates a carry. Accordingly, all digit positions other than the rightmost one need to wait on the possibility of having to add an extra 1 from a carry on the digits one position to the right.

This means that no digit position can have an absolutely final value until it has been established whether or not a carry is coming in from the right. Moreover, if the sum without a carry is the highest value in the base (9 in base-10 pencil-and-paper methods or 1 in binary arithmetic), it is not possible to tell whether or not a given digit position is going to pass on a carry to the position on its left. At worst, when a whole sequence of sums comes to …99999999… (in decimal) or …11111111… (in binary), nothing can be deduced at all until the value of the carry coming in from the right is known; that carry must be propagated to the left, one step at a time, as each digit position evaluates "9 + 1 = 0, carry 1" or "1 + 1 = 0, carry 1". It is the "rippling" of the carry from right to left that gives the ripple-carry adder its name and slowness. When adding 32-bit integers, for instance, allowance has to be made for the possibility that a carry could have to ripple through every one of the 32 one-bit adders.

Lookahead

Carry-lookahead depends on two things:

Calculating for each digit position whether that position is going to propagate a carry if one comes in from the right.
Combining these calculated values to be able to deduce quickly whether, for each group of digits, that group is going to propagate a carry that comes in from the right.

Supposing that groups of four digits are chosen. The sequence of events would go like this:

All 1-bit adders calculate their results. Simultaneously, the lookahead units perform their calculations.
Assuming that a carry arises in a particular group, that carry will emerge at the left-hand end of the group within at most five gate delays and start propagating through the group to its left.
If that carry is going to propagate all the way through the next group, the lookahead unit will already have deduced this. Accordingly, before the carry emerges from the next group, the lookahead unit is immediately (within one gate delay) able to tell the next group to the left that it is going to receive a carry – and, at the same time, to tell the next lookahead unit to the left that a carry is on its way.

The net effect is that the carries start by propagating slowly through each 4-bit group, just as in a ripple-carry system, but then move four times as fast, leaping from one lookahead-carry unit to the next. Finally, within each group that receives a carry, the carry propagates slowly within the digits in that group.

The more bits in a group, the more complex the lookahead carry logic becomes, and the more time is spent on the "slow roads" in each group rather than on the "fast road" between the groups (provided by the lookahead carry logic). On the other hand, the fewer bits there are in a group, the more groups have to be traversed to get from one end of a number to the other, and the less acceleration is obtained as a result.

Deciding the group size to be governed by lookahead carry logic requires a detailed analysis of gate and propagation delays for the particular technology being used.

It is possible to have more than one level of lookahead-carry logic, and this is in fact usually done. Each lookahead-carry unit already produces a signal saying "if a carry comes in from the right, I will propagate it to the left", and those signals can be combined so that each group of, say, four lookahead-carry units becomes part of a "supergroup" governing a total of 16 bits of the numbers being added. The "supergroup" lookahead-carry logic will be able to say whether a carry entering the supergroup will be propagated all the way through it, and using this information, it is able to propagate carries from right to left 16 times as fast as a naive ripple carry. With this kind of two-level implementation, a carry may first propagate through the "slow road" of individual adders, then, on reaching the left-hand end of its group, propagate through the "fast road" of 4-bit lookahead-carry logic, then, on reaching the left-hand end of its supergroup, propagate through the "superfast road" of 16-bit lookahead-carry logic.

Again, the group sizes to be chosen depend on the exact details of how fast signals propagate within logic gates and from one logic gate to another.

For very large numbers (hundreds or even thousands of bits), lookahead-carry logic does not become any more complex, because more layers of supergroups and supersupergroups can be added as necessary. The increase in the number of gates is also moderate: if all the group sizes are four, one would end up with one third as many lookahead carry units as there are adders. However, the "slow roads" on the way to the faster levels begin to impose a drag on the whole system (for instance, a 256-bit adder could have up to 24 gate delays in its carry processing), and the mere physical transmission of signals from one end of a long number to the other begins to be a problem. At these sizes, carry-save adders are preferable, since they spend no time on carry propagation at all.

Carry lookahead method

Carry-lookahead logic uses the concepts of generating and propagating carries. Although in the context of a carry-lookahead adder, it is most natural to think of generating and propagating in the context of binary addition, the concepts can be used more generally than this. In the descriptions below, the word digit can be replaced by bit when referring to binary addition of 2.

The addition of two 1-digit inputs A and B is said to generate if the addition will always carry, regardless of whether there is an input-carry (equivalently, regardless of whether any less significant digits in the sum carry). For example, in the decimal addition 52 + 67, the addition of the tens digits 5 and 6 generates because the result carries to the hundreds digit regardless of whether the ones digit carries; in the example, the ones digit does not carry (2 + 7 = 9). Even if the numbers were, say, 54 and 69, the addition of the tens digits 5 and 6 would still generate because the result once again carries to the hundreds digit independently of 4 and 9 creating a carrying.

In the case of binary addition, $A+B$ generates if and only if both A and B are 1. If we write $G(A,B)$ to represent the binary predicate that is true if and only if $A+B$ generates, we have

G(A,B)=A\cdot B

where $A\cdot B$ is an and.

The addition of two 1-digit inputs A and B is said to propagate if the addition will carry whenever there is an input carry (equivalently, when the next less significant digit in the sum carries). For example, in the decimal addition 37 + 62, the addition of the tens digits 3 and 6 propagate because the result would carry to the hundreds digit if the ones were to carry (which in this example, it does not). Note that propagate and generate are defined with respect to a single digit of addition and do not depend on any other digits in the sum.

In the case of binary addition, $A+B$ propagates if and only if at least one of A or B is 1. If $P(A,B)$ is written to represent the binary predicate that is true if and only if $A+B$ propagates, one has

P(A,B)=A+B

where $A+B$ on the right-hand side of the equation is an or.

Sometimes a slightly different definition of propagate is used. By this definition A + B is said to propagate if the addition will carry whenever there is an input carry, but will not carry if there is no input carry. Due to the way generate and propagate bits are used by the carry-lookahead logic, it doesn't matter which definition is used. In the case of binary addition, this definition is expressed by

P'(A,B)=A\oplus B

where $A\oplus B$ is an xor.

Table showing when carries are propagated or generated.

$A$	$B$	$C_{i}$	$C_{o}$	Type of Carry
0	0	0	0	None
0	0	1	0	None
0	1	0	0	None
0	1	1	1	Propagate
1	0	0	0	None
1	0	1	1	Propagate
1	1	0	1	Generate
1	1	1	1	Generate/Propagate

For binary arithmetic, or is faster than xor and takes fewer transistors to implement. However, for a multiple-level carry-lookahead adder, it is simpler to use $P'(A,B)$ .

Given these concepts of generate and propagate, a digit of addition carries precisely when either the addition generates or the next less significant bit carries and the addition propagates. Written in Boolean algebra, with $C_{i}$ the carry bit of digit i, and $P_{i}$ and $G_{i}$ the propagate and generate bits of digit i respectively,

C_{i+1}=G_{i}+(P_{i}\cdot C_{i}).

Implementation details

For each bit in a binary sequence to be added, the carry-lookahead logic will determine whether that bit pair will generate a carry or propagate a carry. This allows the circuit to "pre-process" the two numbers being added to determine the carry ahead of time. Then, when the actual addition is performed, there is no delay from waiting for the ripple-carry effect (or time it takes for the carry from the first full adder to be passed down to the last full adder).

To determine whether a bit pair will generate a carry, the following logic works:

G_{i}=A_{i}\cdot B_{i}

To determine whether a bit pair will propagate a carry, either of the following logic statements work:

P_{i}=A_{i}\oplus B_{i}

P_{i}=A_{i}+B_{i}

The reason why this works is based on evaluation of $C_{1}=G_{0}+P_{0}\cdot C_{0}$ . The only difference in the truth tables between ( $A\oplus B$ ) and ( $A+B$ ) is when both $A$ and $B$ are 1. However, if both $A$ and $B$ are 1, then the $G_{0}$ term is 1 (since its equation is $A\cdot B$ ), and the $P_{0}\cdot C_{0}$ term becomes irrelevant. The XOR is used normally within a basic full adder circuit; the OR is an alternative option (for a carry-lookahead only), which is far simpler in transistor-count terms.

For the example provided, the logic for the generate ( $G$ ) and propagate ( $P$ ) values are given below. The numeric value determines the signal from the circuit above, starting from 0 on the far right to 3 on the far left:

C_{1}=G_{0}+P_{0}\cdot C_{0}

C_{2}=G_{1}+P_{1}\cdot C_{1}

C_{3}=G_{2}+P_{2}\cdot C_{2}

C_{4}=G_{3}+P_{3}\cdot C_{3}

Substituting $C_{1}$ into $C_{2}$ , then $C_{2}$ into $C_{3}$ , then $C_{3}$ into $C_{4}$ yields the following expanded equations:

C_{1}=G_{0}+P_{0}\cdot C_{0}

C_{2}=G_{1}+G_{0}\cdot P_{1}+C_{0}\cdot P_{0}\cdot P_{1}

C_{3}=G_{2}+G_{1}\cdot P_{2}+G_{0}\cdot P_{1}\cdot P_{2}+C_{0}\cdot P_{0}\cdot P_{1}\cdot P_{2}

C_{4}=G_{3}+G_{2}\cdot P_{3}+G_{1}\cdot P_{2}\cdot P_{3}+G_{0}\cdot P_{1}\cdot P_{2}\cdot P_{3}+C_{0}\cdot P_{0}\cdot P_{1}\cdot P_{2}\cdot P_{3}

The carry-lookahead 4-bit adder can also be used in a higher-level circuit by having each CLA logic circuit produce a propagate and generate signal to a higher-level CLA logic circuit. The group propagate ( $PG$ ) and group generate ( $GG$ ) for a 4-bit CLA are:

PG=P_{0}\cdot P_{1}\cdot P_{2}\cdot P_{3}

GG=G_{3}+G_{2}\cdot P_{3}+G_{1}\cdot P_{3}\cdot P_{2}+G_{0}\cdot P_{3}\cdot P_{2}\cdot P_{1}

They can then be used to create a carry-out for that particular 4-bit group:

CG=GG+PG\cdot C_{in}

It can be seen that this is equivalent to $C_{4}$ in previous equations.

Putting four 4-bit CLAs together yields four group propagates and four group generates. A lookahead-carry unit (LCU) takes these 8 values and uses identical logic to calculate $C_{i}$ in the CLAs. The LCU then generates the carry input for each of the 4 CLAs and a fifth equal to $C_{16}$ .

The calculation of the gate delay of a 16-bit adder (using 4 CLAs and 1 LCU) is not as straight forward as the ripple carry adder.

Starting at time of zero:

calculation of $P_{i}$ and $G_{i}$ is done at time 1,
calculation of the $PG$ is done at time 2,
calculation of the $GG$ is done at time 3,
calculation of the inputs for the CLAs from the LCU are done at:
- time 0 for the first CLA,
- time 5 for the second, third and fourth CLA,
calculation of the $S_{i}$ $Carry-lookahead adder$ are done at:
- time 4 for the first CLA,
- time 8 for the second, third & fourth CLA,
calculation of the final carry bit ( $C_{16}$ ) is done at time 5.

The maximal time is 8 gate delays (for $S_{[4-15]}$ ).

A standard 16-bit ripple-carry adder would take 16 × 2 − 1 = 31 gate delays.

Expansion

This example is a 4-bit carry look ahead adder, there are 5 outputs. Below is the expansion:

S0 = (A0 XOR B0) XOR Cin                               '2dt (dt - delay time)  S1 = (A1 XOR B1)      XOR ((A0 AND B0)       OR ((A0 XOR B0) AND Cin))                                          '4dt    S2 = (A2 XOR B2)      XOR ((A1 AND B1)       OR ((A1 XOR B1) AND (A0 AND B0))       OR ((A1 XOR B1) AND (A0 XOR B0) AND Cin))                          '4dt  S3 = (A3 XOR B3)      XOR ((A2 AND B2)       OR ((A2 XOR B2) AND (A1 AND B1))       OR ((A2 XOR B2) AND (A1 XOR B1) AND (A0 AND B0))       OR ((A2 XOR B2) AND (A1 XOR B1) AND (A0 XOR B0) AND Cin))          '4dt  Cout = (A3 AND B3)        OR ((A3 XOR B3) AND (A2 AND B2))        OR ((A3 XOR B3) AND (A2 XOR B2) AND (A1 AND B1))        OR ((A3 XOR B3) AND (A2 XOR B2) AND (A1 XOR B1) AND (A0 AND B0))        OR ((A3 XOR B3) AND (A2 XOR B2) AND (A1 XOR B1) AND (A0 XOR B0) AND Cin)  '3dt

More simple 4-bit carry-lookahead adder:

'Step 0 Gin = Cin                                   '0dt P00 = A0 XOR B0                             '1dt G00 = A0 AND B0                             '1dt P10 = A1 XOR B1                             '1dt G10 = A1 AND B1                             '1dt P20 = A2 XOR B2                             '1dt G20 = A2 AND B2                             '1dt P30 = A3 XOR B3                             '1dt G30 = A3 AND B3                             '1dt 'Step 1 G01 = G00 OR_       P00 AND Gin                           '3dt, C0, valency-2 G11 = G10 OR_       P10 AND G00 OR_       P10 AND P00 AND Gin                   '3dt, C1, valency-3 G21 = G20 OR_       P20 AND G10 OR_       P20 AND P10 AND G00 OR_       P20 AND P10 AND P00 AND Gin           '3dt, C2, valency-4 G31 = G30 OR_       P30 AND G20 OR_       P30 AND P20 AND G10 OR_       P30 AND P20 AND P10 AND G00 OR_       P30 AND P20 AND P10 AND P00 AND Gin   '3dt, C3, valency-5 'Sum S0 = P00 XOR Gin                            '2dt S1 = P10 XOR G01                            '4dt S2 = P20 XOR G11                            '4dt S3 = P30 XOR G21                            '4dt S4 =         G31                            '3dt, Cout

Manchester carry chain

The Manchester carry chain is a variation of the carry-lookahead adder^[5] that uses shared logic to lower the transistor count. As can be seen above in the implementation section, the logic for generating each carry contains all of the logic used to generate the previous carries. A Manchester carry chain generates the intermediate carries by tapping off nodes in the gate that calculates the most significant carry value. However, not all logic families have these internal nodes, CMOS being a major example. Dynamic logic can support shared logic, as can transmission gate logic. One of the major downsides of the Manchester carry chain is that the capacitive load of all of these outputs, together with the resistance of the transistors causes the propagation delay to increase much more quickly than a regular carry lookahead. A Manchester-carry-chain section generally doesn't exceed 4 bits.

Related Research Articles

In digital circuits, an adder–subtractor is a circuit that is capable of adding or subtracting numbers. Below is a circuit that adds or subtracts depending on a control signal. It is also possible to construct a circuit that performs both addition and subtraction at the same time.

In computing, a linear-feedback shift register (LFSR) is a shift register whose input bit is a linear function of its previous state.

<span class="mw-page-title-main">Exclusive or</span> True when either but not both inputs are true

Exclusive or, exclusive disjunction, exclusive alternation, logical non-equivalence, or logical inequality is a logical operator whose negation is the logical biconditional. With two inputs, XOR is true if and only if the inputs differ. With multiple inputs, XOR is true if and only if the number of true inputs is odd.

In computer programming, a bitwise operation operates on a bit string, a bit array or a binary numeral at the level of its individual bits. It is a fast and simple action, basic to the higher-level arithmetic operations and directly supported by the processor. Most bitwise operations are presented as two-operand instructions where the result replaces one of the input operands.

In CPU design, the use of a sum-addressed decoder (SAD) or sum-addressed memory (SAM) decoder is a method of reducing the latency of the CPU cache access and address calculation. This is achieved by fusing the address generation sum operation with the decode operation in the cache SRAM.

An adder, or summer, is a digital circuit that performs addition of numbers. In many computers and other kinds of processors, adders are used in the arithmetic logic units (ALUs). They are also used in other parts of the processor, where they are used to calculate addresses, table indices, increment and decrement operators and similar operations.

In Boolean algebra, any Boolean function can be expressed in the canonical disjunctive normal form (CDNF), minterm canonical form, or Sum of Products as a disjunction (OR) of minterms. The De Morgan dual is the canonical conjunctive normal form (CCNF), maxterm canonical form, or Product of Sums which is a conjunction (AND) of maxterms. These forms can be useful for the simplification of Boolean functions, which is of great importance in the optimization of Boolean formulas in general and digital circuits in particular.

The OR gate is a digital logic gate that implements logical disjunction. The OR gate outputs "true" if any of its inputs is "true"; otherwise it outputs "false". The input and output states are normally represented by different voltage levels.

A carry-skip adder is an adder implementation that improves on the delay of a ripple-carry adder with little effort compared to other adders. The improvement of the worst-case delay is achieved by using several carry-skip adders to form a block-carry-skip adder.

In electronics, a carry-select adder is a particular way to implement an adder, which is a logic element that computes the $-bit sum of two -bit numbers. The carry-select adder is simple but rather fast, having a gate level depth of .$

The Brent–Kung adder, proposed in 1982, is an advanced binary adder design, having a gate level depth of $.$

XOR gate is a digital logic gate that gives a true output when the number of true inputs is odd. An XOR gate implements an exclusive or from mathematical logic; that is, a true output results if one, and only one, of the inputs to the gate is true. If both inputs are false (0/LOW) or both are true, a false output results. XOR represents the inequality function, i.e., the output is true if the inputs are not alike otherwise the output is false. A way to remember XOR is "must have one or the other but not both".

In electronics, a Ling adder is a particularly fast binary adder designed using H. Ling's equations and generally implemented in BiCMOS. Samuel Naffziger of Hewlett-Packard presented an innovative 64 bit adder in 0.5 μm CMOS based on Ling's equations at ISSCC 1996. The Naffziger adder's delay was less than 1 nanosecond, or 7 FO4.

A carry-save adder is a type of digital adder, used to efficiently compute the sum of three or more binary numbers. It differs from other digital adders in that it outputs two numbers, and the answer of the original summation can be achieved by adding these outputs together. A carry save adder is typically used in a binary multiplier, since a binary multiplier involves addition of more than two binary numbers after multiplication. A big adder implemented using this technique will usually be much faster than conventional addition of those numbers.

In computing, the Kogge–Stone adder is a parallel prefix form of carry-lookahead adder. Other parallel prefix adders (PPA) include the Sklansky adder (SA), Brent–Kung adder (BKA), the Han–Carlson adder (HCA), the fastest known variation, the Lynch–Swartzlander spanning tree adder (STA), Knowles adder (KNA) and Beaumont-Smith adder (BSA).

A lookahead carry unit (LCU) is a logical unit in digital circuit design used to decrease calculation time in adder units and used in conjunction with carry look-ahead adders (CLAs).

In electronics, a subtractor – a digital circuit that performs subtraction of numbers – can be designed using the same approach as that of an adder. The binary subtraction process is summarized below. As with an adder, in the general case of calculations on multi-bit numbers, three bits are involved in performing the subtraction for each bit of the difference: the minuend, subtrahend, and a borrow in from the previous bit order position. The outputs are the difference bit and borrow bit $. The subtractor is best understood by considering that the subtrahend and both borrow bits have negative weights, whereas the X and D bits are positive. The operation performed by the subtractor is to rewrite as the sum .$

A negative base may be used to construct a non-standard positional numeral system. Like other place-value systems, each position holds multiples of the appropriate power of the system's base; but that base is negative—that is to say, the base $b$ is equal to $-r$ for some natural number $r$ .

A redundant binary representation (RBR) is a numeral system that uses more bits than needed to represent a single binary digit so that most numbers have several representations. An RBR is unlike usual binary numeral systems, including two's complement, which use a single bit for each digit. Many of an RBR's properties differ from those of regular binary representation systems. Most importantly, an RBR allows addition without using a typical carry. When compared to non-redundant representation, an RBR makes bitwise logical operation slower, but arithmetic operations are faster when a greater bit width is used. Usually, each digit has its own sign that is not necessarily the same as the sign of the number represented. When digits have signs, that RBR is also a signed-digit representation.

A truth table is a mathematical table used in logic—specifically in connection with Boolean algebra, Boolean functions, and propositional calculus—which sets out the functional values of logical expressions on each of their functional arguments, that is, for each combination of values taken by their logical variables. In particular, truth tables can be used to show whether a propositional expression is true for all legitimate input values, that is, logically valid.

References

↑ "Analytical Engine – History of Charles Babbage Analytical Engine". history-computer.com. 4 January 2021. Retrieved 2021-06-19.
↑ Babbage, Charles (1864). Passages from the Life of a Philosopher. London: Longman, Green, Longmand Roberts & Green. pp. 59–63, 114–116.
↑ Rojas, Raul (2014-06-07). "The Z1: Architecture and Algorithms of Konrad Zuse's First Computer". arXiv: 1406.1886 [cs.AR].
↑ Rosenberger, Gerald B. (1960-12-27). "Simultaneous Carry Adder". U.S. Patent 2,966,305.
↑ "Manchester carry-chain adder - WikiChip". wikichip.org. Retrieved 2017-04-24.

External links

Carry Look Ahead Adder JavaScript simulator

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Analytical Engine – History of Charles Babbage Analytical Engine". history-computer.com. 4 January 2021. Retrieved 2021-06-19.

[Babbage_1864-2] Babbage, Charles (1864). Passages from the Life of a Philosopher. London: Longman, Green, Longmand Roberts & Green. pp. 59–63, 114–116.

[3] Rojas, Raul (2014-06-07). "The Z1: Architecture and Algorithms of Konrad Zuse's First Computer". arXiv: 1406.1886 [cs.AR].

[Rosenberger_1960-4] Rosenberger, Gerald B. (1960-12-27). "Simultaneous Carry Adder". U.S. Patent 2,966,305.

[Manchester-5] "Manchester carry-chain adder - WikiChip". wikichip.org. Retrieved 2017-04-24.

[1]

[2]

[3]

[4]

[5]