(Carry Lookahead Adder)

(Carry Lookahead Adder)
EEL4712 Digital Design (Carry Lookahead Adder)

behavioral (sequential)
VHDL Design Styles VHDL Design Styles Testbenches dataflow behavioral (sequential) structural Concurrent statements Components and interconnects Sequential statements Registers State machines Instruction decoders Subset most suitable for synthesis

Register Transfer Level (RTL) Design Description
Today’s Topic Combinational Logic Combinational Logic … Registers

MLU Example

MLU Block Diagram A MUX_4_1 NEG_A Y NEG_Y B L1 L0 NEG_B MUX_0 A1 Y1
1 A1 A MUX_4_1 IN0 Y1 MUX_1 1 NEG_A IN1 MUX_2 Y IN2 OUTPUT IN3 SEL0 SEL1 NEG_Y 1 B1 B L1 L0 MUX_3 NEG_B

MLU: Entity Declaration
LIBRARY ieee; USE ieee.std_logic_1164.all; ENTITY mlu IS PORT( NEG_A : IN STD_LOGIC; NEG_B : IN STD_LOGIC; NEG_Y : IN STD_LOGIC; A : IN STD_LOGIC; B : IN STD_LOGIC; L1 : IN STD_LOGIC; L0 : IN STD_LOGIC; Y : OUT STD_LOGIC ); END mlu;

MLU: Architecture Declarative Section
ARCHITECTURE mlu_dataflow OF mlu IS SIGNAL A1 : STD_LOGIC; SIGNAL B1 : STD_LOGIC; SIGNAL Y1 : STD_LOGIC; SIGNAL MUX_0 : STD_LOGIC; SIGNAL MUX_1 : STD_LOGIC; SIGNAL MUX_2 : STD_LOGIC; SIGNAL MUX_3 : STD_LOGIC; SIGNAL L: STD_LOGIC_VECTOR(1 DOWNTO 0);

MLU - Architecture Body
BEGIN A1<= NOT A WHEN (NEG_A='1') ELSE A; B1<= NOT B WHEN (NEG_B='1') ELSE B; Y <= NOT Y1 WHEN (NEG_Y='1') ELSE Y1; MUX_0 <= A1 AND B1; MUX_1 <= A1 OR B1; MUX_2 <= A1 XOR B1; MUX_3 <= A1 XNOR B1; L <= L1 & L0; with (L) select Y1 <= MUX_0 WHEN "00", MUX_1 WHEN "01", MUX_2 WHEN "10", MUX_3 WHEN OTHERS; END mlu_dataflow;

Buffers

Tri-state Buffer (a) A tri-state buffer (b) Equivalent circuit
x f e = 0 (a) A tri-state buffer x f e x f e = 1 x f Z 1 Z 1 (b) Equivalent circuit 1 1 1 (c) Truth table

Four types of Tri-state Buffers

Tri-state Buffer – example (1)
LIBRARY ieee; USE ieee.std_logic_1164.all; ENTITY tri_state IS PORT ( ena: IN STD_LOGIC; input: IN STD_LOGIC; output: OUT STD_LOGIC ); END tri_state;

Tri-state Buffer – example (2)
ARCHITECTURE dataflow OF tri_state IS BEGIN output <= input WHEN (ena = ‘1’) ELSE ‘Z’; END dataflow;

Wire and Buses

Merging wires and buses
4 10 b d 5 c SIGNAL a: STD_LOGIC_VECTOR(3 DOWNTO 0); SIGNAL b: STD_LOGIC_VECTOR(4 DOWNTO 0); SIGNAL c: STD_LOGIC; SIGNAL d: STD_LOGIC_VECTOR(9 DOWNTO 0); d <= a & b & c;

Splitting buses a d b c SIGNAL a: STD_LOGIC_VECTOR(3 DOWNTO 0);
4 10 d b 5 c SIGNAL a: STD_LOGIC_VECTOR(3 DOWNTO 0); SIGNAL b: STD_LOGIC_VECTOR(4 DOWNTO 0); SIGNAL c: STD_LOGIC; SIGNAL d: STD_LOGIC_VECTOR(9 DOWNTO 0); a <= d(9 downto 6); b <= d(5 downto 1); c <= d(0);

Carry Lookahead Adders

Introduction There are many ways to implement a digital function, but each approach may have different tradeoffs As a digital designer, you need to consider these tradeoffs when meeting design requirements As an example, we’ll look into different adder architectures and their tradeoffs Read Section 5.4 for more details

Ripple Carry Adder Full Adder (FA) Ripple Carry Adder 𝑠=𝑥⨁𝑦⨁ 𝑐 𝑖𝑛
𝑐 𝑜𝑢𝑡 =𝑥𝑦+𝑥 𝑐 𝑖𝑛 +𝑦 𝑐 𝑖𝑛 Ripple Carry Adder A ripple carry (RC) adder is a series of full adders connected by the carry bit

Advantage: Area Impact
The ripple carry adder’s area grows linearly with bit width Each additional bit adds another full adder and its associated gates +5 gates +5 gates +5 gates +5 gates

Disadvantage: Delay Impact
The ripple carry adder’s delay also grows linearly with bit width Each full adder must wait for the previous full adder to produce outputs Lower performance with bit width!

Delay Problem Delay is dependent on the carry bit “rippling” through each adder 𝑐 𝑖+1 = 𝑥 𝑖 𝑦 𝑖 + 𝑥 𝑖 + 𝑦 𝑖 𝑐 𝑖 Can we quickly determine the previous carry’s value and reduce delay? 𝑐 𝑖 = 𝑥 𝑖−1 𝑦 𝑖−1 + 𝑥 𝑖−1 + 𝑦 𝑖−1 𝑐 𝑖−1

Simplifying Carry Equation
Let’s first simplify how we look at the carry equation Carry equation: 𝑐 𝑖+1 = 𝒙 𝒊 𝒚 𝒊 + 𝒙 𝒊 + 𝒚 𝒊 𝑐 𝑖 Simplified: 𝑐 𝑖+1 = 𝑔 𝑖 + 𝑝 𝑖 𝑐 𝑖 𝑔 𝑖 = 𝑥 𝑖 𝑦 𝑖 𝑝 𝑖 = 𝑥 𝑖 + 𝑦 𝑖 How is carry determined in a given adder stage? Adder generates a carry based on gi When gi = 1, ci+1 = (1) + pici = 1 Adder propagates previous carry based on pi When pi = 1, ci+1 = gi + (1)ci = gi + ci

Generate/Propagate Example
Generate signals ( 𝑔 𝑖 = 𝑥 𝑖 𝑦 𝑖 ) FA0 generates a carry ( 𝒄 𝟏 ) Propagate signals ( 𝑝 𝑖 = 𝑥 𝑖 + 𝑦 𝑖 ) FA1 asserts its carry ( 𝒄 𝟐 ) by propagating FA0’s carry ( 𝒄 𝟏 ) FA3 de-asserts its carry ( 𝒄 𝟒 ) by propagating FA2’s carry ( 𝒄 𝟑 ) 1 0001 +1011 11 0001 +1011 00 0011 0001 +1011 1100

Carry Substitution Let’s use substitution to see how the carry bit, ci+1, is impacted by an earlier adder stage Carry equation for adder i+1 and i: 𝑐 𝑖+1 = 𝑔 𝑖 + 𝑝 𝑖 𝑐 𝑖 , or 𝑐 𝑖 = 𝑔 𝑖−1 + 𝑝 𝑖−1 𝑐 𝑖−1 With substitution 𝑐 𝑖+1 = 𝑔 𝑖 + 𝑝 𝑖 𝑔 𝑖−1 + 𝑝 𝑖−1 𝑐 𝑖−1 𝑐 𝑖+1 = 𝑔 𝑖 + 𝑝 𝑖 𝑔 𝑖−1 + 𝑝 𝑖 𝑝 𝑖−1 𝑐 𝑖−1 ci

Carry Pattern Carry Equation: Substitutions: 𝑐 𝑖+1 = 𝑔 𝑖 + 𝑝 𝑖 𝑐 𝑖
𝑐 𝑖+1 = 𝑔 𝑖 + 𝑝 𝑖 𝑐 𝑖 Substitutions: 𝑐 1 = 𝑔 0 + 𝑝 0 𝑐 0 𝑐 2 = 𝑔 1 + 𝑝 1 𝑐 1 𝑐 2 = 𝑔 1 + 𝑝 1 𝒈 𝟎 + 𝒑 𝟎 𝒄 𝟎 𝑐 2 = 𝑔 1 + 𝑝 1 𝑔 0 + 𝑝 1 𝑝 0 𝑐 0 𝑐 3 = 𝑔 2 + 𝑝 2 𝑐 2 𝑐 3 = 𝑔 2 + 𝑝 2 𝒈 𝟏 + 𝒑 𝟏 𝒈 𝟎 + 𝒑 𝟏 𝒑 𝟎 𝒄 𝟎 𝑐 3 = 𝑔 2 + 𝑝 2 𝑔 1 + 𝑝 2 𝑝 1 𝑔 0 + 𝑝 2 𝑝 1 𝑝 0 𝑐 0 c1 c2

Removing Dependencies
The carry signal (ci+1) is now determined by an earlier adder stage’s carry (ci-1, ci-2). 𝑐 𝑖+1 = 𝑔 𝑖 + 𝑝 𝑖 𝑔 𝑖−1 + 𝑝 𝑖 𝑝 𝑖−1 𝑐 𝑖−1 𝑐 𝑖+1 = 𝑔 𝑖 + 𝑝 𝑖 𝑔 𝑖−1 + 𝑝 𝑖 𝑝 𝑖−1 𝑔 𝑖−2 + 𝑝 𝑖 𝑝 𝑖−1 𝑝 𝑖−2 𝑐 𝑖−2 𝑐 𝑖+1 is dependent on the following: The current stage generating a carry ( 𝑔 𝑖 ) The current stage propagating a generated carry from a previous stage ( 𝑝 𝑖 𝑔 𝑖−1 ) or ( 𝑝 𝑖 𝑔 𝑖−1 + 𝑝 𝑖 𝑝 𝑖−1 𝑔 𝑖−2 ) The current and previous stages propagating a carry from an earlier stage ( 𝑝 𝑖 𝑝 𝑖−1 𝑐 𝑖−1 ) or ( 𝑝 𝑖 𝑝 𝑖−1 𝑝 𝑖−2 𝑐 𝑖−2 ) We can perform substitution for each carry bit Case 1 Case 2 Case 3 Case 1 Case 2 Case 3

New Carry Dependence Through substitution, every carry signal can be a function of solely c0, x, and y Can determine carry when inputs are ready Avoids waiting for the carry to ripple (ci-1) 𝑐 3 = 𝑔 2 + 𝑝 2 𝑔 1 + 𝑝 2 𝑝 1 𝑔 0 + 𝑝 2 𝑝 1 𝑝 0 𝑐 0 Three logic level delay Produce 𝑔 𝑖 and 𝑝 𝑖 terms from 𝑥 𝑖 and 𝑦 𝑖 Produce carry minterms (e.g. 𝑝 2 𝑔 1 , 𝑝 2 𝑝 1 𝑔 0 , and 𝑝 2 𝑝 1 𝑝 0 𝑐 0 ) Produce final carry terms (ci) from minterms

Carry Lookahead Adder (CLA)
Steps Produce 𝑔 𝑖 and 𝑝 𝑖 terms from 𝑥 𝑖 and 𝑦 𝑖 Produce carry minterms (e.g. 𝑝 2 𝑔 1 , 𝑝 2 𝑝 1 𝑔 0 , and 𝑝 2 𝑝 1 𝑝 0 𝑐 0 ) Produce final carry terms (ci) from minterms All carry terms calculated concurrently and independent of previous carry calculation propagate/generate terms carry minterms carry term [1]

Constant Delay The propagation delay is now constant, even as the adder width increases! Each of the CLA adder stages calculate its outputs concurrently By contrast, RC ripples carry through each stage and has a variable delay based on width

Area Cost Area now grows quadratically with width
Propagate/generate logic grows with each stage 𝑐 2 = 𝑔 1 + 𝑝 1 𝑔 0 + 𝑝 1 𝑝 0 𝑐 0 𝑐 3 = 𝑔 2 + 𝑝 2 𝑔 1 + 𝑝 2 𝑝 1 𝑔 0 + 𝑝 2 𝑝 1 𝑝 0 𝑐 0 𝑐 4 = 𝑔 3 + 𝑝 3 𝑔 2 + 𝑝 3 𝑝 2 𝑔 1 + 𝑝 3 𝑝 2 𝑝 1 𝑔 0 + 𝑝 3 𝑝 2 𝑝 1 𝑝 0 𝑐 0 Not practical for larger adders

Other Tradeoffs? We know of two adders with different advantages as width increases The ripple carry’s area grows linearly, but suffers from linear growth in delay The carry lookahead’s delay is constant, but suffers from quadratic growth in area Use hybrid architecture to limit both area and delay growth?

Hybrid Approach Make a ripple carry architecture out of multi-bit CLAs adders, or CLA blocks E.g. 4-bit CLA adders connected in a ripple carry fashion

Area Impact The hybrid adder’s area grows linearly
Additional bits add another CLA adder and its associated gates Slower area growth than pure CLA, but larger area than pure RC

Delay Impact The hybrid adder’s delay grows linearly
Each CLA adder must wait for the previous CLA adder to produce outputs But no longer on a bit-by-bit basis! Lower delay growth than pure RC

Alternative Hierarchical Approach
In the hybrid approach, we use the RC architecture on multi-bit CLA adders Alternatively, we can also use the CLA architecture with multi-bit CLA adders Use CLA architecture to gain constant delay as width increases Must determine propagate and generate logic on a CLA block-basis

Another Look at Carry Equations
Let’s simplify how we look at the carry eq for CLA blocks Carry equation for 4-bit CLA block: 𝑐 4 = 𝑔 3 + 𝑝 3 𝑔 2 + 𝑝 3 𝑝 2 𝑔 1 + 𝑝 3 𝑝 2 𝑝 1 𝑔 0 +( 𝑝 3 𝑝 2 𝑝 1 𝑝 0 ) 𝑐 0 Simplified: 𝑐 4 = 𝐺 0 + 𝑃 0 𝑐 0 𝐺 0 = 𝑔 3 + 𝑝 3 𝑔 2 + 𝑝 3 𝑝 2 𝑔 1 + 𝑝 3 𝑝 2 𝑝 1 𝑔 0 𝑃 0 = 𝑝 3 𝑝 2 𝑝 1 𝑝 0 How is each block’s carry determined? CLA block generates a carry based on Gi Last stage in CLA block generates a carry (g3) Other stages generate a carry that is propagated by later stages CLA block propagates previous carry based on Pi Each stage in CLA block propagates the input carry (p3p2p1p0)

Hierarchical Carry Substitution
Let’s use substitution to see how the carry bit is impacted by earlier CLA blocks Carry equation for CLA block 1 and 2: 𝑐 4 = 𝐺 0 + 𝑃 0 𝑐 0 𝑐 8 = 𝐺 1 + 𝑃 1 𝑐 4 With substitution 𝑐 8 = 𝐺 1 + 𝑃 1 𝐺 0 + 𝑃 0 𝑐 0 𝑐 8 = 𝐺 1 + 𝑃 1 𝐺 0 + 𝑃 1 𝑃 0 𝑐 0 Similar to regular CLA

Hierarchical Substitutions
With four 4-bit CLA blocks, 𝑐 4 = 𝐺 0 + 𝑃 0 𝑐 0 𝑐 8 = 𝐺 1 + 𝑃 1 𝑐 4 𝑐 8 = 𝐺 1 + 𝑃 1 ( 𝐺 0 + 𝑃 0 𝑐 0 ) 𝑐 8 = 𝐺 1 + 𝑃 1 𝐺 0 + 𝑃 1 𝑃 0 𝑐 0 𝑐 12 = 𝐺 2 + 𝑃 2 𝑐 8 𝑐 12 = 𝐺 2 + 𝑃 2 𝐺 1 + 𝑃 2 𝑃 1 𝐺 0 + 𝑃 2 𝑃 1 𝑃 0 𝑐 0 𝑐 16 = 𝐺 3 + 𝑃 3 𝑐 12 𝑐 16 = 𝐺 3 + 𝑃 3 𝐺 2 + 𝑃 3 𝑃 2 𝐺 1 + 𝑃 3 𝑃 2 𝑃 1 𝐺 0 + 𝑃 3 𝑃 2 𝑃 1 𝑃 0 𝑐 0 Similar to regular CLA substitutions

Removing Dependencies
The carry signal (c8) is now determined by an earlier CLA block’s carry (c0). 𝑐 8 = 𝐺 1 + 𝑃 1 𝐺 0 + 𝑃 1 𝑃 0 𝑐 0 𝑐 12 = 𝐺 2 + 𝑃 2 𝐺 1 + 𝑃 2 𝑃 1 𝐺 0 + 𝑃 2 𝑃 1 𝑃 0 𝑐 0 𝑐 4𝑖 is dependent on the following: The current stage generating a carry ( 𝐺 1 ) or 𝐺 2 The current stage propagating a generated carry from a CLA block ( 𝑃 1 𝐺 0 ) or ( 𝑃 2 𝐺 1 + 𝑃 2 𝑃 1 𝐺 0 ) The current and previous stages propagating a carry from an earlier CLA block ( 𝑃 1 𝑃 0 𝑐 0 ) or ( 𝑃 2 𝑃 1 𝑃 0 𝑐 0 ) We can perform substitution for each CLA carry bit Case 1 Case 2 Case 3 Case 1 Case 2 Case 3

Hierarchical CLA Steps Produce 𝑔 𝑖 and 𝑝 𝑖 terms from 𝑥 𝑖 and 𝑦 𝑖
Produce carry minterms (e.g. 𝑃 2 𝐺 1 , 𝑃 2 𝑃 1 𝐺 0 , and 𝑃 2 𝑃 1 𝑃 0 𝑐 0 ) Produce final carry terms (ci) from minterms propagate/generate terms (steps 1 & 2) carry terms (steps 3 & 4) Figure 2. A hierarchical carry-lookahead adder [1]

Hierarchical CLA Compared to pure CLA Smaller quadratic area growth
Block propagate/generate logic extracts common propagate/generate terms Avoids duplicating gates for common propagate/generate terms in each carry equation Larger constant delay due to block propagate/generate logic. Figure 2. A hierarchical carry-lookahead adder [1]

Practical Limitations
At larger bit-widths, the carry equations requires gates with many inputs Fan-in limits the number of inputs on a gate A network of smaller gates can be used to avoid fan-in at the cost of additional delay E.g. 𝑝 6 𝑝 5 𝑝 4 𝑝 3 𝑝 2 𝑝 1 𝑝 0 𝑐 0 Signals may have large fan-out as well May need to duplicate gates to limit fan-out which increases area cost Ideally Practically

Multi-level Hierarchical Approaches
Similar to the past two architectures, we can add additional hierarchical levels for different area/delay tradeoffs May help with fan-in/fan-out as width increases Ideal number of levels dependent on width

Conclusions Two adder architectures with different area/delay tradeoffs Ripple carry adder Carry lookahead adder Two hierarchical architectures with different area/delay tradeoffs Hybrid architecture Hierarchical CLA Questions?

References Stephen Brown and Zvonko Vranesic Fundamentals of Digital Logic with VHDL Design with CD-ROM (2 ed.). McGraw-Hill, Inc., New York, NY, USA.

Questions?

(Carry Lookahead Adder)

Similar presentations

Presentation on theme: "(Carry Lookahead Adder)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

(Carry Lookahead Adder)

Similar presentations

Presentation on theme: "(Carry Lookahead Adder)"— Presentation transcript:

Similar presentations

About project

Feedback