Jackson Adders Prof. David Money Harris Matthew Keeter, Andrew Macrae,

Slides:



Advertisements
Similar presentations
Introduction So far, we have studied the basic skills of designing combinational and sequential logic using schematic and Verilog-HDL Now, we are going.
Advertisements

CPE 626 CPU Resources: Adders & Multipliers Aleksandar Milenkovic Web:
1 Lecture 12: Hardware for Arithmetic Today’s topics:  Designing an ALU  Carry-lookahead adder Reminder: Assignment 5 will be posted in a couple of days.
Comparator.
Henry Hexmoor1 Chapter 5 Arithmetic Functions Arithmetic functions –Operate on binary vectors –Use the same subfunction in each bit position Can design.
EECS Components and Design Techniques for Digital Systems Lec 17 – Addition, Subtraction, and Negative Numbers David Culler Electrical Engineering.
VLSI Arithmetic Adders Prof. Vojin G. Oklobdzija University of California
ECE C03 Lecture 61 Lecture 6 Arithmetic Logic Circuits Hai Zhou ECE 303 Advanced Digital Design Spring 2002.
Introduction to CMOS VLSI Design Lecture 11: Adders
Lecture 17: Adders.
CSE 246: Computer Arithmetic Algorithms and Hardware Design Prof Chung-Kuan Cheng Lecture 3.
Chapter 5 Arithmetic Logic Functions. Page 2 This Chapter..  We will be looking at multi-valued arithmetic and logic functions  Bitwise AND, OR, EXOR,
Introduction to CMOS VLSI Design Lecture 11: Adders David Harris Harvey Mudd College Spring 2004.
Abdullah Aldahami ( ) Feb26, Introduction 2. Feedback Switch Logic 3. Arithmetic Logic Unit Architecture a.Ripple-Carry Adder b.Kogge-Stone.
1 Design Space Exploration for Power-Efficient Mixed-Radix Ling Adders Chung-Kuan Cheng Computer Science and Engineering Depart. University of California,
Advanced VLSI Design Unit 05: Datapath Units. Slide 2 Outline  Adders  Comparators  Shifters  Multi-input Adders  Multipliers.
A 240ps 64b Carry-Lookahead Adder in 90nm CMOS Faezeh Montazeri Advanced VLSI Course Presentation University of Tehran December.
Design of a 32-Bit Hybrid Prefix-Carry Look-Ahead Adder
Nov 10, 2008ECE 561 Lecture 151 Adders. Nov 10, 2008ECE 561 Lecture 152 Adders Basic Ripple Adders Faster Adders Sequential Adders.
Computing Systems Designing a basic ALU.
1 Lecture 12 Time/space trade offs Adders. 2 Time vs. speed: Linear chain 8-input OR function with 2-input gates Gates: 7 Max delay: 7.
EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 Digital Integrated Circuits A Design Perspective Arithmetic Circuits Jan M. Rabaey Anantha.
Building a Faster Adder
EE466: VLSI Design Lecture 13: Adders
1 Carry Lookahead Logic Carry Generate Gi = Ai Bi must generate carry when A = B = 1 Carry Propagate Pi = Ai xor Bi carry in will equal carry out here.
CPEN Digital System Design
1 Lecture 11: Hardware for Arithmetic Today’s topics:  Logic for common operations  Designing an ALU  Carry-lookahead adder.
COE 360 Principles of VLSI Design Delay. 2 Definitions.
Carry-Lookahead & Carry-Select Adders
Computer Arthmetic Chapter Four P&H.
Somet things you should know about digital arithmetic:
Lecture 12 Logistics Last lecture Today HW4 due today Timing diagrams
Subtitle: How to design the data path of a processor.
Lecture Adders Half adder.
CSE241A VLSI Digital Circuits Winter 2003 Recitation 2
Swamynathan.S.M AP/ECE/SNSCT
CSE 575 Computer Arithmetic Spring 2003 Mary Jane Irwin (www. cse. psu
Addition and multiplication
Space vs. Speed: Binary Adders
Prof. Vojin G. Oklobdzija
VLSI Arithmetic Lecture 5
ECE 331 – Digital System Design
CSE Winter 2001 – Arithmetic Unit - 1
VLSI Arithmetic Lecture 4
VLSI Arithmetic Adders & Multipliers
Lecture 14 Logistics Last lecture Today
King Fahd University of Petroleum and Minerals
Arithmetic Functions & Circuits
Arithmetic Circuits (Part I) Randy H
Lecture 15 Logistics Last lecture Today HW4 is due today
Instructor: Prof. Chung-Kuan Cheng
ARM implementation the design is divided into a data path section that is described in register transfer level (RTL) notation control section that is viewed.
CS 140 Lecture 14 Standard Combinational Modules
Digital System Design Combinational Logic
Overview Part 1 – Design Procedure Part 2 – Combinational Logic
Part III The Arithmetic/Logic Unit
CSE 140 Lecture 14 Standard Combinational Modules
Addition and multiplication
Instructor: Mozafar Bag-Mohammadi University of Ilam
Lecture 14 Logistics Last lecture Today
Addition and multiplication
ECE 352 Digital System Fundamentals
74LS283 4-Bit Binary Adder with Fast Carry
ECE 352 Digital System Fundamentals
ECE 352 Digital System Fundamentals
Lecture 9 Digital VLSI System Design Laboratory
EE216A – Fall 2010 Design of VLSI Circuits and Systems
Carry-Lookahead & Carry-Select Adders
Lecture 2 Adders Half adder.
Presentation transcript:

Jackson Adders Prof. David Money Harris Matthew Keeter, Andrew Macrae, Tynan McAuley, Becky Glick, Madeleine Ong 21 December 2010

Overview Definitions Tree Adders Ling Adders Jackson Adders 18-bit Jackson Tree Evaluation Methodology Preliminary Results Jackson Adders

Addition Carry Propagate Adder Inputs: AN:0, BN:1 A0 = Cin Outputs: SN:1 Discard Cout Jackson Adders

Propagate, Generate, Kill Oh My! Bitwise Signals Generate: Gi:i = Gi ≡ AiBi Propagate: Pi:i = Pi ≡ Ai+Bi Also called ~Ki Xi ≡ Ai xor Bi Group Recursion to form prefixes Propagate Pi:j = Pi:kPk-1:j Generate Gi:j = Gi:k+Pi:kGk-1:j Group generates if upper part generates or upper part propagates and the lower part generates Bitwise Sum Si = Xi xor Gi-1:0 Jackson Adders

Higher Valency Groups Valency-2 Propagate Pi:j = Pi:kPk-1:j Generate Gi:j = Gi:k+Pi:kGk-1:j Valency-3 Propagate Pi:j = Pi:kPk-1:lPl-1:j Generate Gi:j = Gi:k+Pi:k (Gk-1:j+Pk-1:IGl-1:j) Valency-4 Propagate Pi:j = Pi:kPk-1:lPl-1:mPm-1:j Generate Gi:j = Gi:k+Pi:k(Gk-1:j+Pk-1:I(Gl-1:m+Pl-1:mGm-1:j)) Jackson Adders

Tree Adders How should the recursion be organized? Jackson Adders

Black and Gray Cells Black cell: Group G and P Gray cell: Group G only Inverting vs. non Higher Valency Jackson Adders

Tree Adders Jackson Adders

Higher Valency Trees Jackson Adders

Sparse Trees Sklansky sparseness 4 Only compute prefixes for every 4th column Precompute 4-bit results for each possible carry in Select result based on carry (group generate) Jackson Adders

Carry Selection Jackson Adders

Ling Adders Factor some complexity out of first term Insert it back into sum selection Remove 1 transistor from critical path Exploits fact that GiPi = (AiBi)(Ai+Bi) = Gi Jackson Adders

Ling Equations Define Pseudogenerate: Hi:j ≡ Gi + Gi-1:j Simpler than Gi:j = Gi + PiGi-1:j Recreate Gi:j = PiHi:j = Pi(Gi + Gi-1:j) = Gi + PiGi-1:j Define Pseudopropagate Ii:j ≡ Pi-1:j-1 Shifted version of group propagate Valency-2 recursion is same as PG Hi:j = Hi:k + Ii:kHk-1:j Ii:j = Ii:kIk-1:j Sum: Si = Xi xor Gi-1:0 = Xi xor (Pi-1Hi-1:0) Selection mux: Si = Hi-1:0 ? [Xi xor Pi-1] : Xi Sum selection mux chooses Si based on late-arriving Hi-1:j Jackson Adders

Ling Circuits Simplifies first stage Compute Hi+1:I in one swell foop Easy Too hard Jackson Adders

Jackson Adders Generalized Ling technique Simplify logic in the prefix tree as well Use sum selection to reinsert missing terms Balance logic so both data and select to sum mux are comparable in criticality Developed by Jackson and Talwar in 2004 Used in Arithmetica synthesis tool Parameterized by architecture, valency, sparseness Reportedly produced superior energy-delay tradeoffs Burgess09 indicates benefits over standard designs No comprehensible complete published designs Jackson Adders

Jackson Logic Define new terms D: a group generates or propagates a carry Special case: B: a group generates a carry in at least one bit Rewrite group generate: Group generates if upper part generates or propagates and either at least one bit of upper part generates or the low part generates Jackson Adders

Reduced Generate Again, Rename bracketed term reduced generate R Rp is like G with the top p prop. signals stripped out R0i:j = Gi:j R1i:j = Hi:j Jackson considers p ≥ 2 Group generate can be rewritten in terms of R Computing R prefixes can be easier than G Jackson Adders

Hyperpropagate Another term will be useful for recursion: hyperpropagate Define Special case for 2-bit groups: Jackson Adders

Jackson Recursions Valency-2 is no simpler Valency-3 simplifies R at expense of Q Compare with Compare with Jackson Adders

Valency-3 Circuits Compound gate implementation Simpler gate implementation Jackson Adders

Logical Effort of Valency-3 PG RQ Compound RQ Simpler Ggenerate 4 2.67 2.22 Gpropagate 1.67 3.33 2.77 Pgenerate 5 4.33 Ppropagate 4.66 Jackson Adders

Sum Selection Select sum based on Rpi-1:0 Requires p-bit D signal for sum-selection data input This is the complexity that is factored out of R D recursion Jackson Adders

Prior Work [Jackson04] + Introduced R and Q + Showed how to compute a single sum output Does not show how to build an entire adder Does not include recursions for D, valency-2 R/Q [Burgess09] + Comments on critical path + Comparisons suggest benefits of Jackson adder - Hard to decipher diagram of 24-bit adder Jackson Adders

Example 18-bit Jackson Adder Sklansky tree with sparseness 2 Valency-2 initial stage (like Ling) Valency-3 2nd and 3rd stages Only 4 levels of noninverting logic Jackson Adders

Initial Stage Reduced Generate Hyperpropagate Also will need gi for even bits, pi for odd bits, xi for all bits For sum selection logic Jackson Adders

Second Stage Compute 3 and 6-bit group signals Note potential for sharing common terms Jackson Adders

Third Stage Reduced generate signals for all groups Jackson Adders

D Logic Note that D17:9 depends on R317:12 Medium-length groups of D are required for sum selection Note that D17:9 depends on R317:12 Hence, arrives at same time as R917:0 Jackson Adders

Sum Selection Sparseness of 2 requires 1-bit ripple from even to odd Jackson Adders

Prefix Network Jackson Adders

Comparison Methodology Goal: energy-delay curves for Jackson adders compared to conventional adders How can we objectively compare against the best conventional design? Technology mapping challenges Sizing Gatesizer limitations SCOT is better, but we only have 130 nm models Inadequate design effort on conventional cases Plan: synthesize with Design Compiler Compare against assign y = a + b; Jackson Adders

Cell Library IBM 45 nm partially-depleted SOI 12S ARM Library sc12_base_v31_rvt_soi12s0_ss_nominal_max_0p90v_125c_mxs.lib A12TR library with regular Vt (RVT) transistors 12 track cell height (1.68 mm) Typical operating point: 1.0 V, 25 C We use worst-case slow-slow, 0.9 V, 125 C library Use Maxsol (mxs) version for worst-case history effect 1X inverter INV_X1B_A12TR: Width = 0.38 mm Cin = 1.6 fF FO4 delay = 15 ps Switching energy: 0.00078 mW/MHz ≈ 0.8 fJ equals 0.5 CinVDD2 Leakage power: 0.1 mW (very high!) Jackson Adders

Preliminary Results Truncated 18-bit Jackson adder slightly outperforms y = a + b at high energy Ling adder also slightly beneficial Fastest designs are 105 ps (7 FO4) Jackson takes more energy except at very long delay Jackson Adders

References [Burgess09] N. Burgess, “Implementation of recursive Ling adders in CMOS VLSI,” Proc. Asilomar Conf. Signals, Systems and Computers, 2009, pp. 1777-1781. [Jackson04] R. Jackson and S. Talwar, “High speed binary addition,” Proc. Asilomar Conf. Signals, Systems and Computers, 2004, pp. 1350-1353. [Jackson08] R. Jackson, “Data detection algorithms for perpendicular magnetic recording in the presence of strong media noise,” Ph.D. thesis, Department of Mathematics, University of Warwick, 2008. [Ling81] H. Ling, "High-speed binary adder," IBM J. Research and Development, vol. 25, no. 3, May 1981, pp. 156-166. [Patil07] D. Patil, O. Azizi, M. Horowitz, R. Ho, and R. Ananthraman, "Robust energy-efficient adder topologies," Proc. Computer Arithmetic Symp., Jun. 2007, pp. 16-28. [Weste10] N. Weste and D. Money Harris, CMOS VLSI Design, 4th Ed., Boston: Addison-Wesley, 2010. [Zlatanovici09] R. Zlatanovici, S. Kao, and B. Nikolic, “Energy-delay optimization of 64-bit carry-lookahead adders with a 240 ps 90 nm CMOS design example,” IEEE J. Solid-State Circuits, vol. 44, no. 2, Feb. 2009, pp. 569-583. Jackson Adders