Download presentation
Presentation is loading. Please wait.
Published byArthur Hand Modified over 10 years ago
1
Using Carry-Save Adders For Radix- 4, Can Be Used to Generate 3a – No Booth’s Slight Delay Penalty from CSA – 3 Gates
2
Upper Half P in Stored Carry For Radix-2, Better Use in Keeping Cumulative Product in Redundant Form for First k -1 Cycles Then Use a CPA in the Last Cycle
3
CSA With Booth Recoding Better Usage when Combined with Booth’s Recoding –Reduces Cycles by 50% Each Cycle Faster Due to CSA Sign of a, 2a Incorporated Directly in Recoder/Selector Instead of Add/Subtract Signal Generation
4
CSA Combined with Booth Recoding
5
Booth Recoder/Selector Circuitry Shown on Following Slide Negative Multiples –a, -2a in 2’s Complement a, 2a Aligned at Right with Position i Must be Padded with i Zeros to Right Bitwise Complement (when –a, -2a Needed) Converts zeros to ones Followed by LSb add of 1 Converts Back to zeros Causes a Carry-in of 1 into Position i Can Ignore Positions 0 through i -1 (in neg. multiples) Insert carry-in directly (dot)
6
Booth Recoder – Selector Circuit
7
Radix-4 with CSA – No Booth
8
Radices > 4 Radix-8 (3 bits at a time-k/3 multiples) Requires 3-Level CSA Tree –Might as Well Use Radix-16 (4 bits at a time) –Still 3-level tree with one more CSA MUXes Can Be Replaced with Booth Recoder/Selector Circuits in Higher Radix Multipliers Can Continue to Increase Radix (256-8bits) Leading to Wider Trees Tradeoff is Speed Versus Area
9
Radix-16 Multiplication
10
Classification of Multipliers
11
Twin-Beat Mult. with Radix-8 Booth Recoding
12
Full Tree Multipliers All k PPs Produced Simultaneously Input to k-input Multioperand Tree Multiples of a (Binary, High-Radix or Recoded) Formed at Top of Tree Multiple-Forming Circuits –AND Gates (binary multiplier) –radix-4 Booth (recoded multiplier) Tree Results in Product in Redundant Form (2 Values – Carry-Store for Example) Final Product Formed With Converter (Fast CPA for Exmaple)
13
General Parallel Multiplier
14
Tree Type Multiplier Classification Distinguished by Design of: 1.Partial Product Forming Circuits (i.e., Booth, Hi-Rad, etc.) 2.Reduction Tree Type 3.Redundant-to-Binary Converter If Redundant Result in Carry-Save Form, Converter is Just a CPA Could Use Other Redundant Adders Such as Signed Binary (4:2 Compressors) High Radix Multipliers Lead to Fewer Values to Accumulate –Sequential Design – Fewer Cycles –Parallel Design Smaller Tree –Tradeoff Tree Complexity Versus Multiple Forming Circuit
15
Wallace and Dadda Tree Multipliers Wallace – Combine Partial Products as Soon as Possible Dadda – Maintain Critical Path Length (Tree Depth) but Combine as Late as Possible Wallace – Fastest Possible Design Since Typically Smaller CPA at End Dadda – Simpler Tree but Wider CPA at End
16
4 4 Example 16 AND Gates Used to Form x i a j Terms (dots) 1 2 3 4 3 2 1
17
Wallace Example 1 2 3 4 3 2 1 5 FAs, 3 HAs, 4-bit CPA
18
Dadda Examples 1 2 3 4 3 2 1 3 FAs, 3 HAs, 6-bit CPA 1 2 3 4 3 2 1 4 FAs, 2 HAs, 6-bit CPA
19
Trees in Numeric Representation Many Times Hybrid Approach Used to Find Smallest Width CPA MS Thesis Topic – Optimize Tree With Different Counter Types
20
Implementation Issues Logarithmic Depth Tree – Irregular Structure Design/Layout Difficult Various Length Signal Propagation Paths Hazards and Signal Skew Need Iterated Recursive Structures Automatic Synthesis and Layout Motivates Search for Alternative Reduction Tree Structures
21
Other Tree Architectures Can Compose from Larger Counters, e.g. (7:2) –Use “0” Inputs for Some –Or Prune the Tree for Some Use “slices” – Example is (11:2) – Next Slide –Can be Laid Out to Occupy Narrow Vertical Slice and Replicated –All Carries Produced in Level i Enter Level i+1 –Balanced Delay Tree Results 3 Columns – 1, 3, 5 FAs Can Expand from 11 to 18 – Append Col. of 7
22
(11:2) Tree Slice
23
Other Tree Blocks Converter Stage is Fast CPA Can Also Use SBD With SBD the Converter Stage is a Fast Subtractor
24
Array Multipliers Can Eliminate Top CSA With 0 Input Can Replace 0 With y to Compute ax+y
25
Array Multipliers Tree is One-Sided Longest Delay is 4 CSA Plus k-bit CPA Slower than Wallace/Dadda Tree Regular Structure –short wires in horiz., vert., diag. positions –simple, efficient layout –easily pipelined (latches after each CSA row)
26
Methods for Reducing Array Size
27
Reducing Array Size (cont.)
28
5 by 5 Array Multiplier (unsgnd)
29
Signed Array Multiplier Array with 2’s Complement Alternative is Pezaris Array with Different Cell Types Need Array of AND Gates for Multiple Generation Critical Path is Main Diagonal then Ripple Thru CPA Can skip “h” Cells Along Main Diag –lower right cell now has 4 inputs –move to “extra” input in second cell in diag. –less regular layout now but faster
30
5 by 5 Array Multiplier (signed)
31
5 by 5 Array Multiplier AND Gates Embedded inside FA Blocks
32
Pipelined Partial Tree Multiplier
33
Pipelined Array Multiplier
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.