Download presentation
Presentation is loading. Please wait.
Published byCamilla Tyler Modified over 9 years ago
1
Caltech CS184 Winter2003 -- DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 10: January 31, 2003 Compute 2:
2
Caltech CS184 Winter2003 -- DeHon 2 Last Time LUTs –area –structure –big LUTs vs. small LUTs with interconnect –design space –optimization
3
Caltech CS184 Winter2003 -- DeHon 3 Today Cascades ALUs PLAs
4
Caltech CS184 Winter2003 -- DeHon 4 Last Time Larger LUTs –Less interconnect delay General: Larger compute blocks –Minimize interconnect Large LUTs –Not efficient for typical logic structure
5
Caltech CS184 Winter2003 -- DeHon 5 Different Structure How can we have “larger” compute nodes (less general interconnect) without paying huge area penalty of large LUTs?
6
Caltech CS184 Winter2003 -- DeHon 6 Structure in subgraphs Small LUTs capture structure What structure does a small-LUT-mapped netlist have?
7
Caltech CS184 Winter2003 -- DeHon 7 Structure LUT sequences ubiquitous
8
Caltech CS184 Winter2003 -- DeHon 8 Hardwired Logic Blocks Single Output
9
Caltech CS184 Winter2003 -- DeHon 9 Hardwired Logic Blocks Two outputs
10
Caltech CS184 Winter2003 -- DeHon 10 Delay Model Tcascade =T(3LUT) + T(mux)
11
Caltech CS184 Winter2003 -- DeHon 11 Options
12
Caltech CS184 Winter2003 -- DeHon 12 Chung & Rose Study [Chung & Rose, DAC ’92]
13
Caltech CS184 Winter2003 -- DeHon 13 Cascade LUT Mappings [Chung & Rose, DAC ’92]
14
Caltech CS184 Winter2003 -- DeHon 14 ALU vs. Cascaded LUT?
15
Caltech CS184 Winter2003 -- DeHon 15 4-LUT Cascade ALU
16
Caltech CS184 Winter2003 -- DeHon 16 ALU vs. LUT ? Compare/contrast ALU –Only subset of ops available –Denser coding for those ops –Smaller –…but interconnect dominates
17
Caltech CS184 Winter2003 -- DeHon 17 Parallel Prefix LUT Cascade? Can compute LUT cascade in O(log(N)) time? Can compute mux cascade using parallel prefix? Can make mux cascade associative?
18
Caltech CS184 Winter2003 -- DeHon 18 Parallel Prefix Mux cascade How can mux transform S mux-out? –A=0, B=0 mux-out=0 –A=1, B=1 mux-out=1 –A=0, B=1 mux-out=S –A=1, B=0 mux-out=/S
19
Caltech CS184 Winter2003 -- DeHon 19 Parallel Prefix Mux cascade How can mux transform S mux-out? –A=0, B=0 mux-out=0 Stop= S –A=1, B=1 mux-out=1 Generate= G –A=0, B=1 mux-out=S Buffer = B –A=1, B=0 mux-out=/S Invert = I
20
Caltech CS184 Winter2003 -- DeHon 20 Parallel Prefix Mux cascade How can 2 muxes transform input? Can I compute transform from 1 mux transforms?
21
Caltech CS184 Winter2003 -- DeHon 21 Two-mux transforms SS S SG G SB S SI G GS S GG G GB G GI S BS S BG G BB B BI I IS S IG G IB I II B
22
Caltech CS184 Winter2003 -- DeHon 22 Generalizing mux-cascade How can N muxes transform the input? Is mux transform composition associative?
23
Caltech CS184 Winter2003 -- DeHon 23 Parallel Prefix Mux-cascade
24
Caltech CS184 Winter2003 -- DeHon 24 Commercial Devices
25
Caltech CS184 Winter2003 -- DeHon 25 Xilinx XC4000 CLB
26
Caltech CS184 Winter2003 -- DeHon 26 Xilinx Virtex-II
27
Caltech CS184 Winter2003 -- DeHon 27
28
Caltech CS184 Winter2003 -- DeHon 28
29
Caltech CS184 Winter2003 -- DeHon 29
30
Caltech CS184 Winter2003 -- DeHon 30 Altera Stratix
31
Caltech CS184 Winter2003 -- DeHon 31
32
Caltech CS184 Winter2003 -- DeHon 32
33
Caltech CS184 Winter2003 -- DeHon 33 Programmable Array Logic (PLAs)
34
Caltech CS184 Winter2003 -- DeHon 34 PLA Directly implement flat (two-level) logic –O=a*b*c*d + !a*b*!d + b*!c*d Exploit substrate properties allow wired-OR
35
Caltech CS184 Winter2003 -- DeHon 35 Wired-or Connect series of inputs to wire Any of the inputs can drive the wire high
36
Caltech CS184 Winter2003 -- DeHon 36 Wired-or Implementation with Transistors
37
Caltech CS184 Winter2003 -- DeHon 37 Programmable Wired-or Use some memory function to programmable connect (disconnect) wires to OR Fuse:
38
Caltech CS184 Winter2003 -- DeHon 38 Programmable Wired-or Gate-memory model
39
Caltech CS184 Winter2003 -- DeHon 39 Diagram Wired-or
40
Caltech CS184 Winter2003 -- DeHon 40 Wired-or array Build into array –Compute many different or functions from set of inputs
41
Caltech CS184 Winter2003 -- DeHon 41 Combined or-arrays to PLA Combine two or (nor) arrays to produce PLA (and-or array) Programmable Logic Array
42
Caltech CS184 Winter2003 -- DeHon 42 PLA Can implement each and on single line in first array Can implement each or on single line in second array
43
Caltech CS184 Winter2003 -- DeHon 43 PLA Efficiency questions: –Each and/or is linear in total number of potential inputs (not actual) –How many product terms between arrays?
44
Caltech CS184 Winter2003 -- DeHon 44 PLAs Fast Implementations for large ANDs or Ors Number of P-terms can be exponential in number of input bits –most complicated functions –not exponential for many functions Can use arrays of small PLAs –to exploit structure –like we saw arrays of small memories last time
45
Caltech CS184 Winter2003 -- DeHon 45 PLA Product Terms Can be exponential in number of inputs E.g. n-input xor (parity function) –When flatten to two-level logic, requires exponential product terms –a*!b+!a*b –a*!b*!c+!a*b*!c+!a*!b*c+a*b*c …and shows up in important functions –Like addition…
46
Caltech CS184 Winter2003 -- DeHon 46 PLAs vs. LUTs? Look at Inputs, Outputs, P-Terms –minimum area (one study, see paper) –K=10, N=12, M=3 A(PLA 10,12,3) comparable to 4-LUT? –80-130%? –300% on ECC (structure LUT can exploit) Delay? –Claim 40% fewer logic levels (4-LUT) (general interconnect crossings) [Kouloheris & El Gamal/CICC’92]
47
Caltech CS184 Winter2003 -- DeHon 47 PLA
48
Caltech CS184 Winter2003 -- DeHon 48 PLA and Memory
49
Caltech CS184 Winter2003 -- DeHon 49 PLA and PAL PAL = Programmable Array Logic
50
Caltech CS184 Winter2003 -- DeHon 50 Conventional/Commercial FPGA Altera 9K (from databook)
51
Caltech CS184 Winter2003 -- DeHon 51 Conventional/Commercial FPGA Altera 9K (from databook)
52
Caltech CS184 Winter2003 -- DeHon 52 Big Ideas [MSB Ideas] Programmable Interconnect allows us to exploit that structure –want to match to application structure –Prog. interconnect delay expensive Hardwired Cascades –key technique to reducing delay in programmables PLAs –canonical two level structure –hardwire portions to get Memories, PALs
53
Caltech CS184 Winter2003 -- DeHon 53 Big Ideas [MSB-1 Ideas] Better structure match with hardwired LUT cascades
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.