CBSSS 2002: DeHon Costs André DeHon Wednesday, June 19, 2002.

Slides:



Advertisements
Similar presentations
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 14: March 19, 2014 Compute 2: Cascades, ALUs, PLAs.
Advertisements

Introduction to CMOS VLSI Design Lecture 21: Scaling and Economics
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day6: October 11, 2000 Instruction Taxonomy VLSI Scaling.
Balancing Interconnect and Computation in a Reconfigurable Array Dr. André DeHon BRASS Project University of California at Berkeley Why you don’t really.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day17: November 20, 2000 Time Multiplexing.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 21: April 2, 2007 Time Multiplexing.
CS294-6 Reconfigurable Computing Day 6 September 10, 1998 Comparing Computing Devices.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 15: March 12, 2007 Interconnect 3: Richness.
CS294-6 Reconfigurable Computing Day 8 September 17, 1998 Interconnect Requirements.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day10: October 25, 2000 Computing Elements 2: Cascades, ALUs,
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day8: October 18, 2000 Computing Elements 1: LUTs.
DeHon March 2001 Rent’s Rule Based Switching Requirements Prof. André DeHon California Institute of Technology.
CS294-6 Reconfigurable Computing Day 10 September 24, 1998 Interconnect Richness.
Lecture 3: Field Programmable Gate Arrays II September 10, 2013 ECE 636 Reconfigurable Computing Lecture 3 Field Programmable Gate Arrays II.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 6: January 29, 2007 VLSI Scaling.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 13: February 26, 2007 Interconnect 1: Requirements.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 26: April 18, 2007 Et Cetera…
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 11: February 14, 2007 Compute 1: LUTs.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 4: January 22, 2007 Memories.
Trends toward Spatial Computing Architectures Dr. André DeHon BRASS Project University of California at Berkeley.
CS294-6 Reconfigurable Computing Day 14 October 7/8, 1998 Computing with Lookup Tables.
Balancing Interconnect and Computation in a Reconfigurable Array Dr. André DeHon BRASS Project University of California at Berkeley Why you don’t really.
Penn ESE Spring DeHon 1 FUTURE Timing seemed good However, only student to give feedback marked confusing (2 of 5 on clarity) and too fast.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 5: January 24, 2007 ALUs, Virtualization…
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 12: February 21, 2007 Compute 2: Cascades, ALUs, PLAs.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 13: February 4, 2005 Interconnect 1: Requirements.
EZ-COURSEWARE State-of-the-Art Teaching Tools From AMS Teaching Tomorrow’s Technology Today.
Lecture 2: Field Programmable Gate Arrays September 13, 2004 ECE 697F Reconfigurable Computing Lecture 2 Field Programmable Gate Arrays.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 15: February 12, 2003 Interconnect 5: Meshes.
ESE Spring DeHon 1 ESE534: Computer Organization Day 19: April 7, 2014 Interconnect 5: Meshes.
FPGA Switch Block Design Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 9: February 24, 2014 Operator Sharing, Virtualization, Programmable Architectures.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
Spring 2007W. Rhett DavisNC State UniversityECE 747Slide 1 ECE 747 Digital Signal Processing Architecture SoC Lecture – Normalized Comparison of Architectures.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 7: February 6, 2012 Memories.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day11: October 30, 2000 Interconnect Requirements.
Design Space Exploration for Application Specific FPGAs in System-on-a-Chip Designs Mark Hammerquist, Roman Lysecky Department of Electrical and Computer.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 7: January 24, 2003 Instruction Space.
CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #4 – FPGA.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 14: February 10, 2003 Interconnect 4: Switching.
CBSSS 2002: DeHon Interconnect André DeHon Thursday, June 20, 2002.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 5: February 1, 2010 Memories.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 6: January 19, 2005 VLSI Scaling.
1 Carnegie Mellon University Center for Silicon System Implementation An Architectural Exploration of Via Patterned Gate Arrays Chetan Patel, Anthony Cozzie,
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day7: October 16, 2000 Instruction Space (computing landscape)
FPGA-Based System Design: Chapter 1 Copyright  2004 Prentice Hall PTR Moore’s Law n Gordon Moore: co-founder of Intel. n Predicted that number of transistors.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 13: February 6, 2003 Interconnect 3: Richness.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 8: January 27, 2003 Empirical Cost Comparisons.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 6: January 22, 2003 VLSI Scaling.
1 Field-programmable Gate Array Architectures and Algorithms Optimized for Implementing Datapath Circuits Andy Gean Ye University of Toronto.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 10: January 31, 2003 Compute 2:
Penn ESE534 Spring DeHon 1 ESE534 Computer Organization Day 9: February 13, 2012 Interconnect Introduction.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 11: January 31, 2005 Compute 1: LUTs.
CBSSS 2002: DeHon Universal Programming Organization André DeHon Tuesday, June 18, 2002.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 12: February 5, 2003 Interconnect 2: Wiring Requirements.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 10: January 28, 2005 Empirical Comparisons.
ESE534: Computer Organization
CS184a: Computer Architecture (Structure and Organization)
ESE534 Computer Organization
ESE532: System-on-a-Chip Architecture
CS184a: Computer Architecture (Structure and Organization)
ESE534: Computer Organization
ESE534: Computer Organization
ESE534: Computer Organization
ESE534: Computer Organization
CS184a: Computer Architecture (Structures and Organization)
ESE534: Computer Organization
ESE534: Computer Organization
ESE534: Computer Organization
CprE / ComS 583 Reconfigurable Computing
Presentation transcript:

CBSSS 2002: DeHon Costs André DeHon Wednesday, June 19, 2002

CBSSS 2002: DeHon Key Points Every feature in our computing devices has a cost –Is something physical –Takes up space, has delay, consumes energy Cost structure varies with technology Optimal allocation/organization varies with cost structure

CBSSS 2002: DeHon Costs

CBSSS 2002: DeHon Physical Entities Idea: Computations take up space –Bigger/smaller computations –How fit into limited space? –Size  resources  cost –Size  distance  delay

CBSSS 2002: DeHon Comment Experience from VLSI –Primarily 2D substrate Will want to generalize as appropriate for other substrate –Use concretes from VLSI

CBSSS 2002: DeHon Area Components Gates -- compute Memory Cells -- state Wires -- interconnect

CBSSS 2002: DeHon Typical VLSI Wires – normalizer – pitch 1 unit 2-input gate – maybe 4 x 5 units Memory Cells – maybe 4 x 3 units

CBSSS 2002: DeHon Structure Area Example: nor2-crossbar architecture –Crosspoint: about 2x memory cell 5x5 units

CBSSS 2002: DeHon nor2-crossbar N tall –Two crosspoints per NOR gate –Height/gate~10 N wide –Width/xpoint~5 Area=50xN 2

CBSSS 2002: DeHon Structure Area Example 2: nor2-processors

CBSSS 2002: DeHon Components Gate: 1 Data Memory: –2N memory cells –(underestimate) Instruction Memory: –3 log 2 (N) x N memory cells Counter: –log 2 (N) x 5 gates/bit

CBSSS 2002: DeHon Components Gate: 1 Data Memory: –2N memory cells –(underestimate) Instruction Memory: –3 log 2 (N) x N memory cells Counter: –log 2 (N) x 5 gates/bit

CBSSS 2002: DeHon nor2-processors Area: –12(2N+3 log 2 (N) N) + 20(5 log 2 (N) ) –100 log 2 (N) + 24N + 36 log 2 (N) N

CBSSS 2002: DeHon Area Compare crossbar processor 10: : 500,000 30, : 50M 380,000 10,000: 5G 15M (processor does Nx less calculations at a time)

CBSSS 2002: DeHon Area Comments When need to fit in limited area –Processor (temporal) version beneficial –Why processors preferred in early VLSI (pre-VLSI) Physical space limited Problems large In VLSI –State/description smaller than active Largely because of compact memory

CBSSS 2002: DeHon Area Comments Can do better than crossbar for interconnect –…next time

CBSSS 2002: DeHon Key Costs In VLSI: –Area, delay, energy Often, not simultaneously optimized –Give rise to tradeoffs Previous is crude example of area-delay

CBSSS 2002: DeHon Costs Vary

CBSSS 2002: DeHon VLSI World Technology largely defined by precision in fabrication –Minimum feature size –A physical limit On our ability to build and transfer patterning Do so precisely

CBSSS 2002: DeHon Feature Size is half the minimum feature size in a VLSI process [minimum feature usually channel width]

CBSSS 2002: DeHon Predictable Variation Feature Sizes have been shrinking –As we get control over physical dimensions Feature Size shrink –Changes size limits –Shifts costs

CBSSS 2002: DeHon Scaling Channel Length (L) Channel Width (W) Oxide Thickness (T ox ) Doping (N a ) 1/ Voltage (V)

CBSSS 2002: DeHon Area Perspective [2000 tech.] 18mm  18mm 0.18  m 60G 

CBSSS 2002: DeHon Capacity Growth Things which were not feasible a 5—10 years ago –Very feasible now Designs which must be done one way (e.g. temporal)… –now have many new options

CBSSS 2002: DeHon Effects of Ideal Scaling? Area 1/   Capacitance 1/  Resistance  Threshold (V th ) 1/  Current (I d ) 1/  Gate Delay (  gd ) 1/  Wire Delay (  wire ) 1 Power 1/    1/   Delay shifts from gates to wires –Distance becomes a bigger factor in delay than gates

CBSSS 2002: DeHon VLSI Scaling Forward Can’t scale forward forever Depend on bulk effects, large numbers of atoms –…but approaching atomic scale Conventional VLSI feeling this pain Andrew Kahng will share the industry roadmap with us tonight

CBSSS 2002: DeHon Beyond VLSI Even w/in VLSI Scaling –Changing costs effect our designs Effect more pronounced moving between substrates –Memory not compact? –Memory and switches in 1x1 wire pitches? –Unit resistance wires? –Three dimensional wiring? –Three dimensional active device layout?

CBSSS 2002: DeHon Beyond Silicon Don’t know what the key costs and limits are –Unique/identifiable proteins or match addresses? –Length of binding domains? –Number of qbits? But, understanding them –Will be key to understanding how to engineer efficient structures

CBSSS 2002: DeHon Cost Optimization Example LUT Size

CBSSS 2002: DeHon From Last Time Could build a large Lookup-Table –But grows exponentially in inputs Could interconnect a collection of programmable gates –How much does interconnect cost? How complex (big) should the gates be?

CBSSS 2002: DeHon LUTs with Interconnect Alternative to one big LUT

CBSSS 2002: DeHon Question Restated How large of a LUT should we use as the basic building blocking in a set of programmably interconnected gates?

CBSSS 2002: DeHon Qualitative Effects Larger LUTs –Reduce the number needed –Capture local interconnect, maybe cheaper than paying interconnect between them –Are less and less efficient for certain functions E.g. xor and addition mentioned last time

CBSSS 2002: DeHon Qualitative Effects Smaller LUTs: –Pay large interconnect overhead –Overhead per gate less than exponential –Some functions take small numbers of gates –…but other functions still require exponential gates (net loss)

CBSSS 2002: DeHon Memories and 4-LUTs For the most complex functions an M- LUT has ~2 M-4 4-LUTs SRAM 32Kx8 =0.6  m –170M 2 (21ns latency) –8*2 11 =16K 4-LUTs XC3042 =0.6  m –180M 2 (13ns delay per CLB) –288 4-LUTs Memory is 50+x denser than FPGA –…and faster

CBSSS 2002: DeHon Memory and 4-LUTs For “regular” functions? 15-bit parity –entire 32Kx8 SRAM –5 4-LUTs (2% of XC3042 ~ 3.2M 2 ~1/50th Memory) 7b Add –entire 32Kx8 SRAM –14 4-LUTs (5% of XC3042, 8.8M 2 ~1/20th Memory )

CBSSS 2002: DeHon Empirical Approach Look at trends across benchmark set of “typical” designs –Partially a question about typical regularity –Much of computer “architecture” is about understanding the structure of problems Use algorithm for covering with small LUTs How many need? How much area do they take up with interconnect?

CBSSS 2002: DeHon Toronto Experiments Pick benchmark set Map to K-LUTs –Vary K Route the K-LUTs Develop area/cost model Compute net area –Minimum? [Rose et. al. JSSC v25n5p1217]

CBSSS 2002: DeHon LUT Count vs. base LUT size

CBSSS 2002: DeHon LUT vs. K DES MCNC Benchmark –moderately irregular

CBSSS 2002: DeHon Toronto FPGA Model Connect FPGAs In Mesh (hopefully, less than crossbar)

CBSSS 2002: DeHon Toronto LUT Size Map to K-LUT –use Chortle Route to determine wiring tracks –global route –different channel width W for each benchmark Area Model for K and W

CBSSS 2002: DeHon LUT Area K-LUT: c+ memcell * 2 K Switches: linear in W –E.g. Area=12 x W x switches –How does W grow with N? (for next time) Interconnect in fixed layers: – W 2 x pitch 2 –(but assume switched dominate)

CBSSS 2002: DeHon LUT Area vs. K Routing Area roughly linear in K

CBSSS 2002: DeHon Mapped LUT Area Compose Mapped LUTs and Area Model

CBSSS 2002: DeHon Mapped Area vs. LUT K N.B. unusual case minimum area at K=3

CBSSS 2002: DeHon Toronto Result Minimum LUT Area –at K=4 –Important to note minimum on previous slides based on particular cost model –robust for range of switch sizes

CBSSS 2002: DeHon Implications For this cost model, –Efficient to interconnect small LUTs –Even though it may mean most of the area in wiring Need wiring to exploit structure of problems

CBSSS 2002: DeHon General Result This kind of result typical –Understand competing factors Cost (area per K-LUT) Utility (unit reduction w/ K-LUT) –Understand variations –Find minimum for cost and variation model

CBSSS 2002: DeHon Wrapup

CBSSS 2002: DeHon Key Points Every feature in our computing devices has a cost –Is something physical –Takes up space, has delay, consumes energy Cost structure varies with technology Optimal allocation/organization varies with cost structure

CBSSS 2002: DeHon Coming Attractions Change and limits in VLSI –Andrew Kahng, this afternoon (4:30pm) Interconnect requirements and optimization –Tomorrow No 10:30am lecture today