Penn ESE534 Spring2012 -- DeHon ESE534: Computer Organization Day 14: March 12, 2012 Empirical Comparisons.

Slides:



Advertisements
Similar presentations
Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Advertisements

Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 14: March 19, 2014 Compute 2: Cascades, ALUs, PLAs.
Lecture 9: Coarse Grained FPGA Architecture October 6, 2004 ECE 697F Reconfigurable Computing Lecture 9 Coarse Grained FPGA Architecture.
Programmable Logic Devices
EECE579: Digital Design Flows
CS294-6 Reconfigurable Computing Day 5 September 8, 1998 Comparing Computing Devices.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 21: April 2, 2007 Time Multiplexing.
CS294-6 Reconfigurable Computing Day 6 September 10, 1998 Comparing Computing Devices.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 15: March 12, 2007 Interconnect 3: Richness.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day10: October 25, 2000 Computing Elements 2: Cascades, ALUs,
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day8: October 18, 2000 Computing Elements 1: LUTs.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 9: February 7, 2007 Instruction Space Modeling.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 13: February 26, 2007 Interconnect 1: Requirements.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 26: April 18, 2007 Et Cetera…
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 11: February 14, 2007 Compute 1: LUTs.
CS294-6 Reconfigurable Computing Day 2 August 27, 1998 FPGA Introduction.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 10: February 12, 2007 Empirical Comparisons.
CS294-6 Reconfigurable Computing Day 14 October 7/8, 1998 Computing with Lookup Tables.
Penn ESE Spring DeHon 1 FUTURE Timing seemed good However, only student to give feedback marked confusing (2 of 5 on clarity) and too fast.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 5: January 24, 2007 ALUs, Virtualization…
1. 2 FPGAs Historically, FPGA architectures and companies began around the same time as CPLDs FPGAs are closer to “programmable ASICs” -- large emphasis.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 12: February 21, 2007 Compute 2: Cascades, ALUs, PLAs.
Field Programmable Gate Array (FPGA) Layout An FPGA consists of a large array of Configurable Logic Blocks (CLBs) - typically 1,000 to 8,000 CLBs per chip.
1 DIGITAL DESIGN I DR. M. MAROUF FPGAs AUTHOR J. WAKERLY.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
ESE Spring DeHon 1 ESE534: Computer Organization Day 19: April 7, 2014 Interconnect 5: Meshes.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 11: March 3, 2014 Instruction Space Modeling 1.
Open Discussion of Design Flow Today’s task: Design an ASIC that will drive a TV cell phone Exercise objective: Importance of codesign.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 9: February 24, 2014 Operator Sharing, Virtualization, Programmable Architectures.
CBSSS 2002: DeHon Costs André DeHon Wednesday, June 19, 2002.
J. Christiansen, CERN - EP/MIC
Penn ESE370 Fall Townley & DeHon ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 14: October 1, 2014 Layout and.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 7: February 6, 2012 Memories.
Penn ESE370 Fall Townley & DeHon ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 13: October 5, 2011 Layout and.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 7: January 24, 2003 Instruction Space.
EE3A1 Computer Hardware and Digital Design
Penn ESE534 Spring DeHon ESE534: Computer Organization Day 15: March 24, 2014 Empirical Comparisons.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 5: February 1, 2010 Memories.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day7: October 16, 2000 Instruction Space (computing landscape)
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 8: January 27, 2003 Empirical Cost Comparisons.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 10: January 31, 2003 Compute 2:
Penn ESE534 Spring DeHon 1 ESE534 Computer Organization Day 19: March 28, 2012 Minimizing Energy.
Penn ESE534 Spring DeHon 1 ESE534 Computer Organization Day 9: February 13, 2012 Interconnect Introduction.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 11: January 31, 2005 Compute 1: LUTs.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 11: February 20, 2012 Instruction Space Modeling.
ESE Spring DeHon 1 ESE534: Computer Organization Day 20: April 9, 2014 Interconnect 6: Direct Drive, MoT.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 8: February 19, 2014 Memories.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 22: April 16, 2014 Time Multiplexing.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 10: January 28, 2005 Empirical Comparisons.
ESE532: System-on-a-Chip Architecture
ESE534: Computer Organization
ESE534: Computer Organization
سبکهاي طراحي (Design Styles)
ESE534: Computer Organization
ESE534: Computer Organization
ESE532: System-on-a-Chip Architecture
CS184a: Computer Architecture (Structure and Organization)
Electronics for Physicists
ESE534: Computer Organization
ESE534: Computer Organization
ESE534: Computer Organization
CS184a: Computer Architecture (Structure and Organization)
ESE534: Computer Organization
Electronics for Physicists
ESE534: Computer Organization
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

Penn ESE534 Spring DeHon ESE534: Computer Organization Day 14: March 12, 2012 Empirical Comparisons

Penn ESE534 Spring DeHon Previously Instruction Space Modeling

Before Break Programmable compute blocks –LUTs, ALUs, PLAs –…still working on PLA assignment Penn ESE534 Spring DeHon

Today What if we just built a completely custom circuit? What cost are we paying for programmability? Can we afford to build completely custom circuits? –Can we afford not to? Penn ESE534 Spring DeHon

Today Empirical Data –Custom Gate Array Std. Cell (ASIC) Full –FPGAs –Processors –Tasks

Preclass 1 How big? –2-LUT? –2-LUT w/ Flip-flop? –2-LUT w/ 4 input sources? –2-LUT w/ 1024 input sources? Penn ESE534 Spring DeHon

Empirical Comparisons

Penn ESE534 Spring DeHon Empirical Ground modeling in some concretes Start sorting out –custom vs. configurable –spatial configurable vs. temporal

Penn ESE534 Spring DeHon Full Custom Get to define all layers Use any geometry you like Only rules are process design rules ESE570

Penn ESE534 Spring DeHon Standard Cell Area invnand3AOI4invnor3Inv All cells uniform height Width of channel determined by routing Cell area

Penn ESE534 Spring DeHon Standard Cell Area invnand3AOI4invnor3Inv All cells uniform height Width of channel determined by routing Cell area Identify the full custom and standard cell regions on 386DX die

Penn ESE534 Spring DeHon Standard Cell Area invnand3AOI4invnor3Inv All cells uniform height Width of channel determined by routing Cell area What freedom have we removed? Impact?

Penn ESE534 Spring DeHon MPGA Metal Programmable Gate Array –Resurrected as “Structured ASICs” Gates pre-placed (poly, diffusion) Only get to define metal connections –Cheap (low NRE) only have to pay for metal mask(s) [Wu&Tsai/ISPD2004p103]

Structured ASIC: eASIC nm 1.4M LUTs 500MHz? Penn ESE534 Spring DeHon

Structured ASIC Maybe think about it as an FPGA with vias instead of configurable switches? Ratio of SRAM to via design? Penn ESE534 Spring DeHon

What do we expect? Comparing density/delay/energy –Full custom –Standard Cell (ASIC) –MPGA / Structured ASIC –FPGA –Processor Penn ESE534 Spring DeHon

Why it isn’t trivial? Different logic forms Interconnect Logic Balance Mix of Requirements Penn ESE534 Spring DeHon

MPGA/SA vs. Custom? AMI CICC’83 –MPGA 1.0 –Std-Cell 0.7 –Custom 0.5 AMI CICC’04 –Custom 0.6 (DSP) –Custom 0.8 (DPath) Toshiba DSP –Custom 0.3 Mosaid RAM –Custom 0.2 GE CICC’86 –MPGA 1.0 –Std-Cell FF/counter 0.7 FullAdder 0.4 RAM 0.2

Penn ESE534 Spring DeHon Metal Programmable Gate Arrays

Penn ESE534 Spring DeHon MPGAs Modern -- “Sea of Gates” yield % maybe 5k  /gate = 1.25F 2 /gate ? –(quite a bit of variance)

Penn ESE534 Spring DeHon Conventional FPGA Tile K-LUT (typical k=4) w/ optional output Flip-Flop

Penn ESE534 Spring DeHon Toronto FPGA Model

Penn ESE534 Spring DeHon FPGA Table

Penn ESE534 Spring DeHon (semi) Modern FPGAs APEX 20K1500E  52K LEs  0.18  m  24mm  22mm  1.25M 2 /LE XC2V1000  10.44mm x 9.90mm [source: Chipworks]  0.15  m  11,520 4-LUTs  1. 5M 2 /4-LUT  (~375KF 2 /4-LUT) [Both also have RAM in cited area]

Penn ESE534 Spring DeHon How many gates? (Prelcass 2)

Penn ESE534 Spring DeHon “gates” in 2-LUT

Penn ESE534 Spring DeHon Now how many?

Penn ESE534 Spring DeHon Which gives: More usable gates? More gates/unit area?

Penn ESE534 Spring DeHon Gates Required? Depth=3, Depth=2048?

Penn ESE534 Spring DeHon Gate metric for FPGAs? Day11: several components for computations –compute element –interconnect: space time –instructions Not all applications need in same balance Assigning a single “capacity” number to device is an oversimplification

Penn ESE534 Spring DeHon MPGA vs. FPGA MPGA (SOG GA) –5K 2 /gate –35-70% usable (50%) –7-17K 2 /gate net Ratio: (5) Xilinx XC4K –1.25M 2 /CLB – gates (26?) –26-73K 2 /gate net Adding ~2x Custom/MPGA, Custom/FPGA ~10x

Penn ESE534 Spring DeHon

FPGA vs. Structure ASIC Virtex 6 40nm 470K 6-LUTs Largest device eASIC 45nm 580K eCells Probably smaller die Penn ESE534 Spring DeHon

Penn ESE534 Spring DeHon FPGA vs. Std Cell 90nm FPGA: Stratix II STMicro CMOS090 –Standard Cell Full custom layout …but by tool [Kuon/Rose TRCADv26n2p ]

Penn ESE534 Spring DeHon MPGA vs. FPGA (Delay) MPGA (SOG GA)   gd ~1ns Ratio: 1--7 (2.5) –Altera claiming 2× For their Structured ASIC [2007] –LSI claiming 3× 2005 Xilinx XC4K   1-7 gates in 7ns 2-3 gates typical

Penn ESE534 Spring DeHon 90nm FPGA: Stratix II STMicro CMOS090 [Kuon/Rose TRCADv26n2p ] FPGA vs. Std. Cell Delay

Penn ESE534 Spring DeHon 90nm FPGA: Stratix II STMicro CMOS090 eASIC (MPGA) claim –20% of FPGA power –(best case) [Kuon/Rose TRCADv26n2p ] FPGA vs. Std Cell Energy

Penn ESE534 Spring DeHon Processors vs. FPGAs

Penn ESE534 Spring DeHon Processors and FPGAs

Penn ESE534 Spring DeHon Component Example XC4085XL-09 3,136 CLBs 4.6ns 682 Bit Ops/ns Alpha  64b ALUs 2.3ns 55.7 Bit Ops/ns Single die in 0.35  m [1 “bit op” = 2 gate evaluations]

Penn ESE534 Spring DeHon Processors and FPGAs

Penn ESE534 Spring DeHon Raw Density Summary Area –MPGA 2-3x Custom –FPGA 5x MPGA FPGA:std-cell custom ~ 15-30x Area-Time –Gate Array 6-10x Custom –FPGA 15-20x Gate Array FPGA:std-cell custom ~ 100x –Processor 10x FPGA

Penn ESE534 Spring DeHon Raw Density Caveats Processor/FPGA may solve more specialized problem Problems have different resource balance requirements –…can lead to low yield of raw density

Penn ESE534 Spring DeHon Task Comparisons

Penn ESE534 Spring DeHon Broadening Picture Compare larger computations For comparison –throughput density metric: results/area-time normalize out area-time point selection high throughput density  most in fixed area  least area to satisfy fixed throughput target

Penn ESE534 Spring DeHon Multiply

Preclass 4 Efficiency of 8×8 multiply on 16×16 multiplier? Penn ESE534 Spring DeHon

Multiply

Penn ESE534 Spring DeHon Example: FIR Filtering Application metric: TAPs = filter taps multiply accumulate Y i =w 1 x i +w 2 x i

Mixed Designs Modern FPGAs include hardwired multipliers (Virtex 25x18) Penn ESE534 Spring DeHon

Penn ESE534 Spring DeHon FPGA vs. Std Cell (revisit) 90nm FPGA: Stratix II STMicro CMOS090 [Kuon/Rose TRCADv26n2p ]

Penn ESE534 Spring DeHon Energy [Abnous et al, The Application of Programmable DSPs in Mobile Communications, Wiley, 2002, pp ] Pleiades includes hardwire multiply accumulator

Penn ESE534 Spring DeHon 90nm FPGA: Stratix II STMicro CMOS090 [Kuon/Rose TRCADv26n2p ] FPGA vs. Std Cell Energy (revisit)

Penn ESE534 Spring DeHon IIR/Biquad (Infinite Impulse Response) Simplest IIR: Y i =A  X i +B  Y i-1

Penn ESE534 Spring DeHon DES Keysearch

Penn ESE534 Spring DeHon DNA Sequence Match Problem: “cost” of transform S 1  S 2 Given: cost of insertion, deletion, substitution Relevance: similarity of DNA sequences –evolutionary similarity –structure predict function Typically: new sequence compared to large databse

Penn ESE534 Spring DeHon DNA Sequence Match

Penn ESE534 Spring DeHon Degrade from Peak

How do various architecture degrade from peak? FPGA? Processor? Custom? Penn ESE534 Spring DeHon

Degrade from Peak: FPGAs Long path length  not run at cycle Limited throughput requirement –bottlenecks elsewhere limit throughput req. Insufficient interconnect Insufficient retiming resources (bandwidth)

Penn ESE534 Spring DeHon Degrade from Peak: Processors Ops w/ no gate evaluations (interconnect) Ops use limited word width Stalls waiting for retimed data

Penn ESE534 Spring DeHon Degrade from Peak: Custom/MPGA Solve more general problem than required –(more gates than really need) Long path length Limited throughput requirement Not needed or applicable to a problem

Penn ESE534 Spring DeHon Degrade Notes We’ll cover these issues in more detail as we get into them later in the course

Penn ESE534 Spring DeHon Admin Grading through done during break –Check feedback HW6.3-4 due Wednesday HW7 out Wednesday Reading on web –Classic Paper

Penn ESE534 Spring DeHon Big Ideas [MSB Ideas] Raw densities: custom:ga:fpga:processor –1:5:100:1000 –close gap with specialization