Presentation is loading. Please wait.

Presentation is loading. Please wait.

Making FPGAs a Cost-Effective Computing Architecture Tom VanCourt Yongfeng Gu Martin Herbordt Boston University BOSTON UNIVERSITY.

Similar presentations


Presentation on theme: "Making FPGAs a Cost-Effective Computing Architecture Tom VanCourt Yongfeng Gu Martin Herbordt Boston University BOSTON UNIVERSITY."— Presentation transcript:

1 Making FPGAs a Cost-Effective Computing Architecture Tom VanCourt Yongfeng Gu Martin Herbordt Boston University BOSTON UNIVERSITY

2 21 Jan 2005 2FPGAs for Computing FPGAs as Compute Engines Proven successful Proven successful 1000x speedup in bioinformatics, computational chemistry, etc. Accelerator board fits standard PC backplane Off-the-shelf availability, modest HW cost What makes them work so well What makes them work so well >400 memory busses, 3Tbit/sec* total bandwidth Massive parallelism, fast on-chip communication So why doesn’t everybody use them? So why doesn’t everybody use them? *Xilinx XC2VP100

3 21 Jan 2005 3FPGAs for Computing Field Programmable Gate Arrays What is an FPGA? What is an FPGA? A bag of uncommitted computer parts No defined function – can be whatever you want Why is programming a problem? Why is programming a problem? CPU runs the application, FPGA is the application All of what’s hard in programming, only more so How are we changing the model? How are we changing the model? Separating circuit design from application logic Addressing specific application areas

4 21 Jan 2005 4FPGAs for Computing CPU vs. FPGA Hardware CPU has … FPGA has … Arithmetic & logic 1-10 pipelines Fixed data width > 400 HW multipliers ~100K function cells Registers & memory Fixed reg. array 0-4 caches ~200K reg. bits > 400 cache RAMs Connectivity & communication Fixed datapath 1 ext. data bus 1-32 dedicated I/O Arbitrary data path >1000 data I/O pins 20 links, 3-10Gbit Process technology Incremental growth Process limited Exponential growth Process driver

5 21 Jan 2005 5FPGAs for Computing Programming Skills vs. FPGAs Single-threading Single-threading No synchronization for/if/switch control Incremental execution Incremental execution One instruction at a time Results are immediate Common parallelization Common parallelization Large units of work Costly communication Massive parallelism Visible timing relations State machine/hardwired Pipelined execution All operations active Visible dependencies Parallelism model Fine grain – one ALU op Cheap on-chip comm. CPU model FPGA model

6 21 Jan 2005 6FPGAs for Computing Attempts to Date Hardware description languages Hardware description languages Unfamiliar control & resource models Graphical design entry - tedious Graphical design entry - tedious Example: X = 3*Y + 5*Z Standard programming languages Standard programming languages Good SW structure  good HW structure Semantic gap works both ways Semantic gap works both ways Good HW designers aren’t application experts

7 21 Jan 2005 7FPGAs for Computing Requirements for a Solution Acknowledge SW and HW skills separately Acknowledge SW and HW skills separately HW expertise: system interface, memory structure, synchronization, computation arrays SW expertise: problem origination, data manipulation, algorithm variations, exploration Allow ‘normal’ representations to both Eliminate dependencies between HW and SW Balance generality vs. domain specifics Balance generality vs. domain specifics Create multiple levels of generality

8 21 Jan 2005 8FPGAs for Computing Behavior as a Parameter Reusable structure Reusable structure Standard HW reuse: prefab leaf components + custom connectivity prefab leaf components + custom connectivity Required HW reuse: prefab connectivity + custom leaves prefab connectivity + custom leaves Go beyond VHDL parameterization Go beyond VHDL parameterization Not just data values as parameters Behavior as parameter Like C library’s qsort(data[], compare() ) ??? ? ? ? ?

9 21 Jan 2005 9FPGAs for Computing Reusing Control, not Function Example: Iterative Optimization Example: Iterative Optimization Logic designer provides: Logic designer provides: Parameterized logic model End user provides: End user provides: X 0 – Initial candidate F j (X) – Score next candidate solution j Best(S 0, S 1, …) – Select solution[s] based on score[s] Fill-ins define the search algorithm Fill-ins define the search algorithm Hill climbing, Gibbs sampling, simulated annealing, … XiXi F 0 (X)F 1 (X) X i+1 = Best(S 0, …) X0X0

10 21 Jan 2005 10FPGAs for Computing Familiar SW Development Style Standard design pattern* Standard design pattern* Template Method or Strategy Event driven  ‘inverted’ flow of control Event driven  ‘inverted’ flow of control Sequence and synchronization outside of app. System calls app-specific logic when needed Widely used for GUI, web applications Good match to object oriented design style Good match to object oriented design style System refers to abstract application interface Application provides concrete logic *Gamma et al., ‘Design Patterns’

11 21 Jan 2005 11FPGAs for Computing Preliminary Results Computational chemistry: 3D correlation Computational chemistry: 3D correlation Systolic array for direct correlation Speedup: 400x – 1000x relative to PC (using FFT) Microarray analysis Microarray analysis Regression analysis of disease vs. healthy state Speedup: ~ 1000× relative to PC Approximate string matching Approximate string matching Dynamic programming – Smith-Waterman 2.23 – 9.68 ×10 9 character comparisons/sec

12 21 Jan 2005 12FPGAs for Computing Work in Progress XML representation for models XML representation for models Define abstract application interface Define HW in terms of abstract interface Define abstract FPGA resources & model constraints Concrete representation of application logic Concrete representation of application logic Create concrete application logic Create concrete FPGA resource description Bind concretions to abstract model Create synthesizable output Create synthesizable output Repeatable elements scaled to actual FPGA resource limits


Download ppt "Making FPGAs a Cost-Effective Computing Architecture Tom VanCourt Yongfeng Gu Martin Herbordt Boston University BOSTON UNIVERSITY."

Similar presentations


Ads by Google