Download presentation
Presentation is loading. Please wait.
Published byMartina Marsh Modified over 9 years ago
1
An Introduction to Reconfigurable Computing Mitch Sukalski and Craig Ulmer Dean R&D Seminar 11 December 2003
2
Reconfigurable Computing… is computation on a platform with reconfigurable (i.e., modifiable at run-time) hardware capable of implementing application-specific algorithms and functionality on demand.
3
Computing Spectrum Execute x/xor Fetch Decode Registers + Memory Writeback Software General-Purpose CPU Easily reprogrammed Low cost Fundamental bottlenecks + z -1 xorx + x ABD π x C result Hardware Application-Specific Integrated Circuit (ASIC) Not modifiable High cost Extremely fast Soft-Hardware Field Programmable Gate Arrays (FPGAs) Reconfigurable hardware Medium cost Speedup potential
4
History The Teramac CCM: Multi-Chip Module of FPGAs Fixed+Variable CPU: Users can attach new computational circuits to a fixed ALU Xilinx Virtex FPGA 1945: Eckert, Mauchly, von Neumann: ENIAC 1945: “von Neumann architecture” 1960: Estrin: Fixed+Variable Structure Computer 1970’s: Simple PLDs 1985: Xilinx introduces first FPGA 1990’s: Custom Computing Machines (CCMs) 1999: FPGAs exceed million logic gates 2002: FPGAs include complex cores ENIAC Connecting computational Blocks for an algorithm Xilinx Virtex II Pro (image courtesy of rapidio.org)
5
Reconfigurable Computing in Modern HPC Stand-alone platforms –OctigaBay 12K –SRC-6 –Starbridge Hypercomputer Accelerator cards –Timelogic’s DeCypher –Nallatech’s BenNUEY –Annapolis Micro Systems WILDSTAR II
6
Example: Computational Fluid Dynamics William Smith & Austars Schnore at GE Global Research From:“Towards an RCC-based Accelerator for Computational Fluid Dynamics,” ERSA 2003
7
And now for some details… Field Programmable Gate Arrays (FPGAs) Common RC design techniques Reported examples
8
Field-Programmable Gate Arrays (FPGAs) FPGAs emulate digital logic circuitry –Large array of configurable logic blocks –Internal routing through programmable interconnection network FPGAs hold hardware configuration in SRAM –Change the digital circuitry by loading new configuration Design approach: –User designs in hardware description language –Synthesis tools translate to logic gates –Mapping tools target specific FPGA
9
Register LUT Simplified Logic Block Emulates logic function –Thousands per chip Lookup Table (LUT) –Holds truth table –Inputs produce outputs 1-bit registers –Hold data between cycles Note: Greatly simplified
10
LUT Example:1-bit Adder ABC in C out Sum 00000 00101 01001 01110 10001 10110 11010 11111 Register LUT A B C 0 A B C 0 C out Sum Truth Table
11
Routing Data between Logic Blocks Need to connect logic blocks Wires and Switchboxes –LBs connect to local wires –Switchboxes route long connections Routing set at compile time –Performed by tools
12
Reconfiguration Modern FPGAs SRAM based –Can be loaded with new circuitry Full reconfiguration –Few megabytes of configuration –Milliseconds Partial reconfiguration –Reprogram only a portion of chip –Reduces configuration time –Non-trivial, poorly supported FPGA Full Configuration Image Partial Configuration Image
13
Design Techniques Digital logic design techniques for exploiting FPGAs
14
FPGAs as Computational Accelerators Use FPGAs as soft-hardware –Port algorithm to hardware –Run inside FPGA –Reuse hardware Techniques –Concurrency, memory, partial evaluation
15
1. Concurrency Load FPGA with multiple computational circuits –Hardware state machines are like threads, but.. –All tasks are always running Raw parallelism –Units run in parallel –Example: Key breaking Pipelining –Chain units together in series –Example: Streaming computations, data-flow
16
2. Custom Memory Interactions Most FPGA cards have multiple memory banks –Fetch/store multiple data values at same time –Predictable performance (as opposed to caches) –Hide address generation SRAM Bank 0 SRAM Bank 1 SRAM Bank 2 SRAM Bank 3 X X X SRAM Bank 4 FPGA
17
3. Partial Evaluation Know data constants at design time –Apply to circuits and reduce hardware –Synthesis tools perform automatically Note: FPGAs unique because we can easily generate new, optimized hardware configurations for each set of constants. Example: 4-bit Ripple-Carry Adder
18
RC Performance Examples CFD: 23 GFLOPS sustained –“Towards an RCC-based Accelerator for Computational Fluid Dynamics,” Smith & Schnore, 2003 Adaptive beamforming: 20 GFLOPS –Parallel systolic array architecture –“20 GFLOPS QR processor on a Xilinx Virtex-E FPGA,” Walke, et. al., 2000 Real-time holographic video display at 30fps –“Using field programmable gate arrays to scale up the speed of holographic video computation,” Nwodoh
19
In Summary Reconfigurable computing uses FPGAs to emulate application-specific hardware –Achieve performance gains with dedicated hardware It is possible to implement just about any kind of digital hardware in the FPGA. –Limited by capacity and effort –Resurrect application-specific hardware architectures –SIMD, MIMD, Systolic Processor Arrays, Data-Flow…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.