Presentation is loading. Please wait.

Presentation is loading. Please wait.

DSD2001 Reconfigurable Computing: the Roadmap to a New Business Model – and its Impact on SoC Design TS4: Tuesday, 14.00 hrs Reiner Hartenstein University.

Similar presentations


Presentation on theme: "DSD2001 Reconfigurable Computing: the Roadmap to a New Business Model – and its Impact on SoC Design TS4: Tuesday, 14.00 hrs Reiner Hartenstein University."— Presentation transcript:

1 DSD2001 Reconfigurable Computing: the Roadmap to a New Business Model – and its Impact on SoC Design TS4: Tuesday, 14.00 hrs Reiner Hartenstein University of Kaiserslautern Pirenópolis, GO, Brazil, Sept. 10-15, 2001

2 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 2 Conferences on Reconfigurable Logic topic adoption by congresses: ASP-DAC, DAC, DATE, ISCAS, SPIE …. FCCM, FPGA (founded 1992), and FPL (founded 1991 at Oxford, UK): FPL 2002, La Grande Motte (Montpellier, France), Sept. 2 – 4 http://www.lirmm.fr/fpl2002/ Paper Submission deadline : 15th March 2002 Notification of Acceptance : 20th May 2002 The International Conference on Field- programmable Logic and Applications Laboratoire d‘ Informatique, de Robotique et de Microélectronique de Montpellier Montpellier de

3 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 3 >> Introduction Introduction FPGA boom Coarse Grain Architectures Fascinating Paradigm Shift Programming Coarse Grain rDPAs Principles of Soft Computing Machines Future developments expected Conclusions http://www.uni-kl.de fine grain coarse grain fundamental issues

4 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 4 Logic Gate Price Trend Source:Altera Price (Normalized to Q1/1993) Q1 '93 Q1 '94 Q1 '95 Q1 '96 Q1 '97 Q1 '98 Q1 '99 Q1 '00 Price per Logic Element 40% lower per Year 0 0.2 0.4 0.6 0.8 1 1.2 0.261 0.086 0.042 0.029

5 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 5 The Impact of Reconfigurable Logic Reconfigurable platforms bring a new dimension to digital system development and have a strong impact on SoC design. A rapidly growing large user base of HDL-savvy designers with FPGA experience. Flexibility supports turn-around times of minutes instead of months for real time in-system debugging, profiling, verification, tuning, field-maintenance, and field upgrades However, completely ignored by CS & CSE Curricula

6 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 6 ? What’s coming next ? The History of Paradigm Shifts “Mainstream Silicon Application is switching every 10 Years” TTL µproc., memory “The Programmable System-on-a-Chip is the next wave“ custom standard 1957 1967 1977 1987 1997 2007 Makimoto’s Wave ASICs, accel’s LSI, MSI 1 st Design Crisis 2 nd Design Crisis ? reconfigurable Published in 1989

7 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 7 How’s next Wave ? 2007 FPGAs custom standard 1957 1967 1977 1987 1997 Tredennick’s Paradigm Shifts procedural programming algorithm: variable resources: fixed hardwired algorithm: fixed resources: fixed 2007 ? structural programming algorithm: variable resources: variable Coarse grain RAs no further wave ! Hartenstein’s Curve ? 4 th wave ?

8 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 8 The Impact of Makimoto’s Paradigm Shifts TTL µproc., memory custom standard ASICs, accel’s LSI, MSI reconfigurable 1957 1967 1977 1987 1997 2007 Procedural personalization via RAM-based Machine Paradigm Personalization (CAD) before fabrication structural personalization: RAM-based before run time Dr. Makimoto: FPL 2000 keynote Software Industry’s Secret of Success Repeat Success Story by new Machine Paradigm !

9 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 9 Terminology

10 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 10 Reconfigurable Logic going Mainstream Please, Lobby for New Curricula. Comprehensive Methodology One of the goals of this talk: to motivate You by Key Issues and Visionary Highlights. Fine grain: FPGAs killing the ASIC market Coarse grain: several startups Substantially improved design flow and libraries Fastest growing segment of semiconductor market

11 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 11 >> FPGA boom Introduction FPGA boom Coarse Grain Architectures Fascinating Paradigm Shift Programming Coarse Grain RAs Principles of Soft Computing Machines Future development expected Conclusions http://www.uni-kl.de

12 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 12 What is an FPGA ? single-length lines double-length lines S S S S L LL LL L LLL longlines S = Switch Box L = Logic Block Xilinx XC400E L LL LL L LLL

13 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 13 Top 4 FPGA Manufacturers 2000 Xilinx 42% Altera 37% Lattice 15% Actel 6% Top 4 PLD Manufacturers 2000 $3.7 Bio

14 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 14 FPGA market 1998 / 1999 1999 rank global sales (mio $) 19981999 1Xilinx629899 2Altera654837 3Lattice206410 4Actel154172 5Lucent100120 6Cypress4143 7Quicklogic3040 8Atmel3238 Source: IC Insights Inc. Meanwhile, Xilinx acquired Philips' MOS PLD business, Lattice purchased Vantis..

15 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 15 FPGAs going Mainstream [Dataquest] PLD market > $7 billion by 2003. IP reuse and "pre-fabricated" components for the efficiency of design and use for PLDs FPGAs are going into every type of application. FPGA, from an IP standpoint, starting to look like an ASIC. PLD vendors provide libraries to support their products. today Altera and Xilinx own >65% of PLD business. FPGAs soon reach 50 million system gates

16 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 16 Away from complex design flow Place and Route Netlist Schematics/ HDL Netlister Bitstream Compiler HLL [S. Guccione] EDA trends....

17 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 17 Drop traditional separate design flow User Code Compiler Executable Netlister Netlist Place and Route. Bitstream Schematics/ HDL [S. Guccione] HLL Compiler [S. Guccione] EDA trends....

18 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 18 embedded hardw. CPU & memory cores HLL Compiler CPU core FPGA core Memory core [S. Guccione] embedded CPU and memory available HLL Compiler [S. Guccione] memory

19 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 19 CPU for configuration management on-board microprocessor CPU is available anyhow - even along with a little RTOS HLL Compiler [S. Guccione] Compiler HLL [S. Guccione] EDA trends....

20 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 20 Configuration Architectures host Compiler, Mapper, RTOS etc. Soft Data Path RAM multi-context: Soft Data Path RAM host Compiler, Mapper, RTOS etc. straight forward: host Compiler, Mapper, RTOS etc. Config. Cache RAM Soft Data Path RAM Configuration caching*: Configuration Loading Resources: separate configuration fabrics (e.g. FPGA) wormhole routing (KressArray, Colt, PipeRench) RA part computes code for other RA part (self reconfiguration) (dynamic vs. static configuration) Dynamic ( RTR ) *) no cache as usual !

21 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 21 million gate FPGAs and co-processing with standard microprocessor are commonplace direct implementation of complex algorithms new tools like Xilinx Jbits tool suite directly support coprocessing and Run Time Reconfiguration (RTR) Converging factors for RTR [S. Guccione] CPU core FPGA core Memory core User Java Code Java Compiler JBits API Executable [S. Guccione]

22 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 22 (5) static vs. dynamic reconfiguration 15 min supports ASAT, adaptable devices requires disciplined implementation to avoid a testing nightmare supported by on-board / on-chip CPU core supports in-field debugging and upgrading (new business model) supported by on-board / on-chip CPU core Revenue / month Time / months Update 1 Product Update 2 11020 ASIC Product reconfigurable Product with download 30 [Kean] page 109

23 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 23 Configware as the Key Enabler Configware market is taking off for mainstream FPGA-based designs more complex, even SoC No design productivity and quality without good configware libraries (soft IP cores) from various application areas. Growing no. of independent configware houses (soft IP core vendors) and design services Xilinx AllianceCORE & Reference Design Alliance et al. Currently the top FPGA vendors are key innovators and meet most configware demand.

24 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 24 „Driver“ & „OS“ for FPGAs separate EDA software market, comparable to the compiler / OS market in computers, Cadence, Mentor, Synopsys just jumped in. Xilinx and Altera are fabless FPGA vendors < 5% Xilinx / Altera income from EDA software > 50% Xilinx people work on support, EDA & Configware

25 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 25 >> Coarse Grain Architectures Introduction FPGA boom Coarse Grain Architectures Fascinating Paradigm Shift Programming Coarse Grain rDPAs Principles of Soft Computing Machines Future developments expected Conclusions for detailed overview see proceedings http://www.uni-kl.de

26 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 26 Sources: Proc ISSCC, ICSPAT, DAC, DSPWorld microprocessor / DSP Normalized processor speed battery performance Algorithmic Complexity (Shannon’s Law) memory Transistors/chip 1960 1970 1980 1990 2000 2010 100 000 000 10 000 000 1000 000 100 000 10 000 1000 100 10 1 2G 3G 4G Why coarse grain ? 1G wireless 100 10 1 0.1 0.01 0.001 mA/ MIP computational efficiency StrongARM SH7752

27 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 27 Fine-grained vs. coarse-grained Fine-grained reconfiguration versus coarse-grained reconfiguration. fine grain is general purpose slow and area-inefficient, but high parallelism coarse grain is application domain-specific coarse grain is highly area-efficient extremely high performance

28 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 28 Reconfigurability Overhead S S S S resources needed for reconfigurability partly for configuration code storage L LL LL L LLL area used by application “hidden RAM” not shown

29 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 29 Sources: Proc ISSCC, ICSPAT, DAC, DSPWorld Why Coarse Grain instead of FPGA ? physical logical supersystolic FPGA logical 1980 1990 2000 2010 FPGA physical 100 000 000 000 10 000 000 000 1000 000 000 100 000 000 10 000 000 1000 000 100 000 10 000 1000 Transistors / chip ~ 10 ~ 10 000 drastically smaller configuration memory a lot of more benefits much faster loading FPGA routed memory microprocessor reduced reconfigurability overhead by up to ~ 1000

30 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 30 Commercial RAs XPU family (IP cores): PACT corp., Munich XPU128 flexible array: MorphICs CALISTO: Silicon Spice* CS2000 family: Chameleon Systems MECA family: Malleable* FIPSOC: SIDSA ACM: Quicksilver Tech CHESS array: Elixent *) bought

31 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 31 Universal RAs are not feasible... often Functional Resources are not the Throughput Bottleneck Some Application Areas, such as e. g. Wireless Communication, need extremely rich Communication Resources Use Domain-specific Platform Generators ! The General Purpose (coarse grain) Reconfigurable Array appears to be an Illusion...

32 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 32 KressArray Family generic Fabrics: a few examples Examples of 2 nd Level Interconnect: layouted over rDPU cell - no separate routing areas ! + rout-through and function rout- through only more NNports: rich Rout Resources Select Function Repertory select Nearest Neighbour (NN) Interconnect: an example 16328 24 4 2 rDPU Select mode, number, width of NNports http://kressarray.de

33 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 33 array size: 10 x 16 = 160 rDPUs http://kressarray.de SNN filter KressArray Mapping Example rout thru only not used backbus connect

34 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 34 route-thru-only rDPU 3 vert. NNports, 32 bit http://kressarray.de Xplorer Plot: SNN Filter Example + [13] 2 hor. NNports, 32 bit operator result operand route thru backbus connect

35 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 35 >> Fascinating Paradigm Shift Introduction FPGA boom Coarse Grain Architectures Fascinating Paradigm Shift Programming Coarse Grain rDPAs Principles of Soft Computing Machines Future development expected Conclusions http://www.uni-kl.de

36 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 36 Paradigm Shift Mainstream Tornado Development of Hypergrowth Markets Harper Business 1995

37 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 37 Makimoto’s 3rd wave The next EDA Industry Revolution 1978 Transistor entry: Applicon, Calma, CV... 1992 Synthesis: Cadence, Synopsys... 1985 Schematics entry: Daisy, Mentor, Valid... [Keutzer / Newton] EDA industry paradigm switching every 7 years 1999 (Co-) Compilation Stream-based DPU arrays [Hartenstein] 2006

38 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 38 It’s a General Paradigm Shift ! Using FPGAs (fine grain reconfigurable): just Logic Synthesis on a strange platform Coarse Grain Reconfigurable Arrays (Reconfigurable Computing): a fundamental Paradigm Shift ignored by Curricula & most R&D scenes Replacing Concurrent Processes by much more efficient parallelism: Stream-based ComputingArrays systolic array* [1980] KressArray** [1995] chip-on-a-day* [2000] ____ *) hardwired **) reconfigurable

39 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 39 Stream-based Computing (2) terms: DPU: datapath unit DPA: datapath array rDPU: reconfigurable DPU rDPA: reconfigurable DPA stream-based computing: using complex pipe network (super-systolic: Kress et al.)

40 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 40 Converging Design Flows this synthesis method is a generalization of systolic array synthesis: super systolic synthesis and DPA [Broderson, 2000]: terms: DPU: datpath unit DPA: data path array rDPU: reconfigurable DPU rDPA: reconfigurable DPA the same synthesis method may be used for mapping an algorithm onto both: rDPA [Kress, 1995],

41 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 41 Concurrent Computing DPU instruction sequencer DPU instruction sequencer DPU instruction sequencer DPU instruction sequencer.... Bus (es) or switch box CPU extremely inefficient

42 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 42 Stream-based Computing DPU driven by data stream from / to memory or, from / to peripheral interface transport-triggered execution no instruction sequencer inside !

43 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 43 Stream-based Computing: (r) DPU array for both, reconfigurable, and, hardwired DPU driven by data streams

44 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 44 >>> extremely high efficiency avoiding address computation overhead avoiding instruction fetch and interpretation overhead high parallelism, massively multiple deep pipelines much less configuration memory no routing areas to configure functions from CLBs

45 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 45 >> Programming Coarse Grain RAs Introduction FPGA boom Coarse Grain Architectures Fascinating Paradigm Shift Programming Coarse Grain rDPAs Principles of Soft Computing Machines Future development expected Conclusions http://www.uni-kl.de

46 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 46 Systolic Stream-based Computing System Systolic Array [ H. T. Kung, 1980 ] : an array of DPUs (Data Path Units) DPU architecture y + * x a data streams equations placement linear projection or algebraic mapping The Mathematician’s Synthesis Method linear pipelines and uniform arrays only no routing!

47 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 47 computing in space Computing in space and time data streams y 1 0  y 2 0 y 3 0 - - - y 1 y 2 y 3 - - - x 1 x 2 x 3 - - - computing in time a 12 a 11 a 21 a 32 a 31 a 23 a 33 a 22 a 13 placement systolic arrays etc. and other transformations migration by re-timing this dichotomy is completely ignored by our CS curricula

48 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 48 2 General Stream-based Computing System heterogenous Array of DPUs (data path units) Scheduler Mapper expression tree DPU architectures y + * x a 1 simultaneous placement & routing 3 + ++ + * * * sh * xf - - data streams 4 The same mapper for both: Reconfigurable, or hardwired Kress DPSS [1995] simulated annealing free form pipe network

49 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 49 Super Pipe Networks The key is mapping, rather than architecture * *) KressArray [1995]

50 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 50 Processor Memory Performance Gap

51 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 51 http://kressarray.de Efficient Memory Communication should be directly supported by the Mapper Tools sequencers memory ports application not used Legend: Optimized Parallel Memory Controller An example by Nageldinger’s KressArray Xplorer Synthesizable Memory Communication

52 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 52 Memory Communication Architecture hot research topic in embedded systems storage context transformations [Herz, others] for low power for high performance startups provide memory IP or generators

53 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 53 Stream-based Soft Machine Scheduler Memory (data memory) memory bank... “instructions” rDPA Compiler Sequencers (data stream generator)

54 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 54 Hot Research Topic: Memory Architectures High Performance Embedded Memory Architectures High Performance Memory Communication Architectures [Herz] Custom Memory Management Methodology [Cathoor] Data Reuse Transformations [Kougia et al.] Data Reuse Exploration [Soudris, Wuytak]

55 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 55 >> Principles of Soft Computing Machines Introduction FPGA boom Coarse Grain Architectures Fascinating Paradigm Shift Programming Coarse Grain rDPAs Principles of Soft Computing Machines Future development expected Conclusions http://www.uni-kl.de

56 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 56 KressArray DPSS Application Set DPSS published at ASP-DAC 1995 Architecture Editor Mapping Editor statist. Data Delay Estim. Analyzer Architecture Estimator interm. form 2 expr. tree ALE-X Compiler Power Estimator Power Data VHDL Verilog HDL Generator Simulator User ALEX Code Improvement Proposal Generator Suggestion Selection User Interface interm. form 3 Mapper Design Rules Datapath Generator Kress rDPU Layout data stream Schedule Scheduler KressArray Xplorer (Platform Design Space Explorer) Xplorer Inference Engine (FOX) Sug- gest- ion KressArray family parameters Compiler Mapper Scheduler

57 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 57 Architecture & Mapping Editor Statistics KressArray DPSS Datastream Generator HDL Generator Simulator Datapath Generator Delay & Power Estimator Improvement Proposal Generator User DPSS Source Input KressArray (Design Space) Platform Space Explorer http://kressarray.de Xplorer Application Set

58 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 58 Design Flow of Domain-specific Architecture Optimization Nageldinger’s KressArray Design Space Xplorer: including a Fuzzy Logic Improvement Proposal Generator accessible by internet: http://kressarray.de runs best with Netscape 4.6.1

59 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 59 data counter instructions program counter : state register Compiler Memory Datapath hardwired Sequencer Computer tightly coupled by compact instruction code “von Neumann” does not support soft data paths does not support soft data paths Datapath reconfigurable Xputer Scheduler Compiler Memory multiple sequencer Datapath Array “instructions” University of Kaiserslautern loosely coupled by decision data bits only Xputer: The Soft Machine Paradigm reconfigurable also for hardwired Computer: the wrong Machine Paradigm “von Neumann”

60 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 60 Machine Paradigms

61 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 61 Fundamental Ideas available Data Sequencer Methodology Data-procedural Languages (Duality w. v. N.)... supporting memory bandwidth optimization Soft Data Path Synthesis Algorithms Parallelizing Loop Transformation Methods Compilers supporting Soft Machines SW / CW Partitioning Co-Compilers

62 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 62 JPEG zigzag scan pattern x y EastScan is step by [1,0] end EastScan; SouthScan is step by [0,1] endSouthScan; *> Declarations NorthEastScan is loop 8 times until [*,1] step by [1,-1] endloop end NorthEastScan; SouthWestScan is loop 8 times until [1,*] step by [-1,1] endloop end SouthWestScan; HalfZigZag is EastScan loop 3 times SouthWestScan SouthScan NorthEastScan EastScan endloop end HalfZigZag; goto PixMap[1,1] HalfZigZag; SouthWestScan uturn (HalfZigZag) HalfZigZag data counter 1 3 2 4

63 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 63 Similar Programming Language Paradigms very easy to learn

64 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 64 GAG = Address Generator Generic GAG Scheme Limit Stepper Base Stepper GAG Address Stepper B0B0 AA L0L0 A  A L B 0 [] | || | limit

65 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 65 GAG: Address Stepper GAG = Address Generator Generic + / – Escape Clause End Detect Step Counter =o LA  A init tag A Address endExec maxStepCount 0 B Limit BasestepVector []| |  A L B 0 [] | || | limit GAG: Address Stepper

66 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 66 Generic Sequence Examples Limit Slider Base Slider GAG Address Stepper B0B0 AA L0L0 A

67 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 67 floor F address Slider Operation Demo Example B 0

68 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 68 Changing Models of Computation contemporary host hardwired Compiler accelerator(s) CAD RAM reconfigurable computing host re- Co-Compiler conf. accelerator(s) RAM Software Configware Machine paradigm EDA tools needed* ASIC s *) even 80% hardware people hate their tools both done at customer site done at vendor site no hardware experts needed

69 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 69 Co-Compilation Xputer “Soft” Machine Paradigm Configware running on partitioning compiler high level programming language source  Processor Reconfigurable Accelerators interface Reconfigurable Architecture (RA) -- instead of hardwired no CAD ! Compilation instead ! Hardware / Software Co-Design turns to Configware / Software Co-Design We introduce: Co-Compilation Computer Machine Paradigm Software running on Xputer “Soft” Machine Paradigm Configware running on

70 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 70 Jürgen Becker’s Co-DE-X Co-Compiler Analyzer / Profiler host GNU C compiler paradigm Computer machine DPSS KressArray X-C compiler Xputer machine paradigm Partitioner Loop Transfor- mations X-C is C language extended by MoPL X-C Resource Parameters supporting different platforms supporting platform-based design

71 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 71 Loop Transformation Examples loop 1-8 body endloop loop 1-8 body endloop loop 9-16 body endloop fork join strip mining loop 1-4 trigger endloop loop 1-2 trigger endloop loop 1-8 trigger endloop reconf.array: host: loop 1-16 body endloop sequential processes: resource parameter driven Co-Compilation loop unrolling

72 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 72 History of Loop Transformations David Loveman, 1977, Allen and Kennedy, et al. Loop Unrolling, Loop Fusion, Strip Mining.... (Parameter-driven) Time to Time/Space Partitioning 1995/97 [Karin Schmidt / Jürgen Becker] : downto Datapath Level: e. g.: Transformation from Sequential Process to Super-systolic Multi-dimensional Loop Unrolling / Storage Scheme Optimization supporting burst-mode & parallel Memory Banks 2000 [Michael Herz] : optimized RA to Memory Communication Bandwidth: 70ies - 80ies: at Process Level: Sequential to Parallel Processes, incl. Vectorization

73 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 73 >> Future developments expected Introduction FPGA boom Coarse Grain Architectures Fascinating Paradigm Shift Programming Coarse Grain rDPAs Principles of Soft Computing Machines Future developments expected Conclusions http://www.uni-kl.de

74 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 74 EH conferences "Evolvable Hardware" (EH), "Evolutionary Methods" (EM), "Darwinistic Methods", and biologically inspired electronics new FPGA application [genetic FPGA] „the „DNA“ metaphor EH(NASA/DoD Workshop on Evolvable Hardware), ICES(Evolvable Systems), EuroGP and GP (Genetic Programming), CEC(Congress on Evolutionary Computation), GECCO(Genetic and Evolutionary Computation), EvoWorkshops 2002 (Evolutionary Computing Workshops), MAPLD (Military and Aerospace Applications of Programmable Logic Devices and Technologies) ICGA (Genetic Algorithms).

75 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 75 EH - What is it? What is the relation between Reconfigurable Computing and Evolvable Computing/Hardware? *) by crossing chromosomes Currently: research on darwinistic methods to generate or optimize IT systems by electronic sex*. "chromosome": a synonym for "configuration code". YAFA Evolvable Hardware and Computing - What is it? - yet another FPGA application

76 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 76 How important is evolvable computing ? new conferences in their visionary phase some NASA / DoD expectations look unrealistic Coming shake-out: future is hard to guess reminds me to past AI daze partly a revival of cybernetics, bionics, etc. genetic algorithms people dominate the scene (who do not talk to EDA people) GA suck

77 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 77 Embedded Soft IP Cores soft CPU FPGA Memory core FPGA Compiler HLL

78 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 78 Some soft CPU core examples corearchitectureplatform MicroBlaze 125 MHz 70 D-MIPS 32 bit standard RISC 32 reg. by 32 LUT RAM- based reg. Xilinx up to 100 on one FPGA Nios16-bit instr. set Altera Mercury Nios 50 MHz 32-bit instr. set Altera 22 D-MIPS Nios8 bitAltera – Mercury gr104016-bit gr105032-bit My80i8080AFLEX10K30 or EPF6016 DSPuva1616 bit DSPSpartan-II corearchitectureplatform Leon 25 Mhz SPARC ARM7 cloneARM uP1232 8-bitCISC, 32 reg.200 XC4000E CLBs REGIS8 bits Instr. + ext. ROM 2 XILINX 3020 LCA Reliance-112 bit DSPLattice 4 isp30256, 4 isp1016 1Popcorn-18 bit CISCAltera, Lattice, Xilinx Acorn-11 Flex 10K20 YARD-1A16-bit RISC, 2 opd. Instr. old Xilinx FPGA Board xr16RISC integer CSpartanXL

79 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 79 FPGA CPUs in teaching and academic research UCSC: 1990! Märaldalen University, Eskilstuna, Sweden Chalmers University, Göteborg, Sweden Cornell University Gray Research Georgia Tech Hiroshima City University, Japan Michigan State Universidad de Valladolid, Spain Virginia Tech Washington University, St. Louis New Mexico Tech UC Riverside Tokai University, Japan

80 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 80 Soft rDPA Hardware Design Memory soft CPU miscellanous softDPUarraysoftDPUarray HLL Compiler

81 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 81 Area efficiency: still relevant to-day Rapid technology progress 50 mio system gates soon FPGAs for relocateble configware code ? Compatibility at configuration code level ? Slower clock: compensated by more parellelism Even large rDPAs as a soft IP become feasible By >2005: don’t care about area efficiency ?

82 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 82 >> Conclusions Introduction FPGA boom Coarse Grain Architectures Fascinating Paradigm Shift Programming Coarse Grain rDPAs Principles of Soft Computing Machines Future development expected Conclusions http://www.uni-kl.de

83 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 83 Main problems to be solved (1) Main EDA tools required: De facto standard soft IP core libraries Tools for much better designer productivity Configuration code compatibility by a de facto standard RC platform family Compilers accepting high level programming language Scalable FPGA architectures supporting relocatable configuration code

84 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 84 Main problems to be solved (2) object code compatibility for new µP products Needed to become the dominant FPGA vendor: accepted OS, compilers, development tools available most software written for it: many application areas most configware (soft IP cores) written for it object code compatibility for new FPGA products widely accepted „OS“, compilers, development tools Compare the most successful microprocessor

85 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 85 Main problems to be solved (3) computing in space computing in time systolic arrays etc. and other transformations migration by re-timing this dichotomy is completely ignored by our CS curricula Easy to use C or Java based compilers needed Each programmer and each MBA should have qualified awareness on dichotomy and FPGAs curricular innovations are urgently needed Needing HDL-savvy users is a severe limitation Lobbying urgently needed

86 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 86 However, current CS Education …. Hardware invisible: under the surface … is based on the Submarine Model Brain usage: procedural-only Software Faculty Colleagues shy away from the Paradigm Shift: their Brain hurts? - can’t be: this Half has been amputated Algorithm Assembly Language procedural high level Programming Language Hardware Software This model disables...

87 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 87 Hardware, Configware Hardware and Software as Alternatives Algorithm Software partitioning Software only Software & Hardw/Configw procedural structural Brain Usage: both Hemispheres Hardw/Configw only

88 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 88 The Dominance of the Submarine Model Hardware.. indicates, that our CS Education System produces Zillions of Mentally Disabled Persons (procedural) structurally disabled … completely disabled to cope with Solutions other than Software only

89 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 89 It’s time to crush the Submarine Model Co-Compilation structural programmin g Xputer machine paradigm Computing in Space: von Neumann book already in the 50ies Computing in Space: von Neumann book already in the 50ies Now Fundamentals and Technology are available Now Fundamentals and Technology are available It’s time to innovate CS&E Curricula... It’s time to innovate CS&E Curricula..... toward a Dichotomy of Computing Science.. toward a Dichotomy of Computing Science procedural programming “von Neumann” paradigm Computer machine computing in space computing in time systolic arrays etc. and other transformations migration by re-timing

90 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 90 >>> thank you thank you for listening

91 © 2001, reiner@hartenstein.de http://www.fpl.uni-kl.de University of Kaiserslautern 91 >>> END END


Download ppt "DSD2001 Reconfigurable Computing: the Roadmap to a New Business Model – and its Impact on SoC Design TS4: Tuesday, 14.00 hrs Reiner Hartenstein University."

Similar presentations


Ads by Google