Download presentation
Presentation is loading. Please wait.
Published byChloe McGee Modified over 9 years ago
1
An Automated Development Framework for a RISC Processor with Reconfigurable Instruction Set Extensions Nikolaos Vassiliadis, George Theodoridis and Spiridon Nikolaidis Section of Electronics and Computers, Department of Physics, Aristotle University of Thessaloniki, Greece Aristotle University of Thessaloniki, Greece E-mail: nivas@physics.auth.gr AUTH Introduction Broad diversity of algorithms Rapid evolution of standards High-performance demands Characteristics of modern applications Requirements to amortize cost over high production volumes High levels of flexibility => fast time-to-market Adaptability => increased reusability An appealing option: Reconfigurable Instruction Set Processor (RISP) Couple a standard processor with Reconfigurable Hardware (RH) Processor = bulk of the flexibility RH = adaptation to the targeted application through dynamic Instruction Set Extensions (ISEs) Motivation New features in order to program such hybrid architectures Partition the application between the RH and the processor Configure the RH and feed the compiler with RH related information (e.g. execution and reconfiguration latencies) Introduce extra code to pass data and control directives to and from the RH Transparent incorporation of these new features is a must In order to preserve the time-to-market close to that of the traditional software design flow and continue to target software-oriented groups of users with small or no experience in hardware design Objective Design a development framework for a dynamic Reconfigurable Instruction Set Processor (RISP) Fully automated Hides all RH related issues requiring limited interaction with the user Appropriate for exploration and fine-tune of the architecture towards a target application Allow different values for various architectural parameters Retargetable to different instances of the architecture Target Architecture Core processor Single issue, 5 Pipeline stages, 32-bit RISC Can issue one reconfigurable instruction per execution cycle Reconfigurable instructions explicitly encoded in the instruction word Interface Provides control and data communication Reconfigurable Functional Unit (RFU) Tightly coupled to the core’s pipeline Provides reconfigurable ISE ISE = MISOs of the cores primitive instructions MISO: Multiple-Input-Single-Output Multi-context config. mem Can provide a different config. bitstream per execution cycle 1-D array of coarse-grain PEs “Floating” between two concurrent pipeline stages Exploit spatial/temporal computation Conclusions An automated development framework for a target RISP has been introduced The framework can be used both for fine-tune an instance of the RISP at design time but also to program it after fabrication The framework hides all reconfigurable hardware related issues from the user The target RISP architecture can be used by any software-oriented user with no grip of hardware design Support of a new architecture instance by the framework is possible without any modification Acknowledgement This work was supported by the General Secretariat of Research and Technology of Greece and the European Union. Framework usage demonstration - Experimental Results Speedup vs. number of PEs Speedup vs. PEs/MISO inputs Speedup vs. number of reconfigurable instructions Code size and instruction fetches reduction 16 words of 134 bits Config. Memory Size ALU, Shift, Multiply PEs Functionality 8 Num of PEs 32-bits Granularity Value Configuration Exploration for fine-tuning of critical architecture parameters A set of benchmarks derived from different benchmarking suites like MiBench and Powerstone were considered The possibilities of using the framework for fine-tune critical architectural parameters and/or to program/evaluate a specific instance of the architecture are demonstrated Comparisons results were performed considering the RISC processor of the architecture with and without support of the RFU unit Configuration of an evaluation instance Evaluation of the derived instance x2.91 avg. speedup 38% avg. code size reduction => less memory requirements 62% avg. instruction memory accesses reduction => significant power savings Development Framework Flow Instrument the CDFG with profiling annotations Convert CDFG to equivalent C code Compile and execute Pattern generation to identify MISO cluster of operations Mapping Assign operations PEs / Route the 1-D array Analyze data paths to minimize delay Report candidate instruction semantics Maximum number of inputs in the pattern (i.e. # of reg.file read ports) Permitted types of operations in the pattern (i.e. PE functionality) Maximum number of operations in the patterns (i.e. # of PEs) Number of different ISEs Graph isomorphism to discover identical instructions Rank instructions based on the estimated offered speedup Select the best ISEs
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.