Platforms, ASIPs and LISATek Federico Angiolini DEIS Università di Bologna.

Slides:



Advertisements
Similar presentations
Xtensa C and C++ Compiler Ding-Kai Chen
Advertisements

CS1104: Computer Organisation School of Computing National University of Singapore.
TIE Extensions for Cryptographic Acceleration Charles-Henri Gros Alan Keefer Ankur Singla.
Intro to Computer Org. Pipelining, Part 2 – Data hazards + Stalls.
A reconfigurable system featuring dynamically extensible embedded microprocessor, FPGA, and customizable I/O Borgatti, M. Lertora, F. Foret, B. Cali, L.
LOGO HW/SW Co-Verification -- Mentor Graphics® Seamless CVE By: Getao Liang March, 2006.
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
RTL Processor Synthesis for Architecture Exploration and Implementation Schliebusch, O. Chattopadhyay, A. Leupers, R. Ascheid, G. Meyr, H. Steinert, M.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
Configurable System-on-Chip: Xilinx EDK
Processor Architectures and Program Mapping 5kk10 TU/e 2006 Henk Corporaal Jef van Meerbergen Bart Mesman.
February 21, 2008 Center for Hybrid and Embedded Software Systems Mapping A Timed Functional Specification to a Precision.
The Effect of Data-Reuse Transformations on Multimedia Applications for Different Processing Platforms N. Vassiliadis, A. Chormoviti, N. Kavvadias, S.
Climate Machine Update David Donofrio RAMP Retreat 8/20/2008.
UCB November 8, 2001 Krishna V Palem Proceler Inc. Customization Using Variable Instruction Sets Krishna V Palem CTO Proceler Inc.
Trend towards Embedded Multiprocessors Popular Examples –Network processors (Intel, Motorola, etc.) –Graphics (NVIDIA) –Gaming (IBM, Sony, and Toshiba)
1 Chapter 4. 2 Measure, Report, and Summarize Make intelligent choices See through the marketing hype Key to understanding underlying organizational motivation.
(1) Introduction © Sudhakar Yalamanchili, Georgia Institute of Technology, 2006.
1 Presenter: Ming-Shiun Yang Sah, A., Balakrishnan, M., Panda, P.R. Design, Automation & Test in Europe Conference & Exhibition, DATE ‘09. A Generic.
Principles of Programming Chapter 1: Introduction  In this chapter you will learn about:  Overview of Computer Component  Overview of Programming 
Development in hardware – Why? Option: array of custom processing nodes Step 1: analyze the application and extract the component tasks Step 2: design.
Basics and Architectures
1 3-General Purpose Processors: Altera Nios II 2 Altera Nios II processor A 32-bit soft core processor from Altera Comes in three cores: Fast, Standard,
COMPUTER SCIENCE &ENGINEERING Compiled code acceleration on FPGAs W. Najjar, B.Buyukkurt, Z.Guo, J. Villareal, J. Cortes, A. Mitra Computer Science & Engineering.
CASTNESS11, Rome Italy © 2011 Target Compiler Technologies L 1 Ideas for the design of an ASIP for LQCD Target Compiler Technologies CASTNESS’11, Rome,
Automated Design of Custom Architecture Tulika Mitra
Designing the WRAMP Dean Armstrong The University of Waikato.
Advanced Computer Architecture, CSE 520 Generating FPGA-Accelerated DFT Libraries Chi-Li Yu Nov. 13, 2007.
ASIP Architecture for Future Wireless Systems: Flexibility and Customization Joseph Cavallaro and Predrag Radosavljevic Rice University Center for Multimedia.
High Performance Embedded Computing © 2007 Elsevier Lecture 3: Design Methodologies Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte Based.
System Design with CoWare N2C - Overview. 2 Agenda q Overview –CoWare background and focus –Understanding current design flows –CoWare technology overview.
High Performance Embedded Computing © 2007 Elsevier Chapter 1, part 2: Embedded Computing High Performance Embedded Computing Wayne Wolf.
SPREE RTL Generator RTL Simulator RTL CAD Flow 3. Area 4. Frequency 5. Power Correctness1. 2. Cycle count SPREE Benchmarks Verilog Results 3. Architecture.
IT253: Computer Organization
Configurable, reconfigurable, and run-time reconfigurable computing.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
Developing software and hardware in parallel Vladimir Rubanov ISP RAS.
Embedded Systems Design: A Unified Hardware/Software Introduction 1 Chapter 3 General-Purpose Processors: Software.
EE3A1 Computer Hardware and Digital Design
MILAN: Technical Overview October 2, 2002 Akos Ledeczi MILAN Workshop Institute for Software Integrated.
Dual-Pipeline Heterogeneous ASIP Design Swarnalatha Radhakrishnan, Hui Guo, Sri Parameswaran School of Computer Science & Engineering University of New.
1 Computer Architecture Part II-B: CPU Instruction Set.
6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)
Ted Pedersen – CS 3011 – Chapter 10 1 A brief history of computer architectures CISC – complex instruction set computing –Intel x86, VAX –Evolved from.
A few issues on the design of future multicores André Seznec IRISA/INRIA.
A Hybrid Design Space Exploration Approach for a Coarse-Grained Reconfigurable Accelerator Farhad Mehdipour, Hamid Noori, Hiroaki Honda, Koji Inoue, Kazuaki.
System-level power analysis and estimation September 20, 2006 Chong-Min Kyung.
Architecture Selection of a Flexible DSP Core Using Re- configurable System Software July 18, 1998 Jong-Yeol Lee Department of Electrical Engineering,
Teaching The Principles Of System Design, Platform Development and Hardware Acceleration Tim Kranich
Survey of multicore architectures Marko Bertogna Scuola Superiore S.Anna, ReTiS Lab, Pisa, Italy.
Application-Specific Customization of Soft Processor Microarchitecture Peter Yiannacouras J. Gregory Steffan Jonathan Rose University of Toronto Electrical.
April 15, 2013 Atul Kwatra Principal Engineer Intel Corporation Hardware/Software Co-design using SystemC/TLM – Challenges & Opportunities ISCUG ’13.
1 Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
Addressing modes, memory architecture, interrupt and exception handling, and external I/O. An ISA includes a specification of the set of opcodes (machine.
Compilers: History and Context COMP Outline Compilers and languages Compilers and architectures – parallelism – memory hierarchies Other uses.
System-on-Chip Design
Programmable Hardware: Hardware or Software?
Andreas Hoffmann Andreas Ropers Tim Kogel Stefan Pees Prof
ECE354 Embedded Systems Introduction C Andras Moritz.
SOFTWARE DESIGN AND ARCHITECTURE
课程名 编译原理 Compiling Techniques
CSCI/CMPE 3334 Systems Programming
Matlab as a Development Environment for FPGA Design
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
A High Performance SoC: PkunityTM
To DSP or Not to DSP? Chad Erven.
1.1 The Characteristics of Contemporary Processors, Input, Output and Storage Devices Types of Processors.
HIGH LEVEL SYNTHESIS.
Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
Presentation transcript:

Platforms, ASIPs and LISATek Federico Angiolini DEIS Università di Bologna

Platform-Based Design

Competing Imperatives Technology push: high-volume products feasible design Marketing push: fast turnaround differentiated products IBM PowerPC 750 Nokia 9210

Two Possible Alternatives... Full custom design Do-it-yourself: nobody knows goals better than yourself, you can do it best Maximize overall design efficiency Platforms Observation: any given space has a limited number of good solutions to challenges Let’s capture them in a reusable, configurable package

What Is a Platform? A partial design: for a particular type of system/application includes embedded processor(s) may include embedded software customizable to customer requirements: software some HW components IBM CoreConnect

Example: STMicro Nomadik Main Core Memory System HW Accelerators I/Os

Platforms and Embedded Computing Platforms rely on embedded processors: can be customized through software considerable design effort can be put into CPU Many platforms are complex heterogeneous multiprocessors Agere StarPro

Platform Use Methodology Start with reference design Evaluate differences required for your features Evaluate hardware & software changes Implement hardware & software changes (in parallel) Integrate and test

Why ASIPs

The Big Chasm: Platforms vs. Full-Custom Platform has few degrees of freedom: harder to differentiate limited flexibility could simply be insufficient... Full-custom instead: much bigger design effort extremely long design cycles limited reusability Isn’t there anything in between? :-(

Example: the Big Chasm We have been successfully using some pre-existing architecture (platform or full-custom) A brand new shiny software routine now requires adding an FPU or it will run 500% too slow Platform: ouch – it is not available, and we can’t really work under the hood Full-custom: ouch – it will take months to design one, then who tells our custom compiler how to use it? Platform is inflexible, full-custom is a mess What do we do??

Enter ASIPs Application-Specific IP cores Let’s take a pretty generic IP core (for example, a general-purpose processor) Let’s see what it is missing: Too slow for some multimedia operation (e.g. FFT)? Too limited functionality for memory management?... Let’s extend it by adding instructions to its Instruction Set, and the corresponding hardware! Flexible (add extensions as needed by apps) Reusable (same basis, many flavours)

But Wait a Second... Something is Missing? Normally it takes a lot of effort to tweak the RTL description of a processor core! How do we simulate and validate the extensions? Do we also need to write a simulator? If I just add stuff to my core, gcc won’t know, right? So do I have to write assembly? Wait a second – the assembler doesn’t know either, yet?

Processor Description Languages and ASIP Toolchains Idea: let’s describe processors in a simple, ad-hoc language (i.e. neither C nor VHDL) This language easily allows for changing Instruction Set, register file properties, pipeline operation... This language also allow for automatic generation of supporting toolchains

The LISATek Flow Target Specification Evaluate: Area, Frequency, Power Evaluate: Profiling, Execution Speed LISA 2.0 Description LISATek Generators Architectural Exploration Hardware Implementation C Compiler Assembler Linker Sim Library Simulation Statistics HDL Description Synthesis Tools Gate-Level Model

Summary: the LISATek Concept Available IP cores might be suboptimal for specific task at hand (e.g. multimedia acceleration) Customized Application-Specific IP cores (ASIPs)! Modeling custom IP cores is time consuming IP core description language! Custom IP cores require new compilation toolchain Automatic generation! Silicon prototypes require effort and validation Automatic HDL translation! Flow also applies to “standard” IP cores (DSP, RISC, SIMD, VLIW…)

Example: Infineon DSP Application LISA vs. traditional model description Development Time Model Lines of Code Relative Area Relative Clock Period Traditional VHDL 5 months LISA + SystemC 2 months %+22% Schliebusch 2004 Note: in fact, the datapath is automatically generated, but manual re-coding is advised

LISATek Demo Courtesy: Sergio Foresta

Compiler Generation Courtesy: Davide Rossi

Compiler Generation Flow

Compiler Designer: Register

Compiler Designer: Datalayout

Compiler Designer: Conventions

Compiler Designer: Latency Table

Compiler Designer: Reservation Table

Compiler Designer: Matcher (1) CMIRASM MATCHER

Compiler Designer: Matcher (2)