Hy-C A Compiler Retargetable for 2014 and beyond Philip Sweany 4/29/2014.

Slides:



Advertisements
Similar presentations
Reconfigurable Computing After a Decade: A New Perspective and Challenges For Hardware-Software Co-Design and Development Tirumale K Ramesh, Ph.D. Boeing.
Advertisements

An Overview Of Virtual Machine Architectures Ross Rosemark.
Xtensa C and C++ Compiler Ding-Kai Chen
CPU Review and Programming Models CT101 – Computing Systems.
8. Code Generation. Generate executable code for a target machine that is a faithful representation of the semantics of the source code Depends not only.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
CSCE 212 Chapter 4: Assessing and Understanding Performance Instructor: Jason D. Bakos.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
Platforms, ASIPs and LISATek Federico Angiolini DEIS Università di Bologna.
Define Embedded Systems Small (?) Application Specific Computer Systems.
Configurable System-on-Chip: Xilinx EDK
Processor Architectures and Program Mapping 5kk10 TU/e 2006 Henk Corporaal Jef van Meerbergen Bart Mesman.
1 Breakout thoughts (compiled with N. Carter): Where will RAMP be in 3-5 Years (What is RAMP, where is it going?) Is it still RAMP if it is mapping onto.
Dynamically Reconfigurable Architectures: An Overview Juanjo Noguera Dept. Computer Architecture (DAC-UPC)
Embedded Systems in Silicon TD5102 Henk Corporaal Technical University Eindhoven DTI / NUS Singapore.
The Effect of Data-Reuse Transformations on Multimedia Applications for Different Processing Platforms N. Vassiliadis, A. Chormoviti, N. Kavvadias, S.
Reconfigurable Computing in the Undergraduate Curriculum Jason D. Bakos Dept. of Computer Science and Engineering University of South Carolina.
UCB November 8, 2001 Krishna V Palem Proceler Inc. Customization Using Variable Instruction Sets Krishna V Palem CTO Proceler Inc.
Fall 2001CS 4471 Chapter 2: Performance CS 447 Jason Bakos.
Lecture 3: Computer Performance
SSS 4/9/99CMU Reconfigurable Computing1 The CMU Reconfigurable Computing Project April 9, 1999 Mihai Budiu
(1) Introduction © Sudhakar Yalamanchili, Georgia Institute of Technology, 2006.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
(1.1) COEN 171 Programming Languages Winter 2000 Ron Danielson.
Trigger design engineering tools. Data flow analysis Data flow analysis through the entire Trigger Processor allow us to refine the optimal architecture.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Chapter 1. Introduction.
Advanced Computer Architecture, CSE 520 Generating FPGA-Accelerated DFT Libraries Chi-Li Yu Nov. 13, 2007.
ASIP Architecture for Future Wireless Systems: Flexibility and Customization Joseph Cavallaro and Predrag Radosavljevic Rice University Center for Multimedia.
Configurable, reconfigurable, and run-time reconfigurable computing.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
CAPS project-team Compilation et Architectures pour Processeurs Superscalaires et Spécialisés.
R2D2 team R2D2 team Reconfigurable and Retargetable Digital Devices  Application domains Mobile telecommunications  WCDMA/UMTS (Wideband Code Division.
HPC User Forum Back End Compiler Panel SiCortex Perspective Kevin Harris Compiler Manager April 2009.
Lecture 4: MIPS Instruction Set Reminders: –Homework #1 posted: due next Wed. –Midterm #1 scheduled Friday September 26 th, 2014 Location: TODD 430 –Midterm.
Computer Organization and Architecture Tutorial 1 Kenneth Lee.
From lecture slides for Computer Organization and Architecture: Designing for Performance, Eighth Edition, Prentice Hall, 2010 CS 211: Computer Architecture.
Compilers for Embedded Systems Ram, Vasanth, and VJ Instructor : Dr. Edwin Sha Synthesis and Optimization of High-Performance Systems.
MILAN: Technical Overview October 2, 2002 Akos Ledeczi MILAN Workshop Institute for Software Integrated.
Hy-C A Compiler Retargetable for Single-Chip Heterogeneous Multiprocessors Philip Sweany 8/30/2013.
Levels of Abstraction Computer Organization. Level of Abstraction u Provides users with concepts/tools to solve problem at that level u Implementation.
6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)
DSP Architectures Additional Slides Professor S. Srinivasan Electrical Engineering Department I.I.T.-Madras, Chennai –
Analysis of Programming Languages (2). 2 LANGUAGE DESIGN CONSTRAINTS  Computer architecture  Technical setting  Standards  Legacy systems.
Hybrid Multi-Core Architecture for Boosting Single-Threaded Performance Presented by: Peyman Nov 2007.
Hy-C A Compiler Retargetable for Single-Chip Heterogeneous Multiprocessors Philip Sweany 8/27/2010.
Content Project Goals. Workflow Background. System configuration. Working environment. System simulation. System synthesis. Benchmark. Multicore.
A Design Flow for Optimal Circuit Design Using Resource and Timing Estimation Farnaz Gharibian and Kenneth B. Kent {f.gharibian, unb.ca Faculty.
Memory-Aware Compilation Philip Sweany 10/20/2011.
Compacting ARM binaries with the Diablo framework – Dominique Chanet & Ludo Van Put Compacting ARM binaries with the Diablo framework Dominique Chanet.
Performance Computer Organization II 1 Computer Science Dept Va Tech January 2009 © McQuain & Ribbens Defining Performance Which airplane has.
EU-Russia Call Dr. Panagiotis Tsarchopoulos Computing Systems ICT Programme European Commission.
ASIC/FPGA design flow. Design Flow Detailed Design Detailed Design Ideas Design Ideas Device Programming Device Programming Timing Simulation Timing Simulation.
Real-Time System-On-A-Chip Emulation.  Introduction  Describing SOC Designs  System-Level Design Flow  SOC Implemantation Paths-Emulation and.
ECE 587 Hardware/Software Co- Design Lecture 23 LLVM and xPilot Professor Jia Wang Department of Electrical and Computer Engineering Illinois Institute.
Compiler Research How I spent my last 22 summer vacations Philip Sweany.
K-Nearest Neighbor Digit Recognition ApplicationDomainConstraintsKernels/Algorithms Voice Removal and Pitch ShiftingAudio ProcessingLatency (Real-Time)FFT,
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
SUBJECT : DIGITAL ELECTRONICS CLASS : SEM 3(B) TOPIC : INTRODUCTION OF VHDL.
Computer Organization and Architecture Lecture 1 : Introduction
Andreas Hoffmann Andreas Ropers Tim Kogel Stefan Pees Prof
Defining Performance Which airplane has the best performance?
Why to use the assembly and why we need this course at all?
Want to Write a Compiler?
CSCE 212 Chapter 4: Assessing and Understanding Performance
Dynamically Reconfigurable Architectures: An Overview
NetPerL Seminar Hardware/Software Co-Design
Research: Past, Present and Future
Presentation transcript:

Hy-C A Compiler Retargetable for 2014 and beyond Philip Sweany 4/29/2014

In the Beginning We compilers specific to: – One language (C, Fortran, Lisp, Java, Ruby, …) – One computer architecture (x86, Mips, …) First retargetable compiler concept was to mix M front-ends (language-specific) and N back- ends (architecture-specific) pcc (portable C compiler)

Then … Two approaches – Generic --- build compiler for an instruction set (ISA) that generates “ok” code for lots of architectures e.g. gcc, llvm – Build a very detailed architecture language AND include ISA description as well e.g. LisaTek

Problem Supporting retargetable ISA (gcc) alone is daunting. Adding hardware description on to it makes retargeting almost impossible. Until now, that is.

Hybrid Computing Heterogeneous processors on single chip – “CPU” – FPGA – ASIC – N “CPU”s, M FPGAs, K ASICs Tradeoffs of performance, power, flexibility Wave of the future (?) (or maybe the present?)

CPU 1 CPU 2 CPU m Multi-CPU FPGA 1 FPGA 2 FPGA n Multi-FPGA Shared Memory Generic Hybrid Architecture

System Specification Partitioning CPU Compiler FPGA Synthesis CPU Power-Performance Model FPGA Power-Performance Model Source Code Generic Hy-C Tools Optimization Control Objectives/Constraints

Veyron Tesla Ducati Multi-CPU Shared Memory OMAP Resources (old)

OMAP Processor Resources Chiron – 2 x 600 MHz (2 symmetric processors each at 600 MHz with shared L2) – Power 600uW / MHz Tesla – DSP Sub-System (C64x derivative); 400 MHz, 8-wide ILP – Power 200uW / MHz Ducati – 200 MHz (targeted for control, low latency code) – Power 100uW / MHz

StrongArm C64x FPGA Shared Memory “Canonical” Resources WimpyArm

“Canonical” Processor Resources StrongArm – 2 x 600 MHz (2 symmetric processors each at 600 MHz with shared L2) – Power 600uW / MHz C64x – DSP Sub-System (C64x derivative); 400 MHz, 8-wide ILP – Power 200uW / MHz WimpyArm – 200 MHz (targeted for control, low latency code) – Power 100uW / MHz FPGA fabric

System Specification Partitioning Strong Wimpy Source Code Hy-C for Canonical Chip Optimization Control Objectives/Constraints C64x FPGA

Open Issue(s) How should we describe the architecture? How will we build “programs” for the FPGA? How should we describe the optimization constraints? How/when shall we implement this beast? How will we evaluate the “performance” of the generated code?