Fast Compilation for Reconfigurable Hardware Mihai Budiu and Seth Copen Goldstein Carnegie Mellon University Computer Science Department Joint work with.

Slides:

Advertisements

Similar presentations

Spatial Computation Thesis committee: Seth Goldstein Peter Lee Todd Mowry Babak Falsafi Nevin Heintze Ph.D. Thesis defense, December 8, 2003 SCS Mihai.

Advertisements

Inter-Iteration Scalar Replacement in the Presence of Control-Flow Mihai Budiu – Microsoft Research, Silicon Valley Seth Copen Goldstein – Carnegie Mellon.

Mihai Budiu Microsoft Research – Silicon Valley joint work with Girish Venkataramani, Tiberiu Chelcea, Seth Copen Goldstein Carnegie Mellon University.

Field Programmable Gate Array

CAST – Reconfigurability CAST – Configurable Radio with Advanced Software Technology Reconfigurability – Enabling Technology Problems Galway – 1 st October.

FLAC Lecture 19 Turing Machines and Real Life * Reductions Mihai Budiu March 3, 2000.

PipeRench: A Coprocessor for Streaming Multimedia Acceleration Seth Goldstein, Herman Schmit et al. Carnegie Mellon University.

Performed by: Lin Ilia Khinich Fanny Instructor: Fiksman Eugene המעבדה למערכות ספרתיות מהירות High Speed Digital Systems Laboratory הטכניון - מכון טכנולוגי.

©TheMcGraw-Hill Companies, Inc. Permission required for reproduction or display. COMPSCI 125 Introduction to Computer Science I.

BitValue: Detecting and Exploiting Narrow Bitwidth Computations Mihai Budiu Carnegie Mellon University joint work with Majd Sakr, Kip.

Nanotechnology: Spatial Computing Using Molecular Electronics Mihai Budiu joint work with Seth Copen Goldstein Dan Rosewater.

Presenting: Itai Avron Supervisor: Chen Koren Final Presentation Spring 2005 Implementation of Artificial Intelligence System on FPGA.

Peer-to-peer Hardware-Software Interfaces for Reconfigurable Fabrics Mihai Budiu Mahim Mishra Ashwin Bharambe Seth Copen Goldstein Carnegie Mellon University.

Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 08: RC Principles: Software (1/4) Prof. Sherief Reda.

Compiling Application-Specific Hardware Mihai Budiu Seth Copen Goldstein Carnegie Mellon University.

The Xilinx EDK Toolset: Xilinx Platform Studio (XPS) Building a base system platform.

Application-Specific Hardware Computing Without Processors Mihai Budiu October 6, 2001 SOCS-4.

Introduction to Field Programmable Gate Arrays (FPGAs) COE 203 Digital Logic Laboratory Dr. Aiman El-Maleh College of Computer Sciences and Engineering.

Detecting and Exploiting Narrow Bitwidth Computations Mihai Budiu Carnegie Mellon University joint work with Seth Copen Goldstein.

SSS 4/9/99CMU Reconfigurable Computing1 The CMU Reconfigurable Computing Project April 9, 1999 Mihai Budiu

Seven Minute Madness: Reconfigurable Computing Dr. Jason D. Bakos.

Dr. Konstantinos Tatas ACOE201 – Computer Architecture I – Laboratory Exercises Background and Introduction.

ASH: A Substrate for Scalable Architectures Mihai Budiu Seth Copen Goldstein CALCM Seminar, March 19, 2002.

Tasks 1.Check you have the following posts: 1.Hardware and Software 2.Input and Output 3.Components of a Computer 2.ALL fonts should be a sensible size.

COMPUTER SCIENCE &ENGINEERING Compiled code acceleration on FPGAs W. Najjar, B.Buyukkurt, Z.Guo, J. Villareal, J. Cortes, A. Mitra Computer Science & Engineering.

COSC 235: Programming and Problem Solving Chapter 1: The magic of Python Instructor: Dr. X 1.

Automated Design of Custom Architecture Tulika Mitra

To be smart or not to be? Siva Subramanian Polaris R&D Lab, RTP Tal Lavian OPENET Lab, Santa Clara.

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 1- 1 Overview 1.1 Computer Systems 1.2 Programming and Problem Solving.

2-1 Hardware CPU Memory - 2 kinds Network Graphics Input and Output Devices.

Configurable, reconfigurable, and run-time reconfigurable computing.

FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.

Datta1 Routing for Reliability in Molecular Diode-based Programmable Nanofabrics Kushal Datta, Arindam Mukherjee and Arun Ravindran Department of Electrical.

Introduction to FPGAs Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.

Novel Hardware-software Architecture for Computation of DWT Using Recusive Merge Algorithm Piyush Jamkhandi, Amar Mukherjee, Kunal Mukherjee, Robert Franceschini.

Lesson 1 Operating Systems, Part 1. Objectives Describe and list different operating systems Understand file extensions Manage files and folders.

Language Implementation Methods David Woolbright.

Reconfigurable architectures ESE 566. Outline Static and Dynamic Configurable Systems –Static SPYDER, RENCO –Dynamic FIREFLY, BIOWATCH PipeRench: Reconfigurable.

Project Description. NetFGPA-based Virtual Router Implement a Virtual Router with using NetFPGA Box. an open source hardware and software platform for.

Software Development Introduction

1.The following diagram illustrates the relationship among various hardware components. The arrows indicate the directions of data flow. Activity 1 Relationship.

Specialized Virtual Configurable Arrays Dominique Lavenier - Frederic Raimbault IRISA Rennes, France UBS Vannes, France

Cluster Analysis Data Mining Experiment Department of Computer Science Shenzhen Graduate School Harbin Institute of Technology.

Computer Science A 1. Course plan Introduction to programming Basic concepts of typical programming languages. Tools: compiler, editor, integrated editor,

The Big Picture. My Story  Wrote great programs  Didn’t understand how they worked.

FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.

What’s a Computer?. The Basics A computer is a machine that manipulates data based on a list of instructions called a program.

Introduction to Computer Programming By: Mr. Baha Hanene Chapter 1.

Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.

Information Technology. *At Home *In business *In Education *In Healthcare Computer Uses.

Chapter 1 Introduction.

What Do Computers Do? A computer system is

Programmable Hardware: Hardware or Software?

Electrical Engineering

Computational Thinking, Problem-solving and Programming: General Principals IB Computer Science.

A tutorial guide to start with ISE

مقدمة في الحاسب الآلي T. Arwa Alsarami.

Chapter 1: Computer Systems

Implementation of IDEA on a Reconfigurable Computer

Computer Electronic device Accepts data - input

Computers & Programming Languages

Dynamically Reconfigurable Architectures: An Overview

Embedded systems, Lab 1: notes

Computer Electronic device Accepts data - input

الانترنت والبريد الإلكتروني

University of Gujrat Department of Computer Science

Computer Electronic device Accepts data - input

University of Florida, Gainesville, Florida, USA

Internal components of a computer.

Presentation transcript:

Fast Compilation for Reconfigurable Hardware Mihai Budiu and Seth Copen Goldstein Carnegie Mellon University Computer Science Department Joint work with Srihari Cadambi, Herman Schmit, Matt Moe, Robert Taylor, Ronald Laufer

FPGA, Feb (c) 1998 by Mihai Budiu 2 Goal To program reconfigurable devices using the standard software development processes: –Compile C or Java –Do it quickly Partitioner DIL Java Data-flow Intermediate Language Configuration Reconfigurable HW CPU This talk

FPGA, Feb (c) 1998 by Mihai Budiu 3 Compiler Performance on 1D DCT (8 inputs 8 bit each) Compilation: ~700x faster

FPGA, Feb (c) 1998 by Mihai Budiu 4 The Place and Route Problem Interconnection operators +. <<[1,2] >><< &~ ~ + Processing elements << >>. [1,2] Interconnection network & <<

FPGA, Feb (c) 1998 by Mihai Budiu 5 Our Target: Medium grain processing elements (4 bits) Pipelined architecture Virtualized hardware Local interconnection network Wide pipelined bus

FPGA, Feb (c) 1998 by Mihai Budiu 6 The Place and Route Problem Interconnection operators +. <<[1,2] >><< &~ ~ + Processing elements << >>. [1,2] Interconnection network & << Stripe

FPGA, Feb (c) 1998 by Mihai Budiu 7 Why Place and Route Is Hard Hard constraints: –Stripe width –Pipelined bus width Word-based circuit –interconnection network switches words –fixed PE size Scarce input ports for the interconnection network

FPGA, Feb (c) 1998 by Mihai Budiu 8 How We Simplify Place and Route Computation-oriented programs (restricted language, with unidirectional data flow) Hardware resources virtualized Relatively rich interconnection network High granularity placement (I.e. one 32-bit adder instead of 100 gates) There is a wide pipelined bus available Timing is very predictable

FPGA, Feb (c) 1998 by Mihai Budiu 9 The Key Idea Global analysis and transformations guarantee placeability using lazy noops (conservatively) Deterministic, greedy place & route (no backtracking) All passes linear time in the size of the circuit

FPGA, Feb (c) 1998 by Mihai Budiu 10 Guaranteeing Placement +. << [1,2] >> << &~ +. [1,2] >> << & ~ noop Complex permutation Simple permutation Simple permutation The inserted noops are sufficient but not necessary Simple permutation

FPGA, Feb (c) 1998 by Mihai Budiu 11 Placement of a Non-lazy Noop & ~ noop + + & ~

FPGA, Feb (c) 1998 by Mihai Budiu 12 Lazy Noops Are Not Placed & ~ + + & ~ noop

FPGA, Feb (c) 1998 by Mihai Budiu 13 Place and Route Overview Analysis: –Noops have been inserted to guarantee that the graph is routable. Place & Route: –will determine which lazy noops are instantiated Next: actual Place and Route

FPGA, Feb (c) 1998 by Mihai Budiu 14 Already placed Step1: Analyze Routability + &~ noop & ~ Q: can we place the + given the placement of its ancestors?

FPGA, Feb (c) 1998 by Mihai Budiu 15 Step 2: If a Node Is Unroutable Solution: promote a lazy noop + &~ noop + &~

FPGA, Feb (c) 1998 by Mihai Budiu 16 Step 3: Choosing a Noop Closest noop which is routable. + &~ noop + &~

FPGA, Feb (c) 1998 by Mihai Budiu 17 Other Details Operators are decomposed in pieces for: –timing constraints –size constraints When placing optimize for –register pressure when accessing the bus –constraints placed on future nodes Long critical paths are sliced with pipeline registers

FPGA, Feb (c) 1998 by Mihai Budiu 18 Compilation Times (Seconds on PII/400)

FPGA, Feb (c) 1998 by Mihai Budiu 19 Compilation Speed (PII/400)

FPGA, Feb (c) 1998 by Mihai Budiu 20 Compilation Times Breakdown Place and route

FPGA, Feb (c) 1998 by Mihai Budiu 21 Placed Circuit Utilization

FPGA, Feb (c) 1998 by Mihai Budiu 22 Simulated Speed-up vs. 300Mhz

FPGA, Feb (c) 1998 by Mihai Budiu 23 Conclusions Fast compilation from HLL achievable (seconds not tens of minutes.) High-quality output achievable (60% density) Linear-time Place and Route feasible using the technique of lazy noops

FPGA, Feb (c) 1998 by Mihai Budiu 24 Future Work Time-multiplexing the bus Porting to commercial FPGAs Front-end from C/Java to DIL

FPGA, Feb (c) 1998 by Mihai Budiu 25 How We Simplify Place and Route Computation-oriented programs (restricted language, with unidirectional data flow) Hardware resources virtualized Relatively rich interconnection network High granularity placement (I.e. one 32-bit adder instead of 100 gates) There is a wide pipelined bus available Timing is very predictable

FPGA, Feb (c) 1998 by Mihai Budiu 26 Our Target Applications Pipelineable applications –Stream processing (e.g. DSP, encryption) –Multimedia processing –Vector processing –Limited data dependencies v7 v8 v9 v6 v5 v4 v3 v2 v1 HW Input data Output data Computational power stems from massive parallelism

FPGA, Feb (c) 1998 by Mihai Budiu 27 Mapping Circuits to PipeRench - + a b c - + a b c -+ a b c -+ a b c

FPGA, Feb (c) 1998 by Mihai Budiu 28 Timing and Size Guarantees

FPGA, Feb (c) 1998 by Mihai Budiu 29 Optimize for Register Pressure & ~ Cost: Best position + &~ noop

FPGA, Feb (c) 1998 by Mihai Budiu 30 Kernels