DAC 2001: Paper 18.2 Center for Embedded Computer Systems, UC Irvine Center for Embedded Computer Systems University of California, Irvine

Slides:



Advertisements
Similar presentations
Xianfeng Li Tulika Mitra Abhik Roychoudhury
Advertisements

ECE 667 Synthesis and Verification of Digital Circuits
1 Optimization Optimization = transformation that improves the performance of the target code Optimization must not change the output must not cause errors.
ECE Synthesis & Verification - Lecture 2 1 ECE 667 Spring 2011 ECE 667 Spring 2011 Synthesis and Verification of Digital Circuits High-Level (Architectural)
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
CHIMAERA: A High-Performance Architecture with a Tightly-Coupled Reconfigurable Functional Unit Kynan Fraser.
CML CML Presented by: Aseem Gupta, UCI Deepa Kannan, Aviral Shrivastava, Sarvesh Bhardwaj, and Sarma Vrudhula Compiler and Microarchitecture Lab Department.
Introduction to Data Flow Graphs and their Scheduling Sources: Gang Quan.
Winter 2005ICS 252-Intro to Computer Design ICS 252 Introduction to Computer Design Lecture 5-Scheudling Algorithms Winter 2005 Eli Bozorgzadeh Computer.
Modern VLSI Design 2e: Chapter 8 Copyright  1998 Prentice Hall PTR Topics n High-level synthesis. n Architectures for low power. n Testability and architecture.
Modern VLSI Design 4e: Chapter 8 Copyright  2008 Wayne Wolf Topics High-level synthesis. Architectures for low power. GALS design.
08/31/2001Copyright CECS & The Spark Project SPARK High Level Synthesis System Sumit GuptaTimothy KamMichael KishinevskyShai Rotem Nick SavoiuNikil DuttRajesh.
High-Level Constructors and Estimators Majid Sarrafzadeh and Jason Cong Computer Science Department
FPGA Latency Optimization Using System-level Transformations and DFG Restructuring Daniel Gomez-Prado, Maciej Ciesielski, and Russell Tessier Department.
Copyright © 2002 UCI ACES Laboratory A Design Space Exploration framework for rISA Design Ashok Halambi, Aviral Shrivastava,
Chuanjun Zhang, UC Riverside 1 Low Static-Power Frequent-Value Data Caches Chuanjun Zhang*, Jun Yang, and Frank Vahid** *Dept. of Electrical Engineering.
08/31/2001Copyright CECS & The Spark Project Center for Embedded Computer Systems University of California, Irvine Conditional.
ECE Synthesis & Verification - Lecture 2 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling.
Center for Embedded Computer Systems University of California, Irvine Coordinated Coarse-Grain and Fine-Grain Optimizations.
Power Savings in Embedded Processors through Decode Filter Cache Weiyu Tang, Rajesh Gupta, Alex Nicolau.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Center for Embedded Computer Systems University of California, Irvine Coordinated Coarse Grain and Fine Grain Optimizations.
08/31/2001Copyright CECS & The Spark Project Center for Embedded Computer Systems University of California, Irvine High-Level.
Simulated-Annealing-Based Solution By Gonzalo Zea s Shih-Fu Liu s
Center for Embedded Computer Systems Dynamic Conditional Branch Balancing during the High-Level Synthesis of Control-Intensive.
Data Partitioning for Reconfigurable Architectures with Distributed Block RAM Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.
Center for Embedded Computer Systems University of California, Irvine and San Diego SPARK: A C-to-VHDL Parallelizing High-Level.
PBExplore: A Framework for CIL Exploration of Partial Bypasses in Embedded Processors Aviral Shrivastava 1 Nikil Dutt 1 Alex Nicolau 1 Eugene Earlie 2.
Validating High-Level Synthesis Sudipta Kundu, Sorin Lerner, Rajesh Gupta Department of Computer Science and Engineering, University of California, San.
Storage Assignment during High-level Synthesis for Configurable Architectures Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.
Center for Embedded Computer Systems University of California, Irvine Coordinated Coarse-Grain and Fine-Grain Optimizations.
Center for Embedded Computer Systems University of California, Irvine and San Diego SPARK: A Parallelizing High-Level Synthesis.
Center for Embedded Computer Systems University of California, Irvine and San Diego Hardware and Interface Synthesis of.
Center for Embedded Computer Systems University of California, Irvine SPARK: A High-Level Synthesis Framework for Applying.
COE 561 Digital System Design & Synthesis Resource Sharing and Binding Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum.
Center for Embedded Computer Systems University of California, Irvine Dynamic Common Sub-Expression Elimination during Scheduling.
Memory management Ingrid Verbauwhede Department of Electrical Engineering University of California Los Angeles.
Center for Embedded Computer Systems University of California, Irvine and San Diego Loop Shifting and Compaction for the.
ICS 252 Introduction to Computer Design
SPARK Accelerating ASIC designs through parallelizing high-level synthesis Sumit Gupta Rajesh Gupta
ECE Synthesis & Verification 1 ECE 667 ECE 667 Synthesis and Verification of Digital Systems Retiming.
Memory Access Scheduling and Binding Considering Energy Minimization in Multi- Bank Memory Systems Chun-Gi Lyuh, Taewhan Kim DAC 2004, June 7-11, 2004.
Embedded System Design Framework for Minimizing Code Size and Guaranteeing Real-Time Requirements Insik Shin, Insup Lee, & Sang Lyul Min CIS, Penn, USACSE,
Center for Embedded Computer Systems University of California, Irvine and San Diego SPARK: A Parallelizing High-Level Synthesis.
BRASS Analysis of QuasiStatic Scheduling Techniques in a Virtualized Reconfigurable Machine Yury Markovskiy, Eylon Caspi, Randy Huang, Joseph Yeh, Michael.
Introduction to Data Flow Graphs and their Scheduling Sources: Gang Quan.
Center for Embedded Computer Systems University of California, Irvine and San Diego SPARK: A Parallelizing High-Level Synthesis.
Software Pipelining in Pegasus/CASH Cody Hartwig Elie Krevat
Maria-Cristina Marinescu Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology A Synthesis Algorithm for Modular Design of.
Embedded System Design Framework for Minimizing Code Size and Guaranteeing Real-Time Requirements Insik Shin, Insup Lee, & Sang Lyul Min CIS, Penn, USACSE,
Instruction-Level Parallelism for Low-Power Embedded Processors January 23, 2001 Presented By Anup Gangwar.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
A Graph Based Algorithm for Data Path Optimization in Custom Processors J. Trajkovic, M. Reshadi, B. Gorjiara, D. Gajski Center for Embedded Computer Systems.
Timing Analysis of Embedded Software for Speculative Processors Tulika Mitra Abhik Roychoudhury Xianfeng Li School of Computing National University of.
1 Optimizing compiler tools and building blocks project Alexander Drozdov, PhD Sergey Novikov, PhD.
CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.
Limits of Instruction-Level Parallelism Presentation by: Robert Duckles CSE 520 Paper being presented: Limits of Instruction-Level Parallelism David W.
Parallel Routing for FPGAs based on the operator formulation
Optimal Superblock Scheduling Using Enumeration Ghassan Shobaki, CS Dept. Kent Wilken, ECE Dept. University of California, Davis
CML Path Selection based Branching for CGRAs ShriHari RajendranRadhika Thesis Committee : Prof. Aviral Shrivastava (Chair) Prof. Jennifer Blain Christen.
On-Chip Logic Minimization Roman Lysecky & Frank Vahid* Department of Computer Science and Engineering University of California, Riverside *Also with the.
CAD for VLSI Ramakrishna Lecture#2.
Area-Efficient Instruction Set Synthesis for Reconfigurable System on Chip Designs Philip BriskAdam KaplanMajid Sarrafzadeh Embedded and Reconfigurable.
Operation Tables for Scheduling in the presence of Partial Bypassing Aviral Shrivastava 1 Eugene Earlie 2 Nikil Dutt 1 Alex Nicolau 1 1 Center For Embedded.
Computer Architecture Principles Dr. Mike Frank
Ann Gordon-Ross and Frank Vahid*
Lesson 4 Synchronous Design Architectures: Data Path and High-level Synthesis (part two) Sept EE37E Adv. Digital Electronics.
Architectural-Level Synthesis
ICS 252 Introduction to Computer Design
How to improve (decrease) CPI
Presentation transcript:

DAC 2001: Paper 18.2 Center for Embedded Computer Systems, UC Irvine Center for Embedded Computer Systems University of California, Irvine Speculation Techniques for High Level Synthesis of Control Intensive Designs Sumit Gupta Nikil Dutt Nick Savoiu Rajesh Gupta Sunwoo Kim Alex Nicolau SPARK High Level Synthesis System Supported by Semiconductor Research Corporation

2 Center for Embedded Computer Systems, UC Irvine Speculative Code Motions for Improved Synthesis Results  The Problem: u Quality of results of HLS strongly affected by the input behavioral specification  The Need: u High level and compiler transformations that optimize the quality of synthesis results  Our Focus: u Code motions beyond conditionals and loops  Approach: u Use speculation to increase scope of code motions

3 Center for Embedded Computer Systems, UC Irvine Scheduling with Given Resource Allocation Resource Constraints +<

4 Center for Embedded Computer Systems, UC Irvine Extracting Parallelism with Speculation

5 Center for Embedded Computer Systems, UC Irvine Reverse Speculation  Moves operations into conditionals  Only moves to branches which require result  Moves operations with lower priority

6 Center for Embedded Computer Systems, UC Irvine Early Condition Execution  Evaluates conditions ASAP  Moves all unscheduled operations into conditionals  Uses reverse speculation to achieve this

7 Center for Embedded Computer Systems, UC Irvine Spark Synthesis Framework  Experiments performed using two benchmarks:  ADPCM Encode and MPEG-1 Prediction Block

8 Center for Embedded Computer Systems, UC Irvine Effects of Speculative Code Motions on Example Designs

9 Center for Embedded Computer Systems, UC Irvine Synthesis Results using Synopsys Design Compiler  Considerable reduction in total Delay of Circuit  Critical Path Length remains fairly constant u Increasing Steering Logic u Decreasing size of controller  Area Increases due to increased steering logic and registers u Can be reduced by Resource Binding [ISSS 01]

10 Center for Embedded Computer Systems, UC Irvine Conclusions  Comparative study of various code motions u Evaluation based on F Performance and Size of Controller F Synthesis Results u Reduced schedule lengths consistently obtained by F Across hierarchical conditional block code motions F Speculation and Early Condition Execution  Average of u 35 % improvement in size of controller and performance of design u % reduction in total delay of circuit

11 Center for Embedded Computer Systems, UC Irvine Recent Related Work  Code motions in the presence of conditionals u Condition Vector List Scheduling [Wakabayashi 89] F Condition vectors to improve resource sharing among mutually exclusive operations u Symbolic Scheduling [Radivojevic 96] F Exact symbolic formulation which explores all possible solutions u WaveSched Scheduler [Lakshminarayana 98] F Minimizes expected number of cycles by speculation u Basic Block Control Graph Scheduling [Santos 99] F Supports generalized code motions