Architecture and Synthesis for Multi-Cycle Communication

Slides:



Advertisements
Similar presentations
A Flow Graph Technique for DFT Controller Modification
Advertisements

ECE Synthesis & Verification - Lecture 2 1 ECE 667 Spring 2011 ECE 667 Spring 2011 Synthesis and Verification of Digital Circuits High-Level (Architectural)
Logic Synthesis – 3 Optimization Ahmed Hemani Sources: Synopsys Documentation.
Modern VLSI Design 4e: Chapter 8 Copyright  2008 Wayne Wolf Topics High-level synthesis. Architectures for low power. GALS design.
High-Level Constructors and Estimators Majid Sarrafzadeh and Jason Cong Computer Science Department
FPGA Latency Optimization Using System-level Transformations and DFG Restructuring Daniel Gomez-Prado, Maciej Ciesielski, and Russell Tessier Department.
Application Specific Instruction Generation for Configurable Processor Architectures VLSI CAD Lab Computer Science Department, UCLA Led by Jason Cong Yiping.
08/31/2001Copyright CECS & The Spark Project Center for Embedded Computer Systems University of California, Irvine Conditional.
Kazi Spring 2008CSCI 6601 CSCI-660 Introduction to VLSI Design Khurram Kazi.
Behavioral Synthesis Outline –Synthesis Procedure –Example –Domain-Specific Synthesis –Silicon Compilers –Example Tools Goal –Understand behavioral synthesis.
Kazi Fall 2006 EEGN 4941 EEGN-494 HDL Design Principles for VLSI/FPGAs Khurram Kazi.
Architecture and Compilation for Reconfigurable Processors Jason Cong, Yiping Fan, Guoling Han, Zhiru Zhang Computer Science Department UCLA Nov 22, 2004.
Storage Assignment during High-level Synthesis for Configurable Architectures Wenrui Gong Gang Wang Ryan Kastner Department of Electrical and Computer.
Bitwidth-Aware Scheduling and Binding in High-Level Synthesis X. Cheng +, J. Cong, Y. Fan, G. Han, J. Lin, J. Xu +, Z. Zhang Computer Science Department,
1 Temperature-Aware Resource Allocation and Binding in High Level Synthesis Authors: Rajarshi Mukherjee, Seda Ogrenci Memik, and Gokhan Memik Presented.
Synthesis of Transaction-Level Models to FPGAs Prof. Jason Cong Yiping Fan, Guoling Han, Wei Jiang, Zhiru Zhang VLSI CAD Lab Computer Science Department.
林永隆 (Youn-Long Lin) Department of Computer Science National Tsing Hua University High-Level Synthesis of VLSIs THEDA Tsing Hua Electronic Design Automation.
Merging Synthesis With Layout For Soc Design -- Research Status Jinian Bian and Hongxi Xue Dept. Of Computer Science and Technology, Tsinghua University,
Center for Embedded Computer Systems University of California, Irvine and San Diego Loop Shifting and Compaction for the.
High-Level Synthesis for Reconfigurable Systems. 2 Agenda Modeling 1.Dataflow graphs 2.Sequencing graphs 3.Finite State Machine with Datapath High-level.
XPilot: A Platform-Based System-Level Synthesis for Reconfigurable SOCs Prof. Jason Cong UCLA Computer Science Department.
Combining High Level Synthesis and Floorplan Together EDA Lab, Tsinghua University Jinian Bian.
Architecture-Level Synthesis for Automatic Interconnect Pipelining
A New Methodology for Reduced Cost of Resilience Andrew B. Kahng, Seokhyeong Kang and Jiajia Li UC San Diego VLSI CAD Laboratory.
LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.
Section 10: Advanced Topics 1 M. Balakrishnan Dept. of Comp. Sci. & Engg. I.I.T. Delhi.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
Hardware/Software Co-design Design of Hardware/Software Systems A Class Presentation for VLSI Course by : Akbar Sharifi Based on the work presented in.
L11: Lower Power High Level Synthesis(2) 성균관대학교 조 준 동 교수
ASIC, Customer-Owned Tooling, and Processor Design Nancy Nettleton Manager, VLSI ASIC Device Engineering April 2000 Design Style Myths That Lead EDA Astray.
System-level power analysis and estimation September 20, 2006 Chong-Min Kyung.
1 Power estimation in the algorithmic and register-transfer level September 25, 2006 Chong-Min Kyung.
L13 :Lower Power High Level Synthesis(3) 성균관대학교 조 준 동 교수
A High-Level Synthesis Flow for Custom Instruction Set Extensions for Application-Specific Processors Asia and South Pacific Design Automation Conference.
Meenakshi Kaul, Vinoo Srinivasan, Sriram Govindarajan, Iyad Ouaiss, and Ranga Vemuri University of Cincinnati
A 1.2V 26mW Configurable Multiuser Mobile MIMO-OFDM/-OFDMA Baseband Processor Motivations –Most are single user, SISO, downlink OFDM solutions –Training.
CAD for VLSI Ramakrishna Lecture#2.
1/29 UTDSP: A VLIW Programmable DSP Processor Sean Hsien-en Peng Department of Electrical and Computer Engineering University of Toronto October 26 th,
-1- Soft Core Viterbi Decoder EECS 290A Project Dave Chinnery, Rhett Davis, Chris Taylor, Ning Zhang.
1 Architecture of Datapath- oriented Coarse-grain Logic and Routing for FPGAs Andy Ye, Jonathan Rose, David Lewis Department of Electrical and Computer.
Design of Digital Circuits Lecture 14: Microprogramming
System-on-Chip Design
ASIC Design Methodology
Variable Word Width Computation for Low Power
Ph.D. in Computer Science
James Coole PhD student, University of Florida Aaron Landy Greg Stitt
Pipelining and Retiming 1
CGRA Express: Accelerating Execution using Dynamic Operation Fusion
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
Processor (I).
A Review of Processor Design Flow
Jason Cong, Guoling Han, Zhiru Zhang VLSI CAD Lab
Michael Chu, Kevin Fan, Scott Mahlke
COE 561 Digital System Design & Synthesis Introduction
Lesson 4 Synchronous Design Architectures: Data Path and High-level Synthesis (part two) Sept EE37E Adv. Digital Electronics.
Redundancy-Aware, Fault-Tolerant Clustering
ECE-C662 Introduction to Behavioral Synthesis Knapp Text Ch
An Automated Design Flow for 3D Microarchitecture Evaluation
Steve Dai, Gai Liu, Zhiru Zhang
Architectural-Level Synthesis
Architecture Synthesis
HIGH LEVEL SYNTHESIS.
Resource Sharing and Binding
EDA Lab., Tsinghua University
Win with HDL Slide 4 System Level Design
Michele Santoro: Further Improvements in Interconnect-Driven High-Level Synthesis of DFGs Using 2-Level Graph Isomorphism Michele.
Low Power Digital Design
COE 561 Digital System Design & Synthesis Introduction
ECE 448 Lecture 6 Finite State Machines State Diagrams vs. Algorithmic State Machine (ASM) Charts.
Presentation transcript:

Architecture and Synthesis for Multi-Cycle Communication SOC Group, VLSICAD Lab Led by Jason Cong Yiping Fan, Guoling Han, Xun Yang, Zhiru Zhang VLSI CAD LAB Motivation What is happening now: Interconnect delays dominate the timing in DSM tech. What is about to happen: Single-cycle full chip synchronization is no longer possible. Our Approach Regular Distributed Register (RDR) micro-architecture Highly regular Direct support of multi-cycle on-chip communication MCAS: Architectural Synthesis for Multi-cycle Communication Integrated architectural synthesis (e.g. binding, scheduling) with physical planning Target at RDR architecture MCAS vs. Conventional Flow MCAS achieves 31% clock period and 24% total latency reduction with 18% resource overhead and 11% clock cycle increase on average. ICG C program Locations Placement-driven rescheduling & rebinding Scheduling-driven placement CDFG generation Register and port binding Datapath & FSM generation Floorplan constraints Resource allocation & Functional unit binding RTL VHDL Multi-cycle path constraints CDFG MCAS (Multi-Cycle Architectural Synthesis) 7.52 15.04 22.56 24.9 (mm) 1 clock 2 clock 3 clock 4 clock 5 clock 6 clock 7 clock Global Interconnect … LCC FSM K cycles 1 cycle 2 cycles Register file Island Local Computational Cluster (LCC) …. Register File Wi Hi ALU MUL Cluster with area constraint 2 cycle K cycle MUX - + * 1 3 5 7 9 2 4 6 8 11 10 12 Control Data Flow Graph (CDFG) Mul1 Alu2 Mul2 Alu1 Interconnected Component Graph (ICG) MCAS vs. Synopsys Behavioral Compiler MCAS achieves 21% clock period and 29% total latency reduction on average, without area overhead. Reg. file … Alu1 1,5,10 Mul2 3,7,11 Alu2 2,6,9 Mul1 4,8,12 RDR Placement MCAS System Scheduling-driven placement Integrate list-scheduling with a SA-based global placement for minimizing the total latency. Employ net weighting technique to shorten the critical global connections. Placement-driven rescheduling & rebinding Integrate force-directed list-scheduling with simultaneous rescheduling & rebinding to further minimize the latency. RDR Architecture Distribute registers to each “island” Chose the island size such that local computation and communication in each island can be done in a single cycle: Dintra-island=Dlogic+Dopt-intDlogic+2Dopt-int(Wi+Hi)T