SEMI-SYNTHETIC CIRCUIT GENERATION FOR TESTING INCREMENTAL PLACE AND ROUTE TOOLS David GrantGuy Lemieux University of British Columbia Vancouver, BC.

Slides:



Advertisements
Similar presentations
Presentation of Designing Efficient Irregular Networks for Heterogeneous Systems-on-Chip by Christian Neeb and Norbert Wehn and Workload Driven Synthesis.
Advertisements

an incremental version of A*
Chapter 4 Retiming.
Buffer and FF Insertion Slides from Charles J. Alpert IBM Corp.
ELEN 468 Lecture 261 ELEN 468 Advanced Logic Design Lecture 26 Interconnect Timing Optimization.
Improving Placement under the Constant Delay Model Kolja Sulimma 1, Ingmar Neumann 1, Lukas Van Ginneken 2, Wolfgang Kunz 1 1 EE and IT Department University.
FPGA Technology Mapping Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Global Flow Optimization (GFO) in Automatic Logic Design “ TCAD91 ” by C. Leonard Berman & Louise H. Trevillyan CAD Group Meeting Prepared by Ray Cheung.
A System-Level Stochastic Benchmark Circuit Generator for FPGA Architecture Research Cindy Mark Prof. Steve Wilton University of British Columbia Supported.
1 DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs Deming Chen, Jacon Cong ICCAD 2004 Presented by: Wei Chen.
Pipelining and Retiming 1 Pipelining  Adding registers along a path  split combinational logic into multiple cycles  increase clock rate  increase.
EDA (CS286.5b) Day 5 Partitioning: Intro + KLFM. Today Partitioning –why important –practical attack –variations and issues.
Penn ESE Fall DeHon 1 ESE (ESE534): Computer Organization Day 19: March 26, 2007 Retime 1: Transformations.
Aho-Corasick String Matching An Efficient String Matching.
Logic Design Outline –Logic Design –Schematic Capture –Logic Simulation –Logic Synthesis –Technology Mapping –Logic Verification Goal –Understand logic.
Code and Decoder Design of LDPC Codes for Gbps Systems Jeremy Thorpe Presented to: Microsoft Research
EDA (CS286.5b) Day 3 Clustering (LUT Map and Delay) N.B. no lecture Thursday.
CS294-6 Reconfigurable Computing Day 16 October 15, 1998 Retiming.
EDA (CS286.5b) Day 19 Covering and Retiming. “Final” Like Assignment #1 –longer –more breadth –focus since assignment #2 –…but ideas are cummulative –open.
Interconnect Efficient LDPC Code Design Aiman El-Maleh Basil Arkasosy Adnan Al-Andalusi King Fahd University of Petroleum & Minerals, Saudi Arabia Aiman.
Instruction Scheduling II: Beyond Basic Blocks Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
Optimality Study of Logic Synthesis for LUT-Based FPGAs Jason Cong and Kirill Minkovich VLSI CAD Lab Computer Science Department University of California,
FPGA Technology Mapping. 2 Technology mapping:  Implements the optimized nodes of the Boolean network to the target device library.  For FPGA, library.
Comparators  A comparator compares two input words.  The following slide shows a simple comparator which takes two inputs, A, and B, each of length 4.
EDA (CS286.5b) Day 18 Retiming. Today Retiming –cycle time (clock period) –C-slow –initial states –register minimization.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 13, 2008 Retiming.
1 Shortest Path Calculations in Graphs Prof. S. M. Lee Department of Computer Science.
Data Structures Introduction Phil Tayco Slide version 1.0 Jan 26, 2015.
Escape Routing For Dense Pin Clusters In Integrated Circuits Mustafa Ozdal, Design Automation Conference, 2007 Mustafa Ozdal, IEEE Trans. on CAD, 2009.
June 10, Functionally Linear Decomposition and Synthesis of Logic Circuits for FPGAs Tomasz S. Czajkowski and Stephen D. Brown University of Toronto.
An automatic tool flow for the combined implementation of multi-mode circuits Brahim Al Farisi, Karel Bruneel, João Cardoso, Dirk Stroobandt.
Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.
Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc.
Channel Width Reduction Techniques for System-on-Chip Circuits in Field-Programmable Gate Arrays Marvin Tom University of British Columbia Department of.
Probabilistic Roadmaps for Path Planning in High-Dimensional Configuration Spaces (1996) L. Kavraki, P. Švestka, J.-C. Latombe, M. Overmars.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 24: April 18, 2011 Covering and Retiming.
ANALYSIS AND IMPLEMENTATION OF GRAPH COLORING ALGORITHMS FOR REGISTER ALLOCATION By, Sumeeth K. C Vasanth K.
Congestion Estimation and Localization in FPGAs: A Visual Tool for Interconnect Prediction David Yeager Darius Chiu Guy Lemieux The University of British.
Incremental Placement Algorithm for Field Programmable Gate Arrays David Leong Advisor: Guy Lemieux University of British Columbia Department of Electrical.
A Configurable High-Throughput Linear Sorter System Jorge Ortiz Information and Telecommunication Technology Center 2335 Irving Hill Road Lawrence, KS.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 7: February 3, 2002 Retiming.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Chapter 3.
1 A Min-Cost Flow Based Detailed Router for FPGAs Seokjin Lee *, Yongseok Cheon *, D. F. Wong + * The University of Texas at Austin + University of Illinois.
Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation
Lecture 6: Mapping to Embedded Memory and PLAs September 27, 2004 ECE 697F Reconfigurable Computing Lecture 6 Mapping to Embedded Memory and PLAs.
Program Design. The design process How do you go about writing a program? –It’s like many other things in life Understand the problem to be solved Develop.
Data Structures and Algorithms in Parallel Computing Lecture 2.
1 Synthesizing Datapath Circuits for FPGAs With Emphasis on Area Minimization Andy Ye, David Lewis, Jonathan Rose Department of Electrical and Computer.
ELEC692 VLSI Signal Processing Architecture Lecture 3
CALTECH CS137 Spring DeHon 1 CS137: Electronic Design Automation Day 5: April 12, 2004 Covering and Retiming.
Optimality Study of Logic Synthesis for LUT-Based FPGAs Jason Cong and Kirill Minkovich.
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
1 WireMap FPGA Technology Mapping for Improved Routability Stephen Jang, Xilinx Inc. Billy Chan, Xilinx Inc. Kevin Chung, Xilinx Inc. Alan Mishchenko,
Interconnect Driver Design for Long Wires in FPGAs Edmund Lee University of British Columbia Electrical & Computer Engineering MASc Thesis Presentation.
DAOmap: A Depth-optimal Area Optimization Mapping Algorithm for FPGA Designs Deming Chen, Jason Cong , Computer Science Department , UCLA Presented.
Fast Parallel Algorithms for Edge-Switching to Achieve a Target Visit Rate in Heterogeneous Graphs Maleq Khan September 9, 2014 Joint work with: Hasanuzzaman.
NOTE: To change the image on this slide, select the picture and delete it. Then click the Pictures icon in the placeholder to insert your own image. Fast.
Congestion-Driven Re-Clustering for Low-cost FPGAs MASc Examination Darius Chiu Supervisor: Dr. Guy Lemieux University of British Columbia Department of.
EEL 5722 FPGA Design Fall 2003 Digit-Serial DSP Functions Part I.
A Semi-Canonical Form for Sequential Circuits Alan Mishchenko Niklas Een Robert Brayton UC Berkeley Michael Case Pankaj Chauhan Nikhil Sharma Calypto Design.
1 Euler and Hamilton paths Jorge A. Cobb The University of Texas at Dallas.
Xiao Patrick Dong Supervisor: Guy Lemieux. Goal: Reduce critical path  shorter period Decrease dynamic power 2.
On the Relation Between Simulation-based and SAT-based Diagnosis CMPE 58Q Giray Kömürcü Boğaziçi University.
LINKED LISTS.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 25: April 17, 2013 Covering and Retiming.
Runtime-Quality Tradeoff in Partitioning Based Multithreaded Packing
Incremental Placement Algorithm for Field Programmable Gate Arrays
CS184a: Computer Architecture (Structures and Organization)
Fast Min-Register Retiming Through Binary Max-Flow
Under a Concurrent and Hierarchical Scheme
Presentation transcript:

SEMI-SYNTHETIC CIRCUIT GENERATION FOR TESTING INCREMENTAL PLACE AND ROUTE TOOLS David GrantGuy Lemieux University of British Columbia Vancouver, BC

Overview  Introduction Circuit Generation Method 1 (FPL) doesn’t work well Circuit Generation Method 2 (FPT) simple, works very well  Circuit Scaling Extension Conclusions 2

Introduction Problem  (Free) Large digital circuits are rare  FPGA CAD tools need large circuits Solution  Generate random circuits  But is random realistic? Goal  Generate realistic random circuits (synthetic)  Useful for testing place and route tools 3

Introduction Past approaches  Generate a complete synthetic circuit  Tools: ccirc+cgen, gnl New problem  Real world development is iterative, incremental  FPGA tools commonly used in incremental mode  Need circuits with incremental changes 4

Introduction Our Approach Start with a real circuit Generate a small synthetic change Result is a semi-synthetic circuit Four steps to generate a semi-synthetic circuit 1. Identify 2. Remove 3. Generate 4. Stitch (difficult?) 5

Introduction Problem: Blindly stitching a circuit can create combinational loops Cannot arbitrarily insert new latches to break loops 6

Overview Introduction  Circuit Generation Method 1 (FPL) Circuit Generation Method 2 (FPT)  Circuit Scaling Conclusions 7

Method 1: Introduction Four steps to generate a semi-synthetic circuit 1. Identify sub-circuit (T-VPack) 2. Remove sub-circuit 3. Generate clone (ccirc+gnl, ccirc+cgen) 4. Stitch Steps 1, 2, 3 are easy… Stitching is difficult !! 8

Method 1: Loop Graph, L Ideal Stitching  Determine which outputs connect back to inputs through combinational logic  If synthetic circuit generators could use L, we could stop here Graph L 9

Method 1: Stitching Real Stitching takes 4 sub-steps… a) Generate a dependence graph, D –Calculated precisely from synthetic clone (subcircuit) b) Generate a permissible graph, P –Calculated imprecisely from loop graph of host circuit c) Solve the monomorphism problem –Find a monomorphism mapping D into P d) Stitch 10

Method 1: a) Dependence Graph, D D is computed from synthetic clone Captures all combinational paths through clone Graph D 11

Method 1: b) Permissible Graph, P P computed from L using heuristic Problem is NP-complete in general  Start with loop graph L, consisting of only back-edges  Add all forward edges to L, creating cycles  call this P  Find forward edge in cycle, remove it, repeat until P is cycle-free  Remove all back-edges from P Graph P Graph L

Method 1: c) Monomorphism Monomorphism is like Isomorphism  Except that number of edges need not be identical Find 1:1 vertex mapping of D to P  D: forward combinational paths in synthetic clone  P: permissible forward combinational paths Graph D Graph P (1,2) (2,1) (3,3) (4,4) (5,5) Mapping 13

Method 1: d) Stitch Take the mapping solution and merge the clone into the hole left in the original circuit Mapping: (2,1), (1,2), (3,3), (4,4), (5,5) 14

Method 1: Discussion Good results for small cutout regions Unacceptably long run times for subcircuits > 50 I/Os Monomorphism solver is non-deterministic Preserved most key circuit characteristics  Logic depth increases by factor 2-3x …because method is unconstrained ! Conclusion  Stitching is not a trivial problem to solve (in a cycle- free way) 15

Overview Introduction Circuit Generation Method 1 (FPL)  Circuit Generation Method 2 (FPT)  Circuit Scaling Conclusions 16

Method 2: Introduction Four steps to generate a semi-synthetic circuit 1. Identify sub-circuit (T-VPack) 2. Remove sub-circuit 3. Generate clone (perturber) 4. Stitch The perturber does not “profile and generate”  Instead, it “perturbs” the existing circuit Stitching is trivialized using this method 17

Method 2: Perturbing a Circuit Perturber Algorithm a) Levelize the complete circuit b) Select a random edge in the sub-circuit (1 source and 1 sink) c) Select a second edge in sub-circuit under certain constraints * d) Swap the sinks e) Repeat 18

Method 2: Perturbing a Circuit Perturber Algorithm (cont'd) * Constraints guiding second edge selection 1. Source and sink levels must match 2. Source node cannot be sub-circuit input 19

Method 2: Advantages of Perturber Algorithm Key circuit characteristics are exactly preserved  Number of nodes and edges  Fanout distribution  Depth profile Strengths  No combinational loops are created Because the levelization is preserved  Very fast execution time  Very simple approach 20

Method 2: Initial Results Good results for small cutouts Less-than-impressive results for large cut-outs Cause  Locality is lost during perturbation Independent buses are cross-connected Regular features of the circuit are lost Connections swapped across large regions of chip  Need locality control !! 21

Method 2: Ancestor Control Method to preserve locality Add additional edge selection constraint 1. Source and sink levels must match 2. Source node may not be sub-circuit input 3. Both edges must share a common ancestor through combinational logic within a certain ancestor depth Stop at sub-circuit inputs and flops 22

Method 2: Testing Know all the key circuit characteristics are preserved  Focus comparisons on post-routing results Test 20 largest MCNC circuits  Metrics: channel width, delay, and wirelength  goodness of result == closeness to original MCNC result Test 1: Sanity test, full-circuit compare w/ ccirc+cgen Test 2: Incremental semi-synthetic circuits 23

Method 2: Sanity Check Results Channel Width of complete synthetic circuit Overall cgen: 20% error perturber: 14% error 24

Method 2: Sanity Check Results Delay of complete synthetic circuit Overall cgen: 7% error perturber: 9% error 25

Method 2: Sanity Check Results Wirelength for a complete synthetic circuit Overall cgen: 24 % error perturber: 17 % error 26

Method 2: Testing Test 2: Incremental Circuit Results  Generate semi-synthetic incremental circuit  Change only 5%, 10%, 20% of real circuit  No previous work to compare against 27

Method 2: Incremental Results Channel Width for various cutouts Overall 5%: 2.6 % error 10%: 3.2 % error 20% 6.7 % error 28

Method 2: Incremental Results Delay for various cutouts Overall 5%: 5.5 % error 10%: 6.4 % error 20% 8.7 % error 29

Method 2: Incremental Results Wirelength for various cutouts Overall 5%: 1.4 % error 10%: 2.2 % error 20% 3.9 % error 30

Method 2: Discussion Sanity Check  Complete circuit result error < ccirc+cgen error (error reduced by ~1/3) Incremental Circuits  Close to original (1%-6%) using 5%, 10% cutouts  More deviation (4%-9%) at 20% cutout 31

Overview Introduction Circuit Generation Method 1 (FPL) Circuit Generation Method 2 (FPT)  Circuit Scaling Conclusions 32

Circuit Scaling Scaling  Reduce the size of the cutout region so the tools have to fill in holes  Increase the size of the cutout region so the tools have to make room for larger circuit Approach  Scale a circuit (mutator), then perturb it 33

Circuit Scaling Reduction  Shotgun approach: delete nodes at random Sometimes need to delete chains of logic Enlargement  Duplicate circuit in parallel  Multiplex inputs and outputs with LUTs  Doubles (…, triples, etc) circuit Enlargement+reduction to achieve arbitrary scaling 34

Circuit Scaling: Results Test 5%, 10%, 20% cutout Scale cutout size to 50%, 75%, 200%, 400% original No unexpected changes in post-routing characteristics from MCNC original Some expected changes  Eg, wirelength increased as cutout size enlarged 35

Conclusions Have shown how to create a semi-synthetic circuit Method 2 is a superior method over past approaches  Runtime, simplicity  Preservation of key circuit properties Preserve them first, find ways to alter them later…  Quality of result Scaling is able to change the circuit size without changing the post-routing characteristics in unexpected ways 36

Future Work Critical Path  Lengthen the critical path to force the tools to shuffle nodes along the critical path Mutate other circuit properties?  Node depth profile  Fanout distribution  Fanin distribution  Wire lengths 37

It's Over Questions? Comments? Concerns? 38