Incremental Placement Algorithm for Field Programmable Gate Arrays David Leong Advisor: Guy Lemieux University of British Columbia Department of Electrical.

Slides:



Advertisements
Similar presentations
FPGA-Based System Design: Chapter 4 Copyright  2004 Prentice Hall PTR Topics n Logic synthesis. n Placement and routing.
Advertisements

BSPlace: A BLE Swapping technique for placement Minsik Hong George Hwang Hemayamini Kurra Minjun Seo 1.
Floating-Point FPGA (FPFPGA) Architecture and Modeling (A paper review) Jason Luu ECE University of Toronto Oct 27, 2009.
1 Optimization of Routing Algorithms Summer Science Research By: Kumar Chheda Research Mentor: Dr. Sean McCulloch.
Congestion Driven Placement for VLSI Standard Cell Design Shawki Areibi and Zhen Yang School of Engineering, University of Guelph, Ontario, Canada December.
EECE579: Digital Design Flows
Clustering of Large Designs for Channel-Width Constrained FPGAs Marvin TomGuy Lemieux University of British Columbia Department of Electrical and Computer.
Zheming CSCE715.  A wireless sensor network (WSN) ◦ Spatially distributed sensors to monitor physical or environmental conditions, and to cooperatively.
Placement 1 Outline Goal What is Placement? Why Placement?
SCOTT MILLER, AMBROSE CHU, MIHAI SIMA, MICHAEL MCGUIRE ReCoEng Lab DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING UNIVERSITY OF.
Evolution of implementation technologies
Fast Force-Directed/Simulated Evolution Hybrid for Multiobjective VLSI Cell Placement Junaid Asim Khan Dept. of Elect. & Comp. Engineering, The University.
EDA (CS286.5b) Day 7 Placement (Simulated Annealing) Assignment #1 due Friday.
Architecture and Routing for NoC-based FPGA Israel Cidon* *joint work with Roman Gindin and Idit Keidar.
Merging Synthesis With Layout For Soc Design -- Research Status Jinian Bian and Hongxi Xue Dept. Of Computer Science and Technology, Tsinghua University,
The Memory/Logic Interface in FPGA’s with Large Embedded Memory Arrays The Memory/Logic Interface in FPGA’s with Large Embedded Memory Arrays Steven J.
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
Introduction to FPGA’s FPGA (Field Programmable Gate Array) –ASIC chips provide the highest performance, but can only perform the function they were designed.
ECE 506 Reconfigurable Computing Lecture 7 FPGA Placement.
ECE 506 Reconfigurable Computing Lecture 8 FPGA Placement.
Dr. Konstantinos Tatas ACOE201 – Computer Architecture I – Laboratory Exercises Background and Introduction.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
Yehdhih Ould Mohammed Moctar1 Nithin George2 Hadi Parandeh-Afshar2
High-Quality, Deterministic Parallel Placement for FPGAs on Commodity Hardware Adrian Ludwin, Vaughn Betz & Ketan Padalia FPGA Seminar Presentation Nov.
Placement by Simulated Annealing. Simulated Annealing  Simulates annealing process for placement  Initial placement −Random positions  Perturb by block.
Power Reduction for FPGA using Multiple Vdd/Vth
Titan: Large and Complex Benchmarks in Academic CAD
Scalable and Deterministic Timing-Driven Parallel Placement for FPGAs Supervisor: Dr. Guy Lemieux October 20, 2011 Chris Wang.
LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.
Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc.
Channel Width Reduction Techniques for System-on-Chip Circuits in Field-Programmable Gate Arrays Marvin Tom University of British Columbia Department of.
FPGA Run-time Reconfigurable Placement Presentation by Brian Leonard Clemson University 2003 SURE REU Program Advisor: Ron Sass.
Enhancing FPGA Performance for Arithmetic Circuits Philip Brisk 1 Ajay K. Verma 1 Paolo Ienne 1 Hadi Parandeh-Afshar 1,2 1 2 University of Tehran Department.
1 Rapid Estimation of Power Consumption for Hybrid FPGAs Chun Hok Ho 1, Philip Leong 2, Wayne Luk 1, Steve Wilton 3 1 Department of Computing, Imperial.
1 Parallelizing FPGA Placement with TMSteffan Parallelizing FPGA Placement with Transactional Memory Steven Birk*, Greg Steffan**, and Jason Anderson**
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 5: Global Routing © KLMH Lienig 1 EECS 527 Paper Presentation High-Performance.
University of British Columbia Dept. of Electrical and Computer Engineering November 30, 2007 A Combined Clustering and Placement Algorithm for FPGAs Mark.
Regularity-Constrained Floorplanning for Multi-Core Processors Xi Chen and Jiang Hu (Department of ECE Texas A&M University), Ning Xu (College of CST Wuhan.
Design Space Exploration for Application Specific FPGAs in System-on-a-Chip Designs Mark Hammerquist, Roman Lysecky Department of Electrical and Computer.
Impact of Interconnect Architecture on VPSAs (Via-Programmed Structured ASICs) Usman Ahmed Guy Lemieux Steve Wilton System-on-Chip Lab University of British.
Han Liu Supervisor: Seok-Bum Ko Electrical & Computer Engineering Department 2010-Feb-2.
Congestion Estimation and Localization in FPGAs: A Visual Tool for Interconnect Prediction David Yeager Darius Chiu Guy Lemieux The University of British.
Lecture 13: Logic Emulation October 25, 2004 ECE 697F Reconfigurable Computing Lecture 13 Logic Emulation.
1 A Min-Cost Flow Based Detailed Router for FPGAs Seokjin Lee *, Yongseok Cheon *, D. F. Wong + * The University of Texas at Austin + University of Illinois.
Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation
Parallel Routing for FPGAs based on the operator formulation
FPGA CAD 10-MAR-2003.
Optimality FPGA Technology Mapping: A Study of Optimality Andrew C. Ling M.A.Sc. Candidate University of Toronto Deshanand P. Singh Ph.D. Altera Corporation.
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
ECE 565 VLSI Chip Design Styles Shantanu Dutt ECE Dept. UIC.
1 Field-programmable Gate Array Architectures and Algorithms Optimized for Implementing Datapath Circuits Andy Gean Ye University of Toronto.
FPGA Logic Cluster Design Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
© PSU Variation Aware Placement in FPGAs Suresh Srinivasan and Vijaykrishnan Narayanan Pennsylvania State University, University Park.
Interconnect Driver Design for Long Wires in FPGAs Edmund Lee University of British Columbia Electrical & Computer Engineering MASc Thesis Presentation.
SEMI-SYNTHETIC CIRCUIT GENERATION FOR TESTING INCREMENTAL PLACE AND ROUTE TOOLS David GrantGuy Lemieux University of British Columbia Vancouver, BC.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
Congestion-Driven Re-Clustering for Low-cost FPGAs MASc Examination Darius Chiu Supervisor: Dr. Guy Lemieux University of British Columbia Department of.
ECE 506 Reconfigurable Computing Lecture 5 Logic Block Architecture Ali Akoglu.
EEL 5722 FPGA Design Fall 2003 Digit-Serial DSP Functions Part I.
Interconnect Driver Design for Long Wires in FPGAs Edmund Lee, Guy Lemieux & Shahriar Mirabbasi University of British Columbia, Canada Electrical & Computer.
Placement and Routing Algorithms. 2 FPGA Placement & Routing.
1 Architecture of Datapath- oriented Coarse-grain Logic and Routing for FPGAs Andy Ye, Jonathan Rose, David Lewis Department of Electrical and Computer.
Runtime-Quality Tradeoff in Partitioning Based Multithreaded Packing
Partial Reconfigurable Designs
Placement study at ESA Filomena Decuzzi David Merodio Codinachs
Floating-Point FPGA (FPFPGA)
HeAP: Heterogeneous Analytical Placement for FPGAs
Incremental Placement Algorithm for Field Programmable Gate Arrays
Topics Logic synthesis. Placement and routing..
EDA Lab., Tsinghua University
Presentation transcript:

Incremental Placement Algorithm for Field Programmable Gate Arrays David Leong Advisor: Guy Lemieux University of British Columbia Department of Electrical and Computer Engineering Vancouver, BC, Canada © Copyright, David Leong, 2006

2 Contributions Incremental Placement Algorithm –iPlace –Based on Placement Locality and Floor-planning –Multi-Region Algorithm Incremental Benchmark Circuits –Synthetic Benchmark Set –Physical Re-Synthesis Benchmark Set Findings –50-260x speed up for Single-Region Benchmarks –50-70x speed up for Multi-Region Benchmarks Publications: –“Un/DoPack: Re-clustering of SOC Circuits …”, ICCAD 2006 –“iPlace, Incremental Placement for FPGAs”, Submitted to FPGA 2007

3 Motivation Runtime for placement is increasing with increasing FPGA sizes –6 hours for 50,000 LUT circuit –Time == Engineering Cost System-on-Chip circuits are made of many subcomponents What if one part is modified? –Component reuse and hierarchy –Multiple regions of the circuit need incremental placement in order to support reuse –Floor-planning for each circuit Physical Re-Synthesis Flows

4 Incremental Placement Challenges Example: –C4,C5,D4,D5 is a modified sub-circuit –Can only fit <=4 CLBs in the previous location –Free space is far away! –How do we fit >4 CLBs quickly?

5 iPlace Formulation Placement Locality –Modified Sub-Circuits should be “close” to previous location –Floorplans Expanded “Virtual” Placement Grid –Literally thinking outside of the box! Efficient Shifting Algorithm –No CPU Intensive LE/CLB swapping and cost evaluation

6 iPlace Algorithm Four Steps to Incremental Placement 1.Previous Placement and Floor-planning 2.Expanded “SuperGrid” Placement 3.Compaction Re-legalization 4.Simulated Annealing Refinement

7 Previous Placement and Floor-planning

8 Expansion Phase (SuperGrid)

9 Compaction Phase

10 Compaction Phase

11 Intermediate Solution

12 Simulated Annealing Refinement Retuned VPR Simulated Annealing Algorithm –Lower Initial Temperature (44% temp) –Smaller Range Window –Lower Temperature degradation factor (Alpha) –Variable number of swaps per temperature range, 1-3x number of CLBs

13 Benchmarking Three Benchmark Sets Single Region: –Synthetic Flow Simulates a design change –Physical Re-Synthesis flow Circuit unchanged, clustering or tech-mapping changed for optimizations Multi Region: –Physical Re-Synthesis flow Scaled to select multiple regions

14 Synthetic Benchmark Set (Syn) Developed by Dave Grant Select a rectangular region on the placement grid –Replace it with a synthetic clone –Clone can be same size, smaller, or larger (double size) Selection region of 2.5%, 5%, 10% of array size Selected Area

15 Syn: Run-Time Results -MISEX3 runtimes were too fast to measure.

16 Syn: Channel Width Results

17 Syn: Critical Path Results

18 Physical Re-Synthesis Benchmark Set (PR) Un/DoPack developed by Marvin, Guy and myself Iterative Congestion Reduction Algorithm –Select congested region –Spread out LUTs by adding whitespace to each CLB Single Congestion “Region” re-synthesized by depopulation Region selected to be 2.5%, 5%, 10%, 15% of array size

19 PR: Run-Time Results

20 PR: Channel Width Results

21 PR: Critical Path Results

22 Multi-Region Physical Resynthesis Benchmark (MR) Un/DoPack flow Multiple Regions Select all regions needed to reduce CW by 10%, 20%, 30%, 40%, 50% 1/3 to 2/3 of the entire FPGA re-synthesized

23 MR: Run-Time Results

24 MR: Channel Width Results

25 MR: Critical Path Results

26 Contributions Incremental Placement Algorithm –iPlace –Based on Placement Locality and Floor-planning –Multi-Region Algorithm Incremental Benchmark Circuits –Synthetic Benchmark Set –Physical Re-Synthesis Benchmark Set Findings –50-260x speed up for Single-Region Benchmarks –50-70x speed up for Multi-Region Benchmarks Publications: –“Un/DoPack: Re-clustering of SOC Circuits …”, ICCAD 2006 –“iPlace, Incremental Placement for FPGAs”, Submitted to FPGA 2007

27 Future Works Support for Macro Blocks Support for Carry Chains More Intelligent Shifting –Still keep it simple Integration with Commercial flows –EG: QUIP

End of Presentation, Thank you!!

Questions?