ECE 506 Reconfigurable Computing Lecture 8 FPGA Placement.

Slides:



Advertisements
Similar presentations
1 Routing Protocols I. 2 Routing Recall: There are two parts to routing IP packets: 1. How to pass a packet from an input interface to the output interface.
Advertisements

CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 22: December 2, 2005 Routing 2 (Pathfinder)
Cadence Design Systems, Inc. Why Interconnect Prediction Doesn’t Work.
ECE 506 Reconfigurable Computing Lecture 6 Clustering Ali Akoglu.
BSPlace: A BLE Swapping technique for placement Minsik Hong George Hwang Hemayamini Kurra Minjun Seo 1.
3D-STAF: Scalable Temperature and Leakage Aware Floorplanning for Three-Dimensional Integrated Circuits Pingqiang Zhou, Yuchun Ma, Zhouyuan Li, Robert.
Minimizing Clock Skew in FPGAs
Chop-SPICE: An Efficient SPICE Simulation Technique For Buffered RC Trees Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of.
ISQED’2015: D. Seemuth, A. Davoodi, K. Morrow 1 Automatic Die Placement and Flexible I/O Assignment in 2.5D IC Design Daniel P. Seemuth Prof. Azadeh Davoodi.
Fuzzy Simulated Evolution for Power and Performance of VLSI Placement Sadiq M. Sait Habib Youssef Junaid A. KhanAimane El-Maleh Department of Computer.
38 th Design Automation Conference, Las Vegas, June 19, 2001 Creating and Exploiting Flexibility in Steiner Trees Elaheh Bozorgzadeh, Ryan Kastner, Majid.
CS294-6 Reconfigurable Computing Day 8 September 17, 1998 Interconnect Requirements.
Placement 1 Outline Goal What is Placement? Why Placement?
Reconfigurable Computing (EN2911X, Fall07)
Fuzzy Simulated Evolution for Power and Performance of VLSI Placement Sadiq M. SaitHabib Youssef Junaid A. KhanAimane El-Maleh Department of Computer Engineering.
Lecture 4: FPGA Placement September 12, 2013 ECE 636 Reconfigurable Computing Lecture 4 FPGA Placement.
© 2005 Altera Corporation © 2006 Altera Corporation Placement and Timing for FPGAs Considering Variations Yan Lin 1, Mike Hutton 2 and Lei He 1 1 EE Department,
EDA (CS286.5b) Day 14 Routing (Pathfind, netflow).
Architecture and Routing for NoC-based FPGA Israel Cidon* *joint work with Roman Gindin and Idit Keidar.
Ryan Kastner ASIC/SOC, September Coupling Aware Routing Ryan Kastner, Elaheh Bozorgzadeh and Majid Sarrafzadeh Department of Electrical and Computer.
ELEN 468 Lecture 271 ELEN 468 Advanced Logic Design Lecture 27 Interconnect Timing Optimization II.
The Memory/Logic Interface in FPGA’s with Large Embedded Memory Arrays The Memory/Logic Interface in FPGA’s with Large Embedded Memory Arrays Steven J.
CSE 144 Project Part 2. Overview Multiple rows Routing channel between rows Components of identical height but various width Goal: Implement a placement.
Lecture 5: FPGA Routing September 17, 2013 ECE 636 Reconfigurable Computing Lecture 5 FPGA Routing.
HARP: Hard-Wired Routing Pattern FPGAs Cristinel Ababei , Satish Sivaswamy ,Gang Wang , Kia Bazargan , Ryan Kastner , Eli Bozorgzadeh   ECE Dept.
Router Architectures An overview of router architectures.
ECE 506 Reconfigurable Computing Lecture 7 FPGA Placement.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
Escape Routing For Dense Pin Clusters In Integrated Circuits Mustafa Ozdal, Design Automation Conference, 2007 Mustafa Ozdal, IEEE Trans. on CAD, 2009.
Surface Simplification Using Quadric Error Metrics Michael Garland Paul S. Heckbert.
Packing and Placement Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 15: February 12, 2003 Interconnect 5: Meshes.
ESE Spring DeHon 1 ESE534: Computer Organization Day 19: April 7, 2014 Interconnect 5: Meshes.
Power Reduction for FPGA using Multiple Vdd/Vth
1 Coupling Aware Timing Optimization and Antenna Avoidance in Layer Assignment Di Wu, Jiang Hu and Rabi Mahapatra Texas A&M University.
Solving Hard Instances of FPGA Routing with a Congestion-Optimal Restrained-Norm Path Search Space Keith So School of Computer Science and Engineering.
Julien Lamoureux and Steven J.E Wilton ICCAD
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Presenter: Jonathan Murphy On Adaptive Routing in Wavelength-Routed Networks Authors: Ching-Fang Hsu Te-Lung Liu Nen-Fu Huang.
L11: Lower Power High Level Synthesis(2) 성균관대학교 조 준 동 교수
Placement. Physical Design Cycle Partitioning Placement/ Floorplanning Placement/ Floorplanning Routing Break the circuit up into smaller segments Place.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 6: Detailed Routing © KLMH Lienig 1 What Makes a Design Difficult to Route Charles.
The Application of The Improved Hybrid Ant Colony Algorithm in Vehicle Routing Optimization Problem International Conference on Future Computer and Communication,
Congestion Estimation and Localization in FPGAs: A Visual Tool for Interconnect Prediction David Yeager Darius Chiu Guy Lemieux The University of British.
CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #4 – FPGA.
1 A Min-Cost Flow Based Detailed Router for FPGAs Seokjin Lee *, Yongseok Cheon *, D. F. Wong + * The University of Texas at Austin + University of Illinois.
Timing-Driven Routing for FPGAs Based on Lagrangian Relaxation
Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,
Improving Voltage Assignment by Outlier Detection and Incremental Placement Huaizhi Wu* and Martin D.F. Wong** * Atoptech, Inc. ** University of Illinois.
Routability-driven Floorplanning With Buffer Planning Chiu Wing Sham Evangeline F. Y. Young Department of Computer Science & Engineering The Chinese University.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 14: February 27, 2002 Routing 2 (Pathfinder)
FPGA CAD 10-MAR-2003.
Net Criticality Revisited: An Effective Method to Improve Timing in Physical Design H. Chang 1, E. Shragowitz 1, J. Liu 1, H. Youssef 2, B. Lu 3, S. Sutanthavibul.
High-Performance Global Routing with Fast Overflow Reduction Huang-Yu Chen, Chin-Hsiung Hsu, and Yao-Wen Chang National Taiwan University Taiwan.
Static Timing Analysis
Interconnect Driver Design for Long Wires in FPGAs Edmund Lee University of British Columbia Electrical & Computer Engineering MASc Thesis Presentation.
Dynamically Computing Fastest Paths for Intelligent Transportation Systems - ADITI BHAUMICK ab3585.
SEMI-SYNTHETIC CIRCUIT GENERATION FOR TESTING INCREMENTAL PLACE AND ROUTE TOOLS David GrantGuy Lemieux University of British Columbia Vancouver, BC.
Register-Transfer (RT) Synthesis Greg Stitt ECE Department University of Florida.
ECE 506 Reconfigurable Computing Lecture 5 Logic Block Architecture Ali Akoglu.
Interconnect Driver Design for Long Wires in FPGAs Edmund Lee, Guy Lemieux & Shahriar Mirabbasi University of British Columbia, Canada Electrical & Computer.
6/19/ VLSI Physical Design Automation Prof. David Pan Office: ACES Placement (3)
Placement and Routing Algorithms. 2 FPGA Placement & Routing.
Runtime-Quality Tradeoff in Partitioning Based Multithreaded Packing
HeAP: Heterogeneous Analytical Placement for FPGAs
Hyunchul Park, Kevin Fan, Manjunath Kudlur,Scott Mahlke
Topics Logic synthesis. Placement and routing..
ECE 697F Reconfigurable Computing Lecture 4 FPGA Placement
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

ECE 506 Reconfigurable Computing Lecture 8 FPGA Placement

Timing-driven Placement Why should placement take timing into account? -Placement sets the constraints for router -A timing driven router’s performance is limited by the quality of the placement. -For more speed, placement should be timing-driven. Operation principle -Map blocks that are on critical path onto physical locations that are closer together –Minimize the amount of interconnect for critical signals to traverse

Timing-Driven Placement Expectation °High quality placement °Reasonable execution time °Less sacrifices in routability

Timing-driven Placement Timing-driven only placement -Increases demand on routing resources Wireability-driven only placement -Slower circuit Take both wire length and critical path into account -Problem: Modeling delay –Critical path changes as we move blocks –Most accurate delay model »Route each placement »Extract delay of each connection »Execution time is a major problem

Timing-Driven Placement – Delay Modeling °Delay profile Homogenous FPGA Exploit uniformity -Compute delay as a function of distance (∆x, ∆y) -Use VPR router to determine delay between blocks -Compute a delay lookup matrix for every possible ∆x, ∆y Router is timing driven -Take advantage of the architecture features –Segment length –Use long wires for blocks on far ends of the FPGA Assumption that router will probably find the minimum delay path (a leap of faith!)

Determining Criticality Same basic approach as used for clustering criticality For each (i, j) connection from source i and sink j -Determine arrival times (pre-order BFS) -Determine required arrival times (post-order BFS) -Determine slack -> required_arrival_time – arrival_time -Criticality(i, j) = [1- slack(i, j)]/ (Max slack)

TVPLACE

Cost Function What is the purpose of the criticality exponent? °Heavily weight connections that are critical, while giving less weight to connections that are non-critical From lookup table matrix [0 1]

Balancing Wiring and Timing Cost Need to determine relative changes in timing and wiring based on moves Idea: Use relative changes from previous calculation -Both values less than 1 -Helps balance effect based on scaling parameter

Path vs Connection Based Timing Analysis °Path based: Timing-analysis to compute path-delays at every stage of the placement and use delays in the cost function Computationally expensive -Moving any connection triggers a new timing-analysis °Connection based: Perform timing-analysis before placement -Assign slacks to each connection -Pay attention to connections with low slack Delay values are always up to date (∆x, ∆y) Criticality becomes outdated after the moves °Approach: Hybrid Allow certain number of moves between each timing- analysis

VPR, Placement °VPlace is a Simulated Annealing based algorithm minimize the amount of interconnect circuit blocks that are on the same net => close together. uses a bounding-box based cost function

Updated Annealing Algorithm

How often to recalculate delay? Recalculating delay once per temperature is good. Also simplifies programming somewhat # of temperature changes between each timing analysis

Criticality Exponent °Large exponent Fewer connections will have large “Timing_Cost” -For these few connections “Timing_Cost” is effective For Non-critical connections “Wiring_Cost” is effective Therefore, placement focuses on minimizing wiring as “Criticality_Exponent” increases

Criticality Exponent °When is 1 Critical path is worse Wiring cost is much more worse

Oscillation Effect °When is 1 Only delay component Attempts to minimize critical path at the cost of extending other non-critical paths Timing analyze once per temperature update -Several moves between temperature updates Able to reduce critical path during one iteration of the outer loop Makes other paths very critical Oscillation effect makes it hard for placement to converge to best solution °When is 0.5 Wirelength reduces the oscillation effect -Penalizes moves that increase wirelength

Effect of

How important is timing-driven placement? Run time Penalty – 2.5X

Conclusion °The greatest challenge facing FPGA placement is the need to produce high quality placements for ever-larger circuits. FPGA capacity doubles every two to three years, doubling the size of the placement problem. °In order to maintain the fast time to market and ease of use historically provided by FPGAs, placement algorithms cannot be allowed to take ever more CPU time. °There is thus a compelling need for algorithms that are very scalable and parallel yet still produce high-quality results.