3Boston University ECE Dept.;

Slides:



Advertisements
Similar presentations
QuT: A Low-Power Optical Network-on-chip
Advertisements

OCV-Aware Top-Level Clock Tree Optimization
A Novel 3D Layer-Multiplexed On-Chip Network
International Symposium on Low Power Electronics and Design Energy-Efficient Non-Minimal Path On-chip Interconnection Network for Heterogeneous Systems.
Ripple: An Effective Routability-Driven Placer by Iterative Cell Movement Xu He, Tao Huang, Linfu Xiao, Haitong Tian, Guxin Cui and Evangeline F.Y. Young.
MCFRoute: A Detailed Router Based on Multi- Commodity Flow Method Xiaotao Jia, Yici Cai, Qiang Zhou, Gang Chen, Zhuoyuan Li, Zuowei Li.
Simulations of All-Optical Multiple-Input AND- Gate Based on Four Wave Mixing in a Single Semiconductor Optical Amplifier H. Le Minh, Z. Ghassemlooy, Wai.
IP I/O Memory Hard Disk Single Core IP I/O Memory Hard Disk IP Bus Multi-Core IP R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R Networks.
Supply Voltage Degradation Aware Analytical Placement Andrew B. Kahng, Bao Liu and Qinke Wang UCSD CSE Department {abk, bliu,
Local Unidirectional Bias for Smooth Cutsize-delay Tradeoff in Performance-driven Partitioning Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts. Work supported.
Yield- and Cost-Driven Fracturing for Variable Shaped-Beam Mask Writing Andrew B. Kahng CSE and ECE Departments, UCSD Xu Xu CSE Department, UCSD Alex Zelikovsky.
Estimation of Wirelength Reduction for λ-Geometry vs. Manhattan Placement and Routing H. Chen, C.-K. Cheng, A.B. Kahng, I. Mandoiu, and Q. Wang UCSD CSE.
Jieyi Long and Seda Ogrenci Memik Dept. of EECS, Northwestern Univ. Jieyi Long and Seda Ogrenci Memik Dept. of EECS, Northwestern Univ. Automated Design.
Processing Rate Optimization by Sequential System Floorplanning Jia Wang 1, Ping-Chih Wu 2, and Hai Zhou 1 1 Electrical Engineering & Computer Science.
SLIP 2000April 9, Wiring Layer Assignments with Consistent Stage Delays Andrew B. Kahng (UCLA) Dirk Stroobandt (Ghent University) Supported.
11 Multi-Product Floorplan Optimization Framework for Chip Multiprocessors Marco Escalante 1, Andrew B. Kahng 2, Michael Kishinevsky 1, Umit Ogras 3 and.
Dose Map and Placement Co-Optimization for Timing Yield Enhancement and Leakage Power Reduction Kwangok Jeong, Andrew B. Kahng, Chul-Hong Park, Hailong.
Area-I/O Flip-Chip Routing for Chip-Package Co-Design Progress Report 方家偉、張耀文、何冠賢 The Electronic Design Automation Laboratory Graduate Institute of Electronics.
Authors: Jia-Wei Fang,Chin-Hsiung Hsu,and Yao-Wen Chang DAC 2007 speaker: sheng yi An Integer Linear Programming Based Routing Algorithm for Flip-Chip.
Physical Planning for the Architectural Exploration of Large-Scale Chip Multiprocessors Javier de San Pedro, Nikita Nikitin, Jordi Cortadella and Jordi.
McPAT: An Integrated Power, Area, and Timing Modeling Framework for Multicore and Manycore Architectures Runjie Zhang Dec.3 S. Li et al. in MICRO’09.
CAFE router: A Fast Connectivity Aware Multiple Nets Routing Algorithm for Routing Grid with Obstacles Y. Kohira and A. Takahashi School of Computer Science.
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
UC San Diego / VLSI CAD Laboratory Incremental Multiple-Scan Chain Ordering for ECO Flip-Flop Insertion Andrew B. Kahng, Ilgweon Kang and Siddhartha Nath.
1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.
Kwangsoo Han, Andrew B. Kahng, Hyein Lee and Lutong Wang
Kwangsoo Han‡, Andrew B. Kahng‡† and Hyein Lee‡
Logical Topology Design
RF network in SoC1 SoC Test Architecture with RF/Wireless Connectivity 1. D. Zhao, S. Upadhyaya, M. Margala, “A new SoC test architecture with RF/wireless.
Design of a High-Throughput Low-Power IS95 Viterbi Decoder Xun Liu Marios C. Papaefthymiou Advanced Computer Architecture Laboratory Electrical Engineering.
Po-Wei Lee, Chung-Wei Lin, Yao-Wen Chang, Chin-Fang Shen, Wei-Chih Tseng NTU &Synopsys An Efficient Pre-assignment Routing Algorithm for Flip-Chip Designs.
Hsing-Chih Chang Chien Hung-Chih Ou Tung-Chieh Chen Ta-Yu Kuan Yao-Wen Chang Double Patterning Lithography-Aware Analog Placement.
TSV-Constrained Micro- Channel Infrastructure Design for Cooling Stacked 3D-ICs Bing Shi and Ankur Srivastava, University of Maryland, College Park, MD,
Authors – Jeahyuk huh, Doug Burger, and Stephen W.Keckler Presenter – Sushma Myneni Exploring the Design Space of Future CMPs.
Enabling System-Level Modeling of Variation-Induced Faults in Networks-on-Chips Konstantinos Aisopos (Princeton, MIT) Chia-Hsin Owen Chen (MIT) Li-Shiuan.
University of Michigan, Ann Arbor
Analysis of Cache Tuner Architectural Layouts for Multicore Embedded Systems + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing.
Maze Routing Algorithms with Exact Matching Constraints for Analog and Mixed Signal Designs M. M. Ozdal and R. F. Hentschke Intel Corporation ICCAD 2012.
By P.-H. Lin, H. Zhang, M.D.F. Wong, and Y.-W. Chang Presented by Lin Liu, Michigan Tech Based on “Thermal-Driven Analog Placement Considering Device Matching”
Assaf Shacham, Keren Bergman, Luca P. Carloni Presented for HPCAN Session by: Millad Ghane NOCS’07.
Hybrid Optoelectric On-chip Interconnect Networks Yong-jin Kwon 1.
Outline Motivation and Contributions Related Works ILP Formulation
Simulation Project Paper: Resolving the thermal challenges for
A Novel Timing-Driven Global Routing Algorithm Considering Coupling Effects for High Performance Circuit Design Jingyu Xu, Xianlong Hong, Tong Jing, Yici.
Compiler-Directed Power Density Reduction in NoC-Based Multi-Core Designs Sri Hari Krishna Narayanan, Mahmut Kandemir, Ozcan Ozturk Embedded Mobile Computing.
Building manycore processor-to-DRAM networks using monolithic silicon photonics Ajay Joshi †, Christopher Batten †, Vladimir Stojanović †, Krste Asanović.
1 Hardware Reliability Margining for the Dark Silicon Era Liangzhen Lai and Puneet Gupta Department of Electrical Engineering University of California,
Proximity Optimization for Adaptive Circuit Design Ang Lu, Hao He, and Jiang Hu.
Power-aware NOC Reuse on the Testing of Core-based Systems* CSCE 932 Class Presentation by Xinwang Zhang April 26, 2007 * Erika Cota, et al., International.
Technische Universität München Institute for Electronic Design Automation PLATON: A Force-Directed Placement Algorithm for 3D Optical Networks-on- Chip.
Silicon Photonics(15/2) Minkyu Kim Paper Review Nanophotonics, 2014 I.Introduction II.Performance metrics of modulators III.Design of modulators IV.Current.
11 Yibo Lin 1, Xiaoqing Xu 1, Bei Yu 2, Ross Baldick 1, David Z. Pan 1 1 ECE Department, University of Texas at Austin 2 CSE Department, Chinese University.
Optimizing Interconnection Complexity for Realizing Fixed Permutation in Data and Signal Processing Algorithms Ren Chen, Viktor K. Prasanna Ming Hsieh.
Yiting Xia, T. S. Eugene Ng Rice University
Improved Flop Tray-Based Design Implementation for Power Reduction
Network Resources.
Digital readout architecture for Velopix
Kristof Blutman† , Hamed Fatemi† , Andrew B
Fault-Tolerant NoC-based Manycore system: Reconfiguration & Scheduling
Leiming Yu, Fanny Nina-Paravecino, David Kaeli, Qianqian Fang
Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts.
Babak Sorkhpour, Prof. Roman Obermaisser, Ayman Murshed
Revisiting and Bounding the Benefit From 3D Integration
Analysis of a Chip Multiprocessor Using Scientific Applications
Digital Processing Platform
Jinghong Liang,Tong Jing, Xianlong Hong Jinjun Xiong, Lei He
An Automated Design Flow for 3D Microarchitecture Evaluation
Final Project presentation
A Cross-Layer Methodology for Design and Optimization of Networks in 2
Integrated Optical Wavelength Converters and Routers for Robust Wavelength-Agile Analog/ Digital Optical Networks Daniel J. Blumenthal (PI), John E. Bowers,
Presentation transcript:

3Boston University ECE Dept.; Cross-layer Floorplan Optimization for Silicon Photonic NoCs in Many-core Systems Ayse K. Coskun3, Anjun Gu1, Warren Jin4, Ajay Joshi3, Andrew B. Kahng1,2, Jonathan Klamkin4, Yenai Ma3, John Recchio1, Vaishnav Srinivas1 and Tiansheng Zhang3 UCSD 1ECE and 2CSE Dept.; 3Boston University ECE Dept.; 4UCSB ECE Dept. This research has been partially funded by the NSF grants CNS-1149703 and CCF-1149549. Work at UCSD has been supported by NSF, Samsung and the IMPACT+ Center.

Manycore Systems Integrate many simple cores for massive throughput (Thread-level parallelism) Tilera Gx Kalray MPPA-256 NoC requirements of manycore systems Long global links Large available bandwidth Low energy consumption Silicon Photonic Network

Silicon Photonic Network Silicon-photonic link Ring resonators are highly thermally sensitive Large on-chip temperature variations Localized thermal tuning λ Ring mod. λ1 Ring filter

Silicon Photonic Network Silicon-photonic link Ring resonators are highly thermally sensitive Large on-chip temperature variations Localized thermal tuning λ Ring mod. λ1 Ring filter micro-heater  Tuning power overhead

Silicon Photonic Network Silicon-photonic link Ring resonators are highly thermally sensitive Large on-chip temperature variations Localized thermal tuning λ  Tuning power overhead Laser source power consumption is large High optical loss (propagation, crossing, etc.) Low laser source efficiency Place and route (P & R) silicon photonic links to reduce optical loss

Contributions We formulate a mixed integer linear programming (MILP) based optimizer that finds P & R solutions for Clos PNoC that minimize power consumption and area combination We develop a PNoC floorplan optimization flow that is aware of on-chip thermal variations based on various power profiles We propose the notion of power weight to model core thermal impact on router element, enabling optimization with heterogeneous cores or power profiles

Outline Previous Work Cross-layer Floorplan Optimization for PNoCs Experimental Results Conclusions

Related Work Studies on Placement of Optical Devices PNoC placement’s influence on Signal to Noise Ratio [Li et al., 2015] Laser source placement’s impact on PNoC power [Chen et al., 2014] P & R Solutions for PNoC PROTON: An automatic tool for PNoC P & R [Boos et al., 2013] GLOW: A ILP based global router for PNoC [Ding et al., 2012] Optical Waveguide Routing Algorithms Reduce optical loss under a fixed netlist [Condrat et al., 2008] [Ding et al., 2009] [Ramini et al., 2013]

Floorplan Optimization Flow INPUT OUTPUT MILP-Based Optimizer Design Options & Constraints (# of cores, aspect ratios, etc.) Floorplan with Minimized PNoC Power & Area Cost Compact Thermal Model Optimization Goal: PNoC Power: P & R’s impact on waveguide length, crossing and bending Laser source efficiency PNoC placement’s impact on thermal tuning power PNoC Area: Area cost of router groups and waveguides

Floorplan Optimization Flow [# of cores, core parameters, aspect ratio (AR)]

Floorplan Optimization Notations System is formed by tiles L2 C+L1 Processor Tile with 4 Cores

Floorplan Optimization Notations Cluster (Hori.) System is formed by tiles PNoC is represented by clusters of tiles, Cluster (Vert.) L2 C+L1 Processor Tile with 4 Cores

Floorplan Optimization Notations Cluster (Hori.) System is formed by tiles PNoC is represented by clusters of tiles, location of router groups (set C), C=0 Cluster (Vert.) C=1 C=7 L2 C+L1 Processor Tile with 4 Cores

Floorplan Optimization Notations Cluster (Hori.) System is formed by tiles PNoC is represented by clusters of tiles, location of router groups (set C), and the waveguides (set N) C=0 Cluster (Vert.) Net n=0 C=1 C=7 L2 C+L1 Processor Tile with 4 Cores

Floorplan Optimization Notations Cluster (Hori.) System is formed by tiles PNoC is represented by clusters of tiles, location of router groups (set C), and the waveguides (set N) Vertex set V: potential places for router groups Edge set A: potential places for waveguides C=0 Cluster (Vert.) Net n=0 C=1 C=7 Edge Vertex L2 C+L1 Processor Tile with 4 Cores

Floorplan Optimization Flow INPUT OUTPUT MILP-Based Optimizer Design Options & Constraints (# of cores, aspect ratios, etc.) Floorplan with Minimized PNoC Power & Area Cost Compact Thermal Model Compact thermal model Accumulated thermal weight profiles Power profile: Resonant frequency difference among router groups Thermal tuning power Compact Thermal Model Size: 1×N N×M 1×M

Floorplan Optimization Flow INPUT OUTPUT MILP-Based Optimizer Design Options & Constraints (# of cores, aspect ratios, etc.) Floorplan with Minimized PNoC Power & Area Cost Compact Thermal Model Compact thermal model

Floorplan Optimization Formulation Router group related constraints Tile and cluster related constraints Path related constraints

Floorplan Optimization Formulation Only one vertex and one orientation is chosen for each router group

Floorplan Optimization Formulation Row/column index of router group c

Floorplan Optimization Formulation Orientation of the cluster of ring group c

Floorplan Optimization Formulation Shows which tiles are occupied by which cluster

Floorplan Optimization Formulation No tile can belong to more than one cluster

Floorplan Optimization Formulation A path of routing graph edges for each net n from its source sn to its sink tn

Outline Previous Work Cross-layer Floorplan Optimization for PNoCs Experimental Results Conclusions

Design of Experiments Experimental Methodology Software: ILOG CPLEX v12.5.1 Platform: 2.8GHz Xeon server Configuration Parameters: # of cores (64, 128, and 256) Network configuration (8-ary Clos) Cluster aspect ratio (1:2, 1:4, and 1:8) Chip aspect ratio (1:1, 1:2, 1:4) Optical data rate (2Gbps, 4Gbps and 8Gbps) # of waveguides (32, 64 and 128) Power Profiles: (W) Uniform Chessboard Centric Cornered Clustered Pringle

Results For different core counts: For different chip aspect ratios: Accumulated thermal weight profiles PNoC Power (W) Optimized PNoC layouts For different chip aspect ratios: Accumulated thermal weight profiles PNoC Power (W) Optimized PNoC layouts

Results For various laser source wall plug efficiency (WPE) and power profiles: Power profiles Accumulated thermal weight profiles Optimized PNoC layouts WPE: 5% WPE: 15% Chessboard 15% power saving comp. to vertical U-shape layout Clustered Pringle

Results For various optical data rate: Accumulated thermal weight profiles PNoC Power (W) Optimized PNoC layouts For different cluster power weight: High-power Cluster Low-power Cluster

Take-away Points Thermal tuning power and laser power play important roles in PNoC P & R Larger chips present an economy of scale for the PNoC power due to the more symmetric thermal weight profiles Skewed chip aspect ratios create asymmetry in the thermal weight profiles The maximum achievable optical data rate is always preferred It is important to consider different power profiles during design time

Conclusions Proposed a cross-layer, thermally- aware optimizer for floorplanning of PNoCs The optimizer minimizes PNoC power using an MILP formation through placing and routing on- chip photonic devices Compared to thermally-agnostic solutions, the proposed optimizer saves up to 15% PNoC power