IP I/O Memory Hard Disk Single Core IP I/O Memory Hard Disk IP Bus Multi-Core IP R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R Networks.

Slides:



Advertisements
Similar presentations
QuT: A Low-Power Optical Network-on-chip
Advertisements

A Novel 3D Layer-Multiplexed On-Chip Network
Dynamic Topology Optimization for Supercomputer Interconnection Networks Layer-1 (L1) switch –Dumb switch, Electronic “patch panel” –Establishes hard links.
International Symposium on Low Power Electronics and Design Energy-Efficient Non-Minimal Path On-chip Interconnection Network for Heterogeneous Systems.
4/17/20151 Improving Memory Bank-Level Parallelism in the Presence of Prefetching Chang Joo Lee Veynu Narasiman Onur Mutlu* Yale N. Patt Electrical and.
Flattened Butterfly Topology for On-Chip Networks John Kim, James Balfour, and William J. Dally Presented by Jun Pang.
Wavelength-Routing Switch Fabric Patrick Chiang, Hossein Kakvand, Milind Kopikare, Uma Krishnamoorthy, Paulina Kuo, Pablo Molinero-Fernández Stanford University.
Benjamin C. Johnstone, Dr. Sonia Lopez Alarcon 1.
PRESENTED BY: PRIYANK GUPTA 04/02/2012 Generic Low Latency NoC Router Architecture for FPGA Computing Systems & A Complete Network on Chip Emulation Framework.
CCNoC: On-Chip Interconnects for Cache-Coherent Manycore Server Chips CiprianSeiculescu Stavros Volos Naser Khosro Pour Babak Falsafi Giovanni De Micheli.
NETWORK ON CHIP ROUTER Students : Itzik Ben - shushan Jonathan Silber Instructor : Isaschar Walter Final presentation part A Winter 2006.
L2 to Off-Chip Memory Interconnects for CMPs Presented by Allen Lee CS258 Spring 2008 May 14, 2008.
OCIN Workshop Wrapup Bill Dally. Thanks To Funding –NSF - Timothy Pinkston, Federica Darema, Mike Foster –UC Discovery Program Organization –Jane Klickman,
Network based System on Chip Part A Performed by: Medvedev Alexey Supervisor: Walter Isaschar (Zigmond) Winter-Spring 2006.
MINIMISING DYNAMIC POWER CONSUMPTION IN ON-CHIP NETWORKS Robert Mullins Computer Architecture Group Computer Laboratory University of Cambridge, UK.
Lei Wang, Yuho Jin, Hyungjun Kim and Eun Jung Kim
Design of a High-Throughput Distributed Shared-Buffer NoC Router
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Sections 8.1 – 8.5)
1 Evgeny Bolotin – ICECS 2004 Automatic Hardware-Efficient SoC Integration by QoS Network on Chip Electrical Engineering Department, Technion, Haifa, Israel.
Fiber-Optic Communications
Issues in System-Level Direct Networks Jason D. Bakos.
Network-on-Chip: Communication Synthesis Department of Computer Science Texas A&M University.
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
COLUMBIA UNIVERSITY Interconnects Jim Tomkins: “Exascale System Interconnect Requirements” Jeff Vetter: “IAA Interconnect Workshop Recap and HPC Application.
RFAD LAB, YONSEI University IEEE TRANSACTIONS ON COMPUTERS, VOL. 57, NO. 9, SEPTEMBER 2008 Photonic Networks-on-Chip for Future Generations of Chip Multiprocessors.
Juanjo Noguera Xilinx Research Labs Dublin, Ireland Ahmed Al-Wattar Irwin O. Irwin O. Kennedy Alcatel-Lucent Dublin, Ireland.
Photonic Networks on Chip Yiğit Kültür CMPE 511 – Computer Architecture Term Paper Presentation 27/11/2008.
ROBERT HENDRY, GILBERT HENDRY, KEREN BERGMAN LIGHTWAVE RESEARCH LAB COLUMBIA UNIVERSITY HPEC 2011 TDM Photonic Network using Deposited Materials.
Yao Wang, Yu Wang, Jiang Xu, Huazhong Yang EE. Dept, TNList, Tsinghua University, Beijing, China Computing System Lab, Dept. of ECE Hong Kong University.
High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
1 University of Utah & HP Labs 1 Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 Naveen Muralimanohar Rajeev Balasubramonian.
On-Chip Networks and Testing
R OUTE P ACKETS, N OT W IRES : O N -C HIP I NTERCONNECTION N ETWORKS Veronica Eyo Sharvari Joshi.
International Symposium on Low Power Electronics and Design NoC Frequency Scaling with Flexible- Pipeline Routers Pingqiang Zhou, Jieming Yin, Antonia.
Déjà Vu Switching for Multiplane NoCs NOCS’12 University of Pittsburgh Ahmed Abousamra Rami MelhemAlex Jones.
SMART: A Single- Cycle Reconfigurable NoC for SoC Applications -Jyoti Wadhwani Chia-Hsin Owen Chen, Sunghyun Park, Tushar Krishna, Suvinay Subramaniam,
Improving Network I/O Virtualization for Cloud Computing.
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
A Lightweight Fault-Tolerant Mechanism for Network-on-Chip
Department of Computer Science and Engineering The Pennsylvania State University Akbar Sharifi, Emre Kultursay, Mahmut Kandemir and Chita R. Das Addressing.
Network on Chip - Architectures and Design Methodology Natt Thepayasuwan Rohit Pai.
COMPARISON B/W ELECTRICAL AND OPTICAL COMMUNICATION INSIDE CHIP Irfan Ullah Department of Information and Communication Engineering Myongji university,
CS 8501 Networks-on-Chip (NoCs) Lukasz Szafaryn 15 FEB 10.
Silicon Nanophotonic Network-On-Chip Using TDM Arbitration
Performance Analysis of a JPEG Encoder Mapped To a Virtual MPSoC-NoC Architecture Using TLM 林孟諭 Dept. of Electrical Engineering National Cheng Kung.
Traffic Steering Between a Low-Latency Unsiwtched TL Ring and a High-Throughput Switched On-chip Interconnect Jungju Oh, Alenka Zajic, Milos Prvulovic.
Enabling System-Level Modeling of Variation-Induced Faults in Networks-on-Chips Konstantinos Aisopos (Princeton, MIT) Chia-Hsin Owen Chen (MIT) Li-Shiuan.
Performance, Cost, and Energy Evaluation of Fat H-Tree: A Cost-Efficient Tree-Based On-Chip Network Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi.
Yu Cai Ken Mai Onur Mutlu
Assaf Shacham, Keren Bergman, Luca P. Carloni Presented for HPCAN Session by: Millad Ghane NOCS’07.
Hybrid Optoelectric On-chip Interconnect Networks Yong-jin Kwon 1.
SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.
PERFORMANCE EVALUATION OF LARGE RECONFIGURABLE INTERCONNECTS FOR MULTIPROCESSOR SYSTEMS Wim Heirman, Iñigo Artundo, Joni Dambre, Christof Debaes, Pham.
Network On Chip Cache Coherency Final presentation – Part A Students: Zemer Tzach Kalifon Ethan Kalifon Ethan Instructor: Walter Isaschar Instructor: Walter.
Building manycore processor-to-DRAM networks using monolithic silicon photonics Ajay Joshi †, Christopher Batten †, Vladimir Stojanović †, Krste Asanović.
A Low-Area Interconnect Architecture for Chip Multiprocessors Zhiyi Yu and Bevan Baas VLSI Computation Lab ECE Department, UC Davis.
CS203 – Advanced Computer Architecture
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
Runtime Reconfigurable Network-on- chips for FPGA-based systems Mugdha Puranik Department of Electrical and Computer Engineering
Network-on-Chip Paradigm Erman Doğan. OUTLINE SoC Communication Basics  Bus Architecture  Pros, Cons and Alternatives NoC  Why NoC?  Components 
Process Variation Aware Crosstalk Mitigation for DWDM based Photonic NoC Architectures. By Ishan Thakkar
3Boston University ECE Dept.;
Pablo Abad, Pablo Prieto, Valentin Puente, Jose-Angel Gregorio
Exploring Concentration and Channel Slicing in On-chip Network Router
Azeddien M. Sllame, Amani Hasan Abdelkader
Analysis of a Chip Multiprocessor Using Scientific Applications
Flexible Transport Networks
Natalie Enright Jerger, Li Shiuan Peh, and Mikko Lipasti
Leveraging Optical Technology in Future Bus-based Chip Multiprocessors
Presentation transcript:

IP I/O Memory Hard Disk Single Core IP I/O Memory Hard Disk IP Bus Multi-Core IP R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R R Networks On Chip  Increasing application complexity  Parallel processing  Bus based architecture does not scale  High Latency, Low Bandwidth, Low Predictability  Networks-on-chip (NoCs) enable multi-core systems  Better Bandwidth, Scalability and reliability 2

3  key challenge: Communication  Scalability  Performance  Power  NoC helps! However  High latency  High Power Dissipation  ~40% of overall power in MIT RAW  ~30% of overall power in Intel 80 core teraflop chip  Temperature, chip reliability etc

Contribution  Photonic ring interfaced with 2D electrical mesh  Key enabler: CMOS ICs with 3D integration  Separate photonic and logic layers Propose novel hybrid nanophotonic-electric architecture called PHOTON Low Latency, High Bandwidth, Low Power  Low Latency, High Bandwidth, Low Power 4

Components Photonic Interconnect  Laser light source: multi-wavelength mode-locked  Modulator: microring-resonator structure  Detector: SiGe photodetector w/ microring resonator filters  Waveguide: high refractive index Silicon On Insulator (SOI)  WDM: Wave Length Division Multiplexing  n interfacing cores having exclusive access to λ/n wavelengths 5

Components of Photonic Ring  Microring resonators as couplers  Destructive overlap with older messages in ring  Attenuators before each modulator  Sink for corresponding wavelength if signal goes full circle 6

 Number of cores around gateway utilizing photonic path 7

 6-tuple Paramerization k: Number of photonic rings b: Bitwidth of the waveguides n: Number of gateway interfaces r: PRI size w: Number of WDM channels c: Number of cores in the CMP k=4,b=256, n=16,r=2,w=16,c=36 k=5,b=256, n=16,r=2,w=16,c=36 k=3,b=256, n=12,r=2,w=16,c=36 8

System Level Architecture  Electrical Mesh  Wormhole switching  Flit width of 256  Regular 2D electrical mesh topology  Input queued crossbar, with 4-flit buffer at ports  Enhanced XY dimension order routing  Photonic ring  Parallel waveguides = flit width = 256  Gateway interface routers enable inter-layer transfers  Reduces router overhead  ACK/NACK flow control  If multiple requests contend for access to the photonic waveguide at a gateway interface, then the request with the furthest distance given priority 9

PRI Aware X-Y Router OpticalOptical WDM Control Input Ports Output Ports Photonic layer Timeout Monitor Routing and Switch Allocation Region Validation Arbitration  n-k regular routers w/ region validation, timeout monitor  Enhanced gateway interface  add < 1% area overhead (minimal) Data N W E S Local N W E S Local 6x6 Crossbar Switch Flow Ctrl 10

PRI Aware X-Y Routing 11

 PRI:  Small PRI promotes transfer over electrical NoC  Large PRI promotes transfers over photonic rings  WDM:  Dissipated power in the modulators and receivers  Reducing number of WDM channels can save power  DVS/DFS:  Dynamic supply and voltage clock scaling is one of the most widely used runtime optimization  Performance requirements can lead to almost quadratic reduction in power 12

 Goal:  Analyze power, latency and performance tradeoffs as compared  Traditional NoC architectures  Non reconfigurable hybrid photonic NoC  Other hybrid photonic NoCs proposed in recent literature  Simulation parameters:  CMP/NoC Sizes: 6x6, 10x10  Benchmarks: Splash 2  Runtime Dynamic Configuration  Simulation methodology:  SystemC: Allows hardware and software components  Cycle accurate model 13

14 Loss Coupler/Splitter Optical Loss1.2 dB Non Linearity Optical Loss1 dB at 30 mW Waveguide Crossing Loss0.05 dB Ring modulator loss1 dB Receiver Filter Loss1.5 dB Photo detector Loss0.1 dB SOI Waveguide Loss3 dB/cm Delay Electrical delay42 ps/mm Electrical laser power3.3 W with 30% η Modulator Driver Delay9.5 ps Modulator Delay3.1 ps Waveguide Delay15.4 ps/mm Photo Detector Delay0.22 ps Receiver Delay24.0 ps Power Data Traffic Dependent Energy Modulator and Receiver 20 fJ/bit Static Energy (clock, leakage)5 fJ/bit Thermal tuning energy (20K Temperature range) 1 heater per micro ring resonator 16 fJ/bit/heater Bitwidth of the waveguides256 Electrical laser power3.3 W with 30% η CMOS32 nm Based on real world Data and ITRS projections

Improvement compared non dynamic Greater number of photonic rings: more opportunities for fine tuning traffic distribution 15

Significant improvement for relatively smaller complexity Power Improvement 16

PHOTON energy-delay improvements relative to the electrical mesh 17

 PHOTON has significant advantage over more complex hybrid photonic torus architecture  Fewer power hungry photonic components  Aggressive power savings with runtime reconfiguration 18

 Hybrid photonic torus has 10-15× more photonic layer area  About 1.5-2× electrical layer area overhead  Electrical layer overhead for PHOTON is minimal Optical Layer area improvement Silicon layer overhead 19

 Future CMPs with hundreds of cores  Require a scalable communication fabric  Reducing power consumption is essential  High performance per watt  2D electrical NoCs unable to meet these requirements  Proposed novel PHOTON shows significant promise Simpler and scalable architecture Lower area overhead Significant power and performance gains 20

. 21