Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan ) Hiroki Matsutani ( U of Tokyo, Japan ) Hideharu Amano ( Keio U/ NII, Japan ) Stabilizing.

Slides:



Advertisements
Similar presentations
Interconnection Networks: Flow Control and Microarchitecture.
Advertisements

Application of GMPLS technology to traffic engineering Shinya Tanaka, Hirokazu Ishimatsu, Takeshi Hashimoto, Shiro Ryu (1), and Shoichiro Asano (2) 1:
Dynamic Topology Optimization for Supercomputer Interconnection Networks Layer-1 (L1) switch –Dumb switch, Electronic “patch panel” –Establishes hard links.
Packet Switching COM1337/3501 Textbook: Computer Networks: A Systems Approach, L. Peterson, B. Davie, Morgan Kaufmann Chapter 3.
George Michelogiannakis, Nan Jiang, Daniel Becker, William J. Dally This work was completed in Stanford University.
1 CONGESTION CONTROL. 2 Congestion Control When one part of the subnet (e.g. one or more routers in an area) becomes overloaded, congestion results. Because.
1.  Congestion Control Congestion Control  Factors that Cause Congestion Factors that Cause Congestion  Congestion Control vs Flow Control Congestion.
Evaluating Bufferless Flow Control for On-Chip Networks George Michelogiannakis, Daniel Sanchez, William J. Dally, Christos Kozyrakis Stanford University.
Ultra Fine-Grained Run-Time Power Gating of On-Chip Routers for CMPs
Allocator Implementations for Network-on-Chip Routers Daniel U. Becker and William J. Dally Concurrent VLSI Architecture Group Stanford University.
Reconfigurable Network Topologies at Rack Scale
Miguel Gorgues, Dong Xiang, Jose Flich, Zhigang Yu and Jose Duato Uni. Politecnica de Valencia, Spain School of Software, Tsinghua University, China, Achieving.
High Performance Router Architectures for Network- based Computing By Dr. Timothy Mark Pinkston University of South California Computer Engineering Division.
Scaling Internet Routers Using Optics Producing a 100TB/s Router Ashley Green and Brad Rosen February 16, 2004.
Lei Wang, Yuho Jin, Hyungjun Kim and Eun Jung Kim
Issues in System-Level Direct Networks Jason D. Bakos.
A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares, Alexander Loukissas, Amin Vahdat Presented by Gregory Peaker and Tyler Maclean.
Storage area network and System area network (SAN)
Router Architectures An overview of router architectures.
Dragonfly Topology and Routing
1 LAN switching and Bridges Relates to Lab 6. Covers interconnection devices (at different layers) and the difference between LAN switching (bridging)
Router Architectures An overview of router architectures.
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
Switching, routing, and flow control in interconnection networks.
Buffer Management for Shared- Memory ATM Switches Written By: Mutlu Apraci John A.Copelan Georgia Institute of Technology Presented By: Yan Huang.
High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
Tightly-Coupled Multi-Layer Topologies for 3D NoCs Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi (NII, JAPAN) Hideharu Amano (Keio Univ, JAPAN)
Itrat Rasool Quadri ST ID COE-543 Wireless and Mobile Networks
High-Performance Networks for Dataflow Architectures Pravin Bhat Andrew Putnam.
Adding Slow-Silent Virtual Channels for Low-Power On-Chip Networks Hiroki Matsutani (Keio Univ, Japan) Michihiro Koibuchi (NII, Japan) Daihan Wang (Keio.
Report Advisor: Dr. Vishwani D. Agrawal Report Committee: Dr. Shiwen Mao and Dr. Jitendra Tugnait Survey of Wireless Network-on-Chip Systems Master’s Project.
Three-Dimensional Layout of On-Chip Tree-Based Networks Hiroki Matsutani (Keio Univ, Japan) Michihiro Koibuchi (NII, Japan) D. Frank Hsu (Fordham Univ,
Current major high performance networking technologies InfiniBand 10G-Ethernet.
Infiniband subnet management Discuss the Infiniband subnet management system Discuss fat tree and subnet management in an Infiniband with a fat tree topology.
Déjà Vu Switching for Multiplane NoCs NOCS’12 University of Pittsburgh Ahmed Abousamra Rami MelhemAlex Jones.
A Scalable, Commodity Data Center Network Architecture Jingyang Zhu.
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
Author : Jing Lin, Xiaola Lin, Liang Tang Publish Journal of parallel and Distributed Computing MAKING-A-STOP: A NEW BUFFERLESS ROUTING ALGORITHM FOR ON-CHIP.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
1 Michihiro Koibuchi, Takafumi Watanabe, Atsushi Minamihata, Masahiro Nakao, Tomoyuki Hiroyasu, Hiroki Matsutani, and Hideharu Amano
A Lightweight Fault-Tolerant Mechanism for Network-on-Chip
Floodless in SEATTLE : A Scalable Ethernet ArchiTecTure for Large Enterprises. Changhoon Kim, Matthew Caesar and Jenifer Rexford. Princeton University.
ECE669 L21: Routing April 15, 2004 ECE 669 Parallel Computer Architecture Lecture 21 Routing.
VL2: A Scalable and Flexible Data Center Network Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David.
Michihiro Koibuchi(NII, Japan ) Tomohiro Otsuka(Keio U, Japan ) Hiroki Matsutani ( U of Tokyo, Japan ) Hideharu Amano ( Keio U/ NII, Japan ) An On/Off.
Non-Minimal Routing Strategy for Application-Specific Networks-on-Chips Hiroki Matsutani Michihiro Koibuchi Yutaka Yamada Jouraku Akiya Hideharu Amano.
Interconnect simulation. Different levels for Evaluating an architecture Numerical models – Mathematic formulations to obtain performance characteristics.
CS 4396 Computer Networks Lab Router Architectures.
Runtime Power Gating of On-Chip Routers Using Look-Ahead Routing
Performance, Cost, and Energy Evaluation of Fat H-Tree: A Cost-Efficient Tree-Based On-Chip Network Hiroki Matsutani (Keio Univ, JAPAN) Michihiro Koibuchi.
Interconnection network network interface and a case study.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
A Switch-Tagged Routing Methodology for PC Clusters with VLAN Ethernet 2011 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS Authors: Michihiro Koibuchi,
Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220.
HP Labs 1 IEEE Infocom 2003 End-to-End Congestion Control for InfiniBand Jose Renato Santos, Yoshio Turner, John Janakiraman HP Labs.
Computer Communication and Networking Lecture # 4 by Zainab Malik 1.
© Sudhakar Yalamanchili, Georgia Institute of Technology (except as indicated) Deadlock: Part II - Recovery.
1 LAN switching and Bridges Relates to Lab Outline Interconnection devices Bridges/LAN switches vs. Routers Bridges Learning Bridges Transparent.
MMPTCP: A Multipath Transport Protocol for Data Centres 1 Morteza Kheirkhah University of Edinburgh, UK Ian Wakeman and George Parisis University of Sussex,
Fall, 2001CS 6401 Switching and Routing Outline Routing overview Store-and-Forward switches Virtual circuits vs. Datagram switching.
Buffer Management in a Switch
Pablo Abad, Pablo Prieto, Valentin Puente, Jose-Angel Gregorio
Azeddien M. Sllame, Amani Hasan Abdelkader
Switching, routing, and flow control in interconnection networks
Buffer Management for Shared-Memory ATM Switches
Network-on-Chip Programmable Platform in Versal™ ACAP Architecture
Bridges Neil Tang 10/10/2008 CS440 Computer Networks.
Switching, routing, and flow control in interconnection networks
Dragonfly+: Low Cost Topology for scaling Datacenters
In-network computation
Presentation transcript:

Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan ) Hiroki Matsutani ( U of Tokyo, Japan ) Hideharu Amano ( Keio U/ NII, Japan ) Stabilizing Path Modification of Power-Aware On/Off Interconnection Networks

HPC networks (Infiniband, GbE) On/Off link activation method –Reducing power consumption of HPC networks –Paths are updated to avoid deactivated links Applying network reconfiguration to switches Evaluations –Cycle-accurate network simulator –Behavior of network during the path change Outline

20% 40% 60% 50% 30% 10% 0% Number of Supercomputers on Top500 List Percentage on Top500 List Network of High-performance computing

Virginia Tech's X 2,200 cores 280 th on Top500 ABE (NCSA) 9,600 cores 23 th on top500 ASCI-Q (LANL) 8,192 cores BLUEGENE/L (LLNL) 212,992 processors 2 nd on Top500 list IBA Propietary RoadRunner (LANL) 122,400 cores 1 st on Top500 Quadrics IBA TACC (Univ Texas) 251,904 cores 5 th on top500 IBA Examples 2008

HPC Networks  Small switches (24/48-port) provide the lowest cost per port  When 100,000 cores are connected, a large number of small switches are needed -drastically increasing the number of links - Unused and rarely-used links should be deactivated for power-aware HPCs switch host TREE 1TREE 4TREE 3TREE Link aggr. using 3 links 4 paths

Power cons is almost constant regardless of traffic load # of activated ports dominates the power cons of switches –Power cons of port is reduced down to ZERO by port- shutdown operation Power cons of HPC switches ProductPortOther (Xbar) Total ( ratio of ports ) PC (65%) PC (53%) PC (63%) SF (41%) SFS7000D- SK (34%) Unit :W GbE IB

HPC networks (Infiniband, GbE) On/Off link activation method –Reducing power consumption of HPC networks –Paths are updated to avoid deactivated links Applying network reconfiguration to switches Evaluations –Cycle-accurate network simulator –Behavior of network during the path change Outline

Overview of the on/off link method switch host Traffic load becomes low ( turning off a part of links) TREE 1TREE 4TREE 3TREE TREE 1TREE 4TREE 3TREE Network load is not always high (e.g. during computation time) Switch ports consume 40-60% of the total power of a switch

A runtime on/off link method Eg : port monitor, IPTraf, pilot execution How is NW stabilized during the path-update? Low or high-load links appear Selection of on/off links and paths Update of link status and paths Traffic monitoring No Yes Very crucial factor Low traffic load is detected TREE 1TREE 4TREE 3TREE Paths: Before & After the before path is deactivated

Stabilizing network during the path update Network Reconfiguration (deadlock avoidance) Rold Rold is deadlock free Rnew is deadlock free Rold+Rnew may deadlock Rnew NW Reconfiguration Switch Link Rold=Routing Table before the update Rnew=Routing Table after the update

Network Reconfiguration Rold Rold is deadlock free Rnew is deadlock free Rold+Rnew may cause deadlock Rnew Reconfiguration Deadlock Old behind new New behind old

Existing NW reconf tech. on fault- tolerant networks DOUBLE-SCHEME SIMPLE RECONFIGURATION Static reconfigurationDynamic reconfiguration Traffic is stopped New routing is applied Traffic is resumed Traffic is not stopped Old and new routing coexist Difficulty to avoid deadlock High latencies STATIC RECONFIGURATION(ST)

Current NW Reconfigurations –SR PDA: Simple Reconfiguration: Packet Dropping Aware[Lysne08,TC] Tokens are sent before update of routing Packets are sent after updating routing tables –SR LA: Simple Reconfiguration: Latency Aware[Lysne08,IEEE TC] All new tables are distributed before using new one. Latency due to the tokens is reduced. –DS: Double Scheme[Pinkston03,TPDS] Requires 2 virtual channels. One channel have to be drained –ST:Static Reconfiguration Traffic injection is completely stopped

HPC Interconnects (Infiniband, GbE) On/Off link activation method –Reducing power consumption of HPC networks –Paths are updated to avoid deactivated links Applying network reconfiguration to switches Evaluations –Cycle-accurate network simulator –Behavior of network during the path change Outline

Switch model (InfiniBand) Buffered input (1KB per VL) and output (1KB per VL) ports Non-multiplexed crossbar with separate ports per VL FIFO-based crossbar arbiter per output crossbar port Round-robin arbiter per output port 100 ns routing time Link model Link Speed = 2.5 Gbps (1X links) Topologies 2D mesh networks Traffic model Packet lengths are 58 bytes Uniform Full range of traffic, from low load to saturation Simulation Environment

Evaluation Results We twice apply NW reconf. process to each execution: Deactivating links, after decrease the traffic injection Re-activating links, after increase the traffic injection We evaluated full range of initial traffic injection, (from low traffic-to near congestion)

Static Reconfiguration (ST) (a) Low Traffic Load (b) High Traffic Load Traffic load decreasesTraffic load increases Latency is high Traffic decreases, a link is deactivated Traffic increases, a link is reactivated At each on/off link operation, traffic is not stabilized in ST!!

SR-LA (dynamic reconfiguration) (a) Low Traffic Load (b) High Traffic Load Also, at each on/off link operation, traffic is not stabilized in SR-LA!!

SR-PDA (dynamic reconfiguration) (a) Low Traffic Load (b) High Traffic Load Also, at each on/off link operation, traffic is not stabilized in SR-PDA!!

Double Scheme (dynamic reocnfiguration) (a) Low Traffic Load (b) High Traffic Load Latency is constant Traffic load decreasesTraffic load increases Latency is constant Stabilizing the path update only in Double Scheme!!

DS ST SRL Larger Network (8x8 Mesh) Similar behavior!! Only Double Scheme stabilizes networks during the path update!!

We apply network reconfiguration techniques to power-aware on/off networks for HPC –Links consume ~63% of switch power On/off link activation reduces power It must accept the topology change –Network reconfiguration smoothly supports the path update »Stabilizing the update of new/old paths »Avoiding deadlocks of new/old paths Cycle-accurate simulation –shows its impact on the power-aware on/off networks Double Scheme (dynamic NW reconf) maintains performance, stabilizing networks, deadlock avoidance Network reconfiguration is essential for realizing the power-aware on/off networks for HPC systems Conclusions

Acknowledgment This work was partially supported by JST CREST (ULP-HPC: Ultra Low-Power, High-Performance Computing via Modelling and Optimization of Next Generation HPC Technologies)

17/17