Dragonfly+: Low Cost Topology for scaling Datacenters

Slides:



Advertisements
Similar presentations
Ch. 12 Routing in Switched Networks
Advertisements

Interconnection Networks: Flow Control and Microarchitecture.
Routing and Congestion Problems in General Networks Presented by Jun Zou CAS 744.
Ch. 12 Routing in Switched Networks Routing in Packet Switched Networks Routing Algorithm Requirements –Correctness –Simplicity –Robustness--the.
Misbah Mubarak, Christopher D. Carothers
A Novel 3D Layer-Multiplexed On-Chip Network
Flattened Butterfly Topology for On-Chip Networks John Kim, James Balfour, and William J. Dally Presented by Jun Pang.
George Michelogiannakis, Nan Jiang, Daniel Becker, William J. Dally This work was completed in Stanford University.
EFFICIENT ROUTING MECHANISMS FOR DRAGONFLY NETWORKS Marina García Enrique Vallejo Ramón Beivide Miguel Odriozola Mateo Valero International Conference.
1 Message passing architectures and routing CEG 4131 Computer Architecture III Miodrag Bolic Material for these slides is taken from the book: W. Dally,
Flattened Butterfly: A Cost-Efficient Topology for High-Radix Networks ______________________________ John Kim, William J. Dally &Dennis Abts Presented.
1 Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E)
Adaptive Routing Proshanto Mukherji CSC 457: Computer Networks University of Rochester.
Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University.
Dynamic routing – QoS routing Load sensitive routing QoS routing.
1 Indirect Adaptive Routing on Large Scale Interconnection Networks Nan Jiang, William J. Dally Computer System Laboratory Stanford University John Kim.
1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,
A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares, Alexander Loukissas, Amin Vahdat Presented by Gregory Peaker and Tyler Maclean.
1 Near-Optimal Oblivious Routing for 3D-Mesh Networks ICCD 2008 Rohit Sunkam Ramanujam Bill Lin Electrical and Computer Engineering Department University.
A Scalable, Commodity Data Center Network Architecture Mohammad AI-Fares, Alexander Loukissas, Amin Vahdat Presented by Ye Tao Feb 6 th 2013.
A Scalable, Commodity Data Center Network Architecture
Routing Algorithms ECE 284 On-Chip Interconnection Networks Spring
Dragonfly Topology and Routing
Localized Asynchronous Packet Scheduling for Buffered Crossbar Switches Deng Pan and Yuanyuan Yang State University of New York Stony Brook.
Quasi Fat Trees for HPC Clouds and their Fault-Resilient Closed-Form Routing Technion - EE Department; *and Mellanox Technologies Eitan Zahavi* Isaac Keslassy.
1 The Turn Model for Adaptive Routing. 2 Summary Introduction to Direct Networks. Deadlocks in Wormhole Routing. System Model. Partially Adaptive Routing.
Distributed Quality-of-Service Routing of Best Constrained Shortest Paths. Abdelhamid MELLOUK, Said HOCEINI, Farid BAGUENINE, Mustapha CHEURFA Computers.
1 Enabling Large Scale Network Simulation with 100 Million Nodes using Grid Infrastructure Hiroyuki Ohsaki Graduate School of Information Sci. & Tech.
1 Scaling Collective Multicast Fat-tree Networks Sameer Kumar Parallel Programming Laboratory University Of Illinois at Urbana Champaign ICPADS ’ 04.
Infiniband subnet management Discuss the Infiniband subnet management system Discuss fat tree and subnet management in an Infiniband with a fat tree topology.
Department of Computer Science at Florida State LFTI: A Performance Metric for Assessing Interconnect topology and routing design Background ‒ Innovations.
Author : Jing Lin, Xiaola Lin, Liang Tang Publish Journal of parallel and Distributed Computing MAKING-A-STOP: A NEW BUFFERLESS ROUTING ALGORITHM FOR ON-CHIP.
Dragonfly Topology for networks Presented by : Long Bao.
ACN: RED paper1 Random Early Detection Gateways for Congestion Avoidance Sally Floyd and Van Jacobson, IEEE Transactions on Networking, Vol.1, No. 4, (Aug.
Anshul Kumar, CSE IITD CSL718 : Multiprocessors Interconnection Mechanisms Performance Models 20 th April, 2006.
1 Modeling and Performance Evaluation of DRED (Dynamic Random Early Detection) using Fluid-Flow Approximation Hideyuki Yamamoto, Hiroyuki Ohsaki Graduate.
VL2: A Scalable and Flexible Data Center Network Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David.
1 On Scalable Edge-based Flow Control Mechanism for VPN Tunnels --- Part 2: Scalability and Implementation Issues Hiroyuki Ohsaki Graduate School of Information.
GPSR: Greedy Perimeter Stateless Routing for Wireless Networks EECS 600 Advanced Network Research, Spring 2005 Shudong Jin February 14, 2005.
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
Interconnect simulation. Different levels for Evaluating an architecture Numerical models – Mathematic formulations to obtain performance characteristics.
Interconnect simulation. Different levels for Evaluating an architecture Numerical models – Mathematic formulations to obtain performance characteristics.
Jose Miguel Montanana (NII, Japan) Michihiro Koibuchi (NII, Japan ) Hiroki Matsutani ( U of Tokyo, Japan ) Hideharu Amano ( Keio U/ NII, Japan ) Stabilizing.
6 December On Selfish Routing in Internet-like Environments paper by Lili Qiu, Yang Richard Yang, Yin Zhang, Scott Shenker presentation by Ed Spitznagel.
Dynamic Traffic Distribution among Hierarchy Levels in Hierarchical Networks-on-Chip Ran Manevich, Israel Cidon, and Avinoam Kolodny Group Research QNoC.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Jiaxin Cao, Rui Xia, Pengkun Yang, Chuanxiong Guo,
1 Lecture 14: Interconnection Networks Topics: dimension vs. arity, deadlock.
Effective bandwidth with link pipelining Pipeline the flight and transmission of packets over the links Overlap the sending overhead with the transport.
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Communication Costs in Parallel Machines Dr. Xiao Qin Auburn University
Puzzle You have 2 glass marbles Building with 100 floors
VL2: A Scalable and Flexible Data Center Network
William Stallings Data and Computer Communications
Data Center Architectures
Network Layer COMPUTER NETWORKS Networking Standards (Network LAYER)
Yiting Xia, T. S. Eugene Ng Rice University
Data Center Network Topologies II
How to Train your Dragonfly
ECE 544: Traffic engineering (supplement)
Datacenter Interconnection Network Design
Pablo Abad, Pablo Prieto, Valentin Puente, Jose-Angel Gregorio
Hamed Rezaei, Mojtaba Malekpourshahraki, Balajee Vamanan
CIS679: Anycast Review of Last lecture Anycast.
Lecture 14: Interconnection Networks
Indirect Networks or Dynamic Networks
Interconnection Network Design Lecture 14
VL2: A Scalable and Flexible Data Center Network
Data Center Architectures
EE382C Lecture 6 Adaptive Routing 4/14/11 What is tornado traffic?
EE382C Final Project Crouching Tiger, Hidden Dragonfly
Presentation transcript:

Dragonfly+: Low Cost Topology for scaling Datacenters Authors: Alexander Shpiner, Zachy Haramaty, Saar Eliad, Vladimir Zdornov, Barak Gafni and Eitan Zahavi

Outline: Topology Fully Progressive Adaptive Routing Analytical and Simulative Analysis Conclusion

Topology Dragonfly and Fat-Trees Dragonfly Topology Fat Tree Topology

Topology Dragonfly+

Topology Dragonfly+

Topology Dragonfly+

Topology For keeping full bi-sectional bandwidth inside the group: (1) p= l = s = h (2) p= h = k/2 (3) Ngroup = pl = (k*k)/4 l: leaf routers p: hosts per leaf routers  s,h: spine routers k: router radix Ngroup : number of hosts in the group

Routing

Deadlock Avoidance (1) Packet that traverses the minimal route does not change its VL. (2) Packet that traverses the intermediate spine route changes its VL in intermediate spine router. (3) Packet that traverses the intermediate leaf route changes its VL in intermediate leaf router.

Routing Is Min-routing optimal? Non-Min route is choosen if all egress queues on min routes are longer than T, and there is an egress queue on the non-min route that is shorter than T. (T is queue length threshold.) Routing decision are evaluated in every router on the packet's path. 

Routing Fully Progressive Adaptive Routing (FPAR-Rules)

Routing Fully Progressive Adaptive Routing(FPAR-Rules)

Routing How does it handle Remote Congestion? ARN(Adaptive Routing Notification)  ARN messages: destination address A and incoming port ARN ARN messages are sent among the routers to notify distant congestion that can be resolved by previous router on the route. Packet Excludes port P from a list of possible ports for packets destined to A, for predefined time. If p is the only port packets are queued and ARN messeage is sent to previous router.

Analytical and Simulative Analysis Analytical Analysis (Dragonfly+, Dragonfly, 3-level Fat Tree with 2:1 blocking ratio, 3-level Fat Tree non-blocking, Slimfly) assuming router radix of 36.  Scalability: Maximal number of hosts Cost: Number of hosts per router Locality: Full Bisection Group Size (number of hosts inside the group) Network Throughput Number of VLs Diameter and Maximal assured route length

Analytical Analysis Scalability Fig A. Maximal network size in number of hosts vs. Router radix (k). Fig B. Group size in number of hosts vs router radix(k).  Dragonfly+ and Non-blocking Fat Tree graphs are merging

Analytical Analysis

Simulative Analysis (over Omnet++ based infrastructure) Uniform Random Traffic (packets are injected to random destination by hosts) DF+ network of 1296 hosts and k=36, 4 groups.  Each Spine router of a group is connected by six parallel links to a spine router in each other group. Permutation Traffic (simulated with 100 randomized permutation and selected a single permutation that achieved worst performance) Maximal Dragonfly+ network of K=8 radix routers, 272 hosts. Speedup analysis: Static, Random, and Adaptive routing schemes with permutation traffic of 8KB, 256KB, and 1MB.

Simulative Analysis -Uniform Random Traffic Fig: End to End Network Latency vs. Load with Uniform Random Traffic

Simulative Analysis -Permutation Traffic Fig A: Speedup of Permutation Pattern with various routing schemes and message size Fig B: Mean End to End Network Latency vs. Load with Permutation Traffic with message size of 1MB.

Conclusions Presented novel Fully Progressive Adapting routing technique  Dragonfly+ is 4 times more scalable than Dragonfly with the same cost. Provides same or better throughput for equivalent Dragonfly and Fat Tree under various traffic patterns.