Generic and Automatic Address Configuration for Data Center Networks

Slides:



Advertisements
Similar presentations
1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan.
Advertisements

Logically Centralized Control Class 2. Types of Networks ISP Networks – Entity only owns the switches – Throughput: 100GB-10TB – Heterogeneous devices:
SkewReduce YongChul Kwon Magdalena Balazinska, Bill Howe, Jerome Rolia* University of Washington, *HP Labs Skew-Resistant Parallel Processing of Feature-Extracting.
BCube: A High Performance, Server-centric Network Architecture for Modular Data Centers Chuanxiong Guo1, Guohan Lu1, Dan Li1, Haitao Wu1, Xuan Zhang2,
Multicast in Wireless Mesh Network Xuan (William) Zhang Xun Shi.
~1~ Infocom’04 Mar. 10th On Finding Disjoint Paths in Single and Dual Link Cost Networks Chunming Qiao* LANDER, CSE Department SUNY at Buffalo *Collaborators:
Research: Group communication in distributed interactive applications Student: Knut-Helge Vik Institute: University of Oslo, Simula Research Labs.
Graph Isomorphism Algorithms and networks. Graph Isomorphism 2 Today Graph isomorphism: definition Complexity: isomorphism completeness The refinement.
CSE 534 Fundamentals of Computer Networks Lecture 4: Bridging (From Hub to Switch by Way of Tree) Based on slides from D. Choffnes Northeastern U. Revised.
CS 4700 / CS 5700 Network Fundamentals Lecture 7: Bridging (From Hub to Switch by Way of Tree) Revised 1/14/13.
Siddharth Choudhary.  Refines a visual reconstruction to produce jointly optimal 3D structure and viewing parameters  ‘bundle’ refers to the bundle.
Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,
Error Tolerant Address Configuration for Data Center Networks with Malfunctioning Devices Xingyu Ma, Chengchen Hu, Kai Chen, Che Zhang, Hongtao Zhang,
Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.
1 Load Balance and Efficient Hierarchical Data-Centric Storage in Sensor Networks Yao Zhao, List Lab, Northwestern Univ Yan Chen, List Lab, Northwestern.
1 University of Freiburg Computer Networks and Telematics Prof. Christian Schindelhauer Wireless Sensor Networks 22nd Lecture Christian Schindelhauer.
Chuanxiong Guo, Haitao Wu, Kun Tan,
1 Load Balance and Efficient Hierarchical Data-Centric Storage in Sensor Networks Yao Zhao, List Lab, Northwestern Univ Yan Chen, List Lab, Northwestern.
Finding Protection Cycles in DWDM Networks 2002 IEEE ICC on Volume 5, 28 April-2 May Page(s): Reporter: Jyun-Yong Du.
Advances in Optical Network Design with p-Cycles: Joint optimization and pre-selection of candidate p-cycles (work in progress) Wayne D. Grover, John Doucette.
Introduction to Computer Networks 09/23 Presenter: Fatemah Panahi.
1 25\10\2010 Unit-V Connecting LANs Unit – 5 Connecting DevicesConnecting Devices Backbone NetworksBackbone Networks Virtual LANsVirtual LANs.
FAR: A Fault-avoidance Routing Method for Data Center Networks with Regular Topology Bin Liu, ZTE.
Daibo Liu 1 Daibo Liu 1, Zhichao Cao 2, Xiaopei Wu 2, Yuan He 2, Xiaoyu Ji 3 and Mengshu Hou 1 ICDCS, 2015, Columbus TeleAdjusting: Using Path Coding and.
1 CS 4396 Computer Networks Lab Dynamic Routing Protocols - II OSPF.
Securing Every Bit: Authenticated Broadcast in Wireless Networks Dan Alistarh, Seth Gilbert, Rachid Guerraoui, Zarko Milosevic, and Calvin Newport.
 Network Segments  NICs  Repeaters  Hubs  Bridges  Switches  Routers and Brouters  Gateways 2.
Computational Intelligence: Methods and Applications Lecture 30 Neurofuzzy system FSM and covering algorithms. Włodzisław Duch Dept. of Informatics, UMK.
Diversified Top-k Graph Pattern Matching 1 Yinghui Wu UC Santa Barbara Wenfei Fan University of Edinburgh Southwest Jiaotong University Xin Wang.
SRL: A Bidirectional Abstraction for Unidirectional Ad Hoc Networks. Venugopalan Ramasubramanian Ranveer Chandra Daniel Mosse.
University “Ss. Cyril and Methodus” SKOPJE Cluster-based MDS Algorithm for Nodes Localization in Wireless Sensor Networks Ass. Biljana Stojkoska.
CS270 Project Overview Maximum Planar Subgraph Danyel Fisher Jason Hong Greg Lawrence Jimmy Lin.
KAIS T On the problem of placing Mobility Anchor Points in Wireless Mesh Networks Lei Wu & Bjorn Lanfeldt, Wireless Mesh Community Networks Workshop, 2006.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 7 Spanning Tree Protocol.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
A Simulation-Based Study of Overlay Routing Performance CS 268 Course Project Andrey Ermolinskiy, Hovig Bayandorian, Daniel Chen.
PATH DIVERSITY WITH FORWARD ERROR CORRECTION SYSTEM FOR PACKET SWITCHED NETWORKS Thinh Nguyen and Avideh Zakhor IEEE INFOCOM 2003.
SketchVisor: Robust Network Measurement for Software Packet Processing
Network Layer COMPUTER NETWORKS Networking Standards (Network LAYER)
Yiting Xia, T. S. Eugene Ng Rice University
Cohesive Subgraph Computation over Large Graphs
CS 3700 Networks and Distributed Systems
Internet Indirection Infrastructure (i3)
Data Center Network Architectures
Chuanxiong Guo, et al, Microsoft Research Asia, SIGCOMM 2008
Topology Control –power control
A Study of Group-Tree Matching in Large Scale Group Communications
Michael Langberg: Open University of Israel
Aaron Gember-Jacobson
ElasticTree Michael Fruchtman.
CS 3700 Networks and Distributed Systems
FAR: A Fault-avoidance Routing Method for Data Center Networks with Regular Topology Please send.
Optimal Configuration of OSPF Aggregates
Algorithms and networks
Intra-Domain Routing Jacob Strauss September 14, 2006.
ISP and Egress Path Selection for Multihomed Networks
BCube: A High Performance, Server-centric Network Architecture for Modular Data Centers Chuanxiong Guo1, Guohan Lu1, Dan Li1, Haitao Wu1, Xuan Zhang2,
CS 4700 / CS 5700 Network Fundamentals
Multi-Core Parallel Routing
Chuanxiong Guo, Haitao Wu, Kun Tan,
Generic and Automatic Address Configuration for Data Center Networks
Multi-hop Coflow Routing and Scheduling in Data Centers
Algorithms and networks
SAT-Based Optimization with Don’t-Cares Revisited
CS 4700 / CS 5700 Network Fundamentals
Haitao Wang Utah State University SoCG 2017, Brisbane, Australia
Stumpf and Teague Object-Oriented Systems Analysis and Design with UML
Achieving Resilient Routing in the Internet
Stumpf and Teague Object-Oriented Systems Analysis and Design with UML
Presentation transcript:

Generic and Automatic Address Configuration for Data Center Networks 1Kai Chen, 2Chuanxiong Guo, 2Haitao Wu, 3Jing Yuan, 4Zhenqian Feng, 1Yan Chen, 5Songwu Lu, 6Wenfei Wu 1Northwestern University, 2Micrsoft Research Asia, 3Tsinghua, 4NUDT, 5UCLA, 6BUAA SIGCOMM 2010, New Delhi, India

Motivation Address autoconfiguration is desirable in networked systems Manual configuration is error-prone 50%-80% network outages are due to manual configuration DHCP for layer-2 Ethernet autoconfiguration Address autoconfiguration in data centers (DC) has become a problem Applications need locality information for computation New DC designs encode topology information for routing DHCP is not enough - no such locality/topology information

Research Problem Given a new/generic DC, how to autoconfigure the addresses for all the devices in the network? DAC: data center address autoconfiguration

Outline Motivation Research Problem DAC Implementation and Experiments Simulations Conclusion

DAC Input Blueprint Graph (Gb) Physical Topology Graph (Gp) A DC graph with logical IDs Logical ID can be any format Available earlier and can be automatically generated Physical Topology Graph (Gp) A DC graph with device IDs Device ID can be MAC address Not available until the DC is built and topology is collected 10.0.0.3 00:19:B9:FA:88:E2

DAC System Framework Malfunction Detection Device-to-logical Physical Topology Collection Device-to-logical ID Mapping Logical ID Dissemination

Two Main Challenges Challenge 1: Device-to-logical ID Mapping Assign a logical ID to a device, preserving the topological relationship between devices Challenge 2: Malfunction Detection Detect the malfunctioning devices if the physical topology is not the same as blueprint (NP-complete and even APX-hard)

Malfunction Detection Roadmap Malfunction Detection Physical Topology Collection Device-to-logical ID Mapping Logical ID Dissemination

Device-to-logical ID Mapping How to preserve the topological relationship? Abstract DAC mapping into the Graph Isomorphism (GI) problem The GI problem is hard: complexity (P or NPC) is unknown Introduce O2: a one-to-one mapping for DAC O2 Base Algorithm and O2 Optimization Algorithm Adopt and improve techniques from graph theory

O2 Base Algorithm Gb: {l1 l2 l3 l4 l5 l6 l7 l8} Gp: {d1 d2 d3 d4 d5 d6 d7 d8} Decomposition Gb: {l1} {l2 l3 l4 l5 l6 l7 l8} Gp: {d1} {d2 d3 d4 d5 d6 d7 d8} Refinement Gb: {l1} {l5} {l2 l3 l4 l6 l7 l8} Gp: {d1} {d2 d3 d5 d7} {d4 d6 d8}

O2 Base Algorithm Gb: {l1 l2 l3 l4 l5 l6 l7 l8} Gp: {d1 d2 d3 d4 d5 d6 d7 d8} Decomposition Gb: {l5} {l1 l2 l3 l4 l6 l7 l8} Gp: {d1} {d2 d3 d4 d5 d6 d7 d8} Refinement Gb: {l5} {l1 l2 l7 l8} {l3 l4 l6 } Gp: {d1} {d2 d3 d5 d7} {d4 d6 d8} Refinement Gb: {l5} {l1 l2 l7 l8} {l6} {l3 l4} Gp: {d1} {d2 d3 d5 d7} {d6} {d4 d8}

O2 Base Algorithm Refinement Gb: {l5} {l6} {l1 l2} {l7 l8} {l3 l4} Gp: {d1} {d6} {d2 d7} {d3 d5} {d4 d8} Decomposition Gb: {l5} {l6} {l1} {l2} {l7 l8} {l3 l4} Gp: {d1} {d6} {d2} {d7} {d3 d5} {d4 d8} Decomposition & Refinement Gb: {l5} {l6} {l1} {l2} {l7} {l8} {l3} {l4} Gp: {d1} {d6} {d2} {d7} {d3} {d5} {d4} {d8}

O2 Base Algorithm O2 base algorithm is very slow for 3 problems: P1: Iterative splitting in Refinement: it tries to use each cell to split every other cell iteratively Gp: π1 π2 π3 …… πn-1 πn P2: Iterative mapping in Decomposition: when the current mapping is failed, it iteratively selects the next node as a candidate for mapping P3: Random selection of mapping candidate: no explicit hint for how to select a candidate for mapping

O2 Optimization Algorithm R1: A cell cannot split another cell that is disjoint with itself. R2: If u in Gb cannot be mapped to v in Gp, then all nodes in the same orbit with u cannot be mapped to v either. Heuristics based on DC topology features Sparse => Selective Splitting (for Problem 1) Symmetric => Candidate Filtering via Orbit (for Problem 2) Asymmetric => Candidate Selection via SPLD (Shortest Path Length Distribution) (for Problem3) We propose the last one and adopt the first two from graph theory R3: Two nodes u, v in Gb, Gp cannot be mapped to each other if have different SPLDs.

Speed of O2 Mapping 8.9 seconds 12.4 hours 8.9 seconds

Malfunction Detection Roadmap Malfunction Detection Physical Topology Collection Device-to-logical ID Mapping Logical ID Dissemination

Malfunction Detection Types of Malfunctions Node failure, Link failure, Miswiring Effects of Malfunctions O2 cannot find device-to-logical ID mapping Our Goal Detect malfunctioning devices Problem Complexity An ideal solution Find Maximum Common Subgraph (MCS) between Gb and Gp say Gmcs Remove Gmcs from Gp => the rest are malfunctions MCS is NP-complete and even APX-hard

Practical Solution Isomorphic Isomorphic Observations Our Idea 1 Isomorphic 1 Observations Most node/link failures, miswirings cause node degree change Special, rare miswirings happen without degree change Our Idea Degree change case: exploit the degree regularity in DC Devices in DC have regular degrees (common sense) No degree change case: probe sub-graphs derived from anchor points, and correlate the miswired devices using majority voting Select anchor point pairs from 2 graphs probe sub-graphs iteratively, stop when k-hop subgraphs are isomorphic but (k+1)-hop are not, increase the counters for k- and (k+1)- hop nodes Output node counter list: high counter => high possible to be miswired 2 2 Isomorphic … … k k Non-Isomorphic k+1 k+1

Simulations on Miswiring Detection Over data centers with tens of thousands of devices with 1.5% nodes as anchor points to identify all hardest-to-detect miswirings 1.5%

Malfunction Detection Roadmap Malfunction Detection Physical Topology Collection Device-to-logical ID Mapping Logical ID Dissemination

Basic DAC Protocols CBP: Communication Channel Building Protocol Top-Down, from root to leaves PCP: Physical Topology Collection Protocol Bottom-Up, from leaves to root LDP: Logical ID Dissemination Protocol DAC manager: handle all the intelligences can be any server in the network

Implementation and Experiments Over a BCube(8,1) network with 64 servers Communication Channel Building (CCB) Transition time Physical Topology Collection (TC) Device-to-logical ID Mapping Logical IDs Dissemination (LD) The total time used: 275 milliseconds

46 seconds for the DCell(6, 3) with 3.8+ million devices Simulations Over large-scale data centers (in milliseconds) 46 seconds for the DCell(6, 3) with 3.8+ million devices

Summary DAC: address autoconfiguration for generic data center networks, especially when the address is topology-aware Graph isomorphism for address configuration 275ms for a 64-sever BCube, and 46s for a DCell with 3.8+ million devices Anchor point probing for malfunction detection with 1.5% nodes as anchor points to identify all hardest-to-detect miswirings DAC is a small step towards the more ambitious goal of automanagement of the whole data centers

Q & A? Thanks!