With: Radhika Niranjan Mysore, Malveeka Tewari, Ying Zhang (Ericsson Research), Keith Marzullo, Amin Vahdat Meg Walraed-Sullivan University of California,

Slides:



Advertisements
Similar presentations
Network Layer Delivery Forwarding and Routing
Advertisements

Adders Used to perform addition, subtraction, multiplication, and division (sometimes) Half-adder adds rightmost (least significant) bit Full-adder.
EE384y: Packet Switch Architectures
Computer Networks TCP/IP Protocol Suite.
1 UNIT I (Contd..) High-Speed LANs. 2 Introduction Fast Ethernet and Gigabit Ethernet Fast Ethernet and Gigabit Ethernet Fibre Channel Fibre Channel High-speed.
Virtual Trunk Protocol
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2003 Chapter 11 Ethernet Evolution: Fast and Gigabit Ethernet.
Cognitive Radio Communications and Networks: Principles and Practice By A. M. Wyglinski, M. Nekovee, Y. T. Hou (Elsevier, December 2009) 1 Chapter 12 Cross-Layer.
Processes and Operating Systems
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
OSPF 1.
Path Splicing Nick Feamster, Murtaza Motiwala, Megan Elmore, Santosh Vempala.
Interconnection: Switching and Bridging
Multihoming and Multi-path Routing
Interconnection: Switching and Bridging CS 4251: Computer Networking II Nick Feamster Fall 2008.
UNITED NATIONS Shipment Details Report – January 2006.
RXQ Customer Enrollment Using a Registration Agent (RA) Process Flow Diagram (Move-In) Customer Supplier Customer authorizes Enrollment ( )
1 Hyades Command Routing Message flow and data translation.
1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan.
Scalable Routing In Delay Tolerant Networks
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Multipath Routing for Video Delivery over Bandwidth-Limited Networks S.-H. Gary Chan Jiancong Chen Department of Computer Science Hong Kong University.
Video Services over Software-Defined Networks
INTERNET PROTOCOLS Class 9 CSCI 6433 David C. Roberts Entire contents copyright 2011, David C. Roberts, all rights reserved.
Chapter 1: Introduction to Scaling Networks
Local Area Networks - Internetworking
PP Test Review Sections 6-1 to 6-6
TCP/IP Protocol Suite 1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 2 The OSI Model and the TCP/IP.
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
© 2006 Cisco Systems, Inc. All rights reserved. MPLS v MPLS VPN Technology Introducing MPLS VPN Architecture.
IP Multicast Information management 2 Groep T Leuven – Information department 2/14 Agenda •Why IP Multicast ? •Multicast fundamentals •Intradomain.
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 EN0129 PC AND NETWORK TECHNOLOGY I IP ADDRESSING AND SUBNETS Derived From CCNA Network Fundamentals.
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
IPv6 Routing.
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 1 v3.1 Module 10 Routing Fundamentals and Subnets.
Adding Up In Chunks.
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 1 v3.1 Module 2 Networking Fundamentals.
Chapter 9 ARP CIS 82 Routing Protocols and Concepts Rick Graziani Cabrillo College Last Updated: 5/13/2008.
Chapter 9: Subnetting IP Networks
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA TCP/IP Protocol Suite and IP Addressing Halmstad University Olga Torstensson
Analyzing Genes and Genomes
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Addressing the Network – IPv4 Network Fundamentals – Chapter 6.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Connecting LANs, Backbone Networks, and Virtual LANs
Intracellular Compartments and Transport
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 1 v3.1 Module 9 TCP/IP Protocol Suite and IP Addressing.
PSSA Preparation.
VPN AND REMOTE ACCESS Mohammad S. Hasan 1 VPN and Remote Access.
Essential Cell Biology
Mani Srivastava UCLA - EE Department Room: 6731-H Boelter Hall Tel: WWW: Copyright 2003.
Energy Generation in Mitochondria and Chlorplasts
Profile. 1.Open an Internet web browser and type into the web browser address bar. 2.You will see a web page similar to the one on.
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public ITE PC v4.0 Chapter 1 1 Link-State Routing Protocols Routing Protocols and Concepts – Chapter.
PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric
PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric. Presented by: Vinuthna Nalluri Shiva Srivastava.
Radhika Niranjan Mysore, Andreas Pamboris, Nathan Farrington, Nelson Huang, Pardis Miri, Sivasankar Radhakrishnan, Vikram Subramanya, and Amin Vahdat Department.
Mobile IP Performance Issues in Practice. Introduction What is Mobile IP? –Mobile IP is a technology that allows a "mobile node" (MN) to change its point.
PortLand: A Scalable Fault-Tolerant Layer 2 Data Center Network Fabric Radhika Niranjan Mysore, Andreas Pamboris, Nathan Farrington, Nelson Huang, Pardis.
NTHU CS5421 Cloud Computing
Presentation transcript:

With: Radhika Niranjan Mysore, Malveeka Tewari, Ying Zhang (Ericsson Research), Keith Marzullo, Amin Vahdat Meg Walraed-Sullivan University of California, San Diego

 Group of entities that want to communicate ◦ Need a way to refer to one another  Historically, a common problem ◦ E.g. laptop has two labels (MAC address, IP address)  Labeling in data center networks is unique ◦ Phone system ◦ Snail mail ◦ Internet ◦ Wireless networks 2

 Interconnect of switches connecting hosts  Massive in scale: 10k switches, 100k hosts, millions of VMs 3

 Designed with regular, symmetric structure ◦ Often multi-rooted trees (e.g. fat tree)  Reality doesn’t always match the blueprint ◦ Components and partitions are added/removed ◦ Links/switches/hosts fail and recover ◦ Cables are connected incorrectly 4

 What gets labeled in a data center network? ◦ Switch ports ◦ Host NICs ◦ Virtual machines at hosts ◦ Etc. 5

 Flat Addressing ◦ E.g. MAC Addresses (Layer 2) Unique Automatic ✗ Scalability:  Switches have limited forwarding entries (say, 10k)  # Labels in forwarding tables = # Nodes 6

 Hierarchical Addressing ◦ E.g. IP Addresses (Layer 3) with DHCP Scalable forwarding state  # Labels in forwarding tables < # Nodes ✗ Relies on manual configuration:  Unrealistic at scale 7

 PortLand’s LDP: Location Discovery Protocol  DAC: Data center Address Configuration  Manual configuration via blueprints  Rely on centralized control ◦ Cannot directly connect controller to all nodes ◦ Requires separate out-of-band control network or flooding techniques 8 PortLand: A Scalable Fault-Tolerance Layer 2 Data Center Network Fabric. Niranjan Mysore et al. SIGCOMM 2009 Generic and Automatic Address Configuration for Data Center Networks. Chen et al. SIGCOMM 2010

Network Size Label Assignment Management Overhead Ethernet IP Target location Hardware Limit: Need Labels < Nodes Flat LabelsStructured Labels Automation 9

 Less management means more automation  Structured labels encode topology ∴Labels change with topology dynamics Network Size Management Overhead Ethernet IP Target 10

 ALIAS: topology discovery and label assignment in hierarchical networks  Approach: Automatic, decentralized assignment of hierarchical labels  Benefits: ◦ Scalability (structured labels, shared label prefixes) ◦ Low management overhead (automation) ◦ No out-of-band control network (decentralized) 11

Systems (Implementation/Evaluation) Theory (Proof/Protocol Derivation) ALIAS: Scalable, Decentralized Label Assignment for Data Centers. M. Walraed-Sullivan, R. Niranjan Mysore, M. Tewari, Y. Zhang, K. Marzullo, A. Vahdat. SOCC 2011 Brief Announcement: A Randomized Algorithm for Label Assignment in Dynamic Networks. M. Walraed-Sullivan, R. Niranjan Mysore, K. Marzullo, A. Vahdat. DISC 2011 ALIAS: topology discovery and label assignment in hierarchical networks 12

 Multi-rooted trees ◦ Multi-stage switch fabric connecting hosts ◦ Indirect hierarchy ◦ May allow peer links  Labels ultimately used for communication ◦ Multiple paths between nodes 13

 Switches and hosts have labels ◦ Labels encode (shortest physical) paths from the root of the hierarchy to a switch/host ◦ Each switch/host may have multiple labels ◦ Labels encode location and expose path multiplicity h’s Labels a a d d g g h h b b e e g g h h b b f f g g h h c c f f g g h h a a d d g g b b e e g g b b f f g g c c f f g g g’s Labels b de g f ca h 14

 Hierarchical routing leverages this info ◦ Push packets upward, downward path is explicit h’s Labels a a d d g g h h b b e e g g h h b b f f g g h h c c f f g g h h a a d d g g b b e e g g b b f f g g c c f f g g g’s Labels b de g f ca h 15

 Continuously 1Overlay appropriate hierarchy on network fabric 2Group sets of related switches into hypernodes 3Assign coordinates to switches 4Combine coordinates to form labels  Periodic state exchange between immediate neighbors 16

 Switches are at levels 1 through n  Hosts are at level 0 Only requires 1 host to begin Level 0 Level 1 Level 2 Level 3 17

 Continuously 1Overlay appropriate hierarchy on network fabric 2Group sets of related switches into hypernodes 3Assign coordinates to switches 4Combine coordinates to form labels 18

 Labels encode paths from a root to a host ◦ Multiple paths lead to multiple labels per host  Aggregate for label compaction ◦ Locate switches that reach same hosts Level 1 Level 2 Level 3 Level 4 (hosts omitted for space) 19

Hypernode (HN): Maximal set of switches that connect to same HNs below (via any member) Level 1 Level 2 Level 3 Level 4 Hypernode members are indistinguishable on downward path from root Base Case:  Each Level 1 switch is in its own hypernode 20

 Continuously 1Overlay appropriate hierarchy on network fabric 2Group sets of related switches into hypernodes 3Assign coordinates to switches 4Combine coordinates to form labels 21

 Coordinates combine to make up labels  Labels used to route downwards 22  Switches in a HN share a coordinate  HN’s with a parent in common need distinct coordinates

23 choosers deciders  Can we make this problem simpler?  Switches in a HN share a coordinate  HN’s with a parent in common need distinct coordinates

 To assign coordinates to hypernodes: a. Define abstraction (choosers/deciders) b. Design solution for abstraction c. Apply solution throughout multi- rooted tree 24 choosers deciders

 Label Selection Problem (LSP) ◦ Chooser processes connected to Decider processes ◦ In a bipartite graph d2d2 d3d3 d1d1 d4d4 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 Choosers (hypernodes) deciders (parent switches) 25

 Label Selection Problem Goals: ◦ All choosers eventually select coordinates ◦ Choosers sharing a decider have distinct coordinates d2d2 d3d3 d1d1 d4d4 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 choosers deciders xyzyqyq zzzz x Multiple instances of LSP Per-instance coordinates yz 26

 Label Selection Problem (LSP) ◦ Difficulty: connections can change over time d2d2 d3d3 d1d1 d4d4 c1c1 c2c2 c3c3 c4c4 c5c5 c6c6 xyzyqyq zzzz xzrzr 27

 Decider/Chooser Protocol (DCP) ◦ Distributed algorithm that implements LSP ◦ Las-Vegas style randomized algorithm  Probabilistically fast, guaranteed to be correct ◦ Practical: Low message overhead, quick convergence ◦ Reacts quickly and locally to topology dynamics  Transient startup conditions  Miswirings  Failure/recovery, connectivity changes 28

c 2 :y? c 1 :x? c 2 :y? c 1 :x?  Algorithm: ◦ Choosers select coordinates randomly and send to deciders ◦ Deciders reply with [yes] or [no+hints] ◦ One no  reselect, All yeses  finished d2d2 d1d1 c1c1 c2c2 c1:c2:c1:c2: c1:c2:c1:c2: c1:c2:c1:c2: c1:c2:c1:c2: c 1 : x c 2 : y c 1 : x c 2 : y c 1 : x c 2 : y c 1 : x c 2 : y yes Coord: x Coord: y 29

 Hypernodes are choosers for their coordinates  Switches are deciders for neighbors below 30 2 choosers 3 deciders 2 choosers 1 decider 3 choosers 3 deciders

 DCP assigns level 1 coordinates  3 choosers  3 deciders 31

 DCP for upper levels: ◦ HN switches cooperate (per-parent restrictions) ◦ Not directly connected  2 choosers  3 deciders 32 Communicate via shared L1 switch “Distributed- Chooser DCP”

 Continuously 1Overlay appropriate hierarchy on network fabric 2Group related switches into hypernodes 3Assign per-hypernode coordinates 4Combine coordinates to form labels 33

 Concatenate coordinates from root downward (For clarity, assume labels same across instances of LSP) 34

 Hypernodes create clusters of hosts that share label prefixes 35

 Topology changes may cause paths to change  Which causes labels to change  Evaluation: ◦ Quick convergence ◦ Localized effects 36

 Many overlying communication protocols ◦ Hierarchical-style forwarding makes most sense  E.g. MAC address rewriting ◦ At sender’s ingress switch: dest. MAC  ALIAS label ◦ At recipient’s egress switch: ALIAS label  dest. MAC ◦ Up*/down* forwarding (AutoNet, SOSP91) ◦ Proxy ARP for resolution  E.g. encapsulation, tunneling 37

 “Standard” systems approach ◦ Implementation, experimentation, deployment  Theoretical approach ◦ Proof, formalization, verification via model checking  Goal: ◦ Verify correctness, feasibility ◦ Assess scalability 38

 Does ALIAS assign labels correctly?  Do labels enable scalable communication? ✓ Implemented in Mace ( ✓ Used Mace Model Checker to verify  Label assignment: levels, hypernodes, coordinates  Sample overlying communication: pairs of nodes can communicate when physically connected ✓ Ported to small testbed with existing communication protocol for realistic evaluation 39

 Does DCP solve the Label Selection Problem? ✓ Proof that DCP implements LSP ✓ Implemented in Mace and model checked all versions of DCP  Is LSP a reasonable abstraction? ✓ Formal protocol derivation from basic DCP  ALIAS 40

 Is overhead (storage, control) acceptable? ✓ Resource requirements of algorithm  Memory: ~KBs for 10k host network  Control overhead: agility/overhead tradeoff ✓ Memory usage on testbed deployment (<150B) 41 Ports/SwitchHosts Cycle (ms) Control Overhead (Mbps, %10G link) 6465k (0.3%) (0.06%) k (0.25%) (0.12%)

 Is the protocol practical in convergence time? ✓ DCP: Used Mace simulator to verify that “probabilistically fast” is quite fast in practice ✓ Measured convergence on tested deployment  On startup  After failure (speed and locality) ✓ Used Mace model checker to verify locality of failure reactions for larger networks 42

 Does ALIAS scale to data center sizes? ✓ Used Mace model checker to verify labels and communication for larger networks than testbed ✓ Wrote simulation code to analyze network behavior for enormous networks 43

Topology ALIAS Forwarding Table Entries LevelsPorts% Fully ProvisionedServers , , , , e.g. MAC e.g. IP, LDP/DAC

 Scale and complexity of data center networks make labeling problem unique  ALIAS enables scalable data center communication by: ◦ Using a distributed approach ◦ Leveraging hierarchy to form topologically significant labels ◦ Eliminating manual configuration 45

46

47

48

49

50