Revisiting Ethernet: Plug-and-play made scalable and efficient Changhoon Kim, and Jennifer Rexford Princeton University.

Slides:

Advertisements

Similar presentations

Shortest Path Bridging IEEE 802

Advertisements

Communication Networks Recitation 3 Bridges & Spanning trees.

Logically Centralized Control Class 2. Types of Networks ISP Networks – Entity only owns the switches – Throughput: 100GB-10TB – Heterogeneous devices:

IP Mobility Support Basic idea of IP mobility management

Mobility Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101

Multi-Layer Switching Layers 1, 2, and 3. Cisco Hierarchical Model Access Layer –Workgroup –Access layer aggregation and L3/L4 services Distribution Layer.

Scalable Flow-Based Networking with DIFANE 1 Minlan Yu Princeton University Joint work with Mike Freedman, Jennifer Rexford and Jia Wang.

Floodless in SEATTLE: A Scalable Ethernet Architecture for Large Enterprises Chang Kim, and Jennifer Rexford Princeton.

Revisiting Ethernet: Plug-and-play made scalable and efficient Changhoon Kim and Jennifer Rexford Princeton University.

Projects Related to Coronet Jennifer Rexford Princeton University

5/31/05CS118/Spring051 twisted pair hub 10BaseT, 100BaseT, hub r T= Twisted pair (copper wire) r Nodes connected to a hub, 100m max distance r Hub: physical.

Oct 21, 2004CS573: Network Protocols and Standards1 IP: Addressing, ARP, Routing Network Protocols and Standards Autumn

1 Relates to Lab 4. This module covers link state routing and the Open Shortest Path First (OSPF) routing protocol. Dynamic Routing Protocols II OSPF.

1 Network Layer: Host-to-Host Communication. 2 Network Layer: Motivation Can we built a global network such as Internet by extending LAN segments using.

Tesseract A 4D Network Control Plane

COS 461: Computer Networks

A Scalable, Commodity Data Center Network Architecture.

Jennifer Rexford Princeton University MW 11:00am-12:20pm SDN Software Stack COS 597E: Software Defined Networking.

1 LAN switching and Bridges Relates to Lab 6. Covers interconnection devices (at different layers) and the difference between LAN switching (bridging)

BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,

1 Relates to Lab 4. This module covers link state routing and the Open Shortest Path First (OSPF) routing protocol. Dynamic Routing Protocols II OSPF.

CN2668 Routers and Switches Kemtis Kunanuraksapong MSIS with Distinction MCTS, MCDST, MCP, A+

Semester 1 Module 8 Ethernet Switching Andres, Wen-Yuan Liao Department of Computer Science and Engineering De Lin Institute of Technology

Network Redundancy Multiple paths may exist between systems. Redundancy is not a requirement of a packet switching network. Redundancy was part of the.

Chapter 4: Managing LAN Traffic

Itrat Rasool Quadri ST ID COE-543 Wireless and Mobile Networks

© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Multicast routing.

Homework Assignment #1 1. Homework Assignment Part 1: LAN setup –All nodes are hosts (including middle nodes) –Each link is its own LAN, with its own.

TRansparent Interconnection of Lots of Links (TRILL) March 11 th 2010 David Bond University of New Hampshire: InterOperability.

1 Computer Communication & Networks Lecture 22 Network Layer: Delivery, Forwarding, Routing (contd.)

University of the Western Cape Chapter 11: Routing Aleksandar Radovanovic.

1 CS 4396 Computer Networks Lab LAN Switching and Bridges.

1 Introducing Routing 1. Dynamic routing - information is learned from other routers, and routing protocols adjust routes automatically. 2. Static routing.

End-to-end resource management in DiffServ Networks –DiffServ focuses on singal domain –Users want end-to-end services –No consensus at this time –Two.

Cisco – Chapter 11 Routers All You Ever Wanted To Know But Were Afraid to Ask.

Address Resolution Protocol(ARP) By:Protogenius. Overview Introduction When ARP is used? Types of ARP message ARP Message Format Example use of ARP ARP.

Objectives: Chapter 5: Network/Internet Layer  How Networks are connected Network/Internet Layer Routed Protocols Routing Protocols Autonomous Systems.

10/8/2015CST Computer Networks1 IP Routing CST 415.

NUS.SOC.CS2105 Ooi Wei Tsang Application Transport Network Link Physical you are here.

Broadcast and Multicast. Overview Last time: routing protocols for the Internet  Hierarchical routing  RIP, OSPF, BGP This time: broadcast and multicast.

Network Layer4-1 Chapter 4: Network Layer r 4. 1 Introduction r 4.2 Virtual circuit and datagram networks r 4.3 What’s inside a router r 4.4 IP: Internet.

Chi-Cheng Lin, Winona State University CS 313 Introduction to Computer Networking & Telecommunication Chapter 5 Network Layer.

Floodless in SEATTLE : A Scalable Ethernet ArchiTecTure for Large Enterprises. Changhoon Kim, Matthew Caesar and Jenifer Rexford. Princeton University.

Central Control over Distributed Routing fibbing.net SIGCOMM Stefano Vissicchio 18th August 2015 UCLouvain Joint work with O. Tilmans (UCLouvain), L. Vanbever.

© J. Liebeherr, All rights reserved 1 Multicast Routing.

OSI Model. Switches point to point bridges two types store & forward = entire frame received the decision made, and can handle frames with errors cut-through.

#1 EETS 8316/NTU CC725-N/TC/ Routing - Circuit Switching  Telephone switching was hierarchical with only one route possible —Added redundant routes.

“Hashing Out” the Future of Enterprise and Data-Center Networks Jennifer Rexford Princeton University Joint with Changhoon.

Cisco Systems Networking Academy S2 C 11 Routing Basics.

ICS 156: Networking Lab Magda El Zarki Professor, ICS UC, Irvine.

Internet Protocols. ICMP ICMP – Internet Control Message Protocol Each ICMP message is encapsulated in an IP packet – Treated like any other datagram,

Cisco Confidential © 2013 Cisco and/or its affiliates. All rights reserved. 1 Cisco Networking Training (CCENT/CCT/CCNA R&S) Rick Rowe Ron Giannetti.

5: DataLink Layer 5a-1 Bridges and spanning tree protocol Reference: Mainly Peterson-Davie.

Mobile IP 순천향대학교 전산학과 문종식

1 LAN switching and Bridges Relates to Lab Outline Interconnection devices Bridges/LAN switches vs. Routers Bridges Learning Bridges Transparent.

Routing Semester 2, Chapter 11. Routing Routing Basics Distance Vector Routing Link-State Routing Comparisons of Routing Protocols.

Mobile IP THE 12 TH MEETING. Mobile IP  Incorporation of mobile users in the network.  Cellular system (e.g., GSM) started with mobility in mind. 

BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,

Assignment 1  Chapter 1:  Question 11  Question 13  Question 14  Question 33  Question 34  Chapter 2:  Question 6  Question 39  Chapter 3: 

Routing Jennifer Rexford.

ETHANE: TAKING CONTROL OF THE ENTERPRISE

Scaling the Network: The Internet Protocol

CS4470 Computer Networking Protocols

Revisiting Ethernet: Plug-and-play made scalable and efficient

Intra-Domain Routing Jacob Strauss September 14, 2006.

VL2: A Scalable and Flexible Data Center Network

Scaling the Network: The Internet Protocol

Revisiting Ethernet: Plug-and-play made scalable and efficient

Other Routing Protocols

Reconciling Zero-conf with Efficiency in Enterprises

Presentation transcript:

Revisiting Ethernet: Plug-and-play made scalable and efficient Changhoon Kim, and Jennifer Rexford Princeton University

2 An “All Ethernet” Enterprise Network?  “All Ethernet” makes network management easier Zero-configuration of end-hosts and network due to  Flat addressing  Self-learning Location independent and permanent addresses also simplify  Host mobility  Network troubleshooting  Access-control policies

3 But, Ethernet bridging does not scale  Flooding-based delivery Frames to unknown destinations are flooded  Broadcasting for basic service Bootstrapping relies on broadcasting Vulnerable to resource exhaustion attacks  Inefficient forwarding paths Loops are fatal due to broadcast storms; use the STP Forwarding along a single tree leads to inefficiency

4 State of the Practice: A Hybrid Architecture Enterprise networks comprised of Ethernet-based IP subnets interconnected by routers R R R R Ethernet Bridging - Flat addressing - Self-learning - Flooding - Forwarding along a tree IP Routing - Hierarchical addressing - Subnet configuration - Host configuration - Forwarding along shortest paths R

5 Motivation Neither bridging nor routing is satisfactory. Can’t we take only the best of each? Architectures Features Ethernet Bridging IP Routing Ease of configuration  Optimality in addressing  Mobility support  Path efficiency  Load distribution  Convergence speed  Tolerance to loop  SEIZE (Scalable and Efficient Zero-config Enterprise) SEIZE       

6 Overview  Objectives  SEIZE architecture  Evaluation  Conclusions

7 Overview: Objectives  Objectives Avoiding flooding Restraining broadcasting Keeping forwarding tables small Ensuring path efficiency  SEIZE architecture  Evaluation  Conclusions

8 Avoiding Flooding  Bridging uses flooding as a routing scheme Unicast frames to unknown destinations are flooded Does not scale to a large network  Objective #1: Unicast unicast traffic Need a control-plane mechanism to discover and disseminate hosts’ location information “Send it everywhere! At least, they’ll learn where the source is.” “Don’t know where destination is.”

9 Restraining Broadcasting  Liberal use of broadcasting for bootstrapping (DHCP and ARP) Broadcasting is a vestige of shared-medium Ethernet Very serious overhead in switched networks  Objective #2: Support unicast-based bootstrapping Need a directory service  Sub-objective #2.1: Support general broadcast However, handling broadcast should be more scalable

10 Keeping Forwarding Tables Small  Flooding and self-learning lead to unnecessarily large forwarding tables Large tables are not only inefficient, but also dangerous  Objective #3: Install hosts’ location information only when and where it is needed Need a reactive resolution scheme Enterprise traffic patterns are better-suited to reactive resolution

11 Ensuring Optimal Forwarding Paths  Spanning tree avoids broadcast storms. But, forwarding along a single tree is inefficient. Poor load balancing and longer paths Multiple spanning trees are insufficient and expensive  Objective #4: Utilize shortest paths Need a routing protocol  Sub-objective #4.1: Prevent broadcast storms Need an alternative measure to prevent broadcast storms

12 Backwards Compatibility  Objective #5: Do not modify end-hosts From end-hosts’ view, network must work the same way End hosts should  Use the same protocol stacks and applications  Not be forced to run an additional protocol

13 Overview: Architecture  Objectives  SEIZE architecture Hash-based location management Shortest-path forwarding Responding to network dynamics  Evaluation  Conclusions

14 SEIZE in a Slide  Flat addressing of end-hosts Switches use hosts ’ MAC addresses for routing Ensures zero-configuration and backwards-compatibility (Obj # 5)  Automated host discovery at the edge Switches detect the arrival/departure of hosts Obviates flooding and ensures scalability (Obj #1, 5)  Hash-based on-demand resolution Hash deterministically maps a host to a switch Switches resolve end-hosts ’ location and address via hashing Ensures scalability (Obj #1, 2, 3)  Shortest-path forwarding between switches Switches run link-state routing with only their own connectivity info Ensures data-plane efficiency (Obj #4)

15 How does it work? Host discovery or registration B D x y Hash ( F ( x ) = B ) Store at B Traffic to x Hash ( F ( x ) = B ) Tunnel to egress node, A Deliver to x Switches End-hosts Control flow Data flow Notifying to D Entire enterprise (A large single IP subnet) LS core E Optimized forwarding directly from D to A C A Tunnel to relay switch, B

16 Terminology Ingress Relay (for x ) Egress x y B A Dst Src D Ingress applies a cache eviction policy to this entry cut-through forwarding

17 Responding to Topology Changes  Consistent Hash [Karger et al.,STOC’97] minimizes re-registration A B C D E F h h h h h h h h h h

18 Single Hop Look-up A B C D F(x) x y y sends traffic to x E Every switch on a ring is logically one hop away

19 Responding to Host Mobility Relay (for x ) x y B A Src D when cut-through forwarding is used G Old Dst New Dst

20 Unicast-based Bootstrapping  ARP Ethernet: Broadcast requests SEIZE: Hash-based on-demand address resolution  Exactly the same mechanism as location resolution  Proxy resolution by ingress switches via unicasting  DHCP Ethernet: Broadcast requests and replies SEIZE: Utilize DHCP relay agent (RFC 2131)  Proxy resolution by ingress switches via unicasting

21 Overview: Evaluation  Objectives  SEIZE architecture  Evaluation Scalability and efficiency Simple and flexible network management  Conclusions

22 Control-Plane Scalability When Using Relays  Minimal overhead for disseminating host-location information Each host’s location is advertised to only two switches  Small forwarding tables The number of host information entries over all switches leads to O(H), not O(SH)  Simple and robust mobility support When a host moves, updating only its relay suffices No forwarding loop created since update is atomic

23 Data-Plane Efficiency w/o Compromise  Price for path optimization Additional control messages for on-demand resolution Larger forwarding tables Control overhead for updating stale info of mobile hosts  The gain is much bigger than the cost Because most hosts maintain a small, static communities of interest (COIs) [Aiello et al., PAM’05] Classical analogy: COI ↔ Working Set (WS); Caching is effective when a WS is small and static

24 Evaluation: Prototype Implementation  Link-state routing: eXtensible Open Router Platform [Handley et al., NSDI’05]  Host information management and traffic forwarding: The Click modular router [Kohler et al., TOCS’00] Host info. registration and notification messages Click XORP OSPF Daemon Ring Manager Host Info Manager SeizeSwitch Link-state advertisements from other switches Data Frames Routing Table Network Map Click Interface

25 Evaluation: Set-up and Models  Emulation on Emulab  Test Network Configuration  Test Traffic LBNL internal packet traces [Pang et al., IMC ’ 05]  17.8M packets from 5,128 hosts across 22 subnets Real-time replay  Models tested Ethernet w/ STP, SEIZE w/o path opt., and SEIZE w/ path opt. Inactive timeout-based eviction: 5 min ltout, 1 ~ 60 sec rtout SW2 SW1 SW0 SW3 N0N2N3N1

26 Overall Comparison Data-plane Efficiency Control-plane Scalability Low Cost

27 Sensitivity to Cache Eviction Policy

28 Some Unique Benefits  Optimal load balancing via relayed delivery Flows sharing the same ingress and egress switches are spread over multiple indirect paths For any valid traffic matrix, this practice guarantees 100% throughput with minimal link usage [Zhang-Shen et al., HotNets’04/IWQoS’05]  Simple and robust access control Enforcing access-control policies at relays makes policy management simple and robust Why? Because routing changes and host mobility do not change policy enforcement points

29 Conclusions  SEIZE is a plug-and-playable enterprise architecture ensuring both scalability and efficiency  Enabling design choices Hash-based location management Reactive location resolution and caching Shortest-path forwarding  Lessons Trading a little data-plane efficiency for huge control- plane scalability makes a qualitatively different system Traffic patterns (small static COIs, and short flow interarrival times) are our friends

30 Future Work  Enriching evaluation Various topologies Dynamic set-ups (topology changes, and host mobility)  Applying reactive location resolution to other networks There are some routing systems that need to be slimmer  Generalization How aggressively can we optimize control-plane without losing data-plane efficiency?

31 Thank you. Full paper is available at

Backup Slides

33 Group-based Broadcasting  SEIZE uses per-group multicast tree

34 Group-based Access Control  Relay switches enforce inter-group access policies  The idea Allow resolution only when the access policy between a resolving host’s group and a resolved host’s group permits access

35 Simple and Flexible Management  Using only a number of powerful switches as relays? Yes, a pre-hash can generate a set of identifiers for a switch  Applying cut-through forwarding selectively? Yes, ingress switches can adaptively decide which policy to use (E.g., no cut-through forwarding for DNS look-ups)  Controlling (or predicting) a switch’s table size? Yes, pre-hashing can determine the number of hosts for which a switch provides relay service The number of directly connected hosts to a switch is also usually known ahead of time  Traffic engineering? Yes, adjusting link weights works effectively

36 Control Overhead

37 Host Information Replication Factor Max = NH min = H RF 2.23 RF H2H RF 1.83

38 Path Efficiency + 2% + 29% + 27% Optimum

39 Understanding Traffic Patterns

40 Understanding Traffic Patterns - cont’d

41 Evaluation: Prototype Implementation Click XORP OSPF Daemon Ring Mgr HostInfo Mgr SeizeSwitch Link State Advertisements from other switches Host info. registration and optimization msgs Data Frames IP Forwarding RIBD Click FEA

42 Prototype: Inside a Click Process Strip(14) FromDevice(em0)FromDevice(em1) ToDevice(em0)ToDevice(em1) CheckIPHeader(…) ProcessIPMisc(…) ARPQuerier(…) LookupIPRoute(…) to ARPResponder or ARPQuerier Classifier(…) ARPIP SeizeSwitch(…) FromDevice(eth0)FromDevice(eth1) ToDevice(eth0)ToDevice(eth1) IP Classifier(…) ARP Classifier(…) ARPIP to ARPResponder or ARPQuerier to upper layer IP Strip(20) Others Classifier(…) IP Proto SEIZE Others Strip(14)

43 store or update in host-table c-hash srcmac, get a relay node rn notify to rn to  Source Learning Inside a SeizeSwitch Element IP Forwarding look up routing table send down to L2 send up to L4 strip IP header c-hash dstmac, get a relay node rn encapsuate with, set proto to SEIZE encapsuate with, set proto to SEIZE send out to interface to  to    inform ingress of Layer 2Layer 3 strip Ethernet header is dstmac on host-table? yesno yes is dstmac local? no proto == SEIZE ? yes no is dstIP me? yes no get egress-IP of dstmac, from host table yes no control message? is dstmac me or broadcast? yes no apply to host table L2 Data Forwarding L2 Control to  EthFrame departs EthFrame arrives

44 Control Plane: Single Hop DHT I H G D C A B K E F L J 6 E, K, L A, J, I B, G C, H D, F 1’s LOCAL 1’s REMOTE_AUTH 3 Registers L 1 Forgets L2 Registers F

45 Temporal Traffic Locality

46 Spatial Traffic Locality

47 Failover Performance Time/Sequence Graph Sequence Num. [KB] Time (s) SW down SW up New ST built 100,000 50,000 Time/Sequence Graph Sequence Num. [KB] 100,000 50, Time (s) Relay down Relay up OSPF cnvg & host registration OSPF cnvg & host registration