Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMPE 252A : Computer Networks

Similar presentations


Presentation on theme: "CMPE 252A : Computer Networks"— Presentation transcript:

1 CMPE 252A : Computer Networks
An Introduction to Software Defined Networking (SDN) & OpenFlow Huazhe Wang Some slides are borrowed from

2 The Internet: A Remarkable Story
Tremendous success From research experiment to global infrastructure Brilliance of under-specifying Network: best-effort packet delivery Hosts: arbitrary applications Enables innovation in applications Web, P2P, VoIP, social networks, virtual worlds But, change is easy only at the edge… 

3 Inside the ‘Net: A Different Story…
Closed equipment Software bundled with hardware Vendor-specific interfaces Over specified Slow protocol standardization Few people can innovate Equipment vendors write the code Long delays to introduce new features Impacts performance, security, reliability, cost…

4 Networks are Hard to Manage
Operating a network is expensive More than half the cost of a network Yet, operator error causes most outages Buggy software in the equipment Routers with 20+ million lines of code Cascading failures, vulnerabilities, etc. The network is “in the way” Especially a problem in data centers … and home networks

5 Traffic engineering: difficult for traditional routing
5 3 v w 5 2 u 2 1 z 3 1 2 x y 1 Q: what if network operator wants u-to-z traffic to flow along uvwz, x-to-z traffic to flow xwyz? A: need to define link weights so traffic routing algorithm computes routes accordingly (or need a new routing algorithm)! Note: implicit assumption here: destination based forwarding

6 Traffic engineering: difficult
2 1 3 5 v w u z y x Q: what if network operator wants to split u-to-z traffic along uvwz and uxyz (load balancing)? A: can’t do it (or need a new routing algorithm) Note: implicit assumption here: destination based forwarding

7 Traffic engineering: difficult
5 v 3 w v w 5 2 u z 2 z 1 3 1 x x y 2 y 1 Q: what if w wants to route blue and red traffic differently? A: can’t do it (with destination based forwarding, and LS, DV routing) Note: implicit assumption here: destination based forwarding Problem:

8 Rethinking the “Division of Labor”

9 Network-layer functions
Recall: two network-layer functions: forwarding: move packets from router’s input to appropriate router output data plane routing: determine route taken by packets from source to destination control plane Two approaches to structuring network control plane: per-router control (traditional) logically centralized control (software defined networking)

10 Traditional Computer Networks
Data plane: Packet streaming Maybe this is a better picture… though need some details of what happens in each plane… like in next slides… or, have both? Forward, filter, buffer, mark, rate-limit, and measure packets

11 Traditional Computer Networks
Per-router control plane: Distributed algorithms Maybe this is a better picture… though need some details of what happens in each plane… like in next slides… or, have both? Individual routing algorithm components in each and every router interact with each other in control plane to compute forwarding tables Track topology changes, compute routes, install forwarding rules

12 Traditional Computer Networks
Management plane: Human time scale Maybe this is a better picture… though need some details of what happens in each plane… like in next slides… or, have both? Collect measurements and configure the equipment

13 Software Defined Networking (SDN)
Logically-centralized control Smart, slow API to the data plane (e.g., OpenFlow) Dumb, fast Switches A distinct (typically remote) controller interacts with local control agents (CAs) in routers to compute forwarding tables Extract the control part of routers to form a central controller

14 Software defined networking (SDN)
3. control plane functions external to data-plane switches 4.programmable control applications routing access control load balance data plane control Remote Controller CA 2. control, data plane separation 1: generalized“ flow-based” forwarding (e.g., OpenFlow)

15 Software defined networking (SDN)
Why a logically centralized control plane? easier network management: avoid router misconfigurations, greater flexibility of traffic flows table-based forwarding allows “programming” routers centralized “programming” easier: compute tables centrally and distribute distributed “programming: more difficult: compute tables as result of distributed algorithm (protocol) implemented in each and every router open (non-proprietary) implementation of control plane

16 Analogy: mainframe to PC evolution*
App Specialized Applications Linux Mac OS Windows (OS) or Open Interface Specialized Operating System Microprocessor Open Interface Specialized Hardware Vertically integrated Closed, proprietary Slow innovation Small industry Horizontal Open interfaces Rapid innovation Huge industry * Slide courtesy: N. McKeown

17 SDN perspective: data plane switches
fast, simple, commodity switches implementing generalized data-plane forwarding in hardware switch flow table computed, installed by controller API for table-based switch control (e.g., OpenFlow) defines what is controllable and what is not protocol for communicating with controller (e.g., OpenFlow) data plane control SDN Controller (network operating system) routing access load balance southbound API northbound API SDN-controlled switches network-control applications

18 SDN perspective: SDN controller
network-control applications SDN controller (network OS): maintain network state information interacts with network control applications “above” via northbound API interacts with network switches “below” via southbound API implemented as distributed system for performance, scalability, fault-tolerance, robustness routing access control load balance control plane northbound API SDN Controller (network operating system) southbound API data plane SDN-controlled switches

19 SDN perspective: control applications
network-control apps: “brains” of control: implement control functions using lower-level services, API provided by SND controller unbundled: can be provided by 3rd party: distinct from routing vendor, or SDN controller network-control applications routing access control load balance control plane northbound API SDN Controller (network operating system) southbound API data plane SDN-controlled switches

20 SDN: control/data plane interaction example
S1, experiencing link failure using OpenFlow port status message to notify controller 1 Dijkstra’s link-state Routing 4 5 network graph RESTful API intent SDN controller receives OpenFlow message, updates link status info 2 3 Application interface layer statistics flow tables Link-state info host info switch info Dijkstra’s routing algorithm application has previously registered to be called when ever link status changes. It is called. 3 2 Network-wide management layer OpenFlow SNMP Communication layer 6 Interface layer to network control apps: abstractions API Network-wide state management layer: state of networks links, switches, services: a distributed database communication layer: communicate between SDN controller and controlled switches 1 s2 Dijkstra’s routing algorithm access network graph info, link state info in controller, computes new routes 4 s1 s4 s3

21 SDN: control/data plane interaction example
Dijkstra’s link-state Routing link state routing app interacts with flow-table-computation component in SDN controller, which computes new flow tables needed 5 4 5 network graph RESTful API intent 3 Application interface layer statistics flow tables Link-state info host info switch info 2 Network-wide management layer Controller uses OpenFlow to install new tables in switches that need updating 6 OpenFlow SNMP Communication layer 6 Interface layer to network control apps: abstractions API Network-wide state management layer: state of networks links, switches, services: a distributed database communication layer: communicate between SDN controller and controlled switches 1 s2 s1 s4 s3

22 OpenFlow (OF) defines the communication protocol that enables the SDN Controller to directly interact with the forwarding plane of network devices.

23 OpenFlow data plane abstraction
flow: defined by header fields generalized forwarding: simple packet-handling rules Pattern: match values in packet header fields Actions: drop, forward, modify, matched packet or send matched packet to controller Priority: disambiguate overlapping patterns Counters: #bytes and #packets * : wildcard src=1.2.*.*, dest=3.4.5.*  drop src = *.*.*.*, dest=3.4.*.*  forward(2) 3. src= , dest=*.*.*.*  send to controller

24 OpenFlow data plane abstraction
match+action: unifies different kinds of devices Router match: longest destination IP prefix action: forward out a link Switch match: destination MAC address action: forward or flood Firewall match: IP addresses and TCP/UDP port numbers action: permit or deny NAT match: IP address and port action: rewrite address and port

25 Examples Destination-based forwarding: Firewall:
Switch Port MAC src dst Eth type VLAN ID IP Src Dst Prot TCP sport dport Action * * * * * * * * * port6 IP datagrams destined to IP address should be forwarded to router output port 6 Firewall: Switch Port MAC src dst Eth type VLAN ID IP Src Dst Prot TCP sport dport Forward * * * * * * * * * 22 drop do not forward (block) all datagrams destined to TCP port 22 Switch Port MAC src dst Eth type VLAN ID IP Src Dst Prot TCP sport dport Forward drop * * * * * * * * * do not forward (block) all datagrams sent by host

26 OpenFlow protocol operates between controller, switch
TCP used to exchange messages OpenFlow messages

27 OpenFlow: controller-to-switch messages
features: controller queries switch features, switch replies configure: controller queries/sets switch configuration parameters modify-state: add, delete, modify flow entries in the OpenFlow tables packet-out: controller can send this packet out of specific switch port

28 OpenFlow: switch-to-controller messages
packet-in: transfer packet (and its control) to controller. See packet-out message from controller flow-removed: flow table entry deleted at switch port status: inform controller of a change on a port. Fortunately, network operators don’t “program” switches by creating/sending OpenFlow messages directly. Instead use higher-level abstraction at controller

29 OpenFlow in the Wild Open Networking Foundation
Google, Facebook, Microsoft, Yahoo, Verizon, Deutsche Telekom, and many other companies Commercial OpenFlow switches HP, NEC, Quanta, Dell, IBM, Juniper, … Network operating systems NOX, Beacon, Floodlight, OpenDaylight, POX, Pyretic Network deployments Eight campuses, and two research backbone networks Commercial deployments (e.g., Google backbone)

30 Challenges

31 Heterogeneous Switches
Number of packet-handling rules Range of matches and actions Multi-stage pipeline of packet processing Offload some control-plane functionality (?) access control MAC look-up IP look-up Support new protocols or features

32 Controller Delay and Overhead
Controller is much slower the the switch Processing packets leads to delay and overhead Need to keep most packets in the “fast path” Proactive or reactive rule install packets

33 Distributed Controller
Network OS Controller Application Network OS Controller Application For scalability and reliability Partition and replicate state

34 Testing and Debugging OpenFlow makes programming possible
Network-wide view at controller Direct control over data plane Plenty of room for bugs Still a complex, distributed system Need for testing techniques Controller applications Controller and switches Rules installed in the switches

35 Evolution of OpenFlow

36 Evolution of OpenFlow P4 OpenFlow 2.0
OpenFlow is a balancing act, forwarding abstraction balancing general match/action vs. fixed function switch ASICs Instead of repeatedly extending OF standard , implement flexible mechanisms for parsing packets and matching (arbitrary) header fields through common interface

37 Run OpenFlow Experiments
Mininet Open vSwitch (OVS)

38 Hedera: Dynamic Flow Scheduling for Data Center Networks
Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan* Nelson Huang Amin Vahdat UC San Diego *Williams College Slides modified from

39 Problem Key insight/idea Challenges
MapReduce/Hadoop style workloads have substantial BW requirements ECMP-based multi-paths load balancing often leads oversubscription --> jobs bottlenecked by network Key insight/idea Identify large flows and periodically rearrange them to balance the load across core switches Challenges Estimate bandwidth demand of flows Find optimal allocation network paths for flows

40 ECMP Paths Many equal cost paths going up to the core switches
Only one path down from each core switch Randomly allocate paths to flows using hash Agnostic to available resources Long lasting collisions between long (elephant) flows S D

41 Collisions of elephant flows
Collisions possible in two different ways Upward path Downward path S1 S2 S3 D2 S4 D1 D4 D3

42 Hedera Scheduler Detect Large Flows
2. Estimate Flow Demands 3. Schedule Flows 1. Detect Large Flows Detect Large Flows Flows that need bandwidth but are network-limited Estimate Flow Demands Use min-max fairness to allocate flows between src-dst pairs Place Flows Use estimated demands to heuristically find better placement of large flows on the ECMP paths

43 Elephant Detection Scheduler continually polls edge switches for flow byte-counts Flows exceeding B/s threshold are “large” > %10 of hosts’ link capacity (i.e. > 100Mbps) What if only mice on host? Default ECMP load-balancing efficient for small flows

44 Demand Estimation Measured flow rate is misleading
Need to find a flow’s “natural” bandwidth requirement when not limited by the network find max-min fair bandwidth allocation

45 Demand Estimation Given traffic matrix of large flows, modify each flow’s size at it source and destination iteratively… Sender equally distributes bandwidth among outgoing flows that are not receiver-limited Network-limited receivers decrease exceeded capacity equally between incoming flows Repeat until all flows converge Guaranteed to converge in O(|F|) time

46 Demand Estimation

47 Demand Estimation

48 Demand Estimation

49 Demand Estimation

50 Flow Placement Heuristics
Two approaches Global First Fit: Greedily choose path that has sufficient unreserved b/w Simulated Annealing: Iteratively find a globally better mapping of paths to flows

51 Global First-Fit Scheduler ? ? ? Flow A Flow B 1 2 3 Flow C New flow detected, linearly search all possible paths from SD Place flow on first path whose component links can fit that flow

52 Global First-Fit Once flow ends, entries + reservations time out
Scheduler Flow A Flow B 1 2 3 Flow C Once flow ends, entries + reservations time out

53 Simulated Annealing Simulated Annealing:
a numerical optimization technique for searching for a solution in a space otherwise too large for ordinary search methods to yield results Probabilistic search for good flow-to-core mappings

54 Simulated Annealing State: A set of mappings from destination hosts to core switches. Neighbor State: Swap core switches between 2 hosts Within same pod, Within same edge, etc

55 Simulated Annealing Function/Energy: Total exceeded b/w capacity
Using the estimated demand of flows Minimize the exceeded capacity Temperature: Iterations left Fixed number of iterations (1000s) Achieves good core-to-flow mapping Sometimes very close to global optimal

56 Simulated Annealing Example run: 3 flows, 3 iterations Scheduler Core
? ? ? ? Flow A 2 2 2 Flow B 1 1 2 3 Flow C 3 2 Example run: 3 flows, 3 iterations

57 Simulated Annealing Scheduler Core ? ? ? ? Flow A 2 Flow B 1 2 3 Flow C 3 Final state is published to the switches and used as the initial state for next round

58 Simulated Annealing Optimizations
Assign a single core switch to each destination host Incremental calculation of exceeded capacity Using previous iterations best result as initial state The controller insert the optimized rules after each scheduling

59 Evaluation

60 Evaluation Data Shuffle ECMP GFF SA Control 438.4 335.5 336.0 306.4
Total Shuffle Time (s) 438.4 335.5 336.0 306.4 Avg. Completion Time (s) 358.1 258.7 262.0 226.6 Avg. Bisection BW (Gbps) 2.81 3.89 3.84 4.44 Avg. host goodput (MB/s) 20.9 29.0 28.6 33.1 16-hosts: 120 GB all-to-all in-memory shuffle Hedera achieves 39% better bisection BW over ECMP, 88% of ideal non-blocking switch

61 Reactiveness Demand Estimation:
27K hosts, 250K flows, converges < 200ms Simulated Annealing: Asymptotically dependent on # of flows + # iterations 50K flows and 10K iter: 11ms Most of final bisection BW: first few hundred iterations Scheduler control loop: Polling + Estimation + SA = 145ms for 27K hosts

62 Limitations Dynamic workloads, large flow turnover faster than control loop Scheduler will be continually chasing the traffic matrix Need to include penalty term for unnecessary SA flow re-assignments ECMP Hedera Stable Matrix Stability Unstable Flow Size

63 Conclusion Rethinking networking Significant momentum
Open interfaces to the data plane Separation of control and data Leveraging techniques from distributed systems Significant momentum In both research and industry


Download ppt "CMPE 252A : Computer Networks"

Similar presentations


Ads by Google