Download presentation
Presentation is loading. Please wait.
1
CMPE 252A : Computer Networks
An Introduction to Software Defined Networking (SDN) & OpenFlow Huazhe Wang Some slides are borrowed from
2
The Internet: A Remarkable Story
Tremendous success From research experiment to global infrastructure Brilliance of under-specifying Network: best-effort packet delivery Hosts: arbitrary applications Enables innovation in applications Web, P2P, VoIP, social networks, virtual worlds But, change is easy only at the edge…
3
Inside the ‘Net: A Different Story…
Closed equipment Software bundled with hardware Vendor-specific interfaces Over specified Slow protocol standardization Few people can innovate Equipment vendors write the code Long delays to introduce new features Impacts performance, security, reliability, cost…
4
Networks are Hard to Manage
Operating a network is expensive More than half the cost of a network Yet, operator error causes most outages Buggy software in the equipment Routers with 20+ million lines of code Cascading failures, vulnerabilities, etc. The network is “in the way” Especially a problem in data centers … and home networks
5
Traffic engineering: difficult for traditional routing
5 3 v w 5 2 u 2 1 z 3 1 2 x y 1 Q: what if network operator wants u-to-z traffic to flow along uvwz, x-to-z traffic to flow xwyz? A: need to define link weights so traffic routing algorithm computes routes accordingly (or need a new routing algorithm)! Note: implicit assumption here: destination based forwarding
6
Traffic engineering: difficult
2 1 3 5 v w u z y x Q: what if network operator wants to split u-to-z traffic along uvwz and uxyz (load balancing)? A: can’t do it (or need a new routing algorithm) Note: implicit assumption here: destination based forwarding
7
Traffic engineering: difficult
5 v 3 w v w 5 2 u z 2 z 1 3 1 x x y 2 y 1 Q: what if w wants to route blue and red traffic differently? A: can’t do it (with destination based forwarding, and LS, DV routing) Note: implicit assumption here: destination based forwarding Problem:
8
Rethinking the “Division of Labor”
9
Network-layer functions
Recall: two network-layer functions: forwarding: move packets from router’s input to appropriate router output data plane routing: determine route taken by packets from source to destination control plane Two approaches to structuring network control plane: per-router control (traditional) logically centralized control (software defined networking)
10
Traditional Computer Networks
Data plane: Packet streaming Maybe this is a better picture… though need some details of what happens in each plane… like in next slides… or, have both? Forward, filter, buffer, mark, rate-limit, and measure packets
11
Traditional Computer Networks
Per-router control plane: Distributed algorithms Maybe this is a better picture… though need some details of what happens in each plane… like in next slides… or, have both? Individual routing algorithm components in each and every router interact with each other in control plane to compute forwarding tables Track topology changes, compute routes, install forwarding rules
12
Traditional Computer Networks
Management plane: Human time scale Maybe this is a better picture… though need some details of what happens in each plane… like in next slides… or, have both? Collect measurements and configure the equipment
13
Software Defined Networking (SDN)
Logically-centralized control Smart, slow API to the data plane (e.g., OpenFlow) Dumb, fast Switches A distinct (typically remote) controller interacts with local control agents (CAs) in routers to compute forwarding tables Extract the control part of routers to form a central controller
14
Software defined networking (SDN)
3. control plane functions external to data-plane switches 4.programmable control applications … routing access control load balance data plane control Remote Controller CA 2. control, data plane separation 1: generalized“ flow-based” forwarding (e.g., OpenFlow)
15
Software defined networking (SDN)
Why a logically centralized control plane? easier network management: avoid router misconfigurations, greater flexibility of traffic flows table-based forwarding allows “programming” routers centralized “programming” easier: compute tables centrally and distribute distributed “programming: more difficult: compute tables as result of distributed algorithm (protocol) implemented in each and every router open (non-proprietary) implementation of control plane
16
Analogy: mainframe to PC evolution*
App Specialized Applications Linux Mac OS Windows (OS) or Open Interface Specialized Operating System Microprocessor Open Interface Specialized Hardware Vertically integrated Closed, proprietary Slow innovation Small industry Horizontal Open interfaces Rapid innovation Huge industry * Slide courtesy: N. McKeown
17
SDN perspective: data plane switches
fast, simple, commodity switches implementing generalized data-plane forwarding in hardware switch flow table computed, installed by controller API for table-based switch control (e.g., OpenFlow) defines what is controllable and what is not protocol for communicating with controller (e.g., OpenFlow) data plane control SDN Controller (network operating system) … routing access load balance southbound API northbound API SDN-controlled switches network-control applications
18
SDN perspective: SDN controller
network-control applications SDN controller (network OS): maintain network state information interacts with network control applications “above” via northbound API interacts with network switches “below” via southbound API implemented as distributed system for performance, scalability, fault-tolerance, robustness … routing access control load balance control plane northbound API SDN Controller (network operating system) southbound API data plane SDN-controlled switches
19
SDN perspective: control applications
network-control apps: “brains” of control: implement control functions using lower-level services, API provided by SND controller unbundled: can be provided by 3rd party: distinct from routing vendor, or SDN controller network-control applications … routing access control load balance control plane northbound API SDN Controller (network operating system) southbound API data plane SDN-controlled switches
20
SDN: control/data plane interaction example
S1, experiencing link failure using OpenFlow port status message to notify controller 1 Dijkstra’s link-state Routing 4 5 … network graph RESTful API intent SDN controller receives OpenFlow message, updates link status info 2 3 … Application interface layer statistics flow tables … Link-state info host info switch info Dijkstra’s routing algorithm application has previously registered to be called when ever link status changes. It is called. 3 2 Network-wide management layer … OpenFlow SNMP Communication layer 6 Interface layer to network control apps: abstractions API Network-wide state management layer: state of networks links, switches, services: a distributed database communication layer: communicate between SDN controller and controlled switches 1 s2 Dijkstra’s routing algorithm access network graph info, link state info in controller, computes new routes 4 s1 s4 s3
21
SDN: control/data plane interaction example
Dijkstra’s link-state Routing link state routing app interacts with flow-table-computation component in SDN controller, which computes new flow tables needed 5 4 5 … network graph RESTful API intent 3 … Application interface layer statistics flow tables … Link-state info host info switch info 2 Network-wide management layer Controller uses OpenFlow to install new tables in switches that need updating 6 … OpenFlow SNMP Communication layer 6 Interface layer to network control apps: abstractions API Network-wide state management layer: state of networks links, switches, services: a distributed database communication layer: communicate between SDN controller and controlled switches 1 s2 s1 s4 s3
22
OpenFlow (OF) defines the communication protocol that enables the SDN Controller to directly interact with the forwarding plane of network devices.
23
OpenFlow data plane abstraction
flow: defined by header fields generalized forwarding: simple packet-handling rules Pattern: match values in packet header fields Actions: drop, forward, modify, matched packet or send matched packet to controller Priority: disambiguate overlapping patterns Counters: #bytes and #packets * : wildcard src=1.2.*.*, dest=3.4.5.* drop src = *.*.*.*, dest=3.4.*.* forward(2) 3. src= , dest=*.*.*.* send to controller
24
OpenFlow data plane abstraction
match+action: unifies different kinds of devices Router match: longest destination IP prefix action: forward out a link Switch match: destination MAC address action: forward or flood Firewall match: IP addresses and TCP/UDP port numbers action: permit or deny NAT match: IP address and port action: rewrite address and port
25
Examples Destination-based forwarding: Firewall:
Switch Port MAC src dst Eth type VLAN ID IP Src Dst Prot TCP sport dport Action * * * * * * * * * port6 IP datagrams destined to IP address should be forwarded to router output port 6 Firewall: Switch Port MAC src dst Eth type VLAN ID IP Src Dst Prot TCP sport dport Forward * * * * * * * * * 22 drop do not forward (block) all datagrams destined to TCP port 22 Switch Port MAC src dst Eth type VLAN ID IP Src Dst Prot TCP sport dport Forward drop * * * * * * * * * do not forward (block) all datagrams sent by host
26
OpenFlow protocol operates between controller, switch
TCP used to exchange messages OpenFlow messages
27
OpenFlow: controller-to-switch messages
features: controller queries switch features, switch replies configure: controller queries/sets switch configuration parameters modify-state: add, delete, modify flow entries in the OpenFlow tables packet-out: controller can send this packet out of specific switch port
28
OpenFlow: switch-to-controller messages
packet-in: transfer packet (and its control) to controller. See packet-out message from controller flow-removed: flow table entry deleted at switch port status: inform controller of a change on a port. Fortunately, network operators don’t “program” switches by creating/sending OpenFlow messages directly. Instead use higher-level abstraction at controller
29
OpenFlow in the Wild Open Networking Foundation
Google, Facebook, Microsoft, Yahoo, Verizon, Deutsche Telekom, and many other companies Commercial OpenFlow switches HP, NEC, Quanta, Dell, IBM, Juniper, … Network operating systems NOX, Beacon, Floodlight, OpenDaylight, POX, Pyretic Network deployments Eight campuses, and two research backbone networks Commercial deployments (e.g., Google backbone)
30
Challenges
31
Heterogeneous Switches
Number of packet-handling rules Range of matches and actions Multi-stage pipeline of packet processing Offload some control-plane functionality (?) access control MAC look-up IP look-up Support new protocols or features
32
Controller Delay and Overhead
Controller is much slower the the switch Processing packets leads to delay and overhead Need to keep most packets in the “fast path” Proactive or reactive rule install packets
33
Distributed Controller
Network OS Controller Application Network OS Controller Application For scalability and reliability Partition and replicate state
34
Testing and Debugging OpenFlow makes programming possible
Network-wide view at controller Direct control over data plane Plenty of room for bugs Still a complex, distributed system Need for testing techniques Controller applications Controller and switches Rules installed in the switches
35
Evolution of OpenFlow
36
Evolution of OpenFlow P4 OpenFlow 2.0
OpenFlow is a balancing act, forwarding abstraction balancing general match/action vs. fixed function switch ASICs Instead of repeatedly extending OF standard , implement flexible mechanisms for parsing packets and matching (arbitrary) header fields through common interface
37
Run OpenFlow Experiments
Mininet Open vSwitch (OVS)
38
Hedera: Dynamic Flow Scheduling for Data Center Networks
Mohammad Al-Fares Sivasankar Radhakrishnan Barath Raghavan* Nelson Huang Amin Vahdat UC San Diego *Williams College Slides modified from
39
Problem Key insight/idea Challenges
MapReduce/Hadoop style workloads have substantial BW requirements ECMP-based multi-paths load balancing often leads oversubscription --> jobs bottlenecked by network Key insight/idea Identify large flows and periodically rearrange them to balance the load across core switches Challenges Estimate bandwidth demand of flows Find optimal allocation network paths for flows
40
ECMP Paths Many equal cost paths going up to the core switches
Only one path down from each core switch Randomly allocate paths to flows using hash Agnostic to available resources Long lasting collisions between long (elephant) flows S D
41
Collisions of elephant flows
Collisions possible in two different ways Upward path Downward path S1 S2 S3 D2 S4 D1 D4 D3
42
Hedera Scheduler Detect Large Flows
2. Estimate Flow Demands 3. Schedule Flows 1. Detect Large Flows Detect Large Flows Flows that need bandwidth but are network-limited Estimate Flow Demands Use min-max fairness to allocate flows between src-dst pairs Place Flows Use estimated demands to heuristically find better placement of large flows on the ECMP paths
43
Elephant Detection Scheduler continually polls edge switches for flow byte-counts Flows exceeding B/s threshold are “large” > %10 of hosts’ link capacity (i.e. > 100Mbps) What if only mice on host? Default ECMP load-balancing efficient for small flows
44
Demand Estimation Measured flow rate is misleading
Need to find a flow’s “natural” bandwidth requirement when not limited by the network find max-min fair bandwidth allocation
45
Demand Estimation Given traffic matrix of large flows, modify each flow’s size at it source and destination iteratively… Sender equally distributes bandwidth among outgoing flows that are not receiver-limited Network-limited receivers decrease exceeded capacity equally between incoming flows Repeat until all flows converge Guaranteed to converge in O(|F|) time
46
Demand Estimation
47
Demand Estimation
48
Demand Estimation
49
Demand Estimation
50
Flow Placement Heuristics
Two approaches Global First Fit: Greedily choose path that has sufficient unreserved b/w Simulated Annealing: Iteratively find a globally better mapping of paths to flows
51
Global First-Fit Scheduler ? ? ? Flow A Flow B 1 2 3 Flow C New flow detected, linearly search all possible paths from SD Place flow on first path whose component links can fit that flow
52
Global First-Fit Once flow ends, entries + reservations time out
Scheduler Flow A Flow B 1 2 3 Flow C Once flow ends, entries + reservations time out
53
Simulated Annealing Simulated Annealing:
a numerical optimization technique for searching for a solution in a space otherwise too large for ordinary search methods to yield results Probabilistic search for good flow-to-core mappings
54
Simulated Annealing State: A set of mappings from destination hosts to core switches. Neighbor State: Swap core switches between 2 hosts Within same pod, Within same edge, etc
55
Simulated Annealing Function/Energy: Total exceeded b/w capacity
Using the estimated demand of flows Minimize the exceeded capacity Temperature: Iterations left Fixed number of iterations (1000s) Achieves good core-to-flow mapping Sometimes very close to global optimal
56
Simulated Annealing Example run: 3 flows, 3 iterations Scheduler Core
? ? ? ? Flow A 2 2 2 Flow B 1 1 2 3 Flow C 3 2 Example run: 3 flows, 3 iterations
57
Simulated Annealing Scheduler Core ? ? ? ? Flow A 2 Flow B 1 2 3 Flow C 3 Final state is published to the switches and used as the initial state for next round
58
Simulated Annealing Optimizations
Assign a single core switch to each destination host Incremental calculation of exceeded capacity Using previous iterations best result as initial state The controller insert the optimized rules after each scheduling
59
Evaluation
60
Evaluation Data Shuffle ECMP GFF SA Control 438.4 335.5 336.0 306.4
Total Shuffle Time (s) 438.4 335.5 336.0 306.4 Avg. Completion Time (s) 358.1 258.7 262.0 226.6 Avg. Bisection BW (Gbps) 2.81 3.89 3.84 4.44 Avg. host goodput (MB/s) 20.9 29.0 28.6 33.1 16-hosts: 120 GB all-to-all in-memory shuffle Hedera achieves 39% better bisection BW over ECMP, 88% of ideal non-blocking switch
61
Reactiveness Demand Estimation:
27K hosts, 250K flows, converges < 200ms Simulated Annealing: Asymptotically dependent on # of flows + # iterations 50K flows and 10K iter: 11ms Most of final bisection BW: first few hundred iterations Scheduler control loop: Polling + Estimation + SA = 145ms for 27K hosts
62
Limitations Dynamic workloads, large flow turnover faster than control loop Scheduler will be continually chasing the traffic matrix Need to include penalty term for unnecessary SA flow re-assignments ECMP Hedera Stable Matrix Stability Unstable Flow Size
63
Conclusion Rethinking networking Significant momentum
Open interfaces to the data plane Separation of control and data Leveraging techniques from distributed systems Significant momentum In both research and industry
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.