TOWARDS AN ELASTIC DISTRIBUTED SDN CONTROLLER Advait Dixit, Fang Hao, Sarit Mukherjee, T.V. Lakshman, Ramana Kompella
Physical Network Infrastructure SDN Control Plane Distributed Control Plane Single point of failure Performance bottleneck
Spatial Partitioning Overload
Growing the Control Plane
Shrinking the Control Plane
Goals Build a distributed control plane which Load balances Grows Shrinks This requires Load estimation at controllers Switch migration protocol
Naïve Switch Migration SLAVEMASTER SLAVE Role Change to Master
Problem With Naïve Switch Migration Packet-In MASTER SLAVE Packet-Out Role Change to Master Packet-Out from Slave is dropped
Migration Protocol Requirements Safety: Exactly 1 controller processes every message from the switch Liveness: For each switch, at least 1 controller is active at all times Openflow compliant
Flow-AddFlow-Delete Flow-Removed 4-Phase Switch Migration Protocol MASTERSLAVE Flow-Removed EQUAL Role Change to Master SLAVEMASTER Role Change to Equal Barrier Request Barrier Reply Phase 1: Change from Slave to Equal Phase 2: Insert and remove dummy flow Phase 3: Flush in- flight message Phase 4: Change from Equal to Master
A Mininet Testbed Problem: Cannot generate sufficient traffic for a large distributed control plane veth Pair OpenvSwitch veth Pair OpenvSwitch veth Pair Emulation Host
A Multi-Host Mininet Testbed OpenvSwitch Emulation Host OpenvSwitch GRE Tunnel ` Emulation Host
Evaluation
Next Step: ElastiCon Physical Network Infrastructure Core Controller Module Application 1 Application 2 Core Controller Module Application 1 Application 2 Distributed Data Store (e.g., Hazelcast) Node 1 Load Measurements Load Balance Scale Up Scale Down Scale Up Scale Down Load Adaptation Decisions Distributed SDN Control Plane Node 2 Actions: Migrate switch Remove controller Add controller Actions: Migrate switch Remove controller Add controller
THANK YOU
Distributed Control Plane in a Datacenter Median flow arrival rate requires 1-5 controllers, peak requires 150* Distributed control plane should grow and shrink Flow arrival rates vary across switches and time Need a switch migration (handover) protocol * Calculations based on Benson et al, IMC 2010
Distributed SDN Control Plane Physical Network Infrastructure Core Controller Module Distributed Data Store (e.g., Hazelcast) Controller Node 1Controller Node 2 Application 1Application 2 Core Controller Module Application 1Application 2 Hazelcast Client Stub
Next Steps Build a control loop and algorithms for dynamically: Changing switch-controller mapping based Growing and shrinking distributed controller Thanks! Questions?
Evaluation
Need title for this slide For a data center with 100K hosts: Peak flow arrival rate = 300M* flows/sec Median flow arrival rate = M* flows/sec Impossible to predict flow arrival rates at a switch Implications: Distributed controller needs to grow and shrink Need a switch migration protocol * From Benson et al, IMC 2010
New Problems in Distributed SDN Controllers How to manage distributed state? Where to place controllers? How to write distributed controller applications? How many controllers? Which switch connects to which controller?
How to manage distributed state? Where to place controllers? How to write distributed controller applications? How many controllers? Which switch connects to which controller?
Distributed SDN Controller Distributed Global Network State Physical Network Infrastructure Application 1 Application 2 Application 3 Application 4
Problem Statement How many servers? How to determine switch- controller mapping?
Naïve Switch Migration//Change title? SLAVEMASTER SLAVE Role Change to Master SLAVE