DOT – Distributed OpenFlow Testbed
Motivation Mininet is currently the de-facto tool for emulating an OpenFlow enabled network However, the size of network and amount of traffic are limited by the hardware resources of a single machine Our recent experiments with Mininet show that it can cause Flow serialization of otherwise parallel flows Many flows co-exist and compete for switch resources as transmission rates are limited by the CPU Process for running parallel iperf servers and clients is not trivial
Objective Run large scale emulations of an OpenFlow enabled networks and Avoid/reduce flow serialization and contention introduced by the emulation environment Enable emulation of large amounts of traffic
DOT Emulation Embedding algorithm partitions the logical network into multiple physical hosts Intra-host virtual link Eembedded inside a single host Cross-host link Connects switches located at different hosts Gateway Switch (GS) is added to each active physical host to emulate link delay of the cross-host links The augmented network with GS is called physical network SDN controller operates on the logical network
Embedding of Logical Network Emulated Network Cross-host links Two Physical Machines Embedding algorithm partitions the emulated network into several physical hosts. Our heuristic minimizes the number of physical hosts and cross-host links and considers the resource constraints. This embedding guarantees resource requirements like CPU, memory, and link bandwidth Physical Host 1 Physical Host 2
Embedding Cross-host Links a Virtual Switch (VS) b Physical Embedding Each active physical host contains a Gateway Switch (GS). A cross-host link is divided into two segments. For example, cross-host link a is divided into two segments a’ and a’’ Each segment is connected to the GS of its physical host. For example, a’ is connected to GS1 a’ a” b’ b” Gateway switches
SDN Controller’s View SDN Controller Controller’s View
Software Stack of a DOT Node Virtual Interface Virtual Link VMs are used for generating traffic. Hypervisor layer is responsible for provisioning VMs. VSs and GSs are instances of OpenFlow enabled virtual switch (e.g., OpenVSwitch) Physical Link OpenFlow Switch
Gateway Switch Gateway Switch A DOT component One gateway switch per active physical host Is attached with the physical NIC of the machine Facilitates inter-physical host packet transfer Enables emulation of delays in cross-host links Oblivious of the forwarding protocol used in the emulated network
Simulating Delay of the cross host links Link delay Emulated Network (Only the cross-host links are shown) Physical Embedding Only one of the segments of a cross-host link will simulate delay
Simulating delay A->F B->E D->E Scenario explains three packets are being sent over three different cross host links: A-F, B-E, and D-E. D->E
Simulating delay Now, GS2 has to forward the packet through particular link even if the next hop (e.g., B->E and D->E) is same. A->F B->E D->E When a packet is received at a Gateway Switch through its physical interface, it should identify the remote segment through which it was previously forwarded
Solution of Traffic Forwarding at the Gateway Switch Mac Rewriting Tagging Tunnel with tag
Approach 1: MAC Rewrite Each GS maintains IP to MAC address mapping of all VMs When a packet arrives at a GS through logical links, it replaces The source MAC with its receiving port MAC This enables the remote GS to identify the segment through which the packet has been forwarded The destination MAC with the destination physical host’s physical NIC’s MAC This enables unicast of the packet through physical switching fabric When a GS receives a packet from the physical interface It checks the source MAC to identify the corresponding segment through which it should forward the packet Before forwarding, it replaces the source and destination MAC by inspecting the IP address field of the packet
Approach 1: MAC Rewriting SDN Controller MAC (src, dst) IP (src, dst) VM2, VM1
Approach 1: MAC Rewriting SDN Controller
Approach 1: MAC Rewriting SDN Controller MAC IP VM2, VM1
Approach 1: MAC Rewriting SDN Controller MAC IP VM2, VM1
Approach 1: MAC Rewriting SDN Controller MAC IP VM2, VM1
Approach 1: MAC Rewriting SDN Controller Controller’s View MAC IP VM2, VM1 PB PM1 PD PC PM2 PE
Approach 1: MAC Rewriting GS1 GS2 Outward Traffic If(receiving port PB) srcMac←PB ,dstMac←PM2 If(receiving port PC) srcMac←PC ,dstMac←PM2 Output: PM1 If(receiving port PD) srcMac←PD ,dstMac←PM1 If(receiving port PE) srcMac←PE ,dstMac←PM1 Output: PM2 Inward Traffic If(srcMAC= PD) output: PB If(srcMAC = PE) output: PC Restore MAC by inspecting IP If(srcMAC= PB) output: PD If(srcMAC = PC) output: PE Controller’s View MAC IP VM2, VM1 PB PM1 PD PC PM2 PE
Approach 1: MAC Rewriting GS1 GS2 Outward Traffic If(receiving port PB) srcMac←PB ,dstMac←PM2 If(receiving port PC) srcMac←PC ,dstMac←PM2 Output: PM1 If(receiving port PD) srcMac←PD ,dstMac←PM1 If(receiving port PE) srcMac←PE ,dstMac←PM1 Output: PM2 Inward Traffic If(srcMAC= PD) output: PB If(srcMAC = PE) output: PC Restore MAC by inspecting IP If(srcMAC= PB) output: PD If(srcMAC = PC) output: PE Controller’s View MAC IP VM2, VM1 PB PM1 PD PC PM2 PE
Approach 1: MAC Rewriting GS1 GS2 Outward Traffic If(receiving port PB) srcMac←PB ,dstMac←PM2 If(receiving port PC) srcMac←PC ,dstMac←PM2 Output: PM1 If(receiving port PD) srcMac←PD ,dstMac←PM1 If(receiving port PE) srcMac←PE ,dstMac←PM1 Output: PM2 Inward Traffic If(srcMAC= PD) output: PB If(srcMAC = PE) output: PC Restore MAC by inspecting IP If(srcMAC= PB) output: PD If(srcMAC = PC) output: PE Controller’s View MAC IP PD, PM1 VM2, VM1 PB PM1 PD PC PM2 PE
Approach 1: MAC Rewriting GS1 GS2 Outward Traffic If(receiving port PB) srcMac←PB ,dstMac←PM2 If(receiving port PC) srcMac←PC ,dstMac←PM2 Output: PM1 If(receiving port PD) srcMac←PD ,dstMac←PM1 If(receiving port PE) srcMac←PE ,dstMac←PM1 Output: PM2 Inward Traffic If(srcMAC= PD) output: PB If(srcMAC = PE) output: PC Restore MAC by inspecting IP If(srcMAC= PB) output: PD If(srcMAC = PC) output: PE Controller’s View MAC IP PD, PM1 VM2, VM1 PB PM1 PD PC PM2 PE
Approach 1: MAC Rewriting GS1 GS2 Outward Traffic If(receiving port PB) srcMac←PB ,dstMac←PM2 If(receiving port PC) srcMac←PC ,dstMac←PM2 Output: PM1 If(receiving port PD) srcMac←PD ,dstMac←PM1 If(receiving port PE) srcMac←PE ,dstMac←PM1 Output: PM2 Inward Traffic If(srcMAC= PD) output: PB If(srcMAC = PE) output: PC Restore MAC by inspecting IP If(srcMAC= PB) output: PD If(srcMAC = PC) output: PE Controller’s View MAC IP PD, PM1 VM2, VM1 PB PM1 PD PC PM2 PE
Approach 1: MAC Rewriting GS1 GS2 Outward Traffic If(receiving port PB) srcMac←PB ,dstMac←PM2 If(receiving port PC) srcMac←PC ,dstMac←PM2 Output: PM1 If(receiving port PD) srcMac←PD ,dstMac←PM1 If(receiving port PE) srcMac←PE ,dstMac←PM1 Output: PM2 Inward Traffic If(srcMAC= PD) output: PB If(srcMAC = PE) output: PC Restore MAC by inspecting IP If(srcMAC= PB) output: PD If(srcMAC = PC) output: PE Controller’s View MAC IP VM2, VM1 PB PM1 PD PC PM2 PE
Approach 1: MAC Rewriting SDN Controller Controller’s View MAC IP VM2, VM1 PB PM1 PD PC PM2 PE
Approach 1: MAC Rewriting SDN Controller MAC IP VM2, VM1 Controller’s View PB PM1 PD PC PM2 PE
Approach 1: MAC Rewriting Advantages Packet size remains same No change is required in the physical switching fabric Limitations Needs to maintain all IP to MAC address mapping in each of the GSs. Not scalable
Approach 2: Tunnel with Tag An unique id is assigned to each cross-host link When a packet arrives at a GS through internal logical links It encapsulates the packet with any tunneling protocol (eg. GRE) The destination address is the IP Address of the physical host address An tag equal to the id of the cross-host link is assigned to the packet (using tunnel id field of GRE) When an GS receives a packet from the physical interface It checks the tag (tunnel id) field to identify the outgoing segment It forwards the packet after decapsulating the tunnel header.
Approach 2: Tunnel with Tag SDN Controller Cross-host link id #1 #2 Controller’s View MAC IP VM2, VM1 PB PM1 PD PC PM2 PE
Approach 2: Tunnel with Tag GS1 GS2 Outward Traffic If(receiving port PB) tunnelID←1 Use tunnel to Machine 2 If(receiving port PC) tunnelID←2 If(receiving port PD) Use tunnel to Machine 1 If(receiving port PE) Inward Traffic If(tunnelID=1) output: PB If(tunnelID=2) output: PC output: PD output: PE SDN Controller Cross-host link id #1 #2 Controller’s View MAC IP VM2, VM1 PB PM1 PD PC PM2 PE
Approach 2: Tunnel with Tag SDN Controller #1 Header for encapsulation Original Packet #2 Controller’s View TID= Tunnel ID MAC IP TID PM1, PM2 #1 MAC IP VM2, VM1 PB PM1 PD PC PM2 PE
Approach 2: Tunnel with Tag SDN Controller #1 #2 Controller’s View TID= Tunnel ID MAC IP TID PM1, PM2 #1 MAC IP VM2, VM1 PB PM1 PD PC PM2 PE
Approach 2: Tunnel with Tag GS1 GS2 Outward Traffic If(receiving port PB) tunnelID←1 Use tunnel to Machine 2 If(receiving port PC) tunnelID←2 If(receiving port PD) Use tunnel to Machine 1 If(receiving port PE) Inward Traffic If(tunnelID=1) output: PB If(tunnelID=2) output: PC output: PD output: PE SDN Controller Cross-host link id #1 #2 Controller’s View MAC IP VM2, VM1 PB PM1 PD PC PM2 PE
Approach 2: Tunnel with Tag Advantages No change is required in the physical switching fabric No GS need to know IP-MAC address mapping Rule set in GS is the order of cross-host link Scalable solution Limitations Lowers the MTU Due to the scalability issue, we choose this solution
Emulating Bandwidth Configured for each logical link Using Linux tc command Maximum bandwidth for a cross-host link is bounded by the physical switching capacity Maximum bandwidth of an internal link is capped by the processing capability of the physical host
DOT: Summary Can emulates OpenFlow network with Traffic forwarding Specific link delay Bandwidth Traffic forwarding General OpenVSwitch Forwards traffic as instructed by the Floodlight controller Gateway Switches Instances of OpenVSwitch Forwards traffic based on pre-configured flow rules
Technology used so far OpenVSwitch : Version 1.8 Rate limit is configured in each port Floodlight Controller: Version 0.9 Custom modules added Static Network Loader, ARP Resolver Hypervisor Qemu-KVM Link delays are simulated using tc (Linux traffic control)