Scaling The Edge Bridge Address Table In Datacenter Networks June-2012
Agenda Motivation Protocol properties, concepts and operation Protocol details 2
Motivation
4 Problem Statement Address learning methods Control plane learning Data plane learning Data-plane learning is simpler than control plane learning yet, it leads to bad scaling of forwarding tables Question: can we have both data-plane learning simplicity and forwarding tables scaling?
B VMEB/ Port Overlay Network 5 Dataplane Learning On Edge Bridges (EB) VM1 BC (e.g. ARP Request) VM1 VM2 A VMEB/ Port C VMEB/ Port BC1 D S 1 A.1 BC1 D S A 1 D S A 1 A 1 A 1 D S 1 D S 1 D S 1 D S Dataplane learning EB table size = # of VMs in the VLAN/Tenant Domain Severe FDB Scaling Problem in EB
Protocol Concepts and Operation
Properties of The Proposed Solution Bridge address table scaling for data-center networks with support for hot VM migration FDB size = # of EBs in the network + # of locally attached VMs Layer-2 only No higher layers awareness End point (Hypervisor) is blind to overlay network protocol Can work with any overlay protocol 7
Protocol Concepts The protocol defines Data-plane format between the hypervisor and the Edge-Bridge –Modify 802.1BR or extend 802.1Qbg Control-plane negotiates the protocol capabilities between the EB and the hypervisor –Extend DCBX 802.1Qaz Protocol concepts A handshake between the EB and the hypervisor –Capabilities exchange using control-plane –Dynamic operation uses the data-plane EB –Learns addresses of local VMs & remote EBs –Uses data-plane signaling to informs the hypervisor of the path in the overlay network –Uses the path signaled by the hypervisor to forward traffic to remote VMs over the overlay network Hypervisor –Sends data traffic to EB with path indication –Updates its path database (Path$) using the indications received from the EB 8
9 Protocol Databases and Signaling VM1 VM2 B VMPort D S D S B EB 1A 2B 3C A.1$ VMPath D S S.Path Generated by VM D S T.Path D S Server EB Overlay Network EB Server Rx by VM EB Hypervisor Path$ Overlay FDB Local FDB
A EB 1A 2B 3C 10 Protocol Operation #1 VM1 VM2 flooded Unicast forwarding VM1 VM2 A VMPort C VMPort B VMPort 21 D S 1 A.1 21 D S BCA 21 D S A Dataplane learning EB table size = # of local VMs + # of EBs in the network C EB 1A 2B 3C B 1A 2B 3C A.1$ VMPath B.1$ VMPath 21 D S 1 s.Path 21 D S 1 21 D S 1 21 D S Learn only in B.1
A EB 1A 2B 3C 11 Protocol Operation #2 VM2 VM1 reply VM1 VM2 A VMPort C VMPort B VMPort 21 S D 1 A.1 BA D S 21 Dataplane learning EB table size = # of local VMs + # of EBs in the network C EB 1A 2B 3C B 1A 2B 3C A.1$ VMPath B.1$ VMPath 11 D T.Path 2 S D S.Path 2 S 21 S D B.1
Properties Of Hypervisor Path$ Acts like ARP$ - holds active sessions only Inactive entries are aged-out Not contaminated by ARP-BC received from the network Path$ entry insert/update ETH DA is UC/MC and conforms to a VM hosted by this hypervisor, OR ETH DA is BC and the Layer-3 DA conforms to a VM hosted by this hypervisor Path$ entry delete/refresh Using an activity timer 12
Protocol Details
14 Protocol Details Control protocol Capabilities negotiation between the Hypervisor and the Edge Bridge Modify 802.1Qaz (DCBx) Data-plane protocol (2 options) Add Path-ID Tag (P-Tag) –S-channel/E-Tag is outer –P-Tag is inner: –16b source/target-path-id –Source/target depends on direction Modify BPE E-Tag –Hypervisor EB –I-ECID – identical use to BPE –E-CID – target-path-id –EB Hypervisor –I-ECID –I-ECID < 4K local virtual port (identical to BPE) –I-ECID =>4K source-path-id –E-CID – identical use to BPE DA (6B) SA (6B) S-Channel /E-Tag (4B) P-Tag (4B)VLAN (4B) Payload + FCC
Summary of Protocol Properties Bridge address table scaling for data-center networks with support for hot VM migration FDB size = # of EBs in the network + # of locally attached VMs Layer-2 only No higher layers awareness Hypervisor is blind to overlay network protocol Can work with any overlay protocol Easy to implement Local scope: hypervisor to edge-bridge protocol Simple control-plane – only need to negotiate capabilities –Extend DCBX 802.1Qaz Simple extension of existing data-plane protocols –Modifies 802.1BR E-Tag or extends 802.1BR/802.1Qbg with a P-Tag Easy to deploy Co-exists with 802.1Qbg/802.1BR protocols Support for incremental upgrade in per EB granularity 15
Detailed Packet Walkthrough Identical To The Animation
Walkthrough in a Nutshell (VM1 VM2) #1 VM1 VM2 (VM2 ETH address is known to VM1) and back Initial state: all FDBs are empty Hypervisor hosting VM1 Receive packet from VM1 If VM2 is registered in Path$, forward with the registered T.Path Else forward with T.Path=BC EB-A Learn on FDB-A (VM1,A.1) T.Path=BC Flood to Overlay and to local ports EB-B Replace tunnel-header with S.Path=A Forward to VM1 if VM1 is registered in FDB-B Else flood to local ports Hypervisor hosting VM2 Receive the packet and update Path$ (VM1,Path=A) if: –ETH DA conforms to a VM hosted by this hypervisor, OR –ETH DA is BC and the Layer-3 DA conforms to a VM hosted by this hypervisor Pass packet to VM2 if any of the above conditions is true 17 VM1 VM2
Walkthrough in a Nutshell (VM2 VM1) #2 Hypervisor hosting VM2 Receive packet from VM2 VM1 is registered in Path$ send with T.Path=A EB-B Learn on FDB-B (VM2,B.1) Send over Path A to EB-A EB-A Replace tunnel-header with S.Path=B VM1 is registered in FDB-A (thanks to VM1 VM2 path) Forward to VM1 Hypervisor hosting VM2 Receive the packet and update Path$ (VM2,Path=B) if: –ETH DA conforms to a VM hosted by this hypervisor, OR –The Layer-3 DA conforms to a VM hosted by this hypervisor Pass packet to VM1 if any of the above conditions is true 18 VM1 VM2
Thank you