BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant, Jennifer Rexford 1
Challenges in Edge Networks Edge networks – Enterprise and data center networks Large number of hosts – Tens or even hundreds of thousands of hosts Dynamic hosts – Host mobility, virtual machine migration Cost conscious – Equipment costs, network-management costs 2 Need an easy-to-manage and scalable network
Flat Addresses Self-configuration – Hosts: MAC addresses – Switches: self-learning Simple host mobility – Flat addresses are location-independent But, poor scalability and performance – Flooding frames to unknown destinations – Inefficient delivery of frames over spanning tree – Large forwarding tables (one entry per address) 3
Improving Performance and Scalability Recent proposals on improving control plane – TRILL (IETF), SEATTLE (Sigcomm’08) Scaling information dissemination – Distribute topology and host information intelligently rather than flooding Improving routing performance – Use shortest path routing rather than spanning tree Scaling with large forwarding tables – This talk: BUFFALO 4
Large-scale SPAF Networks SPAF:Shortest Path on Addresses that are Flat – Shortest paths: scalability and performance – Flat addresses: self-configuration and mobility 5 S H S S S S S S S S S S S S H H H H H H H H Data plane scalability challenge – Forwarding table growth (in # of hosts and switches) – Increasing link speed
State of the Art Hash table in SRAM to store forwarding table – Map MAC addresses to next hop – Hash collisions: extra delay Overprovision to avoid running out of memory – Perform poorly when out of memory – Difficult and expensive to upgrade memory 6 00:11:22:33:44:55 00:11:22:33:44:66 aa:11:22:33:44:77 …
Bloom Filters Bloom filters in fast memory (SRAM) – A compact data structure for a set of elements – Calculate s hash functions to store element x – Easy to check membership – Reduce memory at the expense of false positives h 1 (x)h 2 (x)h s (x) x V0V0 V m-1 h 3 (x)
One Bloom filter (BF) per next hop – Store all addresses forwarded to that next hop 8 Nexthop 1 Nexthop 2 Nexthop T …… Packet destination query Bloom Filters hit BUFFALO: Bloom Filter Forwarding
BUFFALO Challenges How to optimize memory usage? Minimize the false-positive rate How to handle false positives quickly? No memory/payload overhead How to handle routing dynamics? Make it easy and fast to adapt Bloom filters 9
BUFFALO Challenges How to optimize memory usage? Minimize the false-positive rate How to handle false positives quickly? No memory/payload overhead How to handle routing dynamics? Make it easy and fast to adapt Bloom filters 10
Optimize Memory Usage Consider fixed forwarding table Goal: Minimize overall false-positive rate – Probability one or more BFs have a false positive Input: – Fast memory size M – Number of destinations per next hop – The maximum number of hash functions Output: the size of each Bloom filter – Larger BF for next-hops with more destinations 11
Constraints and Solution Constraints – Memory constraint Sum of all BF sizes fast memory size M – Bound on number of hash functions To bound CPU calculation time Bloom filters share the same hash functions Proved to be a convex optimization problem – An optimal solution exists – Solved by IPOPT (Interior Point OPTimizer) 12
– Forwarding table with 200K entries, 10 next hop – 8 hash functions – Optimization runs in about 50 msec Minimize False Positives 13
Comparing with Hash Table 14 65% Save 65% memory with 0.1% false positives More benefits over hash table – Performance degrades gracefully as tables grow – Handle worst-case workloads well
BUFFALO Challenges How to optimize memory usage? Minimize the false-positive rate How to handle false positives quickly? No memory/payload overhead How to handle routing dynamics? Make it easy and fast to adapt Bloom filters 15
False Positive Detection Multiple matches in the Bloom filters – One of the matches is correct – The others are caused by false positives 16 Nexthop 1 Nexthop 2 Nexthop T …… Packet destination query Bloom Filters Multiple hits
Handle False Positives Design goals – Should not modify the packet – Never go to slow memory – Ensure timely packet delivery BUFFALO solution – Exclude incoming interface Avoid loops in “one false positive” case – Random selection from matching next hops Guarantee reachability with multiple false positives 17
One False Positive Most common case: one false positive – When there are multiple matching next hops – Avoid sending to incoming interface Provably at most a two-hop loop – Stretch <= Latency(AB)+Latency(BA) 18 False positive A A Shortest path B B dst
Multiple False Positives Handle multiple false positives – Random selection from matching next hops – Random walk on shortest path tree plus a few false positive links – To eventually find out a way to the destination 19 dst Shortest path tree for dst False positive link
Stretch Bound Provable expected stretch bound – With k false positives, proved to be at most – Proved by random walk theories However, stretch bound is actually not bad – False positives are independent – Probability of k false positives drops exponentially Tighter bounds in special topologies – For tree, expected stretch is polynomial in k 20
Stretch in Campus Network 21 When fp=0.001% 99.9% of the packets have no stretch When fp=0.001% 99.9% of the packets have no stretch % packets have a stretch of shortest path length When fp=0.5%, % packets have a stretch 6 times of shortest path length
BUFFALO Challenges How to optimize memory usage? Minimize the false-positive rate How to handle false positives quickly? No memory/payload overhead How to handle routing dynamics? Make it easy and fast to adapt Bloom filters 22
Problem of Bloom Filters Routing changes – Add/delete entries in BFs Problem of Bloom Filters (BF) – Do not allow deleting an element Counting Bloom Filters (CBF) – Use a counter instead of a bit in the array – CBFs can handle adding/deleting elements – But, require more memory than BFs 23
Update on Routing Change Use CBF in slow memory – Assist BF to handle forwarding-table updates – Easy to add/delete a forwarding-table entry 24 CBF in slow memory BF in fast memory Delete a route
Occasionally Resize BF Under significant routing changes – # of addresses in BFs changes significantly – Re-optimize BF sizes Use CBF to assist resizing BF – Large CBF and small BF – Easy to expand BF size by contracting CBF Hard to expand to size 4 CBF BF Easy to contract CBF to size 4
BUFFALO Switch Architecture 26 – Prototype implemented in kernel-level Click
Prototype Evaluation Environment – 3.0 GHz 64-bit Intel Xeon – 2 MB L2 data cache, used as fast memory size M Forwarding table – 10 next hops – 200K entries 27
Prototype Evaluation Peak forwarding rate – 365 Kpps, 1.9 μs per packet – 10% faster than hash-based EtherSwitch Performance with FIB updates – 10.7 μs to update a route – 0.47 s to reconstruct BFs based on CBFs On another core without disrupting packet lookup Swapping in new BFs is fast 28
Conclusion BUFFALO – Improves data plane scalability in SPAF networks – To appear in CoNEXT 2009 Three properties: – Small, bounded memory requirement – Gracefully increase stretch with the growth of forwarding table – Fast reaction to routing updates 29
Ongoing work More theoretical analysis of stretch – Stretch in weighted graph – Stretch in ECMP (Equal Cost Multi Path) – Analysis on average stretch Efficient access control in SPAF network – More complicated than forwarding table – Scaling access control using OpenFlow – Leverage properties in SPAF network 30
Thanks! Questions? 31