Presentation is loading. Please wait.

Presentation is loading. Please wait.

Othello Hashing and Its Applications for Network Processing

Similar presentations


Presentation on theme: "Othello Hashing and Its Applications for Network Processing"— Presentation transcript:

1 Othello Hashing and Its Applications for Network Processing
Chen Qian Department of Computer Science and Engineering, UCSC Publications in ICNP’17, SIGMETRICS’17, MECOMM’17, IEEE/ACM Transactions on Networking, Genome Biology, and Bioinformatics

2 Network-layer functions
forwarding: move packets from router’s input to appropriate router output routing: determine route taken by packets from source to dest. routing algorithms analogy: routing: process of planning trip from source to dest forwarding: process of getting through single interchange

3 Othello Hashing Essentially a key-value lookup structure
Keys can be any names, addresses, identifiers, etc. Values should not be too long. At most 64 bits. For example Layer-3 routing: Key: network address; Value: link to forward a packet Layer-4 Load balancer: Key: virtual IP; Value: direct IP 3

4 Theoretical basis: Minimal Perfect Hashing
Why Othello is special Minimal query time: only two memory read operations (cache lines) per query. Minimal memory cost: 10%-30% of existing hash tables (e.g., Cuckoo). Support dynamic updates: can be updated over a million times per second. Theoretical basis: Minimal Perfect Hashing 4

5 Idea of dynamic Othello lookups
Controller Program Update via existing API of programmable networks Construct Update K-V Lookup Lookup Optimize memory and query cost 5

6 How Othello works Basic version: Classifies keys to two sets 𝑋 and 𝑌
Equivalent to key lookups for a 1-bit value Query result 𝜏 𝑘 =0  𝑘∈𝑋 𝜏 𝑘 =1 𝑘∈𝑌 Advance version: Classifies keys to 2 𝑙 sets Equivalent to key lookups for a 𝑙-bit value 6

7 Othello Query Structure
𝑛 is # of keys Two bitmaps 𝑎,𝑏 with size 𝑚 (𝑚 in (1.33𝑛, 2𝑛)) ℎ 𝑎 ■ 𝑎 Query is easy. Then how to construct it? 1 𝑏 1 ℎ 𝑏 ■ 𝑚 bits ■ is in set Y 𝜏 ■ =0⊕1=1 7

8 Othello Control Structure: Construct
𝐺: acyclic bipartite graph ℎ 𝑎 𝑎 𝑘 ℎ 𝑎 (𝑘) ℎ 𝑏 (𝑘) 6 5 𝒖𝟎 𝒖𝟏 𝒖𝟐 𝒖𝟑 𝒖𝟒 𝒖𝟓 𝒖𝟔 𝒖𝟕 𝒗𝟎 𝒗𝟏 𝒗𝟐 𝒗𝟑 𝒗𝟒 𝒗𝟓 𝒗𝟔 𝒗𝟕 𝑏 ℎ 𝑏 8

9 For n names, the time to find G is proved to be O(n).
Othello Construct ℎ 𝑎 If a cycle is found, use another pair <ha, hb> until an acyclic graph is built 𝑎 𝑘 ℎ 𝑎 (𝑘) ℎ 𝑏 (𝑘) 6 5 1 2 3 4 𝒖𝟎 𝒖𝟏 𝒖𝟐 𝒖𝟑 𝒖𝟒 𝒖𝟓 𝒖𝟔 𝒖𝟕 For n names, the time to find G is proved to be O(n). 𝒗𝟎 𝒗𝟏 𝒗𝟐 𝒗𝟑 𝒗𝟒 𝒗𝟓 𝒗𝟔 𝒗𝟕 𝑏 ℎ 𝑏 9

10 Compute Bitmap 𝑎 𝑏 𝒖𝟎 𝒖𝟏 𝒖𝟐 𝒖𝟑 𝒖𝟒 𝒖𝟓 𝒖𝟔 𝒖𝟕 𝒗𝟎 𝒗𝟏 𝒗𝟐 𝒗𝟑 𝒗𝟒 𝒗𝟓 𝒗𝟔 𝒗𝟕 𝑘
ℎ 𝑎 (𝑘) ℎ 𝑏 (𝑘) set 6 5 Y 1 2 3 4 1 𝒖𝟎 𝒖𝟏 𝒖𝟐 𝒖𝟑 𝒖𝟒 𝒖𝟓 𝒖𝟔 𝒖𝟕 𝒗𝟎 𝒗𝟏 𝒗𝟐 𝒗𝟑 𝒗𝟒 𝒗𝟓 𝒗𝟔 𝒗𝟕 𝑏 10

11 If G is acyclic, easy to find a coloring plan
Compute Bitmap 𝑎 𝑘 ℎ 𝑎 (𝑘) ℎ 𝑏 (𝑘) set 6 5 Y 1 X 2 3 4 1 1 𝒖𝟎 𝒖𝟏 𝒖𝟐 𝒖𝟑 𝒖𝟒 𝒖𝟓 𝒖𝟔 𝒖𝟕 𝒗𝟎 𝒗𝟏 𝒗𝟐 𝒗𝟑 𝒗𝟒 𝒗𝟓 𝒗𝟔 𝒗𝟕 𝑏 1 1 If G is acyclic, easy to find a coloring plan 11

12 L-Othello functionality
Classifies names into 2 𝑙 sets: 𝑍 0 , 𝑍 1 ,⋯, 𝑍 2 𝑙 −1 𝑋 2 𝑌 2 𝑍 2 l Othellos can classify names to 2l sets 𝑋 1 𝑌 1 𝑍 0 𝑍 3 l < 8 for network devices 𝑍 1 12

13 Example Classify keys in >2 sets? 𝑍 0 , 𝑍 1 ,⋯, 𝑍 7
Orthogonal separation of sets 𝑋 3 = 𝑍 0 ∪ 𝑍 1 ∪ 𝑍 2 ∪ 𝑍 3 ; 𝑌 3 = 𝑍 4 ∪ 𝑍 5 ∪ 𝑍 6 ∪ 𝑍 7 . 𝑋 2 = 𝑍 0 ∪ 𝑍 1 ∪ 𝑍 4 ∪ 𝑍 5 ; 𝑌 2 = 𝑍 2 ∪ 𝑍 3 ∪ 𝑍 6 ∪ 𝑍 7 . 𝑋 1 = 𝑍 0 ∪ 𝑍 2 ∪ 𝑍 4 ∪ 𝑍 6 ; 𝑌 1 = 𝑍 1 ∪ 𝑍 3 ∪ 𝑍 5 ∪ 𝑍 7 . 6=(110)2 𝑘∈ 𝑌 3 ∩ 𝑌 2 ∩ 𝑋 1 ⇒𝑘∈ 𝑍 6 𝑙 Othellos : classify keys in 2 𝑙 sets. 13

14 Different coloring plan and bitmaps
𝑎 1 𝑎 2 1 1 Same G, ha, hb. Different coloring plan and bitmaps 𝒖𝟎 𝒗𝟎 𝒖𝟏 𝒗𝟏 𝒖𝟐 𝒗𝟐 𝒖𝟑 𝒗𝟑 𝒖𝟓 𝒗𝟓 𝒖𝟔 𝒗𝟔 𝒖𝟕 𝒗𝟕 𝒖𝟒 𝒗𝟒 𝒖𝟎 𝒗𝟎 𝒖𝟏 𝒗𝟏 𝒖𝟐 𝒗𝟐 𝒖𝟑 𝒗𝟑 𝒖𝟓 𝒗𝟓 𝒖𝟔 𝒗𝟔 𝒖𝟕 𝒗𝟕 𝒖𝟒 𝒗𝟒 Do we need 2l memory reads to query l Othellos? 𝑏 1 𝑏 2 1 1 Othello 1 Same X UY Othello 2 14

15 CPUs can read l bits at one time
𝐴[0] 𝐴[1] ℎ 𝑎 𝐴 𝑎 1 𝑎 2 1 1 𝒖𝟎 𝒗𝟎 𝒖𝟏 𝒗𝟏 𝒖𝟐 𝒗𝟐 𝒖𝟑 𝒗𝟑 𝒖𝟓 𝒗𝟓 𝒖𝟔 𝒗𝟔 𝒖𝟕 𝒗𝟕 𝒖𝟒 𝒗𝟒 𝜏 𝑘 =01⊕10= 11 2 k is in set Z3 𝒖𝟎 𝒗𝟎 𝒖𝟏 𝒗𝟏 𝒖𝟐 𝒗𝟐 𝒖𝟑 𝒗𝟑 𝒖𝟓 𝒗𝟓 𝒖𝟔 𝒗𝟔 𝒖𝟕 𝒗𝟕 𝒖𝟒 𝒗𝟒 𝑏 1 𝑏 2 1 1 𝐵 CPUs can read l bits at one time ℎ 𝑏 Othello 1 Othello 2 15

16 Drawbacks - Alien keys What is 𝜏 𝑘 =𝑎 ℎ 𝑎 𝑘 ⊕𝑏[ ℎ 𝑏 𝑘 ] when 𝑘 is not in 𝑆? An arbitrary value 𝜏 𝑘 return 1 with when 𝑎 𝑖 =1 && 𝑏 𝑗 =0, or 𝑎 𝑖 =0&& 𝑏 𝑗 =1 16

17 Drawbacks - Low load factor
The expected ratio of empty slots in 𝐴 and 𝐵: 𝜖 𝑎 = 𝑚 𝑎 −1 𝑚 𝑎 𝑛 ≈ 𝑒 − 𝑛 𝑚 𝑎 = 𝑒 −1 ≈0.368 𝜖 𝑏 = 𝑚 𝑏 −1 𝑚 𝑏 𝑛 ≈ 𝑒 − 𝑛 𝑚 𝑏 = 𝑒 − 3 4 ≈0.472 The overall # of empty slots and empty ratio: 𝑛 𝑒 = 𝜖 𝑎 𝑚 𝑎 + 𝜖 𝑏 𝑚 𝑏 ≈0.368𝑛+0.472×1.33𝑛≈𝑛 𝑟 𝑒 ≈1 For 𝑛 keys, 𝑛 slots are totally empty 0.33𝑛 occupied slots are redundant 17

18 Applications of Othello
1. Forwarding Information Base (FIB) 2. Software load balancer 3. Data placement and lookup 4. Private queries 5. Genomic sequencing search And more… 18

19 A Concise FIB Resolving FIB explosion is crucial
For layer-two interconnected data centers For OpenFlow-like fine-grained flow control Concise using l-Othello is a portable solution In hardware devices Or software switches A Fast, Small, and Dynamic Forwarding Information Base, In ACM SIGMETRICS 2017 A Concise Forwarding Information Base for Scalable and Fast Name Switching, in IEEE ICNP 2017. 19

20 Network-wide updating
If all devices share a same set of network names/addresses Such as in layer-two Ethernet-based data centers All Othellos will share a same G. Hence network-wide updating is very efficient! Update consistency also provided 20

21 Implementation of three prototypes
1. Memory mode Query and control structures running on different threads. 2. CLICK modular router 3. Intel Data Plane Development Kit (DPDK) 21

22 Comparison: Buffalo Cuckoo hashing
Yu, Fabrikant, Rexford, in CoNEXT’09 Zhou, Fan, Lim, Kaminsky, Andersen, in CoNEXT’13 and SIGCOMM’15 22

23 Comparison: Memory size
FIB Example Memory Size Name Type # Names # Actions Concise Cuckoo Buffalo MAC (48 bits) 7*105 16 1M 5.62M 2.64M 5*106 256 16M 40.15M 27.70M 3*107 128M 321.23M 166.23M IPv4 (32 bits) 1*106 2M 4.27M 3.77M IPv6 (128 bits) 2*106 8M 34.13M 11.08M OpenFlow (356 bits) 3*105 14.46M 1.67M 1.4*106 65536 67.46M 18.21M File name (varied) 359194 512K 19.32M 1.35M 23

24 Query speed 2x to 4x speed advantage 24

25 Update speed 25

26 For unknown network names
1. For data centers with most internal traffic Such situation is rare 2. For networks with much incoming traffic A filter can be installed at a firewall 3. Concise may include an r-bit checksum. A lookup still requires 2 memory accesses in total, as long as l + r <= 64. 26

27 Concury: Consistent, Fast, and Scalable Layer-4 Load Balancing
Shouqian Shi (University of California Santa Cruz) Ye Yu (Google) Xin Li (University of California Santa Cruz) Ying Zhang (Facebook) Xiaozhou Li (Barefoot Networks) Chen Qian (University of California Santa Cruz)

28 Data center Web service providers like Facebook are using far more than one server to react to user requests directed to single public IP.

29 Layer-4 load balancing ……
VIP1 DIP4 L4 Load Balancer facebook.com Public IP: VIP1 – Virtual IP …… VIP2 DIP1 DIP2 DIP3 DIP4 DIP5 – Direct IP static.fsnc1-1.fna.fbcdn.net Layer-4 load balancing is a critical function handle both inbound and inter-service traffic >40%* of cloud traffic needs load balancing (Ananta [SIGCOMM’13])

30 Key issues Huge throughput requirement
Fast distribution algorithm PCC (per connection consistency) under DIP (direct IP) changes Fast reaction on server failure and overload Huge number of concurrent connections Memory efficiency to remember all live connections Fast query algorithm

31 Design space of data center load balancers
hardware ASIC large ? hybrid solutions capacity software hash table small Maybe L4LB is not suitable to be shown here bad flexibility good

32 Why static hashing won’t work under DIP change
Following packets goes to DIP1 DIP1: [0, 1/3) DIP1: [0, 1/4) H(C1)=0.3 DIP2: [1/3, 2/3) DIP2: [1/4, 1/2) Conn C1 C1 goes to DIP2 DIP3: [2/3, 1) DIP3: [1/2, 3/4) L4LB Algorithm: Static Hashing H() DIP4: [3/4, 1) DIP4: Server down Large number of PCC violations!

33 Workflow of hash table based software solution
Look up 5-tuple of the packet Connection Table Found a consistent DIP Found a DIP to handle this packet Hit Hash table for: connection-to-DIP mapping Miss VIP-to-DIP Table Using consistent hashing to find a DIP from the VIP’s DIP pool Install an entry

34 Why current hash table solutions suffer form small capacity
Up to 11M live connections Both high IO and high CPU overhead Up to XX packets / min Look up 5-tuple of the packet Connection Table Hash table for: connection-to-DIP mapping Hit High memory cost Huge num of memory load VIP-to-DIP Table Using consistent hashing to find a DIP from the VIP’s DIP pool Miss Update query structure on a connection basis (up to 60M/min) Install an entry Large number of synchronization

35 Concury: Layer 4 Load Balancer
Existing solutions face a dilemma: storing all connections incurs high memory cost while using digests causes false hits due to digest collisions. Concury represents all concurrent connections by only two arrays and brings neither false hit nor consistency violation. Othello Hashing serves as the key component that tracks the connections and supports fast queries. Concury can be implemented on either commodity servers or programmable ASICs.

36 Othello alien query randomizer
For an alien key with which the value is not given ℎ 𝑎 ■ 𝑎 ?? 1C DA 8E 5E 𝑏 05 B9 17 21 ℎ 𝑏 ■ To compute the \tau(k) value, The Othello data structure maintains two bit maps a, and b, the total size of the two bitmap is smaller than 4n, where n is the number of k. [] Othello also maintains two hash functions. For a query, it computes two hash values, ha and hb, and use them as the index on the two bit maps. It gets the two values stored in the correponsidng postions in the bitmap and compute the exclusive or value as the tau value. For example, in here it gets a name that is denoted by this purple square, and it computes the tau value to be 1, and we can determine that purple square is in set Y. So, query on Othello is easy, then how can we constructut it? ■ is randomly but consistently assigned a value 𝜏 ■ =5𝐸⊕17=49

37

38 Concury-DP Workflow … … Othello-1 Othello-2 Othello- i
VIP index of the packet: i VIP Array Control Plane Stores the mapping of VIP index to Othello address Updates (infrequent) Othello-1 memory address of the Othello of VIP vi Othello-2 DIP Array l-bit Dcode 2D Array DA: DA[i][Dcode]=DIP Othello- i 5-tuple of the packet 𝑛 Othellos 𝐼𝑛𝑑𝑒𝑥𝑒𝑑 𝑏𝑦 𝑉𝐼𝑃 𝐴𝑟𝑟𝑎𝑦

39 Experiment Configurations
Pure C++ algorithm implementations Compare Concury with Hash table with digest (Maglev) Multi hash table with digest (SilkRoad) Metrics Throughput under different #Connections Control plane update speed Data plane memory footprint Data plane update response time Average load balance measure under DIP change

40 Main advantages of Concury-DP
Compared to Maglev [Google]: ok speed, may include false hits Silkroad [Barefoot]: ok speed, may include false hits Concury: high speed, no false hits, additional VIP isolation

41 Throughput

42

43 Memory footprint

44 Control plane connection tracking

45 Dynamic weighted load balancing

46 Design space of data center load balancers
hardware ASIC large Concury hybrid solutions capacity software hash table small Maybe L4LB is not suitable to be shown here bad flexibility good


Download ppt "Othello Hashing and Its Applications for Network Processing"

Similar presentations


Ads by Google