Presentation is loading. Please wait.

Presentation is loading. Please wait.

Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines By F. Bonomi et al. Presented by Kenny Cheng, Tonny Mak Yui Kuen.

Similar presentations


Presentation on theme: "Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines By F. Bonomi et al. Presented by Kenny Cheng, Tonny Mak Yui Kuen."— Presentation transcript:

1 Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines By F. Bonomi et al. Presented by Kenny Cheng, Tonny Mak Yui Kuen

2 2 Introduction A)Motivation B)Objectives C)Problem statements

3 3 A) Motivation Increasing trend to keep flow state in routers Large memory space (~100 bits per flow) is needed for storing a large amount of flow states If memory space can be reduced, using fast on- chip memory is feasible to improve performance

4 4 B) Objectives Introduce the idea of an Approximate Concurrent State Machine (ACSM), it sacrifices some accuracy for memory size. Introduce and compare several solutions to ACSM problem To find an approach with the highest accuracy to memory ratio

5 5 C) Problem statements Describe 3 techniques based on Bloom filters and hashing, and evaluate them using both theoretical analysis and simulation

6 6 Bloom Filter A data structure proposed by Bloom in 1970 Designed for membership test, i.e. to test whether an element exists in a set Fast and compact Chance of false positive, i.e. an element not in the set may be wrongly identified No false negative, i.e. an element in the set must be identified correctly

7 7 How a Bloom Filter Works A bit array with all zeros initially k hash functions... 12k3 00000000000000

8 8 How a Bloom Filter Works Hash the element using the hash functions, get k indices in the bit array Mark the bits to 1... 12k3 00000000000000 Insertion x 00100000110001

9 9 How a Bloom Filter Works Hash the element using the hash functions If all corresponding bits are 1, it’s in the set... 12k3 00100100111001 Lookup x 00100100111001

10 10 How a Bloom Filter Works Sorry, no deletion You don’t know whether the bits are used by other elements or not, cannot simply clear them... 12k3 00100100111001 Deletion x 00?00100??100?

11 11 Counting Bloom Filter Use a counter to replace a bit For insertion, increment the counters For deletion, decrement the counters Problems: more space, overflow counters... 12k3 00001000001001 x 0000100000300200101000113003

12 12 3 Approaches to ACSM Approaches: 1. Direct Bloom Filter 2. Stateful Bloom Filter 3. Fingerprint-compressed Filter Operations need to implement: 1. Insert(flow, state) 2. Lookup(flow) returns (state) 3. Delete(flow) 4. Update(flow, new_state)

13 13 Direct Bloom Filter Approach Use counting Bloom filter 4 operations: Insert – insert (flow_id, state) pair Lookup – if state is not provided, have to lookup every state, return “don’t know” if more than one state is found Delete – lookup + decrement counters Update – delete old + insert new Improvement: use timing-based deletion to handle non-terminated flows

14 14 Timing-based Deletion Add a timing bit to each cell Set the bit if the cell is touched Clear untouched cells periodically, and reset timing bits Alternative to DBF: use standard Bloom filter instead of counting, delete elements only by time-based deletion... 12k3 00330120110102 x 0030000011000200000000000000 Timing Bits 00100000110001

15 15 Stateful Bloom Filter Approach Direct Bloom Filter doesn’t store the state of a flow, need to lookup every state Improvement: add a state value for each cell for faster lookup Hash flow_id only, instead of (flow_id, state) pair Introduce a “don’t know” (DK) state when collision occurs Keep timing-based deletion

16 16 Stateful Bloom Filter Approach Insert, modify, delete – similar to Direct Bloom Filter, set the cell value to DK for collision (counter > 1) Lookup: If all cells are DK, return DK If all cells are either state i or DK, return state i If more than one state other than DK, return “not found”

17 17 1001010110 1 1100110000 4 0110111010 2 0111010100 1 1110011101 3 1100000110 3 0000111101 3... FingerprintState Fingerprint-compressed Filter Approach Store a fingerprint of flow + state in a d-left hashtable... x 12d 1110001000 1

18 18 Fingerprint-compressed Filter Approach Insert - hash the element, and find the corresponding bucket in each hash table, insert the fingerprint + state in the bucket with least number of elements (choose the left-most one to break ties) Lookup – retrieve the state of the fingerprint Delete – remove the fingerprint Update – direct update or remove old + add new Make use of DK when a fingerprint is found in multiple buckets Timing-based deletion can still be applied

19 19 Simulation To investigate the size/accuracy trade-off for the 3 approaches State machine: 10 states Legal state changes: 1 → 2 → 3 → … → 10 Run for 1 million flows About 60000 simultaneous flows 100 ± 40 packets for each flow Some packets trigger state change

20 20 Simulation 3 kinds of simulation flows Interesting flows (30%) – flows with legal state changes only, always complete Noise flows (30%) – flows with random (can be legal or illegal) state changes, never complete Random flows (40%) – flows without state change

21 21 Simulation False positive rate: % of completed flows which is not-interesting False negative rate: % of interesting flows without completion

22 22 Applications Place in the application level QoS:- Video congestion control Peer-to-Peer (P2P) traffic identification

23 23 Video congestion control Apply to MPEG video streaming 3 kinds of frames for MPEG video: I frame – scene information P frame – differential information B frame – least important information Can drop B frames up to 30% with acceptable quality Need to keep track of current frame

24 24 Video congestion control Use FCF ACSM to keep track of state Experimentally the highest false positive rate acceptable is 0.37% This requires a memory size of 27 bits per flow (about ¼ compared to original 100 bits)

25 25 P2P Traffic Identification To limit P2P flows to increase quality for other applications One possible way to identify a P2P flow: concurrent TCP and UDP flows Use ACSM for real-time P2P identification

26 26 Conclusion It’s feasible for ACSM FCF approach is the best approach Two potential applications are introduced for ACSM ACSM may be beneficial to QoS applications, which are fault-tolerant

27 27 Comments Authors focus on accuracy and memory size, but not real performance FCF approach may not perform well on hardware

28 - End - Question & Answer


Download ppt "Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines By F. Bonomi et al. Presented by Kenny Cheng, Tonny Mak Yui Kuen."

Similar presentations


Ads by Google