Presentation is loading. Please wait.

Presentation is loading. Please wait.

By: Ran Ben Basat, Technion, Israel

Similar presentations


Presentation on theme: "By: Ran Ben Basat, Technion, Israel"β€” Presentation transcript:

1 By: Ran Ben Basat, Technion, Israel
Pay for a Sliding Bloom Filter and Get Counting, Distinct Elements, and Entropy for Free By: Ran Ben Basat, Technion, Israel Joint work with Eran Assaf, Gil Einziger, and Roy Friedman IEEE INFOCOM2018 4/15/2019

2 Computing network statistics. Monitoring a large number of flows.
Motivation Computing network statistics. Load balancing, Fairness, Anomaly detection. Monitoring a large number of flows. Allowing real-time queries. 4/15/2019

3 Did appear in the window?
Sliding Bloom Filter Did appear in the window? Recent data is often the most important! No false negatives: 𝐏𝐫 yes π’™βˆˆπ‘Ύ =𝟏 Few false positives: 𝐏𝐫 yes π’™βˆ‰π‘Ύ ≀𝝐 Traditionally – must fit in the SRAM 1 5 3 7 8 4 2 Year 2012 2014 2016 SRAM (MB) 10-20 30-60 50-100 (SilkRoad, SIGCOMM 2017) 4/15/2019

4 Lower Bounds for Sliding Bloom Filters
Any sliding Bloom filter must 𝔅=π‘Šlog π‘Š/πœ– bits (Naor and Yogev, ISAAC 2013) For convenience we assume that πœ–= π‘Š π‘œ 1 . Alternatively, log π‘Š/πœ– = log π‘Š (1+π‘œ 1 ) An algorithm is called succinct if it uses 𝔅 1+π‘œ 1 space.

5 Sliding Window Bloom Filter (Liu et al., INFOCOM 2013)
Use a Cuckoo Hash Table. Current time: 𝟎 𝟏 πŸ’ 𝟐 πŸ‘ Table 1 Thm: if the load factor is ≀ 𝟎.πŸ“ then with high probability all operations take constant time Table 2 FP Timestamp FP Timestamp 𝟏𝟏𝟎 𝟐 πŸ‘ πŸ’ 𝟏 πŸ‘ 𝟏 Space: πŸπ‘Ύπ₯𝐨𝐠𝐖 𝟏+𝐨 𝟏 =𝔅 𝟐+𝒐 𝟏 bits 𝒉 𝟎 𝒉 𝟐 𝒉 𝟏 𝒉 𝟏 𝒉 𝟏 𝒉 𝟏 𝒉 𝟐 Has appeared in the last 3 packets?

6 Per-flow frequency estimation
How many times does appear in the window? A generalization of a Sliding Bloom Filter π‘Šπœ–βˆ’Additive approximation using 𝑂 πœ– βˆ’1 log π‘Š bits and with constant time operations (Ben Basat et al., INFOCOM 2016)

7 Sliding Window Approximate Measurement Protocol (SWAMP)
Current Item Pointer (curr) Cyclic Fingerprint Buffer (CFB)

8 Multiset representations
Consider representing a set of π‘š items from an 𝑛- sized universe. replace( , ) Universe: multiplicity( ) Set: There exist succinct (use 𝔅(π‘š,𝑛) 1+π‘œ 1 bits) representations with 𝑂(1) time operations. (Einziger and Friedman, ICDCN 2016), (Pandey et al., SIGMOD 2017) 4/15/2019

9 Sliding Window Approximate Measurement Protocol (SWAMP)
Current Item Pointer (curr) Cyclic Fingerprint Buffer (CFB) replace( , ) Fingerprint Frequency 2 1 4 (+1) (-1) Aggregates Table

10 The results Algorithm Space Update Time Counts TBF 𝑂 π‘Šlogπ‘Šlog πœ– βˆ’1
SWBF 2+π‘œ 1 π‘Š log 2 π‘Š 𝑂(1) SWAMP 1+π‘œ 1 π‘Š log 2 π‘Š

11 Is SWAMP a good counting algorithm?
We compared to the state of the art WCSS algorithm (Ben Basat et al., INFOCOM 2016)

12 Counting distinct elements over sliding windows
How many distinct flows appear in the window? (1+πœ–)βˆ’multiplicative approximation using 𝑂 πœ– βˆ’2 log π‘Š log log π‘Š bits and with constant update time (Fusy and Giroire, ANALCO 2007), (Chabchoub and Hebrail, ICDM 2010),

13 Counting distinct elements with SWAMP
Current Item Pointer (curr) Cyclic Fingerprint Buffer (CFB) Distinct Fingerprints: 𝒁=6 (-1) Requires just π₯𝐨𝐠𝐖 bits! Fingerprint Frequency 2 1 4 Aggregates Table

14 Counting distinct elements with SWAMP
Guarantees: Pr 𝐷β‰₯𝑍 =1 Pr π·βˆ’π‘β‰₯πœ–π·log 𝛿 βˆ’1 ≀𝛿 (never overestimate, likely to not underestimate by much) (approximate) Maximum Likelihood Estimate: Return ln 1 βˆ’ 𝑍 2 𝐿 ln 1 βˆ’ 𝐿

15 Counting distinct elements with SWAMP
Instead of paying 𝛀 𝝐 βˆ’πŸ π₯𝐨𝐠𝑾 bits using the existing algorithms, SWAMP required 𝑢(π‘Ύπ’π’π’ˆ(𝑾/𝝐)) which is more efficient when 𝝐 is small

16 Takeaways A succinct sliding bloom filter that can also count.
Beats the state of the art for: Sliding Bloom Filter Per-flow Frequency Estimation Counting Distinct Elements Computing Entropy (in the paper) 4/15/2019

17 Any Questions 4/15/2019

18 Distribution Entropy over Sliding Windows
What is the distribution entropy of the window? (1+πœ–)βˆ’multiplicative approximation using 𝑂 πœ– βˆ’2 log π‘Š bits and with 𝑂 πœ– βˆ’2 update time (Braverman et al., PODS 2009).

19 Computing Entropy with SWAMP
We can track 𝐻 βˆ’ the entropy of the finger print distribution Guarantees: Pr 𝐻β‰₯ 𝐻 =1 Pr π»βˆ’ 𝐻 β‰₯πœ– ≀𝛿

20 Computing Entropy with SWAMP
Instead of paying 𝛀 𝝐 βˆ’πŸ π₯𝐨𝐠𝑾 bits using the existing algorithms, SWAMP required 𝑢(π‘Ύπ’π’π’ˆ(𝑾/𝝐)) which is more efficient when 𝝐 is small

21 Set Membership (Bloom Filter)
Did appear in the stream? How about ? Can’t allocate a bit for each potential flow! Traditionally – must fit in the SRAM 1 5 3 7 8 4 2 Year 2012 2014 2016 SRAM (MB) 10-20 30-60 50-100 (SilkRoad, SIGCOMM 2017) 4/15/2019

22 The Bloom Filter (Bloom, 1970)
Use a bit-array of size π‘š and π‘˜ hash functions β„Ž 𝑖 :π‘ˆβ†’ 1,…,π‘š No False Negatives! Few False Positives. Has appeared? 1 1 1 4/15/2019

23 The Timing Bloom Filter (Zhang and Guan, ICDCS 2008)
Use a timestamp-array of size π‘š and π‘˜ hash functions β„Ž 𝑖 :π‘ˆβ†’ 1,…,π‘š Current time: 𝟎 πŸ‘ πŸ’ πŸ“ 𝟏 𝟐 Space: 𝑢 𝑾π₯𝐨𝐠𝑾π₯𝐨𝐠 𝝐 βˆ’πŸ Update/Query: 𝑢 π₯𝐨𝐠 𝝐 βˆ’πŸ Has appeared in the last 3 packets? 2 4 3 5 2 1 2 1 3 1 2 4 3

24 Any Questions 4/15/2019

25 Any Questions 4/15/2019

26 Sliding Window Approximate Measurement Protocol (SWAMP)
Current Item Pointer (curr) Cyclic Fingerprint Buffer (CFB) β„Ž(π‘₯ 𝑛 )=

27 1.0 0.8 0.6 0.4 0.2 0.0 Recall 1.0 0.8 0.6 0.4 0.2 0.0 10 8 10 7 10 6 10 5 10 4 10 3 10 2 10 Mean Square Error Precision Recall 1.0 0.8 0.6 0.4 0.2 0.0 Recall Number of Packets [x100K]

28 10 9 10 8 10 7 10 6 10 5 10 4 10 3 Mean Square Error 10 9 10 8 10 7 10 6 10 5 10 4 10 3 10 2 10 Mean Square Error 10 8 10 7 10 6 10 5 10 4 10 3 10 2 10 Mean Square Error

29 1.0 0.8 0.6 0.4 0.2 0.0 Recall Number of Packets [x100K] Number of Packets [x100K] Number of Videos [x100K]

30 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8 0.6 0.4 0.2 0.0 1.0 0.8 0.6 0.4 0.2 0.0 Precision Precision Precision Recall Recall Recall

31 1.0 0.8 0.6 0.4 0.2 0.0 Recall 1.0 0.8 0.6 0.4 0.2 0.0 Recall 1.0 0.8 0.6 0.4 0.2 0.0 Recall


Download ppt "By: Ran Ben Basat, Technion, Israel"

Similar presentations


Ads by Google