Presentation is loading. Please wait.

Presentation is loading. Please wait.

Maintaining Stream Statistics over Sliding Windows

Similar presentations


Presentation on theme: "Maintaining Stream Statistics over Sliding Windows"— Presentation transcript:

1 Maintaining Stream Statistics over Sliding Windows
Ariel Rosenfeld

2 Streams Here, There, Everywhere!
1 Network Traffic Engineering. Call Record Analysis. Sensor Data Analysis. Medical, Financial Monitoring. Etc, etc, etc.

3 Sliding Window Model Time Increases
… … Window Size = N Current Time

4 The Problem –Basic counting
Count the number of ones in N size window. Exact Solution: Θ(N) memory. Approximate Solution: ? Good approx with o(N) memory?

5 Sliding Window Computation
Main difficulty: discount expiring data As each element arrives, one element expires value of expiring element can’t be known exectly. How do we update our structure? One solution: Use Histograms Bucket sums = (3,2,1,2)

6 Results Exponential Histogram (EH): 1 + ε approximation. (k = 1/ε)
Space: O(1/ε(log2N)) bits. Time: O(log N) worst case, O(1) amortized.

7 Histograms (remainder)

8 Example k/2 = 1. Bucket sizes = 4,2,2,1. Bucket sizes = 4,2,2,1,1.
… … Element arrived this step. Future

9 Observations Error in last (leftmost) bucket.
Bucket Sizes (left to right): Cm,Cm-1, …,C2,C1 Absolute Error <= Cm/2. Answer >= Cm-1+…+C2+C1+1. Error <= Cm/2(Cm-1+…+C2+C1+1). Maintain: Cm/2(Cm-1+…+C2+C1+1) <= 1/k.

10 Observations Every Bucket will become last bucket in future.
New elements may be all zeros. Bucket Sizes (left to right): Cm,Cm-1, …,C2,C1 For every bucket i, Ci/2(Ci-1+…+C2+C1+1) <= 1/k.

11 Invariant Maintain Ci/2(Ci-1+…+C2+C1+1) <= 1/k.
Exponentially increasing bucket sizes from right to left. At least k/2 buckets (at most k/2 +1)of each size(1,2,4,8,…,2i,...).

12 Guarantees. Error Guarantee: Number of buckets: O(k log N).
Error <= Cm/2(Cm-1+…+C2+C1) <= 1/k. Number of buckets: O(k log N). Buckets require O(log N) bits. Total memory: O(k log2 N) bits.

13 Random Counter If exact size of bucket is not “a must”.
Number of buckets: O(k log N). Buckets require O(loglog N) bits. Total memory: O(k logN loglogN) bits.


Download ppt "Maintaining Stream Statistics over Sliding Windows"

Similar presentations


Ads by Google