Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ariel Rosenfeld.  Input: a stream of m integers i1, i2,..., im. (over 1,…,n)  Output: the number of distinct elements in the stream.  Example – count.

Similar presentations


Presentation on theme: "Ariel Rosenfeld.  Input: a stream of m integers i1, i2,..., im. (over 1,…,n)  Output: the number of distinct elements in the stream.  Example – count."— Presentation transcript:

1 Ariel Rosenfeld

2  Input: a stream of m integers i1, i2,..., im. (over 1,…,n)  Output: the number of distinct elements in the stream.  Example – count the distinct number of IP addresses you encounter.

3  Bit vector of size n (mark 1 when encountered)  Keeping all m integers and naively answer. ◦ Sort and count O(min{n,mlogm})

4  a determinitic exact algorithm is impossible using o(n) bits.  A deterministic approximation algorithm for this problem providing a (1 ± 1/1000)- approximation using o(n) bits is impossible.

5  Pick random hash function h : [n] → [0, 1]  Calculate z = min i ∈ stream h(i)  Output 1/z − 1

6  Same ints gets same hash value.  We will show that the output is a good approximation.

7  This is idealized for 2 reasons: 1.We don’t have perfect precision. 2. We need n bits at least to remember the randomness associated with every i. Lets ignore it for now…

8  S = {j1,…jt} (unique elements in the stream)  h(j1),..., h(jt) = X1,..., Xt are independent variables from Unif[0, 1]  Z = min{Xi}

9 P=1 01 01 F(x) 1 1

10

11 1.. 2.. (HW) We get a bounded variance.

12

13  q increases -> better approximation Chebyshev

14  We want a function that doesn't need n bits or more to represent.  So we will use k-wise independent hash functions (H) each can be represented using a small number of bits (log|H|). ◦ In lecture.

15  An example - Set q > k a prime power, and define H poly,k to be the set of all degree ≤ (k − 1) polynomials in Fq[x].  H poly,k is a k-wise independent family.  Size: q k  Needs: k log q bits.


Download ppt "Ariel Rosenfeld.  Input: a stream of m integers i1, i2,..., im. (over 1,…,n)  Output: the number of distinct elements in the stream.  Example – count."

Similar presentations


Ads by Google