Download presentation
Presentation is loading. Please wait.
Published byTyler Ray Modified over 8 years ago
1
Sampling in Space Restricted Settings Anup Bhattacharya IIT Delhi Joint work with Davis Issac (MPI), Ragesh Jaiswal (IITD) and Amit Kumar (IITD)
2
Introduction: Sampling Select a subset of data Computations on “representative” subset would approximate computations on whole data Sampling variants: –Uniform sampling –Weighted sampling Study sampling algorithms with limited space
3
Outline
4
Sampling in Streaming Settings
5
Streaming Settings: The Model – Items/objects arrive in online fashion – #Total items not known in advance – Typically poly(log(n)) space allowed – One/multi-pass, space usage, time/item, overall time complexity, randomness, accuracy of output
6
Sampling in Streaming Settings
7
Reservoir Sampling … Throw it away Store
8
Reservoir Sampling
9
Uniform Sampling with ϵ-error
10
Lower Bound on Sampling with ϵ-error
11
Outline
12
Algorithm for Uniform Sampling ϵ-error
13
Doubling-Chopping Algorithm
14
Doubling-Chopping algorithm, ϵ=1/16
15
0 1
16
00 10 01 11
17
Doubling-Chopping algorithm, ϵ=1/16 000 010 100 110 001 011 101 111
18
Doubling-Chopping algorithm, ϵ=1/16 0000 0010 0100 0110 1000 1010 1100 1110 0001 0011 0101 0111 1001 1011 1101 1111
19
Doubling-Chopping algorithm, ϵ=1/16 0000 0010 0100 0110 1000 1010 1100 1110 0001 0011 0101 0111 1001 1011 1101 1111
20
Doubling-Chopping algorithm, ϵ=1/16 Chop(): Move strings from blocks to new block 0110 1000 1010 1100 1110 0111 1001 1011 1101 1111 0101 0011 0001 0100 0010 0000
21
Doubling-Chopping algorithm, ϵ=1/16 Chop(): Move strings from blocks to new block 0110 1000 1010 1100 1110 0111 1001 1011 1101 1111 0101 0011 0001 0100 0010 0000
22
Doubling-Chopping algorithm, ϵ=1/16 Chop(): Move strings from blocks to new block 0110 1000 1010 1100 1110 0111 1001 1011 1101 1111 0011 0001 0100 0010 0000 0101
23
Doubling-Chopping algorithm, ϵ=1/16 0110 1000 1010 1100 1110 0111 1001 1011 1101 1111 0011 0001 0100 0010 0000 0101
24
Doubling-Chopping algorithm, ϵ=1/16 1000 1010 1100 1110 1001 1011 1101 1111 0001 0100 0010 0000 0101 0011 0111 0110
25
Doubling-Chopping algorithm, ϵ=1/16 1000 1010 1100 1110 1001 1011 1101 1111 0001 0100 0010 0000 0101 0011 0111 0110
26
Algorithm Analysis
27
Analysis contd..
29
Sampling in Query Model
30
Space Restricted Setting: Query Model
31
Sampling in Query Model
32
Thank You Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.