Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wavelet Synopses with Predefined Error Bounds: Windfalls of Duality Panagiotis Karras DB seminar, 23 March, 2006.

Similar presentations


Presentation on theme: "Wavelet Synopses with Predefined Error Bounds: Windfalls of Duality Panagiotis Karras DB seminar, 23 March, 2006."— Presentation transcript:

1 Wavelet Synopses with Predefined Error Bounds: Windfalls of Duality Panagiotis Karras DB seminar, 23 March, 2006

2 Algorithms for Maximum-Error Wavelet Synopses Restricted Space-Bounded Direct GK04,05 Guha05 Unrestricted GH05,06 Error-Bounded Muthukrishnan05 Indirect ?

3 Compact Data Synopses useful in: Approximate Query Processing (exact answers not always required) Learning, Classification, Event Detection Data Mining, Selectivity Estimation Situations where massive data arrives in a stream

4 34 16 2 20 20 0 36 16 0 18 7 -8 9 -9 10 25 11 10 26 Haar Wavelets 18 :Wavelet transform: orthogonal transform for the hierarchical representation of functions and signals :Haar wavelets: simplest wavelet system, easy to understand and implement Haar tree: structure for the visualization of decomposition and value reconstructions Synopsis: Wavelet representation with few non-zero terms.

5 Maximum-Error Metrics Maximum-Error Metrics Error Metrics providing tight error guarantees for all reconstructed values: –Maximum Absolute Error –Maximum Relative Error with Sanity Bound (to avoid domination by small data values) Aim at minimization of these metrics

6 Restricted Synopses Compute Haar wavelet decomposition of D Preserve best coefficient subset that satisfies bound Space-Bounded ProblemSpace-Bounded Problem: [GK04,05,Guha05] Bound B on number of non-zero coefficients Error-Bounded ProblemError-Bounded Problem : [Muthukrishnan05] Bound ε on maximum error Faster Indirect solution to Space-Bounded Problem

7 How does it work? Space-Bounded Problem GK04,05: Global Tabulation Guha05: Local Tabulation - Tabulate four one-dimensional arrays: - Extract from these four, delete them - At most arrays concurrently stored - Derive solution at the top, solve the problem again below time, space iLiL iRiR S = subset of selected ancestors root i + -

8 How does it work? Error-Bounded Problem Muthukrishnan05 iLiL iRiR + root S = subset of selected ancestors i root - - At levels from bottom stop recursion, enter local search - time, space No need to tabulate The solution to this problem is more economic Dual Space-Bounded solved Indirectly via binary search

9 Unrestricted Synopses Unrestricted Synopses [GH05,06] Forget about actual coefficient values Choose a best set of non-zero wavelet terms of any values In practice: Examined values are multiples of resolution step δ

10 Unrestricted Synopses Unrestricted Synopses [GH05,06] Approximation quality better than restricted Time asymptotically linear to n But: - Examined values bounded by M [GH05] - Multiple Guesses of error result [GH06] - Space-Bounded Problem: Two-Dimensional Tabulation E(b,v) on each tree node → High Running Time and Space demands

11 Our Approach: Wavelet Synopses with Predefined Error Bounds Error-Bounded ProblemError-Bounded Problem DP algorithm: - Demarcates examined values using error bound ε - Tabulates only S(v), one dimension per node Space-Bounded ProblemSpace-Bounded Problem Enhanced Solution: - Calculate upper bound for error, use it to bound values Indirect Solution: - Use binary search on Error-Bounded problem

12 How does it work? One-dimensional tabulation on values only Examined incoming values v bounded by error bound Examined assigned values z also bounded Strong version of problem: minimize error within space

13 Complexity Error-Bounded Problem: Time or Space or Space-Bounded Problem: Time vs. Space vs.

14 Experiments: Error-Bounded Problem

15

16 Experiments: Space-Bounded Problem

17

18

19 Related Work M. Garofalakis and A. Kumar. Deterministic wavelet thresholding for maximum-error metrics. PODS 2004 S. Guha. Space efficiency in synopsis construction algorithms. VLDB 2005 S. Guha and B. Harb. Wavelet Synopses for Data Streams: Minimizing Non-Euclidean Error. KDD 2005 S. Muthukrishnan. Subquadratic algorithms for workload-aware haar wavelet synopses. FSTTCS 2005 S. Guha and B. Harb. Approxmation algorithms for wavelet transform coding of data streams. SODA 2006

20 Thank you! Questions?


Download ppt "Wavelet Synopses with Predefined Error Bounds: Windfalls of Duality Panagiotis Karras DB seminar, 23 March, 2006."

Similar presentations


Ads by Google