Wavelet Synopses with Predefined Error Bounds: Windfalls of Duality Panagiotis Karras DB seminar, 23 March, 2006.

Slides:



Advertisements
Similar presentations
Program Efficiency & Complexity Analysis
Advertisements

Wavelet and Matrix Mechanism CompSci Instructor: Ashwin Machanavajjhala 1Lecture 11 : Fall 12.
Kaushik Chakrabarti(Univ Of Illinois) Minos Garofalakis(Bell Labs) Rajeev Rastogi(Bell Labs) Kyuseok Shim(KAIST and AITrc) Presented at 26 th VLDB Conference,
Efficient access to TIN Regular square grid TIN Efficient access to TIN Let q := (x, y) be a point. We want to estimate an elevation at a point q: 1. should.
Fast Algorithms For Hierarchical Range Histogram Constructions
Approximations of points and polygonal chains
Advanced Topics in Algorithms and Data Structures Lecture 7.2, page 1 Merging two upper hulls Suppose, UH ( S 2 ) has s points given in an array according.
Time Complexity of Basic BST Operations Search, Insert, Delete – These operations visit the nodes along a root-to- leaf path – The number of nodes encountered.
Extensions of wavelets
An Introduction to Sparse Coding, Sparse Sensing, and Optimization Speaker: Wei-Lun Chao Date: Nov. 23, 2011 DISP Lab, Graduate Institute of Communication.
Quadtrees, Octrees and their Applications in Digital Image Processing
Optimal Workload-Based Weighted Wavelet Synopsis
Algorithmic Complexity Nelson Padua-Perez Bill Pugh Department of Computer Science University of Maryland, College Park.
Deterministic Wavelet Thresholding for Maximum-Error Metrics Minos Garofalakis Bell Laboratories Lucent Technologies 600 Mountain Avenue Murray Hill, NJ.
Finding Aggregates from Streaming Data in Single Pass Medha Atre Course Project for CS631 (Autumn 2002) under Prof. Krithi Ramamritham (IITB).
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
1 An Adaptive Nearest Neighbor Classification Algorithm for Data Streams Yan-Nei Law & Carlo Zaniolo University of California, Los Angeles PKDD, Porto,
Wavelet Packets For Wavelets Seminar at Haifa University, by Eugene Mednikov.
Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets Based on the work of Jeffrey Scott Vitter and Min Wang.
Quadtrees, Octrees and their Applications in Digital Image Processing
A Quick Introduction to Approximate Query Processing Part II
Basic Concepts and Definitions Vector and Function Space. A finite or an infinite dimensional linear vector/function space described with set of non-unique.
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
Methods of Image Compression by PHL Transform Dziech, Andrzej Slusarczyk, Przemyslaw Tibken, Bernd Journal of Intelligent and Robotic Systems Volume: 39,
1 Tree Searching Strategies. 2 The procedure of solving many problems may be represented by trees. Therefore the solving of these problems becomes a tree.
1 Wavelet synopses with Error Guarantees Minos Garofalakis Phillip B. Gibbons Information Sciences Research Center Bell Labs, Lucent Technologies Murray.
Wavelet Synopses with Error Guarantees Minos Garofalakis Intel Research Berkeley
Deterministic Wavelet Thresholding for Maximum-Error Metrics Minos Garofalakis Internet Management Research Dept. Bell Labs, Lucent
DAST, Spring © L. Joskowicz 1 Data Structures – LECTURE 1 Introduction Motivation: algorithms and abstract data types Easy problems, hard problems.
Binary Trees Chapter 6.
Fast Approximate Wavelet Tracking on Streams Graham Cormode Minos Garofalakis Dimitris Sacharidis
Approximate Frequency Counts over Data Streams Gurmeet Singh Manku, Rajeev Motwani Standford University VLDB2002.
Special Topics in Data Engineering Panagiotis Karras CS6234 Lecture, March 4 th, 2009.
Analysis of Algorithms
Chapter 6 Binary Trees. 6.1 Trees, Binary Trees, and Binary Search Trees Linked lists usually are more flexible than arrays, but it is difficult to use.
Constructing Optimal Wavelet Synopses Dimitris Sacharidis Timos Sellis
Multiresolution analysis and wavelet bases Outline : Multiresolution analysis The scaling function and scaling equation Orthogonal wavelets Biorthogonal.
Mehdi Mohammadi March Western Michigan University Department of Computer Science CS Advanced Data Structure.
Quadtrees, Octrees and their Applications in Digital Image Processing.
Exact Computation Theory and Techniques Serge Adamowsky & Lina Ourima Seminar Algorithmen WS 03 / 04.
The Haar + Tree: A Refined Synopsis Data Structure Panagiotis Karras HKU, September 7 th, 2006.
The Impact of Duality on Data Synopsis Problems Panagiotis Karras KDD, San Jose, August 13 th, 2007 work with Dimitris Sacharidis and Nikos Mamoulis.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
23 November Md. Tanvir Al Amin (Presenter) Anupam Bhattacharjee Department of Computer Science and Engineering,
Dr. Sudharman K. Jayaweera and Amila Kariyapperuma ECE Department University of New Mexico Ankur Sharma Department of ECE Indian Institute of Technology,
Hank Childs, University of Oregon Isosurfacing (Part 3)
1 Multi-Level Indexing and B-Trees. 2 Statement of the Problem When indexes grow too large they have to be stored on secondary storage. However, there.
The Impact of Duality on Data Representation Problems Panagiotis Karras HKU, June 14 th, 2007.
Hierarchical Volume Rendering
One-Pass Wavelet Synopses for Maximum-Error Metrics Panagiotis Karras Trondheim, August 31st, 2005 Research at HKU with Nikos Mamoulis.
Page 1KUT Graduate Course Data Compression Jun-Ki Min.
Honors Track: Competitive Programming & Problem Solving Seminar Topics Kevin Verbeek.
12/12/2003EZW Image Coding Duarte and Haupt 1 Examining The Embedded Zerotree Wavelet (EZW) Image Coding Method Marco Duarte and Jarvis Haupt ECE 533 December.
Continuous Monitoring of Distributed Data Streams over a Time-based Sliding Window MADALGO – Center for Massive Data Algorithmics, a Center of the Danish.
Dense-Region Based Compact Data Cube
Singular Value Decomposition and its applications
CSE 554 Lecture 5: Contouring (faster)
Data Science Algorithms: The Basic Methods
Abolfazl Asudeh Azade Nazi Nan Zhang Gautam DaS
Boosting and Additive Trees (2)
Distance Computation “Efficient Distance Computation Between Non-Convex Objects” Sean Quinlan Stanford, 1994 Presentation by Julie Letchner.
Fast Approximate Query Answering over Sensor Data with Deterministic Error Guarantees Chunbin Lin Joint with Etienne Boursier, Jacque Brito, Yannis Katsis,
Lattice Histograms: A Resilient Synopsis Structure
Data Structures: Segment Trees, Fenwick Trees
Algebraic Techniques for Analysis of Large Discrete-Valued Datasets
SPACE EFFICENCY OF SYNOPSIS CONSTRUCTION ALGORITHMS
An Adaptive Nearest Neighbor Classification Algorithm for Data Streams
Major Design Strategies
Data Mining CSCI 307, Spring 2019 Lecture 24
Major Design Strategies
Presentation transcript:

Wavelet Synopses with Predefined Error Bounds: Windfalls of Duality Panagiotis Karras DB seminar, 23 March, 2006

Algorithms for Maximum-Error Wavelet Synopses Restricted Space-Bounded Direct GK04,05 Guha05 Unrestricted GH05,06 Error-Bounded Muthukrishnan05 Indirect ?

Compact Data Synopses useful in: Approximate Query Processing (exact answers not always required) Learning, Classification, Event Detection Data Mining, Selectivity Estimation Situations where massive data arrives in a stream

Haar Wavelets 18 :Wavelet transform: orthogonal transform for the hierarchical representation of functions and signals :Haar wavelets: simplest wavelet system, easy to understand and implement Haar tree: structure for the visualization of decomposition and value reconstructions Synopsis: Wavelet representation with few non-zero terms.

Maximum-Error Metrics Maximum-Error Metrics Error Metrics providing tight error guarantees for all reconstructed values: –Maximum Absolute Error –Maximum Relative Error with Sanity Bound (to avoid domination by small data values) Aim at minimization of these metrics

Restricted Synopses Compute Haar wavelet decomposition of D Preserve best coefficient subset that satisfies bound Space-Bounded ProblemSpace-Bounded Problem: [GK04,05,Guha05] Bound B on number of non-zero coefficients Error-Bounded ProblemError-Bounded Problem : [Muthukrishnan05] Bound ε on maximum error Faster Indirect solution to Space-Bounded Problem

How does it work? Space-Bounded Problem GK04,05: Global Tabulation Guha05: Local Tabulation - Tabulate four one-dimensional arrays: - Extract from these four, delete them - At most arrays concurrently stored - Derive solution at the top, solve the problem again below time, space iLiL iRiR S = subset of selected ancestors root i + -

How does it work? Error-Bounded Problem Muthukrishnan05 iLiL iRiR + root S = subset of selected ancestors i root - - At levels from bottom stop recursion, enter local search - time, space No need to tabulate The solution to this problem is more economic Dual Space-Bounded solved Indirectly via binary search

Unrestricted Synopses Unrestricted Synopses [GH05,06] Forget about actual coefficient values Choose a best set of non-zero wavelet terms of any values In practice: Examined values are multiples of resolution step δ

Unrestricted Synopses Unrestricted Synopses [GH05,06] Approximation quality better than restricted Time asymptotically linear to n But: - Examined values bounded by M [GH05] - Multiple Guesses of error result [GH06] - Space-Bounded Problem: Two-Dimensional Tabulation E(b,v) on each tree node → High Running Time and Space demands

Our Approach: Wavelet Synopses with Predefined Error Bounds Error-Bounded ProblemError-Bounded Problem DP algorithm: - Demarcates examined values using error bound ε - Tabulates only S(v), one dimension per node Space-Bounded ProblemSpace-Bounded Problem Enhanced Solution: - Calculate upper bound for error, use it to bound values Indirect Solution: - Use binary search on Error-Bounded problem

How does it work? One-dimensional tabulation on values only Examined incoming values v bounded by error bound Examined assigned values z also bounded Strong version of problem: minimize error within space

Complexity Error-Bounded Problem: Time or Space or Space-Bounded Problem: Time vs. Space vs.

Experiments: Error-Bounded Problem

Experiments: Space-Bounded Problem

Related Work M. Garofalakis and A. Kumar. Deterministic wavelet thresholding for maximum-error metrics. PODS 2004 S. Guha. Space efficiency in synopsis construction algorithms. VLDB 2005 S. Guha and B. Harb. Wavelet Synopses for Data Streams: Minimizing Non-Euclidean Error. KDD 2005 S. Muthukrishnan. Subquadratic algorithms for workload-aware haar wavelet synopses. FSTTCS 2005 S. Guha and B. Harb. Approxmation algorithms for wavelet transform coding of data streams. SODA 2006

Thank you! Questions?