CMU SCS 15-826: Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - Metric trees C. Faloutsos.

Slides:



Advertisements
Similar presentations
CMU SCS : Multimedia Databases and Data Mining Lecture #17: Text - part IV (LSI) C. Faloutsos.
Advertisements

Hagenberg -Linz -Prague- Vienna iiWAS 2002, September, Bandung, Indonesia, Page 1 -ISA: AN INCREMENTAL LOWER BOUND APPROACH FOR EFFICIENTLY FINDING.
CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - IV Grid files, dim. curse C. Faloutsos.
On Reinsertions in M-tree Jakub Lokoč Tomáš Skopal Charles University in Prague Department of Software Engineering Czech Republic.
CMU SCS : Multimedia Databases and Data Mining Lecture #4: Multi-key and Spatial Access Methods - I C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #19: SVD - part II (case studies) C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals: M-trees and dim. curse (case studies – Part II) C. Faloutsos.
Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.
Spatial and Temporal Data Mining V. Megalooikonomou Spatial Access Methods (SAMs) II (some slides are based on notes by C. Faloutsos)
Proximity Searching in High Dimensional Spaces with a Proximity Preserving Order Edgar Chávez Karina Figueroa Gonzalo Navarro UNIVERSIDAD MICHOACANA, MEXICO.
CMU SCS : Multimedia Databases and Data Mining Lecture #10: Fractals - case studies - I C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture#5: Multi-key and Spatial Access Methods - II C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture#2: Primary key indexing – B-trees Christos Faloutsos - CMU
Reference-based Indexing of Sequence Databases Jayendra Venkateswaran, Deepak Lachwani, Tamer Kahveci, Christopher Jermaine University of Florida-Gainesville.
CMU SCS : Multimedia Databases and Data Mining Lecture#3: Primary key indexing – hashing C. Faloutsos.
Searching on Multi-Dimensional Data
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals: M-trees and dim. curse (case studies – Part II) C. Faloutsos.
15-826: Multimedia Databases and Data Mining
Improving the Performance of M-tree Family by Nearest-Neighbor Graphs Tomáš Skopal, David Hoksza Charles University in Prague Department of Software Engineering.
1 NNH: Improving Performance of Nearest- Neighbor Searches Using Histograms Liang Jin (UC Irvine) Nick Koudas (AT&T Labs Research) Chen Li (UC Irvine)
ADBIS 2003 Revisiting M-tree Building Principles Tomáš Skopal 1, Jaroslav Pokorný 2, Michal Krátký 1, Václav Snášel 1 1 Department of Computer Science.
CMU SCS : Multimedia Databases and Data Mining Lecture #16: Text - part III: Vector space model and clustering C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals - case studies Part III (regions, quadtrees, knn queries) C. Faloutsos.
VLSH: Voronoi-based Locality Sensitive Hashing Sung-eui Yoon Authors: Lin Loi, Jae-Pil Heo, Junghwan Lee, and Sung-Eui Yoon KAIST
Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.
Scalable and Distributed Similarity Search in Metric Spaces Michal Batko Claudio Gennaro Pavel Zezula.
10-603/15-826A: Multimedia Databases and Data Mining SVD - part I (definitions) C. Faloutsos.
Euripides G.M. PetrakisIR'2001 Oulu, Sept Indexing Images with Multiple Regions Euripides G.M. Petrakis Dept.
10-603/15-826A: Multimedia Databases and Data Mining Text - part II C. Faloutsos.
CMU SCS : Multimedia Databases and Data Mining Lecture #30: Conclusions C. Faloutsos.
Metric based KNN indexing Lecturer:Prof Ooi Beng Chin Presenters:Frankie ChanHT Y Tan ZhenqiangHT J.
CMU SCS : Multimedia Databases and Data Mining Lecture #10: Fractals: M-trees and dim. curse (case studies – Part II) C. Faloutsos.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
M- tree: an efficient access method for similarity search in metric spaces Reporter : Ximeng Liu Supervisor: Rongxing Lu School of EEE, NTU
CMU SCS : Multimedia Databases and Data Mining Lecture #12: Fractals - case studies Part III (quadtrees, knn queries) C. Faloutsos.
Parallel dynamic batch loading in the M-tree Jakub Lokoč Department of Software Engineering Charles University in Prague, FMP.
NM-Tree: Flexible Approximate Similarity Search in Metric and Non-metric Spaces Tomáš Skopal Jakub Lokoč Charles University in Prague Department of Software.
E.G.M. PetrakisSearching Signals and Patterns1  Given a query Q and a collection of N objects O 1,O 2,…O N search exactly or approximately  The ideal.
Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality Piotr Indyk, Rajeev Motwani The 30 th annual ACM symposium on theory of computing.
Euripides G.M. PetrakisIR'2001 Oulu, Sept Indexing Images with Multiple Regions Euripides G.M. Petrakis Dept. of Electronic.
Tomáš Skopal 1, Benjamin Bustos 2 1 Charles University in Prague, Czech Republic 2 University of Chile, Santiago, Chile On Index-free Similarity Search.
CMU SCS : Multimedia Databases and Data Mining Lecture #18: SVD - part I (definitions) C. Faloutsos.
R-trees: An Average Case Analysis. R-trees - performance analysis How many disk (=node) accesses we ’ ll need for range nn spatial joins why does it matter?
1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.
Presenters: Amool Gupta Amit Sharma. MOTIVATION Basic problem that it addresses?(Why) Other techniques to solve same problem and how this one is step.
CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - Metric trees C. Faloutsos.
High-Dimensional Data. Topics Motivation Similarity Measures Index Structures.
Similarity Search without Tears: the OMNI- Family of All-Purpose Access Methods Michael Kelleher Kiyotaka Iwataki The Department of Computer and Information.
Indexing Multidimensional Data
Keogh, E. , Chakrabarti, K. , Pazzani, M. & Mehrotra, S. (2001)
Spatial Data Management
SIMILARITY SEARCH The Metric Space Approach
k-Nearest neighbors and decision tree
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
Instance Based Learning
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
15-826: Multimedia Databases and Data Mining
R-trees: An Average Case Analysis
15-826: Multimedia Databases and Data Mining
Liang Jin (UC Irvine) Nick Koudas (AT&T Labs Research)
C. Faloutsos Spatial Access Methods - R-trees
Presentation transcript:

CMU SCS : Multimedia Databases and Data Mining Lecture #7: Spatial Access Methods - Metric trees C. Faloutsos

CMU SCS Copyright: C. Faloutsos (2014)#2 Must-read material MM Textbook, Chapter 5 Roberto F. Santos Filho, Agma Traina, Caetano Traina Jr., and Christos Faloutsos: Similarity search without tears: the OMNI family of all-purpose access methods ICDE, Heidelberg, Germany, April (code at Similarity search without tears: the OMNI family of all-purpose access methods

CMU SCS Copyright: C. Faloutsos (2014)#3 Outline Goal: ‘Find similar / interesting things’ Intro to DB Indexing - similarity search Data Mining

CMU SCS Copyright: C. Faloutsos (2014)#4 Indexing - Detailed outline primary key indexing secondary key / multi-key indexing spatial access methods –problem dfn –z-ordering –R-trees –misc fractals text

CMU SCS Copyright: C. Faloutsos (2014)#5 SAMs - Detailed outline spatial access methods –problem dfn –z-ordering –R-trees –misc topics metric trees fractals text,...

CMU SCS Copyright: C. Faloutsos (2014)#6 Metric trees What if we only have a distance function d(o1, o2)? (Applications?)

CMU SCS Copyright: C. Faloutsos (2014)#7 Metric trees (assumption: d() is a metric: positive; symmetric; triangle inequality) then, we can use some variation of ‘Vantage Point’ trees [Yannilos] many variations (GNAT trees [Brin95], MVP-trees [Ozsoyoglu+]...)

CMU SCS Copyright: C. Faloutsos (2014)#8 Finally: M-trees [Ciaccia, Patella, Zezula, vldb 97] M-trees = ‘ball-trees’: groups in spheres Metric trees

CMU SCS Copyright: C. Faloutsos (2014)#9 Finally: M-trees [Ciaccia, Patella, Zezula, vldb 97] M-trees = ‘ball-trees’: Minimum Bounding spheres Metric trees

CMU SCS Copyright: C. Faloutsos (2014)#10 Finally: M-trees [Ciaccia, Patella, Zezula, vldb 97] M-trees = ‘ball-trees’: Minimum Bounding spheres Metric trees

CMU SCS Copyright: C. Faloutsos (2014)#11 Finally: M-trees [Ciaccia, Patella, Zezula, vldb 97] M-trees = ‘ball-trees’: Minimum Bounding spheres Metric trees

CMU SCS Copyright: C. Faloutsos (2014)#12 Search (range and k-nn): like R-trees Split? Metric trees

CMU SCS Copyright: C. Faloutsos (2014)#13 Search (range and k-nn): like R-trees Split? Several criteria: –minimize max radius (or sum radii) –(even: random!) Algorithm? Metric trees

CMU SCS Copyright: C. Faloutsos (2014)#14 Search (range and k-nn): like R-trees Split? Several criteria: –minimize max radius (or sum radii) –(even: random!) Algorithm? eg., similar to the quadratic split of Guttman Metric trees

CMU SCS Copyright: C. Faloutsos (2014)#15 OMNI tree [Filho+, ICDE2001] Metric trees - variations

CMU SCS Copyright: C. Faloutsos (2014)#16 How to turn objects into vectors? (assume that distance computations are expensive; we need to answer range/nn queries quickly) Metric trees - OMNI trees

CMU SCS Copyright: C. Faloutsos (2014)#17 How to turn objects into vectors? A: pick n ‘anchor’ objects; record the distance of each object from them -> n-d vector Metric trees - OMNI trees

CMU SCS Copyright: C. Faloutsos (2014)#18 How to turn objects into vectors? A: pick n ‘anchor’ objects; record the distance of each object from them -> n-d vector Metric trees - OMNI trees

CMU SCS Copyright: C. Faloutsos (2014)#19 How to turn objects into vectors? A: pick n ‘anchor’ objects; record the distance of each object from them -> n-d vector Metric trees - OMNI trees

CMU SCS Copyright: C. Faloutsos (2014)#20 we could put OMNI coordinates in R-tree (or other SAM, or even do seq. scan) and still answer range and nn queries! (see [Filho’01] for details) Metric trees - OMNI trees r anchor1 anchor2 d1 d2

CMU SCS Copyright: C. Faloutsos (2014)#21 and still answer range and nn queries! (see [Filho’01] for details) OMNI trees – range queries r anchor1 d1 d2 d1d ……

CMU SCS Copyright: C. Faloutsos (2014)#22 Result: faster than M-trees and seq. scanning (especially if distance computations are expensive) OMNI trees – range queries r anchor1 d1 d2 d1d ……

CMU SCS Copyright: C. Faloutsos (2014)#23 Q1: how to choose anchors? Q2:... and how many? Metric trees - OMNI trees

CMU SCS Copyright: C. Faloutsos (2014)#24 Conclusions for SAMs z-ordering and R-trees for low-d points and regions – very successful M-trees & variants for metric datasets beware of the ‘dimensionality curse’ –Estimate ‘intrinsic’ dimensionality (`fractals’) –Project to lower dimensions (‘SVD/PCA’)

CMU SCS Copyright: C. Faloutsos (2014)#25 References Aurenhammer, F. (Sept. 1991). “Voronoi Diagrams - A Survey of a Fundamental Geometric Data Structure.” ACM Computing Surveys 23(3): Bentley, J. L., B. W. Weide, et al. (Dec. 1980). “Optimal Expected-Time Algorithms for Closest Point Problems.” ACM Trans. on Mathematical Software (TOMS) 6(4): Burkhard, W. A. and R. M. Keller (Apr. 1973). “Some Approaches to Best-Match File Searching.” Comm. of the ACM (CACM) 16(4):

CMU SCS Copyright: C. Faloutsos (2014)#26 References Christian Böhm, Stefan Berchtold, Daniel A. Keim: Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases. ACM Comput. Surv. 33(3): (2001) Searching in high-dimensional spaces: Index structures for improving the performance of multimedia databases Edgar Chávez, Gonzalo Navarro, Ricardo A. Baeza-Yates, José L. Marroquín: Searching in metric spaces. ACM Comput. Surv. 33(3): (2001)Searching in metric spaces

CMU SCS Copyright: C. Faloutsos (2014)#27 References Ciaccia, P., M. Patella, et al. (1997). M-tree: An Efficient Access Method for Similarity Search in Metric Spaces. VLDB. Filho, R. F. S., A. Traina, et al. (2001). Similarity search without tears: the OMNI family of all-purpose access methods. ICDE, Heidelberg, Germany. Friedman, J. H., F. Baskett, et al. (Oct. 1975). “An Algorithm for Finding Nearest Neighbors.” IEEE Trans. on Computers (TOC) C-24:

CMU SCS Copyright: C. Faloutsos (2014)#28 References Fukunaga, K. and P. M. Narendra (July 1975). “A Branch and Bound Algorithm for Computing k-Nearest Neighbors.” IEEE Trans. on Computers (TOC) C-24(7): Shapiro, M. (May 1977). “The Choice of Reference Points in Best-Match File Searching.” Comm. of the ACM (CACM) 20(5):

CMU SCS Copyright: C. Faloutsos (2014)#29 References Shasha, D. and T.-L. Wang (Apr. 1990). “New Techniques for Best-Match Retrieval.” ACM TOIS 8(2): Traina, C., A. J. M. Traina, et al. (2000). Slim-Trees: High Performance Metric Trees Minimizing Overlap Between Nodes. EDBT, Konstanz, Germany.