Skyline query with R*-Tree: Branch and Bound Skyline (BBS) Algorithm

Slides:



Advertisements
Similar presentations
Ken C. K. Lee, Baihua Zheng, Huajing Li, Wang-Chien Lee VLDB 07 Approaching the Skyline in Z Order 1.
Advertisements

The A-tree: An Index Structure for High-dimensional Spaces Using Relative Approximation Yasushi Sakurai (NTT Cyber Space Laboratories) Masatoshi Yoshikawa.
1 A FAIR ASSIGNMENT FOR MULTIPLE PREFERENCE QUERIES Leong Hou U, Nikos Mamoulis, Kyriakos Mouratidis Gruppo 10: Paolo Barboni, Tommaso Campanella, Simone.
The Skyline Operator (Stephan Borzsonyi, Donald Kossmann, Konrad Stocker) Presenter: Shehnaaz Yusuf March 2005.
Multidimensional Indexing
Presented by Russell Myers Paper by Ming-Chuan Wu and Alejandro P. Buchmann.
July 29HDMS'08 Caching Dynamic Skyline Queries D. Sacharidis 1, P. Bouros 1, T. Sellis 1,2 1 National Technical University of Athens 2 Institute for Management.
Bitmap Index Buddhika Madduma 22/03/2010 Web and Document Databases - ACS-7102.
RRT-Connect path solving J.J. Kuffner and S.M. LaValle.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
Spatio-Temporal Databases. Introduction Spatiotemporal Databases: manage spatial data whose geometry changes over time Geometry: position and/or extent.
Branch and Bound Algorithm for Solving Integer Linear Programming
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
FLANN Fast Library for Approximate Nearest Neighbors
AAU A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing Presented by YuQing Zhang  Slobodan Rasetic Jorg Sander James Elding Mario A.
SUBSKY: Efficient Computation of Skylines in Subspaces Authors: Yufei Tao, Xiaokui Xiao, and Jian Pei Conference: ICDE 2006 Presenter: Kamiru Superviosr:
Maximal Vector Computation in Large Data Sets The 31st International Conference on Very Large Data Bases VLDB 2005 / VLDB Journal 2006, August Parke Godfrey,
1 Progressive Computation of Constrained Subspace Skyline Queries Evangelos Dellis 1 Akrivi Vlachou 1 Ilya Vladimirskiy 1 Bernhard Seeger 1 Yannis Theodoridis.
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
Collective Buffering: Improving Parallel I/O Performance By Bill Nitzberg and Virginia Lo.
Fast Packet Classification Using Bloom filters Authors: Sarang Dharmapurikar, Haoyu Song, Jonathan Turner, and John Lockwood Publisher: ANCS 2006 Present:
Efficient Progressive Processing of Skyline Queries in Peer-to-Peer Systems INFOSCALE’06.
Indexing and Visualizing Multidimensional Data I. Csabai, M. Trencséni, L. Dobos, G. Herczegh, P. Józsa, N. Purger Eötvös University,Budapest.
Nearest Neighbor Queries Chris Buzzerd, Dave Boerner, and Kevin Stewart.
Efficient Processing of Top-k Spatial Preference Queries
Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August.
Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality Piotr Indyk, Rajeev Motwani The 30 th annual ACM symposium on theory of computing.
Multimedia and Time-Series Data When Is “ Nearest Neighbor ” Meaningful? Group member: Terry Chan, Edward Chu, Dominic Leung, David Mak, Henry Yeung, Jason.
1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.
Indexing OLAP Data Sunita Sarawagi Monowar Hossain York University.
Introduction to Database Systems1 External Sorting Query Processing: Topic 0.
1 Review of report "LSDX: A New Labeling Scheme for Dynamically Updating XML Data"
Spatio-Temporal Databases. Term Project Groups of 2 students You can take a look on some project ideas from here:
Parallel Computation of Skyline Queries COSC6490A Fall 2007 Slawomir Kmiec.
1 Spatial Query Processing using the R-tree Donghui Zhang CCIS, Northeastern University Feb 8, 2005.
CSIS7101 – Advanced Database Technologies Spatio-Temporal Data (Part 3) The MV3-Tree: A Spatio-Temporal Access Method for Timestamp and Interval Queries.
Data Structures I (CPCS-204) Week # 2: Algorithm Analysis tools Dr. Omar Batarfi Dr. Yahya Dahab Dr. Imtiaz Khan.
Online Skyline Queries. Agenda Motivation: Top N vs. Skyline „Classic“ Algorithms –Block-Nested Loop Algorithm –Divide & Conquer Algorithm Online Algorithm.
Indexing Multidimensional Data
School of Computing Clemson University Fall, 2012
Tian Xia and Donghui Zhang Northeastern University
Strategies for Spatial Joins
Spatio-Temporal Databases
Updating SF-Tree Speaker: Ho Wai Shing.
UNIT 11 Query Optimization
Database Management System
Instance Based Learning
GC 211:Data Structures Algorithm Analysis Tools
Spatial Indexing I Point Access Methods.
Sameh Shohdy, Yu Su, and Gagan Agrawal
Chapter 15 QUERY EXECUTION.
Evaluation of Relational Operations: Other Operations
Introduction to Spatial Databases
Lecture#12: External Sorting (R&G, Ch13)
Spatio-Temporal Databases
GC 211:Data Structures Algorithm Analysis Tools
Junqi Zhang+ Xiangdong Zhou+ Wei Wang+ Baile Shi+ Jian Pei*
Locality Sensitive Hashing
Distributed Probabilistic Range-Aggregate Query on Uncertain Data
Xu Zhou Kenli Li Yantao Zhou Keqin Li
Lecture 2- Query Processing (continued)
Spatial Indexing I R-trees
Implementation of Relational Operations
CSE 373, Copyright S. Tanimoto, 2001 Asymptotic Analysis -
The Skyline Query in Databases Which Objects are the Most Important?
Efficient Processing of Top-k Spatial Preference Queries
Presentation transcript:

Skyline query with R*-Tree: Branch and Bound Skyline (BBS) Algorithm Tod Knott Jamie Vetrovsky Justin Guting Richard Ghrist

Introduction to topic The paper describes a skyline in the following way: “The skyline of a set of d- dimensional points contains the points that are not dominated by any other point on all dimensions”. Skylines can be computed in a variety of different ways including: Block nested loops Divide and conquer Bitmap and Index Nearest Neighbor

Branch and Bound Skyline (BBS) Algorithm The paper introduces the Branch and Bound Skyline (BBS) algorithm. The BBS algorithm is built upon the shortcomings of nearest neighbor (NN) in order to create a more effective and efficient skyline computation. It is a progressive algorithm based on NN search. It is IO optimal, meaning it performs a single access to R-Tree nodes that may contain a skyline point. Much Smaller overhead than NN. Does not retrieve duplicates skyline points.

Branch and Bound Skyline (BBS) Algorithm Cont. Motivation for the development of BBS stems from improving the NN algorithm. NN has some advantages including: Quick returning of the initial skyline points Applicability to arbitrary data distributions and dimensions. But NN suffers from major drawbacks though: Need for duplicate elimination is d>2 Multiple accesses to the same node Large overhead.

Details of BBS Design Based on an R tree, but can use any multidimensional access methods. At a basic level, BBS will pick a known skyline point, and compare to an MBR. Point A dominates MBR E, thus we can remove all points in MBR E as potential candidates for skyline points.

Design of BBS

Our implementation Hardware used: Windows 7 Professional Server Software used: Apache Web Server HTML/Javascript D3.js Ruby on Rails

Problems we had Finding an R-Tree implementation in Ruby to use. Writing said R-Tree since there is no good implementation in Ruby to use. R-tree becomes inefficient in higher dimensional space. Visualizing points in higher dimensional space becomes difficult

Future work/updates for this Multidimensions Runtime Visualizations in k dimensions Past 3d would be confusing to humans in our 3d world Making current algorithm better Interactive visualization

Questions?