Bin Cui, Hua Lu, Quanqing Xu, Lijiang Chen, Yafei Dai, Yongluan Zhou ICDE 08 Parallel Distributed Processing of Constrained Skyline Queries by Filtering.

Slides:



Advertisements
Similar presentations
Serializability in Multidatabases Ramon Lawrence Dept. of Computer Science
Advertisements

Publishing Set-Valued Data via Differential Privacy Rui Chen, Concordia University Noman Mohammed, Concordia University Benjamin C. M. Fung, Concordia.
Multi-Document Person Name Resolution Michael Ben Fleischman (MIT), Eduard Hovy (USC) From Proceedings of ACL-42 Reference Resolution workshop 2004.
CS4432: Database Systems II
Probabilistic Skyline Operator over Sliding Windows Wenjie Zhang University of New South Wales & NICTA, Australia Joint work: Xuemin Lin, Ying Zhang, Wei.
The Skyline Operator (Stephan Borzsonyi, Donald Kossmann, Konrad Stocker) Presenter: Shehnaaz Yusuf March 2005.
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
1 Maximal Independent Set. 2 Independent Set (IS): In a graph G=(V,E), |V|=n, |E|=m, any set of nodes that are not adjacent.
VLDB’2007 review Denis Mindolin. VLDB’07 program.
Using Trees to Depict a Forest Bin Liu, H. V. Jagadish EECS, University of Michigan, Ann Arbor Presented by Sergey Shepshelvich 1.
1 Ranked Queries over sources with Boolean Query Interfaces without Ranking Support Vagelis Hristidis, Florida International University Yuheng Hu, Arizona.
Using Error-Correcting Codes For Text Classification Rayid Ghani Center for Automated Learning & Discovery, Carnegie Mellon University.
Introduction to Evolutionary Computation  Genetic algorithms are inspired by the biological processes of reproduction and natural selection. Natural selection.
CSCI 4440 / 8446 Parallel Computing Three Sorting Algorithms.
Using Error-Correcting Codes For Text Classification Rayid Ghani This presentation can be accessed at
Copyright ©2009 Opher Etzion Event Processing Course Engineering and implementation considerations (related to chapter 10)
1 Optimizing Utility in Cloud Computing through Autonomic Workload Execution Reporter : Lin Kelly Date : 2010/11/24.
Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.
Using Error-Correcting Codes For Text Classification Rayid Ghani Center for Automated Learning & Discovery, Carnegie Mellon University.
Longest Increasing Subsequences in Windows Based on Canonical Antichain Partition Erdong Chen (Joint work with Linji Yang & Hao Yuan) Shanghai Jiao Tong.
Skyline Queries Against Mobile Lightweight Devices in MANETs Zhiyong Huang 1 Christian S. Jensen 2 Hua Lu 1 Beng Chin Ooi 1 1 National University of Singapore,
Skyline Queries Against Mobile Lightweight Devices in MANETs Zhiyong Huang 1 Christian S. Jensen 2 Hua Lu 1 Beng Chin Ooi 1 1 National University of Singapore,
SUBSKY: Efficient Computation of Skylines in Subspaces Authors: Yufei Tao, Xiaokui Xiao, and Jian Pei Conference: ICDE 2006 Presenter: Kamiru Superviosr:
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.
Intelligent Database Systems Lab 1 Advisor : Dr. Hsu Graduate : Jian-Lin Kuo Author : Silvia Nittel Kelvin T.Leung Amy Braverman 國立雲林科技大學 National Yunlin.
Processing Theta-Joins using MapReduce
Reverse Top-k Queries Akrivi Vlachou *, Christos Doulkeridis *, Yannis Kotidis #, Kjetil Nørvåg * *Norwegian University of Science and Technology (NTNU),
Module 11: Introducing Replication. Overview Introduction to Distributed Data Introduction to SQL Server Replication SQL Server Replication Agents SQL.
Efficient Progressive Processing of Skyline Queries in Peer-to-Peer Systems INFOSCALE’06.
Efficient Computation of Reverse Skyline Queries VLDB 2007.
CPSC 203 Introduction to Computers Tutorial 03 and 29 By Jie (Jeff) Gao.
K-Hit Query: Top-k Query Processing with Probabilistic Utility Function SIGMOD2015 Peng Peng, Raymond C.-W. Wong CSE, HKUST 1.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 New Unsupervised Clustering Algorithm for Large Datasets.
1 Shape Segmentation and Applications in Sensor Networks Xianjin Xhu, Rik Sarkar, Jie Gao Department of CS, Stony Brook University INFOCOM 2007.
Efficient Processing of Top-k Spatial Preference Queries
Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August.
Query Aggregation for Providing Efficient Data Services in Sensor Networks Wei Yu *, Thang Nam Le +, Dong Xuan + and Wei Zhao * * Computer Science Department.
Probabilistic Contextual Skylines D. Sacharidis 1, A. Arvanitis 12, T. Sellis 12 1 Institute for the Management of Information Systems — “Athena” R.C.,
Lecture 15- Parallel Databases (continued) Advanced Databases Masood Niazi Torshiz Islamic Azad University- Mashhad Branch
On Computing Top-t Influential Spatial Sites Authors: T. Xia, D. Zhang, E. Kanoulas, Y.Du Northeastern University, USA Appeared in: VLDB 2005 Presenter:
9/2/2005VLDB 2005, Trondheim, Norway1 On Computing Top-t Most Influential Spatial Sites Tian Xia, Donghui Zhang, Evangelos Kanoulas, Yang Du Northeastern.
The σ-neighborhood skyline queries Chen, Yi-Chung; LEE, Chiang. The σ-neighborhood skyline queries. Information Sciences, 2015, 322: 張天彥 2015/12/05.
Efficient Computation of Combinatorial Skyline Queries Author: Yu-Chi Chung, I-Fang Su, and Chiang Lee Source: Information Systems, 38(2013), pp
1 30 November 2006 An Efficient Nearest Neighbor (NN) Algorithm for Peer-to-Peer (P2P) Settings Ahmed Sabbir Arif Graduate Student, York University.
Online Interval Skyline Queries on Time Series ICDE 2009.
Finding skyline on the fly HKU CS DB Seminar 21 July 2004 Speaker: Eric Lo.
Bin Jiang, Jian Pei ICDE 2009 Online Interval Skyline Queries on Time Series 1.
Efficient Skyline Computation on Vertically Partitioned Datasets Dimitris Papadias, David Yang, Georgios Trimponias CSE Department, HKUST, Hong Kong.
Optional and responsive locking in collaborative graphics editing systems By School of Computing and Information Technology Griffith University Brisbane,
S CALABLE S KYLINE C OMPUTATION U SING O BJECT - BASED S PACE P ARTITIONING Shiming Zhang Nikos Mamoulis David W. Cheung sigmod
Presented by: Dardan Xhymshiti Fall  Type: Research paper  Authors:  International conference on Very Large Data Bases. Yoonjar Park Seoul National.
Presented by: Dardan Xhymshiti Spring 2016:. Authors: Publication:  ICDM 2015 Type:  Research Paper 2 Sean Chester*Darius Sidlauskas`Ira Assent*Kenneth.
A stochastic scheduling algorithm for precedence constrained tasks on Grid Future Generation Computer Systems (2011) Xiaoyong Tang, Kenli Li, Guiping Liao,
HKU CSIS DB Seminar Skyline Queries HKU CSIS DB Seminar 9 April 2003 Speaker: Eric Lo.
Parallel Computation of Skyline Queries COSC6490A Fall 2007 Slawomir Kmiec.
Distributed Localization Using a Moving Beacon in Wireless Sensor Networks IEEE Transactions on Parallel and Distributed System, Vol. 19, No. 5, May 2008.
Probabilistic Skylines on Uncertain Data (VLDB2007) Jian Pei et al Supervisor: Dr Benjamin Kao Presenter: For Date: 22 Feb 2008 ??: the possible world.
Parallel Databases.
Distributed Query Processing using different Semijoin operations.
Ge Yang Ruoming Jin Gagan Agrawal The Ohio State University
Clustering Uncertain Taxi data
Time Series Filtering Time Series
Preference Query Evaluation Over Expensive Attributes
Joining Interval Data in Relational Databases
Distributed Databases
A Restaurant Recommendation System Based on Range and Skyline Queries
Xu Zhou Kenli Li Yantao Zhou Keqin Li
The Skyline Query in Databases Which Objects are the Most Important?
Efficient Processing of Top-k Spatial Preference Queries
Presentation transcript:

Bin Cui, Hua Lu, Quanqing Xu, Lijiang Chen, Yafei Dai, Yongluan Zhou ICDE 08 Parallel Distributed Processing of Constrained Skyline Queries by Filtering 1

Outline Introduction Problem Definition Parallel Distributed Skyline Processing Experimental Study Conclusion 2

Introduction Distributed computing environments is consisting of different computers. Sorg directly communicates with any other site(Computer). Each site(Computer) can compute at the same time (Parallel). For instance, multiple stock information databases available at different places like New York Stock Exchange, London Stock Exchange, Tokyo Stock Exchange, etc. For each single stock, the agent needs to take into consideration multiple attributes. Therefore, a skyline query against those distributed databases will help the agent get those interesting stocks. 3

Problem Definition Sorg directly communicates with any other site Si. D : {p(2,6),q(2,4),r(3,3)}, q and r are not dominated. Skyline of D:{q, r } 4

Parallel Distributed Skyline Processing Computing local skyline and rMBRs in parallel. Parallel Distributed Query Execution Merge. 5

(Cont.) Computing local skyline and rMBRs in paralle Green Block: MBR Skyline: {(1,4),(3,3),(5,2)}) 6

(Cont.) Blue Block: skyline and rMBB (reduce MBB). rMBB only includes local skyline.{(1,4), (3,3),(5,2)}. 7

(Cont.) Parallel Distributed Query Execution Each site has a rMBB and local skyline set, and rMBB is represented by two points, the lower left corner rMBB.min and its uper right corner rMBB.max 8

(Cont.) rMBB 1 rMBB 2 rMBB 2.min rMBB 2.min.DR rMBB 1.min rMBB 1.min.DR 9

(Cont.) Execution plan: partitioning : Incomparable Partitioned into: {{A},{B,C,D,E}{F,G}} 10

(Cont.) Though B and D are incomparable, they are assigned to the same group with C and E, because either of them are not incomparable with C (and E). 11

(Cont.) Pick filtering point: 1.Distance of each filtering point is max(MaxDist): dominating region of each filtering point has small overlap. 2. filtering points’ Dominating Region is max(MaxSum): dominating region of each filtering point is larege. 3.Random 12

(Cont.) Assume 2 filtering point. Max distance: choose (1,5),(6,2) 13

(Cont.) Assume 2 filtering point. Max Dominating Region: Choose (2,4) and (4,3) (1,5) (2,4):4 (1,5) (4,3):4 (1,5) (6,2):0 (2,4) (4,3):6 (2,4) (6,2):4 (4,3) (6,2):4 Max 14

(Cont.) 15

(Cont.) Computing local skylines and rMBBs in parallel. 16

(Cont.) 17

(Cont.) Partitioned into {{A,B},{C,D}} rMBBIncomparablecomparable AC,DB B A CA,BD D C 18

(Cont.) Assume 1 filtering point: A:pick(2,4) (Dominating Region: (1,5):0,(2,4):2,(4,3):0 (2,4) compares with B’s (2,4) dominates (2,7),(5,4) Skyline of Partition {A,B}: {(1,5),(2,4),(4,3)} 19

(Cont.) Assume 1 filtering point: C:pick(6,2) (6,2) compares with D’s (6,2) dominates (8,2) (10,1) compares with D’s(10,0) (10,1) is dominated by (10,0) Skyline of Partition{C,D}: {(6,2),(10,0)} 20

(Cont.) Merge Skyline of{A,B},{C,D}: {(1,5),(2,4),(4,3),(6,2),(10,0)} 21

Experimental Study Independent Datasets 22

(Cont.) AntiCorrelated Datasets 23

(Cont.) NBA Dataset 24

(Cont.) Performance with Different Numbers of Filtering Points 25

Conclusion The Percentage of FIlter Points:10% is better. MaxSum is better than MaxDist and Random26 26