Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presented by: Dardan Xhymshiti Fall 2015.  Type: Research paper  Authors:  International conference on Very Large Data Bases. Yoonjar Park Seoul National.

Similar presentations


Presentation on theme: "Presented by: Dardan Xhymshiti Fall 2015.  Type: Research paper  Authors:  International conference on Very Large Data Bases. Yoonjar Park Seoul National."— Presentation transcript:

1 Presented by: Dardan Xhymshiti Fall 2015

2  Type: Research paper  Authors:  International conference on Very Large Data Bases. Yoonjar Park Seoul National University Seoul, Kores yjpark@kdd.snu.ac.kr Jun-Ki Min Korea Univ. of Tech. & Edu Cheonan, Korea jkmin@kut.ac.kr Kyuseok Shim Seoul National University Seoul, Korea shim@kdd.snu.ac.kr

3  MapReduce: is a programming model for processing and generating large data sets with a parallel, distributed algorithm on a cluster.  Skyline operator: The Skyline operator is used in a query and performs a filtering of results from a database so that it keeps only those objects that are not worse than any other.

4  Applications that produce large volumes of uncertain data:  Social networks,  Data integration  Sensor data management  Uncertain data sources?  Data randomness,  Data incompleteness,  Limitation of measuring equipments.

5  Need of advanced analysis queries such as the skyline for big uncertain data.

6

7

8

9  The skyline probability of an instance is the probability that it appears in a possible world an is not dominated by every instance of the other objects in the possible world.  The skyline probability of an objects is the sum of the skyline probabilities of its all instances.  Similarly for the continuous model, we define the skyline probability of an object by using its uncertainty region and pdf.

10

11

12  Propose parallel algorithms using MapReduce to process the probabilistic skyline queries for uncertain data modeled by both discrete and continuous models.  3 filtering methods to identify probabilistic non-skyline objects in advance.  Development of a single MapReduce phase algorithm PS-QP-MR.  Enhances algorithm PS-QPF-MR by applying the three filtering methods additionally.  Presents brute-force algorithms PS-BR-MR and PS-BRF-MR with partitioning randomly and applying the filtering methods.

13  Several algorithms have been proposed for skyline queries:  Nearest Neighbor (Kossman).  Papadias improved NN algorithm by using the branch-and-bound strategy.  Have been proposed techniques for processing uncertain queries such as probabilistic top-K.  The serial algorithms for probabilistic skyline processing over uncertain data have been introduced.


Download ppt "Presented by: Dardan Xhymshiti Fall 2015.  Type: Research paper  Authors:  International conference on Very Large Data Bases. Yoonjar Park Seoul National."

Similar presentations


Ads by Google