Download presentation
Presentation is loading. Please wait.
Published byAshlie Mathews Modified over 9 years ago
1
Efficient Computation of Combinatorial Skyline Queries Author: Yu-Chi Chung, I-Fang Su, and Chiang Lee Source: Information Systems, 38(2013), pp.369-387 Reporter: Yueh-Lin Lin 1
2
Outline Introduction Related Work Combinatorial Skyline Query Processing The Brute-Force Method The Decomposition Algorithm (DA) The Improved Decomposition Algorithm (IDA) Performance Evaluation Conclusions 2
3
Introduction The skyline operator has received considerable attention from database community Importance in numerous disciplines Data mining, multi-criteria decision making, and market analysis 3
4
Skyline Example Mercedes-Benz plans to increase automobile sales Considering advertising TV is the most effective mass-market advertising format Advertising cost and audience number Tries to find a best advertising slot Costs lower and higher number of customers The slots that meet Benz need form a skyline 4
5
Skyline Example 5
6
Motivation 6
7
Combinations of Two Advertising Slots 7
8
Combinatorial Skyline Query (CSQ) 8
9
Observation 9
10
Challenge 10
11
Related Work After the skyline operator Many algorithms are proposed for skyline query processing BBS, bitmap, etc. Variations of the skyline Subspace skyline, k-dominate skyline, dynamic skyline, etc. The concept of combination is not mentioned in previous work Top-k combinatorial skyline queries (DASFAA 2010) 11
12
Problem 12
13
Combinatorial Skyline Query Processing The Brute-Force Method 13
14
The Brute-Force Method Example 14
15
The Brute-Force Method Example 15
16
The Decomposition Algorithm (DA) The brute-force method incurs high computation overhead since it enumerates all combinations. The Decomposition Algorithm To find the combinatorial skyline tuples without enumerating all combinations 16
17
DA Example 17
18
The Improved Decomposition Algorithm (IDA) 18
19
Enhanced Pruning Example 19
20
The Improved Decomposition Algorithm Example 20
21
Performance Evaluation 21
22
Scalability with respect to Data Size Query Processing Time 22
23
Scalability with respect to Data Size Query Processing Time 23
24
Comparison on Real Dataset 24
25
The Real Dataset Processing Time Dimensionality 25
26
The Real Dataset Processing Time Cardinality 26
27
Conclusions Proposed a new type of query The combinatorial skyline query Proposed two algorithms DA IDA The experimental results show IDA better than DA in all performance metrics 27
28
On Skyline Groups Author: Nan Zhang, Chengkai Li, Naeemul Hassan, Sundaresan Rajasekaran, and Gautam Das Source: IEEE Transactions on Knowledge and Data Engineering, Vol. 26, No. 4, April 2014, pp. 942-956 Reporter: Yueh-Lin Lin 28
29
Outline Introduction Skyline Group Problem Finding Skyline Groups Techniques Algorithms Experiments Conclusions Comments 29
30
Motivation 30
31
Challenge 31
32
Techniques 32
33
Skyline Group Problem 33
34
Aggregate Functions 34
35
Finding Skyline Groups 35
36
Finding Skyline Groups 36
37
Techniques 37
38
Output Compression Number of skyline groups may be large, many of them share the same aggregate vector Main idea To store Not all skyline groups The distinct skyline aggregate vectors One skyline group for each skyline vector 38
39
Input Pruning 39
40
Search Space Pruning: Anti-Monotonicity To find and leverage two anti-monotonic properties for skyline search, analogy to the Apriori algorithm Order-Specific Anti-Monotonic Property (OSM) SUM, MIN and MAX Weak Candidate-Generation Property (WCM) MIN and MAX The challenge is to find anti-monotonic properties that hold for skyline search The main contribution is not about proving, but rather about finding the right ones that can effectively prune the search space. 40
41
Algorithm Dynamic Programming Algorithm Based on Order- Specific Property Iterative Algorithm Based on Weak Candidate- Generation Property 41
42
Dynamic Programming Algorithm Based on Order-Specific Property 42
43
Dynamic Programming Algorithm Based on Order-Specific Property 43
44
Experiments The algorithms implemented in C+ Environment Dell PowerEdge 2900 III server Linux kernel 2.6.27-7 Dual Quad-Core Xeon 2.0 GHz 8GB RAM 250 GB HDD in RAID5 44
45
Datasets NBA players (2009 season) 512 tuples (players) 5 attributes Stocks (2009/12/31) 35000 tuples (stocks) 4 attributes Synthetic data 1-10 million tuples 5 attributes 45
46
Aggregate Functions & Methods Compared Aggregate functions SUM, MIN, and MAX Two algorithms compared with baseline method Order-Specific Property (OSM) Weak Candidate-Generation Property (WCM) 46
47
Comparison of Various Methods: SUM 47
48
Effect of Input Pruning 48
49
Conclusions The novel problem of computing skyline groups The novel algorithmic techniques Output compression Input pruning Search space pruning The experiments run the real and synthetic data sets to evaluate the proposed algorithms 49
50
Comments Group skyline with constraint NBA teams have salary limits Parallel computing MapReduce 50
51
Q&A 51
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.