Creating Competitive Products Qian Wan [1], Raymond Chi-Wing Wong [1], Ihab F. Ilyas [2], M. Tamer Ozsu [2], Yu Peng [1] [1] Hong Kong University of Science.

Slides:



Advertisements
Similar presentations
1 Competitive Privacy: Secure Analysis on Integrated Sequence Data Raymond Chi-Wing Wong 1, Eric Lo 2 The Hong Kong University of Science and Technology.
Advertisements

Ranking Outliers Using Symmetric Neighborhood Relationship Wen Jin, Anthony K.H. Tung, Jiawei Han, and Wei Wang Advances in Knowledge Discovery and Data.
Spatio-temporal Databases
Modeling and Querying Possible Repairs in Duplicate Detection George Beskales Mohamed A. Soliman Ihab F. Ilyas Shai Ben-David.
13/04/20151 SPARK: Top- k Keyword Query in Relational Database Wei Wang University of New South Wales Australia.
Efficient Evaluation of k-Range Nearest Neighbor Queries in Road Networks Jie BaoChi-Yin ChowMohamed F. Mokbel Department of Computer Science and Engineering.
Probabilistic Skyline Operator over Sliding Windows Wenjie Zhang University of New South Wales & NICTA, Australia Joint work: Xuemin Lin, Ying Zhang, Wei.
Efficient IR-Style Keyword Search over Relational Databases Vagelis Hristidis University of California, San Diego Luis Gravano Columbia University Yannis.
1 Chapter 5 : Query Processing and Optimization Group 4: Nipun Garg, Surabhi Mithal
Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao 1 1 gStore: Answering SPARQL Queries Via Subgraph Matching 1 Peking University, 2 Hong.
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
        iDistance -- Indexing the Distance An Efficient Approach to KNN Indexing C. Yu, B. C. Ooi, K.-L. Tan, H.V. Jagadish. Indexing the distance:
1 Finding Shortest Paths on Terrains by Killing Two Birds with One Stone Manohar Kaul (Aarhus University) Raymond Chi-Wing Wong (Hong Kong University of.
School of Computer Science and Engineering Finding Top k Most Influential Spatial Facilities over Uncertain Objects Liming Zhan Ying Zhang Wenjie Zhang.
Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University.
Retrieving k-Nearest Neighboring Trajectories by a Set of Point Locations Lu-An Tang, Yu Zheng, Xing Xie, Jing Yuan, Xiao Yu, Jiawei Han University of.
University of Minnesota CG_Hadoop: Computational Geometry in MapReduce Ahmed Eldawy* Yuan Li* Mohamed F. Mokbel*$ Ravi Janardan* * Department of Computer.
1 NNH: Improving Performance of Nearest- Neighbor Searches Using Histograms Liang Jin (UC Irvine) Nick Koudas (AT&T Labs Research) Chen Li (UC Irvine)
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
Stabbing the Sky: Efficient Skyline Computation over Sliding Windows COMP9314 Lecture Notes.
Spatio-temporal Databases Time Parameterized Queries.
Optimization of Spatial Joins on Mobile Devices N. Mamoulis 1, P. Kalnis 2, S. Bakiras 3, X. Li 2 1 Department of Computer Science and Information Systems,
1 Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor Raymond Chi-Wing Wong (Hong Kong University of Science and Technology) M. Tamer.
Efficient Skyline Querying with Variable User Preferences on Nominal Attributes Raymond Chi-Wing Wong 1, Ada Wai-Chee Fu 2, Jian Pei 3, Yip Sing Ho 2,
1 Mining Favorable Facets Raymond Chi-Wing Wong (the Chinese University of Hong Kong) Jian Pei (Simon Fraser University) Ada Wai-Chee Fu (the Chinese University.
Probabilistic Skyline Operator over sliding Windows Wan Qian HKUST DB Group.
Ad-hoc Distributed Spatial Joins on Mobile Devices Panos Kalnis, Xiaochen Li National University of Singapore Nikos Mamoulis The University of Hong Kong.
Creating Competitive Products Qian Wan [1], Raymond Chi-Wing Wong [1], Ihab F. Ilyas [2], M. Tamer Ozsu [2], Yu Peng [1] [1] Hong Kong University of Science.
1 Efficient Algorithms for Optimal Location Queries in Road Networks Zitong Chen (Sun Yat-Sen University) Yubao Liu (Sun Yat-Sen University) Raymond Chi-Wing.
Catching the Best Views of Skyline: A Semantic Approach Based on Decisive Subspaces Jian Pei # Wen Jin # Martin Ester # Yufei Tao + # Simon Fraser University,
Join-Queries between two Spatial Datasets Indexed by a Single R*-tree Join-Queries between two Spatial Datasets Indexed by a Single R*-tree Michael Vassilakopoulos.
1 On Querying Historical Evolving Graph Sequences Chenghui Ren $, Eric Lo *, Ben Kao $, Xinjie Zhu $, Reynold Cheng $ $ The University of Hong Kong $ {chren,
Skyline Queries Against Mobile Lightweight Devices in MANETs Zhiyong Huang 1 Christian S. Jensen 2 Hua Lu 1 Beng Chin Ooi 1 1 National University of Singapore,
SUBSKY: Efficient Computation of Skylines in Subspaces Authors: Yufei Tao, Xiaokui Xiao, and Jian Pei Conference: ICDE 2006 Presenter: Kamiru Superviosr:
Maximal Vector Computation in Large Data Sets The 31st International Conference on Very Large Data Bases VLDB 2005 / VLDB Journal 2006, August Parke Godfrey,
Approximate Encoding for Direct Access and Query Processing over Compressed Bitmaps Tan Apaydin – The Ohio State University Guadalupe Canahuate – The Ohio.
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
1 Exact Top-k Nearest Keyword Search in Large Networks Minhao Jiang†, Ada Wai-Chee Fu‡, Raymond Chi-Wing Wong† † The Hong Kong University of Science and.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
Efficient Progressive Processing of Skyline Queries in Peer-to-Peer Systems INFOSCALE’06.
RELAXED REVERSE NEAREST NEIGHBORS QUERIES Arif Hidayat Muhammad Aamir Cheema David Taniar.
Mining Favorable Facets Raymond Chi-Wing Wong (the Chinese University of Hong Kong) Jian Pei (Simon Fraser University) Ada Wai-Chee Fu (the Chinese University.
Efficient Processing of Top-k Spatial Preference Queries
Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August.
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
GStore: Answering SPARQL Queries Via Subgraph Matching Lei Zou 1, Jinghui Mo 1, Lei Chen 2, M. Tamer Özsu 3, Dongyan Zhao Peking University, 2 Hong.
1 On Optimal Worst-Case Matching Cheng Long (Hong Kong University of Science and Technology) Raymond Chi-Wing Wong (Hong Kong University of Science and.
Answering Top-k Queries Using Views Gautam Das (Univ. of Texas), Dimitrios Gunopulos (Univ. of California Riverside), Nick Koudas (Univ. of Toronto), Dimitris.
Clustering of Uncertain data objects by Voronoi- diagram-based approach Speaker: Chan Kai Fong, Paul Dept of CS, HKU.
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
The σ-neighborhood skyline queries Chen, Yi-Chung; LEE, Chiang. The σ-neighborhood skyline queries. Information Sciences, 2015, 322: 張天彥 2015/12/05.
Information Technology Selecting Representative Objects Considering Coverage and Diversity Shenlu Wang 1, Muhammad Aamir Cheema 2, Ying Zhang 3, Xuemin.
Efficient Computation of Combinatorial Skyline Queries Author: Yu-Chi Chung, I-Fang Su, and Chiang Lee Source: Information Systems, 38(2013), pp
On Top-n Reverse Top-k Queries: Variants, Algorithms, and Applications 陳良弼 Arbee L.P. Chen National Chengchi University 9/21/2012 at NCHU.
1 Finding Competitive Price Yu Peng (Hong Kong University of Science and Technology) Raymond Chi-Wing Wong (Hong Kong University of Science and Technology)
Exploiting Multithreaded Architectures to Improve Data Management Operations Layali Rashid The Advanced Computer Architecture U of C (ACAG) Department.
Rate-Based Query Optimization for Streaming Information Sources Stratis D. Viglas Jeffrey F. Naughton.
HKU CSIS DB Seminar Skyline Queries HKU CSIS DB Seminar 9 April 2003 Speaker: Eric Lo.
1 Spatial Query Processing using the R-tree Donghui Zhang CCIS, Northeastern University Feb 8, 2005.
Computer Science and Engineering Jianye Yang 1, Ying Zhang 2, Wenjie Zhang 1, Xuemin Lin 1 Influence based Cost Optimization on User Preference 1 The University.
Abolfazl Asudeh Azade Nazi Nan Zhang Gautam DaS
Sameh Shohdy, Yu Su, and Gagan Agrawal
Spatio-temporal Pattern Queries
Conflict-Aware Event-Participant Arrangement
Probabilistic Data Management
Relaxing Join and Selection Queries
The Skyline Query in Databases Which Objects are the Most Important?
Efficient Processing of Top-k Spatial Preference Queries
Fraction-Score: A New Support Measure for Co-location Pattern Mining
Liang Jin (UC Irvine) Nick Koudas (AT&T Labs Research)
Presentation transcript:

Creating Competitive Products Qian Wan [1], Raymond Chi-Wing Wong [1], Ihab F. Ilyas [2], M. Tamer Ozsu [2], Yu Peng [1] [1] Hong Kong University of Science and Technology [2] University of Waterloo Presented by Qian Wan Prepared by Qian Wan

Outline Background – Skyline, Related Work Motivation – Example, Problem Definition Algorithm – Framework, Grouping, Pruning Experiments – Synthetic, Real data – 6 factors, 4 measurements Conclusions Creating Competitive Products | VLDB '092

Skyline Definition – Skyline contains the points which are not dominated by others Hotel searching problem – Distance to beach VS Price – Dominance – Skyline Creating Competitive Products | VLDB '09 Dist Price H3H3 H5H5 H7H7 H9H9 H1H1 H2H2 H4H4 H6H6 H8H8 Dist Price H1H1 H2H2 3

Related Work Skyline Queries in DBMS [S.Borzsonyi, 2001] Single Table Skyline Queries – Bitmaps [K.L. Tan,2001], Nearest Neighbor [D.Kossomann, 2002], Branch and Bound Skylines [D.Papadias, 2005] Multi-Table Skyline Queries – Natural Join [W.Jin, 2007][D.Sun, 2008] – Our Work Join different source tables via a “Cartesian product” like procedure. Creating Competitive Products | VLDB '094

Outline Background – Skyline, Related Work Motivation – Example, Problem Definition Algorithm – Framework, Grouping, Pruning Experiments – Synthetic, Real data – 6 factors, 4 measurements Conclusions Creating Competitive Products | VLDB '095

A Travel Agency’s Database Creating Competitive Products | VLDB '09 PackageNo-of- stops Distance- to-beach Hotel-classPrice P P P P Existing Vacation Packages HotelDistance- to-beach Hotel- class Hotel- cost H11003 H ………… FlightNo-of- stops Flight- cost F10120 ……… PackageNo-of- stops Distance-to- beach Hotel-classPrice Q1(F1:H1) Q2(F1,H2) Q3(F1, H3) …………… Q24(f4,h6) Newly Created Vacation Packages Source Tables 1.Direct attributes 2.Indirect attributes 3.One indirect attribute characteristic e.g. Travel Agency (Price), PC Manufacture(Price) Skyline tuples 6

Finding Competitive Products Given a set of source tables Market packages New packages Then, a tuple q in T Q is said to be competitive product if q is in Skyline with respect to Creating Competitive Products | VLDB '097

Naïve Solution Creating Competitive Products | VLDB '09 HotelDistance-to- beach Hotel- class Hotel- cost H11003 H H H41502 H H FlightNo-of- stops Flight- cost F10120 F21100 F3280 F4290 PackageNo-of- stops Distance- to-beach Hotel- class Price Q1(f1:h 1) Q2(f1,h 2) Q3(f1, h3) …………… Q7(f2,h 1) …………… Q13(f3, h1) …………… Q24(f4, h6) Packag e No-of- stops Distanc e-to- beach Hotel- class Price P P P P Intra-dominance checking 2.Inter-dominance checking Source Tables Existing Vacation Packages Newly Created Vacation Packages Packag e No- of- stops Distan ce-to- beach Hotel- class Price Q1(f1 :h1) Q2(f1,h2) Q3(f1, h3) …………… Q7(f2,h1) …………… Q13(f 3,h1) Competitive Products 8

Outline Background – Skyline, Related Work Motivation – Example, Problem Definition Algorithm – Framework, Grouping, Pruning Experiments – Synthetic, Real data – 6 factors, 4 measurements Conclusions Creating Competitive Products | VLDB '099

Algorithm Overview Intra-dominance checking – To Find Skyline in Source Tables Inter-dominance checking – Skyline in Existing Market Packages – R* Tree Indies in Existing Market Packages – Full Pruning – Partial Pruning Post-processing Creating Competitive Products | VLDB '0910

Intra-dominance Checking Creating Competitive Products | VLDB '09 HotelDistance-to- beach Hotel- class Hotel- cost H11003 H H H41502 H H FlightNo-of- stops Flight- cost F10120 F21100 F3280 F4290 PackageNo-of- stops Distance- to-beach Hotel- class Price Q1(f1:h 1) Q2(f1,h 2) Q3(f1, h3) …………… Q7(f2,h 1) …………… Q13(f3, h1) …………… Q15(f3, h5) HotelDistance-to- beach Hotel- class Hotel- cost H11003 H H H41502 H FlightNo-of- stops Flight- cost F10120 F21100 F3280 Skyline Tuples of Source Tables Newly Created Vacation Packages (conceptual) 1.NO intra-dominance checking (one indirect attribute) 2.NO competitive products are missed PackageNo-of- stops Distanc e-to- beach Hotel- class Price Q1(f1: h1) Q2(f1, h2) Q3(f1, h3) …………… Q7(f2, h1) …………… Q13(f 3,h1) Competitive Products Conceptual 11

Algorithm Overview Intra-dominance checking (Framework) – To Find Skyline in Source Tables Inter-dominance checking – Skyline in Existing Market Packages – R* Tree Indies in Existing Market Packages – Full Pruning – Partial Pruning Post-processing Creating Competitive Products | VLDB '0912

Inter-dominance Checking PackageNo-of- stops Distance- to-beach Hotel- class Price P P P P PackageNo-of- stops Distance- to-beach Hotel- class Price P P P P Creating Competitive Products | VLDB '09 PackageNo-of- stops Distance- to-beach Hotel- class Price P P P No Competitive Products are missed R* Tree will speedup the inter-dominance checking Existing Vacation Packages Skyline in Existing Vacation Packages R0R1R3R4R2R5 Inter-dominance Checking  Range query Spatial Index 13

Algorithm Overview Intra-dominance checking (Framework) –T–To Find Skyline in Source Tables Inter-dominance checking –S–Skyline in Existing Market Packages –R–R* Tree Indies in Existing Market Packages –F–Full Pruning –P–Partial Pruning Post-processing Creating Competitive Products | VLDB '0914

Full Pruning PackageNo-of- stops Distanc e-to- beach Hotel- class Price P P P Creating Competitive Products | VLDB '09 PackageNo-of- stops Distance- to-beach Hotel- class Price Q1(f1:h 1) Q2(f1,h 2) Q3(f1, h3) …………… Q7(f2,h 1) …………… Q13(f3, h1) …………… Q15(f3, h5) HotelDistance-to- beach Hotel- class Hotel- cost H11003 H H H41502 H FlightNo-of- stops Flight- cost F10120 F21100 F3280 Skyline Tuples of Source Tables Newly Created Vacation Packages (Conceptual) PackageNo-of- stops Distanc e-to- beach Hotel- class Price Q1(f1: h1) Q2(f1, h2) Q3(f1, h3) …………… Q7(f2, h1) …………… Q13(f 3,h1) Existing Vacation Packages Competitive Products A1 A2 B1 B2 C1={A1, B1} C4={A2, B2} Full Pruning 15

Full Pruning PackageNo-of- stops Distance- to-beach Hotel- class Price P P P Creating Competitive Products | VLDB '09 Best Representative B1B1 B2B2 …………… BiBi …………… BjBj …………… BkBk Groups C1C1 C2C2 …………… CiCi …………… CjCj …………… CkCk PackageNo-of- stops Distance- to-beach Hotel- class Price Q(f2:h4) Q’(f2,h5) PackageNo-of- stops Distance- to-beach Hotel- class Price Min Quality of Best Representative(tightness of each group): (Clustering, e.g. KMeans) Best Representative 16

Algorithm Overview Intra-dominance checking (Framework) – To Find Skyline in Source Tables Inter-dominance checking – Skyline in Existing Market Packages – R* Tree Indies in Existing Market Packages – Full Pruning – Partial Pruning Post-processing Creating Competitive Products | VLDB '0917

Partial Pruning Full pruning prunes all members in the group Partial pruning prunes some members in the group Direct attribute does not change Estimate the best possible value for indirect attributes Using tuples in T E ’ to conduct Range Query in each Source Table Eliminate dominated combinations, if – They are dominated on all direct attributes – They are dominated on all indirect attributes according to their best estimation Partial pruning is used when full pruning cannot be applied Creating Competitive Products | VLDB '0918

Partial Pruning PackageNo-of- stops Distanc e-to- beach Hotel- class Price P P P Creating Competitive Products | VLDB '09 PackageNo-of- stops Distance- to-beach Hotel- class Price Q1(f1:h 1) Q2(f1,h 2) Q3(f1, h3) …………… Q7(f2,h 1) …………… Q13(f3, h1) …………… Q15(f3, h5) HotelDistance-to- beach Hotel- class Hotel- cost H11003 H H H41502 H FlightNo-of- stops Flight- cost F10120 F21100 F3280 Skyline Tuples of Source Tables Newly Created Vacation Packages PackageNo-of- stops Distanc e-to- beach Hotel- class Price Q1(f1: h1) Q2(f1, h2) Q3(f1, h3) …………… Q7(f2, h1) …………… Q13(f 3,h1) Existing Vacation Packages Competitive Products A1 B1 C1={A1, B1} Full Pruning 19

Meta Transformation PackageNo-of- stops Distance- to-beach Hotel- class Price P P P Creating Competitive Products | VLDB '09 PackageNo-of- stops Distance- to-beach Hotel- class Price P PackageNo-of- stops Price P21170 PackageDistance- to-beach Hotel-classPrice P HotelDistance-to- beach Hotel- class Hotel- cost H H H FlightNo-of- stops Flight- cost F10200 F21180 No inter-dominance checking for {F2} X{H2} Meta-Hotel Meta-Flight Min1100 Min HotelDistance- to-beach Hotel- class Hotel- cost H11003 H H FlightNo-of- stops Flight- cost F10120 F21100 A1 B1 20

Algorithm Overview Framework Intra-dominance checking – To Find Skyline in Source Tables Inter-dominance checking – Skyline in Existing Market Packages – R* Tree Indies in Existing Market Packages – Full Pruning – Partial Pruning Post-processing Creating Competitive Products | VLDB '0921

Post-processing More than one indirect attributes – Calculation Previous algorithm  Intra-dominance checking – Any existing Skyline algorithm – Post-processing cost depends on the size of Competitive Products Creating Competitive Products | VLDB '0922

Outline Background – Skyline, Related Work Motivation – Example, Problem Definition Algorithm – Framework, Grouping, Pruning Experiments – Synthetic, Real data – 6 factors, 4 measurements Conclusions Creating Competitive Products | VLDB '0923

Experiments Pentium IV 2.4GHz PC with 4GB memory, Linux platform, C++ Synthetic anti-correlated datasets Real datasets, Travel Agency A and Travel Agency B – A, 296 packages, 1014 hotels and 4394 flights – B, 149 packages, 995 hotels and 866 flights Implementation – Algorithm for Creating Competitive Products (ACCP) – Baseline algorithm – Naïve algorithm Creating Competitive Products | VLDB '09 Skyline in tables R* TreeFull & Partial Pruning ACCPYes BaselineYes No NaïveNo 24

Synthetic Datasets ParametersDefault value No. of attributes in each source table4 No. of indirect attributes in a product table 1 No. of source tables2 No. of clusters in each source table2 Size of existing packages5M Size of each source table100k Schema is similar to our example Anti-correlated 6 factors Measurement – Execution time – Pruning Power – Ratio of Competitive Products out of all combinations – Memory Usage Creating Competitive Products | VLDB '0925

Experiments Creating Competitive Products | VLDB '09 From 100k to 500k Full pruning & partial pruning T Q, T Q ’, T R SKY Pruning Power slightly increases ParametersDefault value No. of attributes in each source table4 No. of indirect attributes in a product table 1 No. of source tables2 No. of clusters in each source table6 Size of existing packages5M Size of each source table100k 26

Experiments Creating Competitive Products | VLDB '09 From 2.5M to 10M ParametersDefault value No. of attributes in each source table4 No. of indirect attributes in a product table 1 No. of source tables2 No. of clusters in each source table6 Size of existing packages5M Size of each source table100k More competitive Slightly decreases 27

Experiments Creating Competitive Products | VLDB '09 Travel Agency A Package Generation Set 1.A, 296 packages, 1014 hotels and 4394 flights. B, 149 packages, 995 hotels and 866 flights 2.Source tables from B, and Package from A 3.Vary discount from 0 to Efficiency ACCP(44.74s) and Baseline (84.47s) 5.|SKY|/|T Q | 6.|DOM|/|T E | DOM SKY 28

Outline Background – Skyline, Related Work Motivation – Example, Problem Definition Algorithm – Framework, Grouping, Pruning Experiments – Synthetic, Real data – 6 factors, 4 measurements Conclusions Creating Competitive Products | VLDB '0929

Conclusions Creating Competitive Products – Example – Problem Definition Algorithms – Framework – Intra-dominance checking – Inter-dominance checking – Post-processing Experiments – Synthetic anti-correlated datasets – Real datasets Creating Competitive Products | VLDB '0930

THANK YOU ! Q&A Creating Competitive Products | VLDB '0931