Presentation is loading. Please wait.

Presentation is loading. Please wait.

Creating Competitive Products Qian Wan [1], Raymond Chi-Wing Wong [1], Ihab F. Ilyas [2], M. Tamer Ozsu [2], Yu Peng [1] [1] Hong Kong University of Science.

Similar presentations


Presentation on theme: "Creating Competitive Products Qian Wan [1], Raymond Chi-Wing Wong [1], Ihab F. Ilyas [2], M. Tamer Ozsu [2], Yu Peng [1] [1] Hong Kong University of Science."— Presentation transcript:

1 Creating Competitive Products Qian Wan [1], Raymond Chi-Wing Wong [1], Ihab F. Ilyas [2], M. Tamer Ozsu [2], Yu Peng [1] [1] Hong Kong University of Science and Technology [2] University of Waterloo Presented by Qian Wan Prepared by Qian Wan

2 Outline Background – Skyline, Related Work Motivation – Example, Problem Definition Algorithm – Framework, Grouping, Pruning Experiments – Synthetic, Real data – 6 factors, 4 measurements Conclusions Creating Competitive Products | VLDB '092

3 Skyline Definition – Skyline contains the points which are not dominated by others Hotel searching problem – Distance to beach VS Price – Dominance – Skyline Creating Competitive Products | VLDB '09 Dist Price H3H3 H5H5 H7H7 H9H9 H1H1 H2H2 H4H4 H6H6 H8H8 Dist Price H1H1 H2H2 3

4 Related Work Skyline Queries in DBMS [S.Borzsonyi, 2001] Single Table Skyline Queries – Bitmaps [K.L. Tan,2001], Nearest Neighbor [D.Kossomann, 2002], Branch and Bound Skylines [D.Papadias, 2005] Multi-Table Skyline Queries – Natural Join [W.Jin, 2007][D.Sun, 2008] – Our Work Join different source tables via a “Cartesian product” like procedure. Creating Competitive Products | VLDB '094

5 Outline Background – Skyline, Related Work Motivation – Example, Problem Definition Algorithm – Framework, Grouping, Pruning Experiments – Synthetic, Real data – 6 factors, 4 measurements Conclusions Creating Competitive Products | VLDB '095

6 A Travel Agency’s Database Creating Competitive Products | VLDB '09 PackageNo-of- stops Distance- to-beach Hotel-classPrice P101302250 P211402170 P313001150 P411504300 Existing Vacation Packages HotelDistance- to-beach Hotel- class Hotel- cost H11003 H2200290 ………… FlightNo-of- stops Flight- cost F10120 ……… PackageNo-of- stops Distance-to- beach Hotel-classPrice Q1(F1:H1)01003220 Q2(F1,H2)02002210 Q3(F1, H3)04001200 …………… Q24(f4,h6)22003210 Newly Created Vacation Packages Source Tables 1.Direct attributes 2.Indirect attributes 3.One indirect attribute characteristic e.g. Travel Agency (Price), PC Manufacture(Price) Skyline tuples 6

7 Finding Competitive Products Given a set of source tables Market packages New packages Then, a tuple q in T Q is said to be competitive product if q is in Skyline with respect to Creating Competitive Products | VLDB '097

8 Naïve Solution Creating Competitive Products | VLDB '09 HotelDistance-to- beach Hotel- class Hotel- cost H11003 H2200290 H3400180 H41502 H51702140 H62003120 FlightNo-of- stops Flight- cost F10120 F21100 F3280 F4290 PackageNo-of- stops Distance- to-beach Hotel- class Price Q1(f1:h 1) 01003220 Q2(f1,h 2) 02002210 Q3(f1, h3) 04001200 …………… Q7(f2,h 1) 11003200 …………… Q13(f3, h1) 21003180 …………… Q24(f4, h6) 22003210 Packag e No-of- stops Distanc e-to- beach Hotel- class Price P101302250 P211402170 P313001150 P411504300 1.Intra-dominance checking 2.Inter-dominance checking Source Tables Existing Vacation Packages Newly Created Vacation Packages Packag e No- of- stops Distan ce-to- beach Hotel- class Price Q1(f1 :h1) 01003220 Q2(f1,h2) 02002210 Q3(f1, h3) 04001200 …………… Q7(f2,h1) 11003200 …………… Q13(f 3,h1) 21003180 Competitive Products 8

9 Outline Background – Skyline, Related Work Motivation – Example, Problem Definition Algorithm – Framework, Grouping, Pruning Experiments – Synthetic, Real data – 6 factors, 4 measurements Conclusions Creating Competitive Products | VLDB '099

10 Algorithm Overview Intra-dominance checking – To Find Skyline in Source Tables Inter-dominance checking – Skyline in Existing Market Packages – R* Tree Indies in Existing Market Packages – Full Pruning – Partial Pruning Post-processing Creating Competitive Products | VLDB '0910

11 Intra-dominance Checking Creating Competitive Products | VLDB '09 HotelDistance-to- beach Hotel- class Hotel- cost H11003 H2200290 H3400180 H41502 H51702140 H62003120 FlightNo-of- stops Flight- cost F10120 F21100 F3280 F4290 PackageNo-of- stops Distance- to-beach Hotel- class Price Q1(f1:h 1) 01003220 Q2(f1,h 2) 02002210 Q3(f1, h3) 04001200 …………… Q7(f2,h 1) 11003200 …………… Q13(f3, h1) 21003180 …………… Q15(f3, h5) 21703200 HotelDistance-to- beach Hotel- class Hotel- cost H11003 H2200290 H3400180 H41502 H51702140 FlightNo-of- stops Flight- cost F10120 F21100 F3280 Skyline Tuples of Source Tables Newly Created Vacation Packages (conceptual) 1.NO intra-dominance checking (one indirect attribute) 2.NO competitive products are missed PackageNo-of- stops Distanc e-to- beach Hotel- class Price Q1(f1: h1) 01003220 Q2(f1, h2) 02002210 Q3(f1, h3) 04001200 …………… Q7(f2, h1) 11003200 …………… Q13(f 3,h1) 21003180 Competitive Products Conceptual 11

12 Algorithm Overview Intra-dominance checking (Framework) – To Find Skyline in Source Tables Inter-dominance checking – Skyline in Existing Market Packages – R* Tree Indies in Existing Market Packages – Full Pruning – Partial Pruning Post-processing Creating Competitive Products | VLDB '0912

13 Inter-dominance Checking PackageNo-of- stops Distance- to-beach Hotel- class Price P101302250 P211402170 P313001150 P411504300 PackageNo-of- stops Distance- to-beach Hotel- class Price P101302250 P211402170 P313001150 P411504300 Creating Competitive Products | VLDB '09 PackageNo-of- stops Distance- to-beach Hotel- class Price P101302250 P211402170 P313001150 No Competitive Products are missed R* Tree will speedup the inter-dominance checking Existing Vacation Packages Skyline in Existing Vacation Packages R0R1R3R4R2R5 Inter-dominance Checking  Range query Spatial Index 13

14 Algorithm Overview Intra-dominance checking (Framework) –T–To Find Skyline in Source Tables Inter-dominance checking –S–Skyline in Existing Market Packages –R–R* Tree Indies in Existing Market Packages –F–Full Pruning –P–Partial Pruning Post-processing Creating Competitive Products | VLDB '0914

15 Full Pruning PackageNo-of- stops Distanc e-to- beach Hotel- class Price P101302250 P211402170 P313001150 Creating Competitive Products | VLDB '09 PackageNo-of- stops Distance- to-beach Hotel- class Price Q1(f1:h 1) 01003220 Q2(f1,h 2) 02002210 Q3(f1, h3) 04001200 …………… Q7(f2,h 1) 11003200 …………… Q13(f3, h1) 21003180 …………… Q15(f3, h5) 21703200 HotelDistance-to- beach Hotel- class Hotel- cost H11003 H2200290 H3400180 H41502 H51702140 FlightNo-of- stops Flight- cost F10120 F21100 F3280 Skyline Tuples of Source Tables Newly Created Vacation Packages (Conceptual) PackageNo-of- stops Distanc e-to- beach Hotel- class Price Q1(f1: h1) 01003220 Q2(f1, h2) 02002210 Q3(f1, h3) 04001200 …………… Q7(f2, h1) 11003200 …………… Q13(f 3,h1) 21003180 Existing Vacation Packages Competitive Products A1 A2 B1 B2 C1={A1, B1} C4={A2, B2} Full Pruning 15

16 Full Pruning PackageNo-of- stops Distance- to-beach Hotel- class Price P101302250 P211402170 P313001150 Creating Competitive Products | VLDB '09 Best Representative B1B1 B2B2 …………… BiBi …………… BjBj …………… BkBk Groups C1C1 C2C2 …………… CiCi …………… CjCj …………… CkCk PackageNo-of- stops Distance- to-beach Hotel- class Price Q(f2:h4)11504250 Q’(f2,h5)11704240 PackageNo-of- stops Distance- to-beach Hotel- class Price Min11504240 Quality of Best Representative(tightness of each group): (Clustering, e.g. KMeans) Best Representative 16

17 Algorithm Overview Intra-dominance checking (Framework) – To Find Skyline in Source Tables Inter-dominance checking – Skyline in Existing Market Packages – R* Tree Indies in Existing Market Packages – Full Pruning – Partial Pruning Post-processing Creating Competitive Products | VLDB '0917

18 Partial Pruning Full pruning prunes all members in the group Partial pruning prunes some members in the group Direct attribute does not change Estimate the best possible value for indirect attributes Using tuples in T E ’ to conduct Range Query in each Source Table Eliminate dominated combinations, if – They are dominated on all direct attributes – They are dominated on all indirect attributes according to their best estimation Partial pruning is used when full pruning cannot be applied Creating Competitive Products | VLDB '0918

19 Partial Pruning PackageNo-of- stops Distanc e-to- beach Hotel- class Price P101302250 P211402170 P313001150 Creating Competitive Products | VLDB '09 PackageNo-of- stops Distance- to-beach Hotel- class Price Q1(f1:h 1) 01003220 Q2(f1,h 2) 02002210 Q3(f1, h3) 04001200 …………… Q7(f2,h 1) 11003200 …………… Q13(f3, h1) 21003180 …………… Q15(f3, h5) 21703200 HotelDistance-to- beach Hotel- class Hotel- cost H11003 H2200290 H3400180 H41502 H51702140 FlightNo-of- stops Flight- cost F10120 F21100 F3280 Skyline Tuples of Source Tables Newly Created Vacation Packages PackageNo-of- stops Distanc e-to- beach Hotel- class Price Q1(f1: h1) 01003220 Q2(f1, h2) 02002210 Q3(f1, h3) 04001200 …………… Q7(f2, h1) 11003200 …………… Q13(f 3,h1) 21003180 Existing Vacation Packages Competitive Products A1 B1 C1={A1, B1} Full Pruning 19

20 Meta Transformation PackageNo-of- stops Distance- to-beach Hotel- class Price P101302250 P211402170 P313001150 Creating Competitive Products | VLDB '09 PackageNo-of- stops Distance- to-beach Hotel- class Price P211402170 PackageNo-of- stops Price P21170 PackageDistance- to-beach Hotel-classPrice P21402170 HotelDistance-to- beach Hotel- class Hotel- cost H11003200 H22002190 H34001180 FlightNo-of- stops Flight- cost F10200 F21180 No inter-dominance checking for {F2} X{H2} Meta-Hotel Meta-Flight Min1100 Min400180 HotelDistance- to-beach Hotel- class Hotel- cost H11003 H2200290 H3400180 FlightNo-of- stops Flight- cost F10120 F21100 A1 B1 20

21 Algorithm Overview Framework Intra-dominance checking – To Find Skyline in Source Tables Inter-dominance checking – Skyline in Existing Market Packages – R* Tree Indies in Existing Market Packages – Full Pruning – Partial Pruning Post-processing Creating Competitive Products | VLDB '0921

22 Post-processing More than one indirect attributes – Calculation Previous algorithm  Intra-dominance checking – Any existing Skyline algorithm – Post-processing cost depends on the size of Competitive Products Creating Competitive Products | VLDB '0922

23 Outline Background – Skyline, Related Work Motivation – Example, Problem Definition Algorithm – Framework, Grouping, Pruning Experiments – Synthetic, Real data – 6 factors, 4 measurements Conclusions Creating Competitive Products | VLDB '0923

24 Experiments Pentium IV 2.4GHz PC with 4GB memory, Linux platform, C++ Synthetic anti-correlated datasets Real datasets, Travel Agency A and Travel Agency B – A, 296 packages, 1014 hotels and 4394 flights – B, 149 packages, 995 hotels and 866 flights Implementation – Algorithm for Creating Competitive Products (ACCP) – Baseline algorithm – Naïve algorithm Creating Competitive Products | VLDB '09 Skyline in tables R* TreeFull & Partial Pruning ACCPYes BaselineYes No NaïveNo 24

25 Synthetic Datasets ParametersDefault value No. of attributes in each source table4 No. of indirect attributes in a product table 1 No. of source tables2 No. of clusters in each source table2 Size of existing packages5M Size of each source table100k Schema is similar to our example Anti-correlated 6 factors Measurement – Execution time – Pruning Power – Ratio of Competitive Products out of all combinations – Memory Usage Creating Competitive Products | VLDB '0925

26 Experiments Creating Competitive Products | VLDB '09 From 100k to 500k Full pruning & partial pruning T Q, T Q ’, T R SKY Pruning Power slightly increases ParametersDefault value No. of attributes in each source table4 No. of indirect attributes in a product table 1 No. of source tables2 No. of clusters in each source table6 Size of existing packages5M Size of each source table100k 26

27 Experiments Creating Competitive Products | VLDB '09 From 2.5M to 10M ParametersDefault value No. of attributes in each source table4 No. of indirect attributes in a product table 1 No. of source tables2 No. of clusters in each source table6 Size of existing packages5M Size of each source table100k More competitive Slightly decreases 27

28 Experiments Creating Competitive Products | VLDB '09 Travel Agency A Package Generation Set 1.A, 296 packages, 1014 hotels and 4394 flights. B, 149 packages, 995 hotels and 866 flights 2.Source tables from B, and Package from A 3.Vary discount from 0 to 0.50 4.Efficiency ACCP(44.74s) and Baseline (84.47s) 5.|SKY|/|T Q | 6.|DOM|/|T E | DOM SKY 28

29 Outline Background – Skyline, Related Work Motivation – Example, Problem Definition Algorithm – Framework, Grouping, Pruning Experiments – Synthetic, Real data – 6 factors, 4 measurements Conclusions Creating Competitive Products | VLDB '0929

30 Conclusions Creating Competitive Products – Example – Problem Definition Algorithms – Framework – Intra-dominance checking – Inter-dominance checking – Post-processing Experiments – Synthetic anti-correlated datasets – Real datasets Creating Competitive Products | VLDB '0930

31 THANK YOU ! Q&A Creating Competitive Products | VLDB '0931


Download ppt "Creating Competitive Products Qian Wan [1], Raymond Chi-Wing Wong [1], Ihab F. Ilyas [2], M. Tamer Ozsu [2], Yu Peng [1] [1] Hong Kong University of Science."

Similar presentations


Ads by Google