Presentation is loading. Please wait.

Presentation is loading. Please wait.

T OWARD P RACTICAL Q UERY P RICING W ITH Q UERY M ARKET Paraschos Koutris Prasang Upadhyaya Magdalena Balazinska Bill Howe Dan Suciu University of Washington.

Similar presentations


Presentation on theme: "T OWARD P RACTICAL Q UERY P RICING W ITH Q UERY M ARKET Paraschos Koutris Prasang Upadhyaya Magdalena Balazinska Bill Howe Dan Suciu University of Washington."— Presentation transcript:

1 T OWARD P RACTICAL Q UERY P RICING W ITH Q UERY M ARKET Paraschos Koutris Prasang Upadhyaya Magdalena Balazinska Bill Howe Dan Suciu University of Washington SIGMOD 2013

2 M OTIVATION Data is increasingly sold and bought on the web Websites that sell data: – Xignite (financial) – Gnip (social) Data marketplace services: – Windows Azure Marketplace – Infochimps – Factual – DataMarket 2

3 A P RICING S CENARIO (1) 3 English-German dictionary T PRICING SCHEMES Sell the whole table T for a fixed price Q: translate only the word “thanks” The user pays for redundant information Price per output tuple Q: Does the word “thanks” translate to “Auto” ? An empty result still carries information englishgerman thanksDanke carAuto dayTag roadStrasse RoadWeg ……

4 A P RICING S CENARIO (2) 4 English-German dictionary T Word Frequency Stats UF wordfrequencygenrerank rock0.025music20 pop0.030music10 database0.001science1453 ……..… Current systems do not sell queries that combine datasets Queries issued by a user may have overlapping content Q1: Return all translations to German of top 10 words in the genre “music” Q2: Return all translations to German of top 20 words in the genre “music” englishgerman thanksDanke carAuto dayTag roadStrasse RoadWeg ……

5 H OW T O P RICE D ATA 5 englishgerman thanksDanke carAuto dayTag roadStrasse roadWeg …… English-German dictionary T p(σ T.english=‘thanks’ )=$0.1 p(σ T.english=‘day’ )=$0.1 p(σ T.english=‘road’ )=$0.15 p(σ T.english=‘cat’ )=$0.05 Price points selection queries on single table exhaust the possible values (Col A ) of some attribute A may select on values not in the active domain p(σ T.english=‘car’ )=$0.1 p(σ T.german=‘Auto’ )=$0.5 …

6 Q UERY M ARKET : C ONTRIBUTIONS A formal pricing framework where: – sellers specify a set of price points as selection queries – buyers can purchase any query on the database – the system automatically computes the price of the query Support efficient computation of prices for a large class of SQL queries Support the necessary functionality for a marketplace: – Pricing queries with overlapping information content – Database updates – Revenue sharing among different sellers? 6

7 O UTLINE 1.The Pricing Framework 2.Computing the Price 3.Query History 4.Revenue Sharing 7

8 T HE P RICING F RAMEWORK The seller defines price points (view-price pairs): S = { (V 1,p 1 ), (V 2,p 2 ), … } A buyer can buy any query Q The system will compute price D S (Q) Seller Price points Buyer Q(D) ? Pricing System + Database D price D S (Q) 8 [Koutris et al., PODS 2012]

9 P ROPERTIES OF P RICES Arbitrage-free: Given D, price D (Q) is arbitrage-free if for all views V 1, …, V k that determine Q: price D (Q) ≤ price D (V 1 ) + … + price D (V k ) Discount-free: price D (Q) must not offer additional discounts except for the explicit price points defined by the seller 9 We say that the views V 1,…, V k determine Q if one can compute Q(D) from V 1 (D),…, V k (D) without access to D

10 T HE P RICING F ORMULA 10 Arbitrage-Price: The price of the cheapest set of views from price points S that determine the query Q unique + arbitrage-free + discount-free + agrees with price points A a1a1 AB a1a1 b a2a2 b Table R Table S Col A = { a 1, a 2, a 3 } Col B = { b } price = $1 price = $2price = $3 {σ[R.A=a 1 ], σ[S.B=b] } determines Q cost = 1 + 3 = 4 {σ[R.A=a 1 ], σ[S.A=a 1 ] } also determines Q cost = 1 + 2 = 3 (cheapest possible) Q(y) = R(x),S(x,y)

11 O UTLINE 1.The Pricing Framework 2.Computing the Price 3.Query History 4.Revenue Sharing 11

12 C OMPUTING T HE P RICE 12 The problem of computing the arbitrage price even for SELECT-PROJECT-JOIN queries is coNP-complete For some queries, the price can be computed fast: Selections, joins w/o projection We describe pricing as an Integer Linear Program (ILP) and then use fast ILP solvers (e.g. GLPK, CPLEX) Classes of queries supported: Selections/Projections/Joins Unions User-Defined Functions (UDF) Bundles of queries

13 ILP C ONSTRUCTION (1) 13 Price the query Q(x,y) = R(x), S(x,y) Introduce a {0/1} variable x[attribute,value] for each price point: x[R.A, a 2 ], x[S.A, a 1 ], x[S.B, b], … A a1a1 AB a1a1 b a2a2 b Table R Table S Col A = { a 1, a 2, a 3 } Col B = { b } price = $1 price = $2price = $3

14 ILP C ONSTRUCTION (2) 14 Minimize (independent of the query): price = x[R.A,a 1 ] + x[R.A,a 2 ] + x[R.A,a 3 ] +2x[S.A,a 1 ] + 2x[S.A,a 2 ] + 2x[S.A,a 3 ] +3x[S.B,b] Constraints: (a 1,b) in Q: x[R.A,a 1 ] ≥ 1 x[S.A,a 1 ] + x[S.B,b] ≥ 1 (a 2,b) not in Q: x[R.A,a 2 ] ≥ 1 (a 3,b) not in Q: x[R.A,a 3 ] + x[S.A,a 3 ] + x[S.B,b] ≥ 1 A a1a1 AB a1a1 b a2a2 b Table R Table S Col A = { a 1, a 2, a 3 } Col B = { b } Q(x,y) = R(x), S(x,y)

15 ILP C ONSTRUCTION (3) 15 Projection: Q(y) = R(x), S(x,y) Constraints: (a 1,b) in Q full : x[R.A,a 1 ] ≥ z 1 x[S.A,a 1 ] + x[S.B,b] ≥ z 1 (a 2,b) in Q full : x[R.A,a 2 ] ≥ z 2 x[S.A,a 2 ] + x[S.B,b] ≥ z 2 (b) in Q : z 1 + z 2 ≥ 1 A a1a1 a2a2 AB a1a1 b a2a2 b Table R Table S Col A = { a 1, a 2, a 3 } Col B = { b} New variable for each tuple in Q full

16 Q UERY M ARKET S YSTEM Runs on top of any SQL database Information stored in the database: – Price points are stored in the database in price tables – Keeping track of price tables with an index table The dataset: – English-german translation: T en,gr (w, w’) – English-french translation : T en,fr (w, w’) – UDF to find hashtags : IsHashtag(w) – Word frequency stats : WF(w, genre, frequency, rank) 16

17 P RICE C OMPUTATION (1) 17 Small dataset where columns have size ~ 10 2 selections 2-way joins w/o projections 2-way joins with projections 3-way join

18 P RICE C OMPUTATION (2) 18 Larger dataset where columns have size ~ 10 3 selections 2-way joins w/o projections 2-way joins with projections 3-way join

19 O UTLINE 1.The Pricing Framework 2.Computing the Price 3.Query History 4.Revenue Sharing 19

20 Q UERY H ISTORY A user asks a sequence of queries over time of varying information overlap Q = Q 1, Q 2, …, Q k Experiment with 30 selection/join queries 20 Oblivious pricing: each query priced independently Bundle pricing: each query Q i priced p(Q 1,…,Q i )- p(Q 1,…,Q i-1 ) View pricing: when a query is purchased, the purchased views are free for later queries

21 Q UERY H ISTORY (2) 21

22 V IEW P RICING View Pricing is our proposed strategy: – Computationally efficient – Low storage overhead – Close to optimal (bundle) price View Pricing can be used for dynamic databases: if view V is purchased at some point and then updated, the user pays only an update price 22

23 O UTLINE 1.The Pricing Framework 2.Computing the Price 3.Query History 4.Revenue Sharing 23

24 R EVENUE S HARING How is the revenue shared between sellers if several datasets contribute to the answer? What if the cheapest set of views to determine a query is not unique ? Example: – Q(‘sigmod13’) = isHashtag(‘sigmod13’), isNoun(‘sigmod13’) – Seller 1 prices $1 per entry for isHashtag, so does seller 2 – If both isHashtag, isNoun are false and each costs $1, purchasing either of the entries answers Q 24

25 R EVENUE S HARING : S OLUTION For a seller s, share(s, Q) is the maximum revenue of s over all minimum-cost set of price points that determine Q share(s, Q) can be computed in our framework Solution: split price(Q) among sellers proportionally to their shares Example: – Both shares are $1 – The revenue of each seller will be $0.5, since their shares are equal 25

26 C ONCLUSIONS QueryMarket: the first system that supports pricing a large class of SQL queries within a formal framework We presented solutions to address the requirements of a real-world marketplace Future work includes: – Scaling the price computation (bucketization) – Full SQL Support (aggregates, negation) – Query answering under limited budget 26

27 Thank you ! 27


Download ppt "T OWARD P RACTICAL Q UERY P RICING W ITH Q UERY M ARKET Paraschos Koutris Prasang Upadhyaya Magdalena Balazinska Bill Howe Dan Suciu University of Washington."

Similar presentations


Ads by Google