Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stochastic Skyline Operator

Similar presentations


Presentation on theme: "Stochastic Skyline Operator"— Presentation transcript:

1 Stochastic Skyline Operator
Xuemin Lin School of Computer Science University of New South Wales Australia Joint Work with: Ying Zhang (UNSW), Wenjie Zhang (UNSW), Muhammad Aamir Cheema (UNSW)

2 Introduction: Skyline
a user preference ≺ is given on each dimension of Rd. two points in Rd, u dominates v (u ≺ v)  i (1 ≤ i ≤ d), u.i ≺= v.i;  j (1 ≤ j ≤ d), u.j ≺ v.j Skyline: Points not dominated by another point. Multiple criteria optimal decision making: minimum set of candidates of best options regarding any monotonic functions.

3 Skyline of Uncertain Objects
Probabilistic Skyline: (VLDB07, PODS09, etc) Skyline probabilities by possible worlds. Providing the probabilities not worse than any other objects. Provide minimal candidate set of optimal solutions? How to define optimal options? How to characterize the minimum candidate set?

4 Expected Utility & Stochastic Order
Expected Utility Principle: Given a set U of uncertain objects and a decreasing utility function f, select U in U to maxmize E[f (U)]. Stochastic Order: Given a family ℱ of utility functions, U ≺ℱ V if for each f in ℱ E[f(U)] ≥ E [f(V)] Decreasing Multiplicative Functions: ℱ= where fi is nonnegative decreasing. Low orthant order: the stochastic order is defined over the family of decreasing multiplicative functions.

5 Example 1. B never preferred by the expected utility principle!
Athlete Instance 1 /probability Instance 2 /probability A (1,4) / 0.5 (3,2) / 0.5 B (2,5) / 0.5 (4,3) / 0.5 C (5,1) / 0.01 (3,4) / 0.99 Utility function: : nonnegative decreasing e.g. ; ; 1. B never preferred by the expected utility principle! 2. Psky (A) = 1, Psky (B) = 0.5, Psky (C) = 0.01

6 Contributions Introduce a novel skyline operator: stochastic skyline.
Guarantee the minimal candidate set to the optimal solutions regarding decreasing multiplicative functions. NP-Completeness of computing stochastic skyline regarding dimensionality d. Novel statistic base pruning techniques. Efficient partition base verification algorithms: polynomial if d is fixed.

7 Problem Statement Stochastic Order (lower orthant order):
Given U & V, U stochastically dominates V (U ≺sd V) if for any x, U.cdf (x) ≥ V.cdf (x) and exists y such that U.cdf (y) > V.cdf (y). U.cdf (x): probability mass of U in the rectangular region R ((0,0,…0), x); see the shaded region. Stochastic Skyline: the objects in U not stochastically dominated by any others, called stochastic skyline. Problem Statement: efficiently compute stochastic skyline regarding discrete cases.

8 Minimality of stochastic skyline
Stochastic skyline removes all objects not preferred by any non-negative decreasing functions!

9 Framework Phase 1: filtering. Remove non-promising objects.
Phase 2: verification. Test stochastic dominance between two objects. BBS combing with a heap: the “near” progressiveness only need to test either U ≺sd V or V ≺sd U in most cases (but not both).

10 Testing if U ≺sd V Violation point: a point x in Rd+ is a violation point regarding U ≺sd V if U.cdf (x) < V.cdf (x). Testing algorithm: if no violation points, then U ≺sd V. Not enough to test instances.

11 Reduce to Grid Points Test if U.cdf ≥ V.cdf against grid points only (see (a)). Testing the switching grid points only (see solid lines (b)).

12 Algorithm Given a rectangular region R (x, y), if U.cdf (x) ≥ V.cdf (y), then no violation point in R (x, y). Partition base testing algorithm: Get switching points Initial check Iteratively partition the grid to throw away non-promising sub-grids

13 Complexity The algorithm runs O (dm log m + md (T (Uartree) + T (Vartree))) where m is the number of instances in V. NP-Complete regarding d. Covert (the decision version of) the minimal set cover problem to a special case of the testing problem.

14 Filtering Techniques Pruning Rule 1: throw away fully dominated entries.

15 Filtering Techniques Pruning Rules 2: applying Cantelli’s Inequality to get upper-bonds.

16 Size Estimation: Expected size: size of stochastic skyline in Rd is bounded by that of conventional skyline in Rd+1; i.e., lnd (n)/(d+1)!

17 Empirical Study C++ with STL compiled with GNU GCC on 2.4GHz Debian
Real data set: NBA player’s game-by-game statistics Synthetic dataset: anti-correlated, correlated, independent

18

19

20

21 Summary a novel skyline operator: stochastic skyline
guarantee minimality . NP-complete to test stochastic order (lower orthant order) . novel efficient algorithms to compute stochastic order. Future work: F is a set of all decreasing functions?

22 Thank you!


Download ppt "Stochastic Skyline Operator"

Similar presentations


Ads by Google