Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dynamic Covering for Recommendation Systems Ioannis Antonellis Anish Das Sarma Shaddin Dughmi.

Similar presentations


Presentation on theme: "Dynamic Covering for Recommendation Systems Ioannis Antonellis Anish Das Sarma Shaddin Dughmi."— Presentation transcript:

1 Dynamic Covering for Recommendation Systems Ioannis Antonellis Anish Das Sarma Shaddin Dughmi

2 Outline Covering & Recommendations Succinct Dynamic Covering Results: o Upper Bounds o Lower Bounds

3 Max k-cover Problem Input: o integer k o items: X = {1,2,..., n} o sets: I = {S1,..., Sm}, Si subset of X Output: Find subset of I with size less than k that maximizes cover of items A B 1 5 4 3 k=1, Solution: A (size=3) k=2, Solutions: A,C (size=4) A,B (size=4) B,C (size=4) C 2 Sets Items

4 Max k-cover Problem NP-complete Greedy Algorithm o pick set that cover more items o iterate 1 - ((k-1)/k)^k <= 1 - 1/e = 0.67 approximation A B C 1 5 4 3 2 Sets Items k=1, Solution: A (size=3) k=2, Solutions: A,C (size=4) A,B (size=4) B,C (size=4)

5 Max k-cover in Recommendations Alice views and rates movies Netflix would like to recommend new movies to Alice for watching Important problem: o Find users "similar" to Alice o Find users who cover a large set of Alice's likes and dislikes

6 Netflix example Each user is identified by subset of movies he likes/viewed Alice likes {A, B, C} Fred likes {A, D} Bob likes {B, E} Ben likes {C, F} Jim likes {A, B, F} James likes {A, B, F} Ben and Jim in conjunction cover all Alice's likes Fred, Bob and Ben in conjunction cover all Alice's likes Jim and James add same value

7 k-covering vs nearest neighbor for k=1, equivalent (dot product similarity) covering allows for diversifying recommendations want to cover all genres liked by a user o consider a user that likes 100 thriller movies and 10 comedies o want "similar" users to cover as many movies as possible o k-nearest neighbor attempts to find many similar users, not cover as many movies as possible

8 oDesk example Online labor marketplace clients post jobs and/or invite contractors contractors apply to jobs Contractor recommendations for clients o Bob invites/interviews/hires contractors o find clients "similar" to Bob Job recommendations for contractors o Alice applies to jobs o find contractors "similar" to Alice

9 Succinct Dynamic Covering (SDC) Input: o integer k o items: X = {1,2,..., n} o sets: I = {S1,..., Sm}, Si subset of X o query Q subset of X Output: Find subset of I with size less than k that maximizes cover of items in query Q However we further constrain the problem: o space constrained: statically preprocess (X,I) and store a small sketch, much smaller than O(mn) o dynamic: Q is not known apriori during the sketch creation

10 Notice two twists dynamic o for each user the set of movies that need to be covered is different o covering is not static space-constrained o real time, interactive recommendations o the whole netflix graph is huge  10 million users  100k movies  popular movies have been viewed many times o cannot process over the entire graph at query time

11 Ad serving online advertisers o bid on webpages matching relevancy criteria o target certain user demographics When a user visits a page Ad servers: o have some (not precise) idea about the demographic of the user (e.g. from click logs) o try to pick a set of ads that cover many user demographics o need to solve the SDC probem

12 Ad serving space-constraint: o set system consists of users, webpages and clicks dynamic: o each user view of each page is associated with different user demographic A B C 1 5 4 3 2 Ads Webpages User visited pages

13 Coverage Oracle Offline stage: o Input:  integer k  items: X = {1,2,..., n}  sets: I = {S1,..., Sm}, Si subset of X Output: Data Structure D Dynamic stage: o Input: Query Q subset of X o Output: use D to find subset of I with size less than k that maximizes cover of items in query Q

14 Outline Covering & Recommendations Succinct Dynamic Covering Results: o Upper Bounds o Lower Bounds

15 Results given space limitations o interested in approximate solutions for SDC space vs approximation ratio tradeoffs ε: [0,1/2] δ1, δ1: non-negative integers, not both zero

16 Simple Deterministic Algorithm For every item, "remember" one set break ties arbitrarily m/k approximation, linear space Sets Items Sets Items k=2: OPT = 16 APPROX = 8 ratio = 16/8 =2

17 Better Deterministic Algorithm Find unchosen set containing the most uncovered items. Iterate. similar to previous algorithm, order is fixed sqrt(n/k) approximation, linear space Sets Items Sets Items k=2: OPT = 16 APPROX = 16 ratio = 16/16 = 1

18 Randomized Algorithm m ε /sqrt(k) approximation nm 1-2ε space Find unchosen set containing at least n/(m ε sqrt(k)). Choose and Iterate. For every remaining unchosen set, choose n/m 2ε uniformly at random from the uncovered items

19 Randomized Algorithm m ε /sqrt(k) approximation nm 1-2ε space Find unchosen set containing at least n/(m ε sqrt(k)). Choose and Iterate. For every remaining unchosen set, choose n/m 2ε uniformly at random from the uncovered items

20 Lower Bound holds for deterministic oracles only proof somewhat involved, uses the probabilistic method matches randomized upper bound Open problem: randomized lower bound

21 Related word distance oracles in graphs, Thorup and Zwick set cover in streaming model (sets are streams or items are streams) nearest neighbor (NN) search: o for k=1, SDC and NN are equivalent using the dot product similarity o no locality sensitive hashing for dot product (Charikar). So, no hope for signature schemes for SDC.

22 Summary Introduced Succinct Dynamic Covering problem Applications in many real-world recommendation systems approximation ratio and space tradeoffs Deterministic and Randomized upper bounds Deterministic lower bound

23 Thank you!


Download ppt "Dynamic Covering for Recommendation Systems Ioannis Antonellis Anish Das Sarma Shaddin Dughmi."

Similar presentations


Ads by Google