Evaluating, Combining and Generalizing Recommendations with Prerequisites Aditya Parameswaran Stanford University (with Profs. Hector Garcia-Molina and Jeffrey D. Ullman) 1
2 Statistics (at Stanford): >10,000 registered users 37,000 listed courses 160,000 evaluations Overall 172 universities 100,000 users Statistics (at Stanford): >10,000 registered users 37,000 listed courses 160,000 evaluations Overall 172 universities 100,000 users
Course Recommendations Instead of a traditional ranked list … Recommend a good package satisfying – Prerequisites (e.g., algebra calculus) – Requirements (e.g., > 3 math courses) – Planning constraints (e.g., no two in same slot) Recent work on recommending packages – Yahoo! Travel Plans [De Choudhury et. al. WWW 10] Yahoo! Composite items [Roy et. al. SIGMOD 10] – Minimizing Cost [Xie et. al. RecSys 10] 3 Prior Work
Intuitive Example Nodes represent all items not taken yet Edges imply prerequisites 4 E(2)B(6) I(2)K(9) C(3) J(8)H(8) A(5) G(7) D(7) Prerequisites: NO score: 32 Prerequisites: YES score: 29
Example: General Prerequisites 5 Algebra OptimizationAlgorithms Geometry Arithmetic Statistics Adv. Math Set Theory Probability Information Theory
Formal Problem Directed acyclic graph G(V, E) – with some nodes Labeled AND or OR Every node x has a score(x) Recommend k = |A| courses such that – score(a) is maximized {a ϵ A} – Prerequisites of all nodes are met 66 OR Graphs AND-OR Graphs AND Graphs Chain Graphs
Outline of Work 7 Complexity Chain Graphs: PTIME DP AND / OR / AND-OR: NP-Hard Adaptable Approx Algorithms 1)Breadth First 2)Greedy 3)Top Down Worst case per structure Complexity: DP > Greedy > Top Down > BF Merge Algorithm Experiments Extensions to Fuzzy Prerequisites OR Graphs AND-OR Graphs AND Graphs Chain Graphs For Chain Graphs Sample
Chain Graph Algorithm Chain 0Chain 1….Chain i -1Chain iChain n 8 0 j To pick j items from i chains: Pick x items from i-1 chains First j – x items from the ith chain Score of best feasible set of j items from first i chains B [j, i] = max over all x {B [x, i–1] + 1 … (j—x) of ith chain} Complexity: O(nk 2 ) k
Breadth First Algorithm Illustration K = 4 Add items until k = 4 Swap items 9 E(2)B(6) I(2)K(9) C(3) J(8)H(8) A(5) G(7) D(7)
Top Down & Greedy Algorithms Algorithms between extremes – Efficient: Breadth First – Inefficient but Exact: Dynamic Programming Top Down is the reverse of Breadth First – Add best items first, then try to add prerequisites Greedy reasons about entire chains at once – Tries to add prefixes of chains with high avg score 10
Outline of Work 11 Complexity Chain Graphs: PTIME DP AND / OR / AND-OR: NP-Hard Adaptable Approx Algorithms 1)Breadth First 2)Greedy 3)Top Down Worst case per structure Complexity: DP > Greedy > Top Down > BF Merge Algorithm Experiments Extensions to Fuzzy Prerequisites OR Graphs AND-OR Graphs AND Graphs Chain Graphs For Chain Graphs Sample
Breadth First: Worst Case Worst case in terms of: – d: maximum length of chain – m: max difference in score in a given chain 12 a+m-Є a - Є a+m-Є a - Є a+m-Є a - Є a+m-Є aaa lots of chains of depth dlots of singleton elements difference = k/d x (da + (d-1)m) - ka =k/d x (d-1)m
Experimental Setup Measure How we perform As fraction of DP (for chains) & no-prereqs (for AND graphs) Vary: – n: number of components – d: max depth of chain / component – p: probability of a long chain / large component – k: size of package Score: exponentially distributed 13 Chain GraphsAND Graphs Three Algorithms: greedy, td, bf Top-2 of td, bf Merge-2 of td, bfTop-3 of greedy, td, bf
Chain Graphs on Varying k 14 Size of the desired package Ratio of Dynamic Programming Solution
AND Graphs on Varying 15 Probability of Long Component Ratio of No-prerequisite Solution
Conclusions Dynamic Programming Algorithm – Only for Chain Graphs – Guaranteed best recommendations Greedy Value Algorithm – Adaptable to any structure – Almost as good recommendations as DP – With less complexity Top Down and Breadth First – Even better complexity – Not as good recommendations as Greedy – Can be improved using Merge algorithm 16
Chain Graphs on Varying p 17 Probability of a long chain (k small) Ratio of Dynamic Programming Solution