Optimizing Submodular Functions

Slides:



Advertisements
Similar presentations
Recap: Mining association rules from large datasets
Advertisements

Efficient summarization framework for multi-attribute uncertain data Jie Xu, Dmitri V. Kalashnikov, Sharad Mehrotra 1.
Optimizing Recommender Systems as a Submodular Bandits Problem Yisong Yue Carnegie Mellon University Joint work with Carlos Guestrin & Sue Ann Hong.
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
Cost-effective Outbreak Detection in Networks Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie Glance.
Turning Down the Noise in the Blogosphere Khalid El-Arini, Gaurav Veda, Dafna Shahaf, Carlos Guestrin.
Maximizing the Spread of Influence through a Social Network
S. J. Shyu Chap. 1 Introduction 1 The Design and Analysis of Algorithms Chapter 1 Introduction S. J. Shyu.
A Fairy Tale of Greedy Algorithms Yuli Ye Joint work with Allan Borodin, University of Toronto.
Beyond Keyword Search: Discovering Relevant Scientific Literature Khalid El-Arini and Carlos Guestrin August 22, 2011 TexPoint fonts used in EMF. Read.
Linear Submodular Bandits and their Application to Diversified Retrieval Yisong Yue (CMU) & Carlos Guestrin (CMU) Optimizing Recommender Systems Every.
Planning under Uncertainty
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Reducing the collection of itemsets: alternative representations and combinatorial problems.
Nisha Ranga TURNING DOWN THE NOISE IN BLOGOSPHERE.
Near-optimal Nonmyopic Value of Information in Graphical Models Andreas Krause, Carlos Guestrin Computer Science Department Carnegie Mellon University.
Sensor placement applications Monitoring of spatial phenomena Temperature Precipitation... Active learning, Experiment design Precipitation data from Pacific.
Approximation Algorithms
INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.
Maximum Entropy Model LING 572 Fei Xia 02/07-02/09/06.
Approximation Algorithms for NP-hard Combinatorial Problems Magnús M. Halldórsson Reykjavik University
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Karthik Raman, Pannaga Shivaswamy & Thorsten Joachims Cornell University 1.
Randomized Composable Core-sets for Submodular Maximization Morteza Zadimoghaddam and Vahab Mirrokni Google Research New York.
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
A Unified Continuous Greedy Algorithm for Submodular Maximization Moran Feldman Roy SchwartzJoseph (Seffi) Naor Technion – Israel Institute of Technology.
Maximization Problems with Submodular Objective Functions Moran Feldman Publication List Improved Approximations for k-Exchange Systems. Moran Feldman,
Minimizing Delay in Shared Pipelines Ori Rottenstreich (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) Yoram Revah, Aviran Kadosh.
Introduction to Information Retrieval Introduction to Information Retrieval CS276 Information Retrieval and Web Search Chris Manning and Pandu Nayak Efficient.
Deterministic Algorithms for Submodular Maximization Problems Moran Feldman The Open University of Israel Joint work with Niv Buchbinder.
Cost-effective Outbreak Detection in Networks Presented by Amlan Pradhan, Yining Zhou, Yingfei Xiang, Abhinav Rungta -Group 1.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Approximation Algorithms based on linear programming.
Unconstrained Submodular Maximization Moran Feldman The Open University of Israel Based On Maximizing Non-monotone Submodular Functions. Uriel Feige, Vahab.
CSE573 Autumn /11/98 Machine Learning Administrative –Finish this topic –The rest of the time is yours –Final exam Tuesday, Mar. 17, 2:30-4:20.
CMSC201 Computer Science I for Majors Lecture 19 – Recursion
Inferring Networks of Diffusion and Influence
An Efficient Algorithm for Incremental Update of Concept space
Greedy & Heuristic algorithms in Influence Maximization
Monitoring rivers and lakes [IJCAI ‘07]
Analysis of Algorithms
CMSC201 Computer Science I for Majors Lecture 18 – Recursion
Moran Feldman The Open University of Israel
Perceptrons Support-Vector Machines
Computer Science cpsc322, Lecture 14
Analysis and design of algorithm
Distributed Submodular Maximization in Massive Datasets
When Security Games Go Green
CS 4/527: Artificial Intelligence
CS 2210 Discrete Structures Algorithms and Complexity
Data Integration with Dependent Sources
Rank Aggregation.
Structured Learning of Two-Level Dynamic Rankings
Algorithm An algorithm is a finite set of steps required to solve a problem. An algorithm must have following properties: Input: An algorithm must have.
Coverage Approximation Algorithms
Large Scale Support Vector Machines
8/04/2009 Many thanks to David Sun for some of the included slides!
MURI Kickoff Meeting Randolph L. Moses November, 2008
8. Efficient Scoring Most slides were adapted from Stanford CS 276 course and University of Munich IR course.
Minimizing the Aggregate Movements for Interval Coverage
Cost-effective Outbreak Detection in Networks
Submodular Maximization Through the Lens of the Multilinear Relaxation
Lecture 3: Environs and Algorithms
Submodular Maximization in the Big Data Era
Submodular Maximization with Cardinality Constraints
CS 2604 Data Structures and File Management
CS137: Electronic Design Automation
Guess Free Maximization of Submodular and Linear Sums
Presentation transcript:

Optimizing Submodular Functions -- Final exam logistics -- Please fill out course evaluation forms (THANKS!!!) Optimizing Submodular Functions CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu

Announcement: Final Exam Logistics 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Final: Logistics Date: Location: SCPD: Tuesday, March 20, 3:30-6:30 pm Gates B01 (Last name A-H) Skilling Auditorium (Last name I-N) NVIDIA Auditorium (Last name O-Z) SCPD: Your exam monitor will receive the exam this week You may come to Stanford to take the exam, or take it at most 48 hours before the exam time Email exam PDF to cs246.mmds@gmail.com by Tuesday, March 20, 11:59 pm Pacific Time Call Jessica at +1 561-543-1855 if you have questions during the exam, or email the course staff mailing list 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Final: Instructions Final exam is open book and open notes A calculator or computer is REQUIRED You may only use your computer to do arithmetic calculations (i.e. the buttons found on a standard scientific calculator) You may also use your computer to read course notes or the textbook But no internet/google/python access is allowed Anyone who brings a power strip to the final exam gets 1 point extra credit for helping out their classmates 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Optimizing Submodular Functions CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu

Recommendations: Diversity Redundancy leads to a bad user experience Uncertainty around information need => don’t put all eggs in one basket How do we optimize for diversity directly? Users can get bored from redundancy Putting all of our eggs in one basket How do we optimize for this directly? 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Covering the day’s news France intervenes And so, with this as motivation, our goal in this work is to Turn Down the Noise in the Blogosphere by presenting users with a small set of representative posts that cover the important stories. We refer to this as coverage. <click> For example, this is what a typical day looks like in the blogosphere. In this word cloud, the size of a word represents its frequency in the blogosphere for this day, January 17. As we can see, the important features here are Obama, Israel – Gaza, New York, etc. Chuck for Defense Argo wins big Hagel expects fight Monday, January 14, 2013 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Covering the day’s news France intervenes And so, with this as motivation, our goal in this work is to Turn Down the Noise in the Blogosphere by presenting users with a small set of representative posts that cover the important stories. We refer to this as coverage. <click> For example, this is what a typical day looks like in the blogosphere. In this word cloud, the size of a word represents its frequency in the blogosphere for this day, January 17. As we can see, the important features here are Obama, Israel – Gaza, New York, etc. Chuck for Defense Argo wins big New gun proposals Monday, January 14, 2013 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Encode Diversity as Coverage Idea: Encode diversity as coverage problem Example: Word cloud of news for a single day Want to select articles so that most words are “covered” 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Diversity as Coverage 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

What is being covered? Q: What is being covered? A: Concepts (In our case: Named entities) Q: Who is doing the covering? A: Documents France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL Hagel expects fight 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Simple Abstract Model Suppose we are given a set of documents D Each document d covers a set 𝑿 𝒅 of words/topics/named entities W For a set of documents A  D we define 𝑭 𝑨 = 𝒅∈𝑨 𝑿 𝒅 Goal: We want to max 𝑨 ≤𝒌 𝑭(𝑨) Note: F(A) is a set function: 𝑭 𝑨 :𝐒𝐞𝐭𝐬→ℕ 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Maximum Coverage Problem Given universe of elements 𝑾 = {𝒘𝟏,…, 𝒘 𝒏 } and sets 𝑿𝟏,…, 𝑿 𝒎  𝑾 Goal: Find k sets Xi that cover the most of W More precisely: Find k sets Xi whose size of the union is the largest Bad news: A known NP-complete problem X3 X1 W X2 X4 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Simple Greedy Heuristic Simple Heuristic: Greedy Algorithm: Start with 𝑨𝟎={ } For 𝒊=𝟏…𝒌 Find set 𝒅 that 𝐦𝐚𝐱⁡𝑭( 𝑨 𝒊−𝟏 ∪{𝒅}) Let 𝑨 𝒊 = 𝑨 𝒊−𝟏  {𝒅} Example: Eval. 𝑭 𝒅 𝟏 ,…, 𝑭({ 𝒅 𝒎 }), pick best (say 𝒅 𝟏 ) Eval. 𝑭 𝒅 𝟏 }∪{ 𝒅 𝟐 ,…, 𝑭({ 𝒅 𝟏 }∪{ 𝒅 𝒎 }), pick best (say 𝒅 𝟐 ) Eval. 𝑭( {𝒅 𝟏 , 𝒅 𝟐 }∪{ 𝒅 𝟑 }),…, 𝑭({ 𝒅 𝟏 , 𝒅 𝟐 }∪{ 𝒅 𝒎 }), pick best And so on… 𝑭 𝑨 = 𝒅∈𝑨 𝑿 𝒅 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Simple Greedy Heuristic Goal: Maximize the covered area 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Simple Greedy Heuristic Goal: Maximize the covered area 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Simple Greedy Heuristic Goal: Maximize the covered area 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Simple Greedy Heuristic Goal: Maximize the covered area 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Simple Greedy Heuristic Goal: Maximize the covered area 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

When Greedy Heuristic Fails? B Goal: Maximize the size of the covered area Greedy first picks A and then C But the optimal way would be to pick B and C 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Approximation Guarantee Greedy produces a solution A where: F(A)  (1-1/e)*OPT (F(A)  0.63*OPT) [Nemhauser, Fisher, Wolsey ’78] Claim holds for functions F(·) with 2 properties: F is monotone: (adding more docs doesn’t decrease coverage) if A  B then F(A)  F(B) and F({})=0 F is submodular: adding an element to a set gives less improvement than adding it to one of its subsets 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Submodularity: Definition Set function F(·) is called submodular if: For all A,B W: F(A) + F(B)  F(A B) + F(A B) +  + A  B A B A  B 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Submodularity: Or equivalently Diminishing returns characterization Equivalent definition: Set function F(·) is called submodular if: For all A B, dB: Gain of adding d to a small set Gain of adding d to a large set F(A  d) – F(A) ≥ F(B  d) – F(B) B + d Large improvement A + d Small improvement 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Example: Set Cover F(A  d) – F(A) ≥ F(B  d) – F(B) F(·) is submodular: A  B Natural example: Sets 𝑑1, …, 𝑑 𝑚 𝐹 𝐴 = 𝑖∈𝐴 𝑑 𝑖 (size of the covered area) Claim: 𝑭(𝑨) is submodular! Gain of adding d to a small set Gain of adding d to a large set F(A  d) – F(A) ≥ F(B  d) – F(B) A d SPLIT B d 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Submodularity– Diminishing returns Submodularity is discrete analogue of concavity F(·) A  B F(B  d) F(B) F(A  d) F(A) Adding d to B helps less than adding it to A! Solution size |A| Gain of adding Xd to a small set Gain of adding Xd to a large set F(A  d) – F(A) ≥ F(B  d) – F(B) 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Submodularity & Concavity Marginal gain: 𝚫 𝑭 𝒅 𝑨 =𝑭 𝑨∪𝒅 −𝑭(𝑨) Submodular: 𝑭 𝑨∪𝒅 −𝑭 𝑨 ≥𝑭 𝑩∪𝒅 −𝑭(𝑩) Concavity: 𝒇 𝒂+𝒅 −𝒇 𝒂 ≥𝒇 𝒃+𝒅 −𝒇(𝒃) 𝐴⊆𝐵 𝑎≤𝑏 F(A) |A| 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Submodularity: Useful Fact Let 𝑭 𝟏 … 𝑭 𝒎 be submodular and 𝝀 𝟏 … 𝝀 𝒎 >𝟎 then 𝑭 𝑨 = 𝒊 𝒎 𝝀 𝒊 𝑭 𝒊 𝑨 is submodular Submodularity is closed under non-negative linear combinations! This is an extremely useful fact: Average of submodular functions is submodular: 𝑭 𝑨 = 𝒊 𝑷 𝒊 ⋅ 𝑭 𝒊 𝑨 Multicriterion optimization: 𝑭 𝑨 = 𝒊 𝝀 𝒊 𝑭 𝒊 𝑨 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Back to our problem Q: What is being covered? A: Concepts (In our case: Named entities) Q: Who is doing the covering? A: Documents France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL Hagel expects fight 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Back to our Concept Cover Problem Objective: pick k docs that cover most concepts F(A): the number of concepts covered by A Elements…concepts, Sets … concepts in docs F(A) is submodular and monotone! We can use greedy to optimize F France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL Enthusiasm for Inauguration wanes Inauguration weekend 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

All-or-nothing too harsh The Set Cover Problem Objective: pick k docs that cover most concepts France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL Enthusiasm for Inauguration wanes Inauguration weekend The good: The bad: Penalizes redundancy Concept importance? Submodular All-or-nothing too harsh 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Probabilistic Set Cover 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Concept importance? Objective: pick k docs that cover most concepts France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL Enthusiasm for Inauguration wanes Inauguration weekend Each concept 𝒄 has importance weight 𝒘 𝒄 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

All-or-nothing too harsh Document coverage function probability document d covers concept c [e.g., how strongly d covers c] Obama Romney Previously, a set of docs A covered c if at least one of the docs in A contained c Now we have a probabilistic notion of document covg, so we’ll define set covg in the analogous way. Enthusiasm for Inauguration wanes 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Probabilistic Set Cover Document coverage function: probability document d covers concept c Coverd(c) can also model how relevant is concept c for user u Set coverage function: Prob. that at least one document in A covers c Objective: Previously, a set of docs A covered c if at least one of the docs in A contained c Now we have a probabilistic notion of document covg, so we’ll define set covg in the analogous way. concept weights 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Optimizing F(A) The objective function is also submodular Intuitive diminishing returns property Greedy algorithm leads to a (1 – 1/e) ~ 63% approximation, i.e., a near-optimal solution 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Summary: Probabilistic Set Cover Objective: pick k docs that cover most concepts France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL Enthusiasm for Inauguration wanes Inauguration weekend Each concept 𝑐 has importance weight 𝑤 𝑐 Documents partially cover concepts: 𝐜𝐨𝐯𝐞 𝐫 𝒅 (𝒄) 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Lazy Optimization of Submodular Functions 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Add document with highest marginal gain Submodular Functions Greedy algorithm is slow! At each iteration we need to re-evaluate marginal gains of all remaning documents Runtime 𝑶(|𝑫|·𝑲) for selecting 𝑲 documents out of the set of 𝑫 of them Greedy Marginal gain: F(Ax)-F(A) a b c d e Add document with highest marginal gain 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

[Leskovec et al., KDD ’07] Speeding up Greedy In round 𝒊: So far we have 𝑨 𝒊−𝟏 = { 𝒅 𝟏 ,…, 𝒅 𝒊−𝟏 } Now we pick 𝐝 𝒊 = 𝐚𝐫𝐠 𝐦𝐚𝐱 𝒅∈𝑽 𝑭( 𝑨 𝒊−𝟏 ∪ {𝒅}) −𝑭( 𝑨 𝒊−𝟏 ) Greedy algorithm maximizes the “marginal benefit” 𝚫 𝒊 𝒅 =𝑭( 𝑨 𝒊−𝟏 ∪ {𝒅}) −𝑭( 𝑨 𝒊−𝟏 ) By submodularity property: 𝐹 𝐴 𝑖 ∪ 𝑑 −𝐹 𝐴 𝑖 ≥𝐹 𝐴 𝑗 ∪ 𝑑 −𝐹 𝐴 𝑗 for 𝑖<𝑗 Observation: By submodularity: For every 𝒅∈𝑫 𝚫 𝒊 (𝒅)≥ 𝚫 𝒋 (𝒅) for 𝒊< 𝒋 since 𝑨𝒊  𝑨 𝒋 Marginal benefits 𝚫 𝒊 (𝒅) only shrink! (as i grows) i(d)   j(d) d Selecting document d in step i covers more words than selecting d at step j (j>i) 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Lazy Greedy F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) Idea: Lazy Greedy: [Leskovec et al., KDD ’07] Lazy Greedy Idea: Use i as upper-bound on j (j > i) Lazy Greedy: Keep an ordered list of marginal benefits i from previous iteration Re-evaluate i only for top element Re-sort and prune (Upper bound on) Marginal gain 1 a A1={a} b c d e F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) A  B 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Lazy Greedy F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) Idea: Lazy Greedy: [Leskovec et al., KDD ’07] Lazy Greedy Idea: Use i as upper-bound on j (j > i) Lazy Greedy: Keep an ordered list of marginal benefits i from previous iteration Re-evaluate i only for top element Re-sort and prune Upper bound on Marginal gain 2 a A1={a} b c d e F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) A  B 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Lazy Greedy F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) Idea: Lazy Greedy: [Leskovec et al., KDD ’07] Lazy Greedy Idea: Use i as upper-bound on j (j > i) Lazy Greedy: Keep an ordered list of marginal benefits i from previous iteration Re-evaluate i only for top element Re-sort and prune Upper bound on Marginal gain 2 a A1={a} d A2={a,b} b e If that removes it from the top, pick next sensor Continue until top remains unchanged c F(A  {d}) – F(A) ≥ F(B  {d}) – F(B) A  B 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Summary so far Summary so far: Diversity can be formulated as a set cover Set cover is submodular optimization problem Can be (approximately) solved using greedy algorithm Lazy-greedy gives significant speedup 1 2 3 4 5 6 7 8 9 10 100 200 300 400 number of blogs selected running time (seconds) exhaustive search (all subsets) naive greedy Lower is better Lazy 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

But what about personalization? Election trouble Songs of Syria Sandy delays model Recommendations 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

We assumed same concept weighting for all users Concept Coverage We assumed same concept weighting for all users France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL France intervenes Chuck for Defense Argo wins big 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Personal Concept Weights Each user has different preferences over concepts France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL politico France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL movie buff 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Personal concept weights Assume each user u has different preference vector wc(u) over concepts c Goal: Learn personal concept weights from user feedback 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Interactive Concept Coverage France Mali Hagel Pentagon Obama Romney Zero Dark Thirty Argo NFL Describe two intuitions France intervenes Chuck for Defense Argo wins big 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Multiplicative Weights (MW) Multiplicative Weights algorithm Assume each concept 𝒄 has weight 𝒘 𝒄 We recommend document 𝒅 and receive feedback, say 𝒓= +1 or -1 Update the weights: For each 𝒄∈𝒅 set 𝒘 𝒄 = 𝜷 𝒓 𝒘 𝒄 If concept c appears in doc d and we received positive feedback r=+1 then we increase the weight wc by multiplying it by 𝜷 (𝜷>𝟏) otherwise we decrease the weight (divide by 𝜷) Normalize weights so that 𝒄 𝒘 𝒄 =𝟏 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

Summary of the Algorithm Steps of the algorithm: Identify items to recommend from Identify concepts [what makes items redundant?] Weigh concepts by general importance Define item-concept coverage function Select items using probabilistic set cover Obtain feedback, update weights 11/7/2018 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu