Download presentation
Presentation is loading. Please wait.
Published byBruno Booker Modified over 9 years ago
1
L EARNING TO M ODEL R ELATEDNESS FOR N EWS R ECOMMENDATION Author: Yuanhua Lv and et al. UIUC, Yahoo! labs Presenter: Robbie 1 WWW 2011
2
O UTLINE Introduction and Motivation Model Relatedness Experiment Conclusion 2
3
I NTRODUCTION Post-Click news Recommendation Seed news Candidate news 3
4
M OTIVATION Promote users’ navigation on the visited website Yahoo!, Google focus on initial clicks, post- click news recommendation largely under- explored Mainly depend on editors’ manual effort No existing method proposed to model relatedness directly 4
5
M ODEL RELATEDNESS Four aspects Relevance Novelty Connection clarity Transition smoothness 5
6
R ELEVANCE AND N OVELTY Similar but not duplicate Novelty often in contrast to relevance Use same set of features to measure them cosine similarity BM25 language models with Dirichlet prior smoothing language models with Jelinek-Mercer smoothing 6
7
C ONNECTION C LARITY Relevance and novelty can only model word overlap between two articles s and d Example: s: White House: Obamas earn $5.5 million in 2009 d: Obama’s oil spill bill seeks $118 million, oil company s and d must be topically cohesive Connection clarity defines topical cohesion of two news 7
8
C ONNECTION C LARITY 8
9
T RANSITION S MOOTHNESS Example: s: Toyota dismisses account of runaway Prius d1: What to do if your car suddenly accelerates d2: Toyota to build Prius at 3 rd Japan plant: report Definition: Measures how well a user’s reading interests can transit from s to d Transition smoothness from s-d to d-s, i.e. from “known” to “novel” 9 Smooth(s, d1) >smooth(s, d2)
10
T RANSITION S MOOTHNESS 10
11
L EARNING A RELATEDNESS F UNCTION 11
12
C ONSTRUCTING T EST C OLLECTION Yahoo! News articles from March 1 st to June 30 th 2010 Each run, randomly generate 549 seed news from June 10 th to June 20 th with at least 2000 visits Perform redundancy detection 12
13
E DITORIAL J UDGMENTS A group of professional news editors from a commercial online news website 4 point relatedness scale “very related”, “somewhat related”, “redundant”, “unrelated”, any document with two different judgments, select a judgment with a higher ratio High agreement in relative relatedness (80.8%),inspire to learn relatedness functions from pair-wise preference information 13
14
E XPERIMENTS : C OMPARING I NDIVIDUAL R ETRIEVAL M ODELS 14 “Body” is the best ---- title and abstract may lose information Cosine similarity as well as or even better than language models in some cases, but NDCG1 is worst Effective for redundancy detection, which brings redundant documents to the top
15
E XPERIMENTS : C OMPARING M ACHINE - L EARNED R ELATEDNESS M ODELS 15
16
E XPERIMENTS : A NALYZING THE U NIFIED R ELATEDNESS M ODEL 16 Cosine similarity significantly worse than BM25 as individual relatedness function, but the most important in the unified model Connection clarity and transition smoothness contribute 7/15 together
17
CONCLUSIONS First attempt at post-click news recommendation Propose 4 aspects to characterize news relatedness Future work Incorporate into the unified relatedness function non- content features Document and user adaptive measures will be more accurate 17
18
RELEVANCE AND NOVELTY Problem: Top ranked documents may be redundant and unrelated articles Solution : Passage retrieval 18
19
E XPERIMENTS : P ASSAGE R ETRIEVAL E VALUATION 19 o Fixed-length(250 empirically) arbitrary passage retrieval o Passage retrieval doesn’t help in most cases o Improve NDCG1 clearly. --- Probably relaxes the concern of ranking redundant documents on top
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.