Intro to Content Optimization

Slides:



Advertisements
Similar presentations
Personalized Recommendation on Dynamic Content Using Predictive Bilinear Models Wei ChuSeung-Taek Park WWW 2009 Audience Science Yahoo! Labs.
Advertisements

Google News Personalization: Scalable Online Collaborative Filtering
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
TAU Agent Team: Yishay Mansour Mariano Schain Tel Aviv University TAC-AA 2010.
- 1 - Intro to Content Optimization Yury Lifshits. Yahoo! Research Largely based on slides by Bee-Chung Chen, Deepak Agarwal & Pradheep Elango.
The Roles of Uncertainty and Randomness in Online Advertising Ragavendran Gopalakrishnan Eric Bax Raga Gopalakrishnan 2 nd Year Graduate Student (Computer.
 1  Outline  Model  problem statement  detailed ARENA model  model technique  Output Analysis.
1 EL736 Communications Networks II: Design and Algorithms Class8: Networks with Shortest-Path Routing Yong Liu 10/31/2007.
Planning under Uncertainty
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Lecture 5: Learning models using EM
Particle Filtering in Network Tomography
Content Recommendation on Y! sites Deepak Agarwal Stanford Info Seminar 17 th Feb, 2012.
ICML’11 Tutorial: Recommender Problems for Web Applications Deepak Agarwal and Bee-Chung Chen Yahoo! Research.
Adaptive News Access Daniel Billsus Presented by Chirayu Wongchokprasitti.
Slide 1 Tutorial: Optimal Learning in the Laboratory Sciences The knowledge gradient December 10, 2014 Warren B. Powell Kris Reyes Si Chen Princeton University.
Evaluation Methods and Challenges. 2 Deepak Agarwal & Bee-Chung ICML’11 Evaluation Methods Ideal method –Experimental Design: Run side-by-side.
Optimizing Marketing Spend Through Multi-Source Conversion Attribution David Jenkins.
Bug Localization with Machine Learning Techniques Wujie Zheng
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Ads Jim Jansen College of Information Sciences and Technology The Pennsylvania State University
SCALABLE INFORMATION-DRIVEN SENSOR QUERYING AND ROUTING FOR AD HOC HETEROGENEOUS SENSOR NETWORKS Paper By: Maurice Chu, Horst Haussecker, Feng Zhao Presented.
Predicting Consensus Ranking in Crowdsourced Setting Xi Chen Mentors: Paul Bennett and Eric Horvitz Collaborator: Kevyn Collins-Thompson Machine Learning.
[xxxx] SEO Online Marketing for Business Catalyst Websites
Importance of SEO in business ESOLPK. SEO What is SEO?  Site design improvement or SEO to put it plainly, is an arrangement of standards that can be.
Search Engine Optimization
Information Retrieval in Practice
SEARCH ENGINE OPTIMIZATION.
Allocation of Support Department Costs, Common Costs, and Revenues
How to use your data science team: Becoming a data-driven organization
Learning Profiles from User Interactions
OPERATING SYSTEMS CS 3502 Fall 2017
Data Mining: Concepts and Techniques
Dan C. Marinescu Office: HEC 439 B. Office hours: M, Wd 3 – 4:30 PM.
Live Customer Support Solution
Why your conversion rates suck?
Tracking Objects with Dynamics
Create your Benner - intro
Manufacturing system design (MSD)
FRM: Modeling Sponsored Search Log with Full Relational Model
1 SEO is short for search engine optimization. Search engine optimization is a methodology of strategies, techniques and tactics used to increase the amount.
PubMed Search Options (Basic Course: Module 6)
Bandits for Taxonomies: A Model-based Approach
The Four Dimensions of Search Engine Quality
Jianping Fan Dept of CS UNC-Charlotte
Author: Kazunari Sugiyama, etc. (WWW2004)
Objective of This Course
Discrete Event Simulation - 4
Presented By Aaron Roth
1.206J/16.77J/ESD.215J Airline Schedule Planning
Copyrights (H.Rashidi & E.Tsang)
Why listen to me? Sr. Digital Marketing Specialist for Fastline Media Group Social media is my world Fastline has seen a… 1,044% growth in Facebook audience.
Chapter 2: Evaluative Feedback
Targeting Wait Statistics with Extended Events
Overview Blogs and wikis are two Web 2.0 tools that allow users to publish content online Blogs function as online journals Wikis are collections of searchable,
ISWC 2013 Entity Recommendations in Web Search
SOCIAL MEDIA STRATEGY.
Design Considerations
October 6, 2011 Dr. Itamar Arel College of Engineering
Markov Decision Problems
Data Transformations targeted at minimizing experimental variance
PubMed Search Options (Basic Course: Module 6)
Predictive Keyword Scores to Optimize Online Advertising Campaigns
PubMed Search Options (Basic Course: Module 6)
PubMed Search Options and Review (Basic Course: Module 6)
Chapter 8: Estimating with Confidence
Chapter 2: Evaluative Feedback
Metrics to Track for Social Media Success
Presentation transcript:

Intro to Content Optimization Yury Lifshits. Yahoo! Research Largely based on slides by Bee-Chung Chen, Deepak Agarwal & Pradheep Elango

Outline High-level overview Explore/Exploit algorithm for Yahoo! Frontpage

High-Level Overview of Content Optimization

Content optimization = optimizing publishing choices New Research Area Content optimization = optimizing publishing choices for every user visit under certain objective function

Publishing Choices Top stories Related stories Tweets, updates, trending queries/topics Headlines text, pictures, text length Ads Layout Modules, order of modules Menu items Balance between types/topics of content

Opportunities (User Visits) Time Context Search query Referrer, user session User demographics User history User interest profile User social graph (social targeting)

Objective function Clicks Time spent Engagement: comments, shares, sign-ups Actions on subsequent pages Ad revenue Off-line conversion Long-term objectives + Business rules constraints

Item Inventory Opportunity Articles, web page, ads, … Use an automated algorithm to select item(s) to show Get feedback (click, time spent,..) Refine the models Repeat (large number of times) Measure metric(s) of interest (Total clicks, Total revenue,…) Opportunity Users, queries, pages, …

Some examples Simple version I have an important module on my page, content inventory is obtained from a third party source which is further refined through editorial oversight. Can I algorithmically recommend content on this module? I want to drive up total CTR on this module More advanced I got X% lift in CTR. But I have additional information on other downstream utilities (e.g. dwell time). Can I increase downstream utility without losing too many clicks? Highly advanced There are multiple modules running on my website. How do I take a holistic approach and perform a simultaneous optimization?

Modeling: Key Components Feature construction Content: IR, clustering, taxonomy, entity,.. User profiles: clicks, views, social, community,.. Online (Fine resolution Corrections) (item, user level) (Quick updates) Offline (Logistic, GBDT,..) Initialize Explore/Exploit (Adaptive sampling)

Tasks of Content Optimization Understand content (Offline) Serve content to optimize our objectives (Online) Quickly learn from feedback obtained using ML/Statistics (Offline + Online) Constantly enhance our content inventory to improve future performance (Offline) Constantly enhance our user understanding to improve future performance (Offline + Online) Iterate

Science of Content Optimization Large scale Machine Learning & Statistics: Offline Models Online Models Collaborative Filtering Explore/Exploit

Explore/Exploit Algorithm for Yahoo! Frontpage Story Selection

Recommend applications Recommend search queries Recommend news article Recommend packages: Image Title, summary Links to other pages Pick 4 out of a pool of K K = 20 ~ 40 Dynamic Routes traffic other pages

Problems in this example Optimize CTR on different modules together in a holistic way Today Module, Trending Now, Personal Assistant, News, Ads Treat them as independent? For a given module Optimize some combination of CTR, downstream engagement and perhaps revenue.

Single Module CTR Optimization Problem Pick the top n items (stories) from an item pool of K items (stories) for each user visit to the Yahoo! homepage in order to maximize the number of clicks in the Today Module In general, items can be articles, ads, modules, configuration parameters of page layout One may also replace click with any performance metric observable with low latency

Simplified Single Module Problem Only consider the first position ~ 2/3 clicks happen at the first position Pick the best one from K items (stories) The single best one for all users No personalization in this talk Best means having the highest click-through rate (CTR) How to solve this “simple” problem? Can’t we just show (exploit) the item having the highest CTR? We need to explore every available item (using some fraction of traffic) to estimate its CTR Explore too little  Unreliable CTR estimates Explore too much  Little traffic to show the best item How much traffic should we allocate to each item now, in order to maximize the total number of clicks in the future (e.g., in a week)

Example Scenario 5 min intervals, 100 visits per interval One new story arrives every interval, story expires after 4 intervals Every story is either “strong” (100% CTR) or “weak” (0% CTR) New story is strong with probability 75%, weak with probability 25% What is the optimal strategy to allocate 100 views for the next interval between the current 4 stories?

Sequential Decision Problem now clicks in the future t –1 t –2 time Item 1 Item 2 … Item K x1% page views x2% page views xK% page views Determine (x1, x2, …, xK) based on clicks and views observed before t in order to maximize the expected total number of clicks in the future

Modeling the Uncertainty, NOT just the Mean Simplified setting: Two items Item A If we only make a single decision, give 100% page views to Item A If we make multiple decisions in the future explore Item B since its CTR can potentially be higher Probability density Item B CTR We know the CTR of Item A (say, shown 1 million times) We are uncertain about the CTR of Item B (only 100 times)

CTR Curves of Some Items in Two Days Each curve is the 1st-position CTR of an item over time CTRs are estimated using 1% random data (See our WWW’09 paper for more information)

Characteristics of Our Application Non-stationary CTR The CTR of each item changes over time Dynamic item pools Items come and go with short lifetimes (~10hr) Batch serving For scalability reasons, data is processed in batches (e.g., one batch per minute) We need to provide a sampling plan for each batch (time interval)

Bayesian Explore/Exploit Adaptation of Whittle (1988) to our problem setting With approximations to ensure computational feasibility With a time-series model to track the non-stationary CTR It provides an approximately Bayes optimal solution Development Bayes optimal solution to a simplified case: Two items, two intervals Near optimal solution to the general case by using the above solution as a building block

Bayesian Solution: Two Items, Two Intervals Two time intervals: t = 0 and t = 1 Item P: We are uncertain about its CTR, p0 at t = 0 and p1 at t = 1 Item Q: We know its CTR exactly, q0 at t = 0 and q1 at t = 1 To determine x, we need to estimate what would happen in the future t=0 t=1 Now time N0 views N1 views End Question: What fraction x of N0 views to item P (1-x) to item Q Assume we observe c; we can update p1 If x and c are given, optimal solution: Give all views to Item P iff E[ p1(x,c) I x, c ] > q1 CTR density Item Q Item P q0 p0 CTR density Item Q Item P q1 p1(x,c) Obtain c clicks after serving x (not yet observed; random variable)

The Two Item, Two Interval Case Expected total number of clicks in the two intervals E[#clicks] at t = 0 E[#clicks] at t = 1 Item P Item Q Show the item with higher E[CTR]: E[#clicks] if we always show item Q Gain(x, q0, q1) Gain of exploring the uncertain item P using x Gain(x, q0, q1) = Expected number of additional clicks if we explore the uncertain item P with fraction x of views in interval 0, compared to a scheme that only shows the certain item Q in both intervals Solution: argmaxx Gain(x, q0, q1)

Two Items, Two Intervals: Normal Approximation Approximate by the normal distribution Reasonable approximation because of the central limit theorem Proposition: Using the approximation, the Bayes optimal solution x can be found in time O(log N0)

Bayesian Solution: General Case From two items to K items Very difficult problem: Note: c = [c1, …, cK] ci is a random variable representing the # clicks on item i we may get Apply Whittle’s Lagrange relaxation (1988) to our problem setting Relax i zi(c) = 1, for all c, to Ec [i zi(c)] = 1 Apply Lagrange multipliers (q1 and q2) to enforce the constraints We essentially reduce the K-item case to K independent two-item sub-problems (which we have solved)

Bayesian Solution: General Case From two intervals to multiple intervals Approximate multiple intervals by two stages Non-stationary CTR Incorporate a time-series model (WWW’09) into our solution Coarse-grained personalization: Partition user-feature space into segments (e.g., decision tree) Explore/exploit most popular items for each segment

Simulation Experiment: Different Traffic Volume Simulation with ground truth estimated based on real data (WWW’09) Setting:16 live items per interval Scenarios: Web sites with different traffic volume (x-axis)

Simulation Experiment: Different Sizes of the Item Pool Simulation with ground truth estimated based on real data Setting: 1000 views per interval; average item lifetime = 20 intervals Scenarios: Different sizes of the item pool (x-axis)

Experimental Result: Controlled Bucket Test Bayes2x2, B-UCB1 and -Greedy were implemented in production and used to serve 3 random samples of real users on a Yahoo! site

Characteristics of Different Schemes Why the Bayesian solution has better performance Characterize each scheme by three dimensions: Exploitation regret: The regret of a scheme when it is showing the item which it thinks is the best (may not actually be the best) 0 means the scheme always picks the actual best It quantifies the scheme’s ability of finding good items Exploration regret: The regret of a scheme when it is exploring the items which it feels uncertain about It quantifies the price of exploration (lower  better) Fraction of exploitation (higher  better) Fraction of exploration = 1 – fraction of exploitation

Characteristics of Different Schemes Exploitation regret: Ability of finding good items (lower  better) Exploration regret: Price of exploration (lower  better) Fraction of Exploitation (higher  better) Exploitation Regret Exploitation Regret Good Good Exploration Regret Exploitation fraction

Summary Explore/exploit is an effective strategy to maximize CTR in a content display system Ongoing research Explore/exploit for personalized recommendation, page layout optimization, and in the presence of business constraints

Thank You!! Questions