Predicting Winning Price In Real Time Bidding With Censored Data Tejaswini Veena Sambamurthy Weicong Chen.

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Lindsey Bleimes Charlie Garrod Adam Meyerson
Design of Experiments Lecture I
Incentivize Crowd Labeling under Budget Constraint
Linear Regression.
Biointelligence Laboratory, Seoul National University
Experimental Design, Response Surface Analysis, and Optimization
Departments of Medicine and Biostatistics
« هو اللطیف » By : Atefe Malek. khatabi Spring 90.
Model Assessment, Selection and Averaging
Project Analysis and Evaluation
Visual Recognition Tutorial
Linear Regression.
Chapter 8 Logistic Regression 1. Introduction Logistic regression extends the ideas of linear regression to the situation where the dependent variable,
Chapter 12 - Forecasting Forecasting is important in the business decision-making process in which a current choice or decision has future implications:
Simple Linear Regression
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Speaker Adaptation for Vowel Classification
Evaluating Hypotheses
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Linear Regression and Correlation Analysis
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Local Bias and its Impacts on the Performance of Parametric Estimation Models Accepted by PROMISE2011 (Best paper award) Ye Yang, Lang Xie, Zhimin He (iTechs)
Handling Advertisements of Unknown Quality in Search Advertising Sandeep Pandey Christopher Olston (CMU and Yahoo! Research)
Lecture 17 Interaction Plots Simple Linear Regression (Chapter ) Homework 4 due Friday. JMP instructions for question are actually for.
Stat 112: Lecture 9 Notes Homework 3: Due next Thursday
Classification and Prediction: Regression Analysis
Copyright 2012 John Wiley & Sons, Inc. Chapter 7 Budgeting: Estimating Costs and Risks.
Survival Analysis A Brief Introduction Survival Function, Hazard Function In many medical studies, the primary endpoint is time until an event.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 5 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.
Financial Assessment and
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Lecture 3-3 Summarizing r relationships among variables © 1.
1 G Lect 1w Structure of course Overview of Regression Topics Some Advanced Topics (beyond this course) Example 1 Expectations Example 2 G
Sponsor: Dr. K.C. Chang Tony Chen Ehsan Esmaeilzadeh Ali Jarvandi Ning Lin Ryan O’Neil Spring 2010.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Correlation Analysis. Correlation Analysis: Introduction Management questions frequently revolve around the study of relationships between two or more.
This material is approved for public release. Distribution is limited by the Software Engineering Institute to attendees. Sponsored by the U.S. Department.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
1 What Practitioners Need to know... By Mark Kritzman Holding Period Return Holding Period Return  HPR = (Ending Price – Beginning Price + Income) / Beginning.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Model Building and Model Diagnostics Chapter 15.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Correlation & Regression Analysis
Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.
NON-LINEAR REGRESSION Introduction Section 0 Lecture 1 Slide 1 Lecture 6 Slide 1 INTRODUCTION TO Modern Physics PHYX 2710 Fall 2004 Intermediate 3870 Fall.
Stochastic Optimization
Collaborative filtering applied to real- time bidding.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Introduction It had its early roots in World War II and is flourishing in business and industry with the aid of computer.
Programmatic Buying Simplified
BPS - 5th Ed. Chapter 231 Inference for Regression.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional.
Scatterplots & Correlations Chapter 4. What we are going to cover Explanatory (Independent) and Response (Dependent) variables Displaying relationships.
Chapter 3: Cost Estimation Techniques
Functional Bid Landscape Forecasting for Display Advertising
Ch3: Model Building through Regression
Publishers Sell space on their website
Chapter 12 Using Descriptive Analysis, Performing
Estimating with PROBE II
Basic Practice of Statistics - 3rd Edition Inference for Regression
Product moment correlation
Parametric Methods Berlin Chen, 2005 References:
Sofia Pediaditaki and Mahesh Marina University of Edinburgh
Presentation transcript:

Predicting Winning Price In Real Time Bidding With Censored Data Tejaswini Veena Sambamurthy Weicong Chen

Topics Covered Introduction Problem Statement Specification Methodology  Linear Regression Model  Censored Regression Model  Mixture Model Datasets Experiments  Settings  Winning Price Patterns  Evaluations Related Work Reviews

Introduction

What is RTB?  Real-time bidding (RTB) is a relatively new method of selling and buying online display advertising in real time one ad impression at a time. What is SSP?  Publishers manage and sell their inventories of ad impressions via the Supply-Side Platform (SSP) What is DSP?  Advertisers can bid and manage their ads across multiple inventory sources via the Demand-Side-Platform (DSP)

Introduction Censoring  The winning price is only observable for the DSP who wins the bid. For the lost DSP they can only observe the lower bound, which is their own bidding price.  Censoring also occurs because of the use of soft floor price  The publishers can set a soft floor price and the hard floor price for each ad impression. Hard floor price is the minimum bidding price to even have a chance to win. The soft floor price is higher than the hard floor price and its usage can be understood from the flow chart in the next slide.

Introduction Mechanism of modern RTB

Problem Statement The goal of this paper is to learn the winning price, in the aspect of a DSP with historical RTB bids and to deal with the problem of various winning price censoring at the same time. We aim to propose a machine-learning based model that can help a DSP to predict the winning price and thus decide the bidding price.

Specifications How to solve these difficulties?  A linear regression model based only on the observed bids  A censored regression model to consider the censored data.  A mixture model which combines the two models, weighted by the winning rate of the DSP. Experiments were conducted on real RTB data sets.  Result shows the mixture model in general prominently outperforms linear regression in terms of prediction accuracy

Math Terminologies I Regression: In statistical modeling, regression analysis is a statistical process for estimating the relationships among variables. Regression analysis helps one understand how the typical value of the dependent variable changes when any one of the independent variables is varied, while the other independent variables are held fixed. Likelihood function: A likelihood function is a function of the parameters of a statistical model given outcome. Probability function: describing a function of the outcome given a fixed parameter value

Methodology: Modeling winning price

How to measuring how good the linear regression model can approximate? Negative log-likelihood function

Methodology: Modeling Censored Data So modeling censored data is necessary!

Methodology: Modeling Censored Data It measures how well the linear regression model approximates the censored data.

Math Terminologies II

Methodology: Modeling Censored Data

Methodology-Mixture model

Dataset

Basic Statistics of the iPinYou Dataset Estimate Winning Rate Winning Rate Area Under Curve Average winning price Winning bid Censored bid

Experiment Settings Since both the DSPs can only provide the winning prices of their own won bids, simulation of the bidding results of both winning and losing bids is processed as follow: The new bidding price is compared to the original winning price to produce the simulated winning and losing bid. The proposed mixture model requires the winning rate as input which is predicted by the simulated bidding result. CTR (click-through-rate) is an important feature for predicting winning rate. Predict CTR and the winning rate in an online fashion to simulate the online bidding environment. For each historical bid, divide the bidding price originally offered by each DSP by a factor as the new bidding price (default: 2).

The difference of Winning Price Patterns Question: Are the patterns of the winning price on the observed data and those on the censored data different? The average winning prices on losing bids are usually higher than the average winning prices on winning bids.

The difference of Winning Price Patterns

Results show that the winning price patterns on the winning and losing bids are different, otherwise the result of the two estimators should be similar. To ameliorate this effect, interaction between the estimated winning rate and the features used to train the model is considered. The model is trained by considering both the original features and the features after interaction with the estimated winning rate. The different patterns of winning and losing bids can thus be differentiated.

Does the censored regression model have better performance on predicting winning price compared to the linear regression model trained only from the historical winning bids?

Evaluation on Mixture Model

Related Work Zhang et al. Used bidding price as the upper bound of the cost as the winning price. Ghosh et al. Considered both the fully and partially observed winning price information. Since their main goal was to consider budget constraints, they simply assumed the winning price was drawn I.i.d from a CDF. Other studies are on the seller side (SSP)

Thank you! Questions?