The Value of Purchase History Data in Target Marketing (1996)

Slides:



Advertisements
Similar presentations
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Advertisements

Bayesian dynamic modeling of latent trait distributions Duke University Machine Learning Group Presented by Kai Ni Jan. 25, 2007 Paper by David B. Dunson,
Forecasting Using the Simple Linear Regression Model and Correlation
Combining Information from Related Regressions Duke University Machine Learning Group Presented by Kai Ni Apr. 27, 2007 F. Dominici, G. Parmigiani, K.
Pattern Recognition and Machine Learning
Correlation and regression Dr. Ghada Abo-Zaid
Consumer Behavior Prediction using Parametric and Nonparametric Methods Elena Eneva CALD Masters Presentation 19 August 2002 Advisors: Alan Montgomery,
1 Estimating Heterogeneous Price Thresholds Nobuhiko Terui* and Wirawan Dony Dahana Graduate School of Economics and Management Tohoku University Sendai.
Part 24: Bayesian Estimation 24-1/35 Econometrics I Professor William Greene Stern School of Business Department of Economics.
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
1 Asim Ansari Carl Mela E-Customization. Page 2 Introduction Marketing Targeted Promotions List Segmentation Conjoint Analysis Recommendation Systems.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Chapter 2 – Tools of Positive Analysis
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Review of Lecture Two Linear Regression Normal Equation
9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.
Chapter Two Probability Distributions: Discrete Variables
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Regression and Correlation Methods Judy Zhong Ph.D.
Hypothesis Testing in Linear Regression Analysis
Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health.
The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.
Statistics and Quantitative Analysis U4320 Segment 8 Prof. Sharyn O’Halloran.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Brian Macpherson Ph.D, Professor of Statistics, University of Manitoba Tom Bingham Statistician, The Boeing Company.
Managerial Economics Demand Estimation & Forecasting.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Structural Models for Customer Behavior under Nonlinear Pricing Schemes Raghu Iyengar Columbia University.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Lecture 2: Statistical learning primer for biologists
1 Optimizing Decisions over the Long-term in the Presence of Uncertain Response Edward Kambour.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
[Part 5] 1/43 Discrete Choice Modeling Ordered Choice Models Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Canadian Bioinformatics Workshops
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
Chapter 13 Simple Linear Regression
Probability Theory and Parameter Estimation I
Multiple Imputation using SOLAS for Missing Data Analysis
Correlation and Simple Linear Regression
Stephen W. Raudenbush University of Chicago December 11, 2006
Linear Regression and Correlation Analysis
Linear Mixed Models in JMP Pro
Analyzing Redistribution Matrix with Wavelet
Linear and generalized linear mixed effects models
Computer vision: models, learning and inference
Impact of Sales Promotions on When, What, and How Much to Buy
A Logit model of brand choice calibrated on scanner data
Regression Analysis Week 4.
More about Posterior Distributions
BUS173: Applied Statistics
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Causal Inference in R Ana Daglis, Farfetch x.
Corporate governance, chief executive officer compensation, and firm performance 刘铭锋
OVERVIEW OF LINEAR MODELS
Econometrics Chengyuan Yin School of Mathematics.
Product moment correlation
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Longitudinal Data & Mixed Effects Models
BEC 30325: MANAGERIAL ECONOMICS
9. Binary Dependent Variables
Marketing Experiments I
Introduction to Regression
Interval Estimation of mean response
Presentation transcript:

The Value of Purchase History Data in Target Marketing (1996) Peter Rossi - Robert McCulloch - Greg Allenby

Authors

Peter Rossi James Collins Professor of Marketing, Statistics, and Economics at UCLA PhD from University of Chicago Professor at Chicago when this paper was published

Robert McCulloch Professor of Statistics, Arizona State University PhD from the University of Minnesota Professor at Chicago when this paper was published

Greg Allenby Professor of Marketing/Statistics at Ohio State University PhD from University of Chicago

Rossi, Allenby, McCulloch (2005)

Motivation

Motivation Increased availability of consumer-level purchase data Few firms take advantage of data Data processing/storage costs are rapidly declining Authors believe that large gains in revenue can be obtained by implementing a customer-centric targeted marketing strategy

Goals of Paper Assess how targeted marketing strategies, each utilizing different information content, affect expected revenue Propose Hierarchical Bayesian model to accomplish this

Random Coefficients Model for Consumer Heterogeneity

Random Coefficients Model For Costumer Heterogeneity Goal of proposed model The authors outline a few goals for their model Accommodate household-specific inferences from individual-level parameter estimates Allow for both observed and unobserved heterogeneity

Random Coefficients Model For Costumer Heterogeneity Recall random Intercept model1 Illustration in two dimensions Model Distribution for individual-level intercepts 1: Fahrmeir, Ludwig, Thomas Kneib, Stefan Lang, and Brian D.. Marx. "Mixed Models." Regression: Models, Methods and Applications. Berlin: Springer, 2013. N. pag. Print.

Random Coefficients Model For Costumer Heterogeneity Recall random coefficient and slope model2 Illustration in two dimensions Model Distribution for individual-level slope 2: Fahrmeir, Ludwig, Thomas Kneib, Stefan Lang, and Brian D.. Marx. "Mixed Models." Regression: Models, Methods and Applications. Berlin: Springer, 2013. N. pag. Print.

Random Coefficients Model For Costumer Heterogeneity Hierarchical Bayesian Model Demographics N(Δzh,Vβ) Betas β1 β2 … βh Utility … y11 y12 y1t … … y21 y22 y2t yh1 yh2 yht … Choice … I11 … I12 I1t I21 I22 I2t Ih1 Ih2 Iht

Random Coefficients Model For Costumer Heterogeneity Hierarchical Bayesian Model Demographics zh: dx1 vector of demographics variables Δh: effect of demographic variables on β βh: model mean of β as function of z Betas yht: Utility (linear predictor) follows a multivariate regression Utility X contains vector of product features, brand loyalties, and log of prices Choice Iht: Choice is defined as the maximum of utilities on each choice occasion Probit link function

Random Coefficients Model For Costumer Heterogeneity Specification of Priors Three parameters across households: Λ: Covariance matrix of utility errors, ε Δ: Effect of demographics on mean β Vβ: Covariance matrix of β Conjugate priors for Δ, Vβ. Very non-informative for all: Δ: Normal Vβ: Inverse Wishart Λ: Independent inverted gamma distribution Λ = Lambda

Random Coefficients Model For Costumer Heterogeneity Why use a Bayesian Model? Reasons to use hierarchical bayesian approach: We can make household-specific inferences from individual-level parameter estimates We need a model that can effectively estimate parameters given only a few (in some cases one) observations for each individual (Gibbs Sampling) Characterize uncertainty around household-level parameters

Assessing Value of Information Sets (Model Estimation)

Alternative Information Sets Goal: Assess information content of various data/information sets available for targeted marketing Variance in data: Short/long Causal data/No Causal Data Authors propose typology for various information sets (table 1)

Description of Data Scanner panel dataset of tuna purchases in Springfield Missouri 400 of 775 households randomly selected At least 1.5 years in data set Five tuna brands

Predictor Variables Causal: Demographics (Δ): Brands/Features: Chicken of the sea (water) [REFERENCE] Starkist (water) House Brand (water) Chicken of the sea (oil) Starkist (oil) Price (log) Demographics (Δ): Household income Family size Retirement (dummy) Unemployed (dummy) Female head of household (dummy) Causal: In-store display (dummy) Feature ads (dummy)

Effect of demographics (Δ) on βh Low income, unemployed High variance. Demographics offer limited explanation Retired, unemployed

Effects of brand (βh) on choice Strong brand pref. Weak brand pref. Individual-level estimates for 10 households Marg refers to marginal predictive dist. or the overall effect. This would be used in the ‘base information set One observation is model used with data on first observed purchase Note how informative the parameters estimates from ‘full information’ model are. Decreases as you remove information Demos only: Demographic data only Very limited value Similarity of top two plots: Adding causal data yields little information gain Increased variation. Wider point estimates

Effects of price (βh) on choice Individual-level price sensitivities Full information model gives best estimates of the price coefficient Even one observation yields decent estimation of price coefficient

Model Estimation Conclusion: Our analysis of model parameters suggests that individual-level purchase history data could be of great use in customizing marketing activities. Further, demographic information is of limited value. We will formalize this analysis next.

Assessing Value of Information Sets (Formal Metric)

Metrics For Couponing in Targeted Marketing F is face value of coupon Pr(i) is purchase probability of ith choice occasion M is manufacturer margin Goal: find F that maximizes revenue (π) ‘Plugging in’ model point estimates will overstate revenue Want to incorporate uncertainty Choose F to maximize expected net revenue, averaged over distribution of β (decision-theoretic approach) Incremental Sales: Expected Net Revenue (π)

Targeted Marketing Optimization Expected revenue from one household for values of F Dots: Parameter estimates from posterior means (plug- in) Bars: Mean from decision- theoretic calculation Note optimal price is the same here, but not for all households High level of uncertainty from small number of household observations Optimal F

What is the value of targeted marketing? We can compare by computing expected revenue per household from different targeting strategies: i.e. Full targeting, choices only, blanket, etc… We use optimal F, F*, to calculate expected revenue (Π), then average over all households Gain Relative to Blanket: (Revenuetarget - RevenueBlanket)/(RevenueBlanket - Revenuenone) This allows us to compare how the implementation of a targeting strategy compares to the implementation of a blanket strategy Large gains: even the one obs. information set yields a 56% increase in revenue

Conclusion Targeted couponing strategies based on customer purchase history can have large, positive effects on revenue Even strategies that use short purchase histories, can have positive impact As data processing/storage costs continue to decline, this type of modeling will be increasingly easier to for managers to implement