Download presentation
Presentation is loading. Please wait.
Published byElijah Riley Modified over 9 years ago
1
Using R for Marketing Research Dan Toomey 2/23/2015 dan@dantoomeysoftware.com
2
Outline Overview of R – Language Basics – Basic Operations – Tools – R, R Studio, others Validate Data Analyze Ad Campaign Effectiveness Sales Impact Drivers Determine Optimal Pricing
3
Language Basics Uses ‘library’s of open source code for specific applications Mostly statistics processing Simple syntax Free tools to use Multiple platforms
4
Basic Operations Load data – variety of formats Compute metrics – Again, mostly interested in statistics Display metrics – Textual – Graphical
5
Load Data ># libraries used >library(s20x) >library(car) ># load and display data >df <- read.csv("http://www.dataapple.net/wp- content/uploads/2013/04/grapeJuice.csv",header=T) >head(df) sales price ad_type price_apple price_cookies 1 222 9.83 0 7.36 8.80 2 201 9.72 1 7.43 9.62 3 247 10.15 1 7.66 8.90 4 169 10.04 0 7.57 10.26 5 317 8.38 1 7.33 9.54
6
Validate Data >par(mfrow = c(1,2)) >boxplot(df$sales,horizontal = TRUE, xlab="sales") ># histogram to explore the data distribution shape >hist(df$sales,main="",xlab="sales",prob=T) >lines(density(df$sales),lty="dashed",lwd=2.5,co l="red")
7
Data Validation No outliers from box plot Normal distribution
8
Ad Effectiveness > sales_ad_nature = subset(df,ad_type==0) > sales_ad_family = subset(df,ad_type==1) > # graph the two > par(mfrow = c(1,2)) > hist(sales_ad_nature$sales,main="",xlab="sales with nature production theme ad",prob=T) > lines(density(sales_ad_nature$sales), lty="dashed",lwd=2.5,col="red") > hist(sales_ad_family$sales,main="",xlab="sales with family health caring theme ad",prob=T) > lines(density(sales_ad_family$sales),lty="dashed",lwd=2.5,col="red")
9
Ad Effectiveness Normal distributions
10
Test Null Hypothesis(Normal) >shapiro.test(sales_ad_nature$sales) Shapiro-Wilk normality test data: sales_ad_nature$sales W = 0.9426, p-value = 0.4155 >shapiro.test(sales_ad_family$sales) Shapiro-Wilk normality test data: sales_ad_family$sales W = 0.8974, p-value = 0.08695
11
Test Means > t.test(sales_ad_nature$sales,sales_ad_family$sales) Welch Two Sample t-test data: sales_ad_nature$sales and sales_ad_family$sales t = -3.7515, df = 25.257, p-value = 0.0009233 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -92.92234 -27.07766 sample estimates: mean of x mean of y 186.6667 246.6667
12
Sales Drivers > pairs(df,col="blue",pch=20) > pairs20x(df)
13
Regression Model > sales.reg<-lm(sales~price+ad_type+price_apple+price_cookies,df) summary(sales.reg) Call: lm(formula = sales ~ price + ad_type + price_apple + price_cookies, data = df) Residuals: Min 1Q Median 3Q Max -36.290 -10.488 0.884 10.483 29.471 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 774.813 145.349 5.331 1.59e-05 *** price -51.239 5.321 -9.630 6.83e-10 *** ad_type 29.742 7.249 4.103 0.000380 *** price_apple 22.089 12.512 1.765 0.089710. price_cookies -25.277 6.296 -4.015 0.000477 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 18.2 on 25 degrees of freedom Multiple R-squared: 0.8974,Adjusted R-squared: 0.881 F-statistic: 54.67 on 4 and 25 DF, p-value: 5.318e-12
14
Test Data Check distribution of residuals
15
Multicolinearity > vif(sales.reg) price ad_type price_apple price_cookies 1.246084 1.189685 1.149248 1.099255
16
Price Elasticity Sales = 774.81 – (51.24 * price) + (29.74 * ad_type) + (22.1 * price_apple) – (25.28 * price_cookies) PE = (ΔQ/Q) / (ΔP/P) = (ΔQ/ΔP) * (P/Q) = -51.24 * 0.045 = -2.3 P/Q = 9.738 / 216.7 = 0.045 The PE indicates that 10% decrease in price will increase the sales by 23%
17
Optimal Pricing Substituting means for price changes in above model we come down to Sales = 772.64 – 51.24*price estimate profit(Y) = (price – C) * Sales Quantity = (price – 5) * (772.64 – 51.24*price) reduces to Y = – 51.24 * price2 + 1028.84 * price – 3863.2 > f<-function(x) { result <- -51.24*x^2 + 1028.84*x - 3863.2 return(result) } #have R optimize the function >optimize(f,lower=0,upper=20,maximum=TRUE) $maximum [1] 10.03942 $objective [1] 1301.28
18
Predict Sales >inputData <- data.frame(price=10,ad_type=1,price_apple=7.6 59,price_cookies=9.738) >predict(sales.reg,inputData,interval="p") fit lwr upr 1 215.1978 176.0138 254.3817
19
References Analyze sales pricing http://www.r- bloggers.com/data-analysis-for-marketing- research-with-r-language-1/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.