Presentation is loading. Please wait.

Presentation is loading. Please wait.

Causal Inference in R Ana Daglis, Farfetch x.

Similar presentations


Presentation on theme: "Causal Inference in R Ana Daglis, Farfetch x."β€” Presentation transcript:

1 Causal Inference in R Ana Daglis, Farfetch x

2 Farfetch Customer Boutique Customer Boutique Customer Boutique
x

3 One of the most common questions we face in marketing is measuring the incremental effects
How much incremental revenue did the new pricing strategy drive? What impact did the new feature on the website have? How many incremental conversions were achieved by increasing the commission rate for our affiliates? …

4 50% of visitors see Version A 50% of visitors see Version B
The main gold standard method for estimating causal effects is a randomised experiment 10% Conversion 15% Conversion Version A Version B 50% of visitors see Version A 50% of visitors see Version B

5 100% of visitors see Version B
However, often A/B tests are either too expensive to run or cannot be run, e.g. due to legal reasons 15% Conversion Version A Version B 100% of visitors see Version B

6 Example: financial performance of a company A
Actual share price Scandal broke

7 Approach: estimate the share price had the scandal not happened
Actual share price Predicted share price Scandal broke

8 By comparing the actual and predicted share price, we can estimate the drop in stock value due to the scandal Actual share price Predicted share price Drop in stock value due to scandal Scandal broke

9 Thanks to a fully Bayesian approach, we can quantify the confidence level of our predictions
Actual share price Predicted share price 95% credible interval Scandal broke

10 How do we construct the counterfactual estimate?
Training Prediction Actual share price Predicted share price 95% credible interval Company B share price Company C share price Scandal broke

11 Most general form of the model
Causal Impact methodology is based on a Bayesian structural time series model Most general form of the model Causal Impact model 𝑦 𝑑 = 𝑍 𝑑 𝑇 𝛼 𝑑 + πœ€ 𝑑 𝛼 𝑑+1 = 𝑇 𝑑 𝑇 𝛼 𝑑 + 𝑅 𝑑 πœ‚ 𝑑 Observation equation 𝑦 𝑑 = πœ‡ 𝑑 + 𝜏 𝑑 + π‘₯ 𝑑 𝑇 𝛽+ πœ€ 𝑑 πœ‡ 𝑑+1 = πœ‡ 𝑑 + 𝛿 𝑑 + πœ‚ πœ‡,𝑑 𝛿 𝑑+1 = 𝛿 𝑑 + πœ‚ 𝛿,𝑑 𝜏 𝑑+1 =βˆ’ 𝑖=0 π‘†βˆ’2 𝜏 π‘‘βˆ’π‘– + πœ‚ 𝜏,𝑑 State equation

12 The model has 5 main parameters: 4 variance terms 𝝈 𝜺 𝟐 , 𝝈 𝝁 𝟐 , 𝝈 𝜹 𝟐 , 𝝈 𝝉 𝟐 and regression coefficients 𝜷 𝑦 𝑑 = πœ‡ 𝑑 + 𝜏 𝑑 + π‘₯ 𝑑 𝑇 𝛽+ πœ€ 𝑑 πœ‡ 𝑑+1 = πœ‡ 𝑑 + 𝛿 𝑑 + πœ‚ πœ‡,𝑑 𝛿 𝑑+1 = 𝛿 𝑑 + πœ‚ 𝛿,𝑑 𝜏 𝑑+1 =βˆ’ 𝑖=0 π‘†βˆ’2 𝜏 π‘‘βˆ’π‘– + πœ‚ 𝜏,𝑑 ~𝒩 0, 𝜎 πœ€ 2 ~𝒩 0, 𝜎 πœ‡ 2 ~𝒩 0, 𝜎 𝛿 2 ~𝒩 0, 𝜎 𝜏 2

13 We impose an inv-gamma prior on 𝝈 𝜺 𝟐 , with parameters 𝒔 𝜺 and 𝒗 𝜺 selected based on the expected goodness-of-fit 𝑦 𝑑 = πœ‡ 𝑑 + 𝜏 𝑑 + π‘₯ 𝑑 𝑇 𝛽+ πœ€ 𝑑 πœ‡ 𝑑+1 = πœ‡ 𝑑 + 𝛿 𝑑 + πœ‚ πœ‡,𝑑 𝛿 𝑑+1 = 𝛿 𝑑 + πœ‚ 𝛿,𝑑 𝜏 𝑑+1 =βˆ’ 𝑖=0 π‘†βˆ’2 𝜏 π‘‘βˆ’π‘– + πœ‚ 𝜏,𝑑 Priors ~𝒩 0, 𝜎 πœ€ 2 𝜎 πœ€ 2 ~ πΌπ‘›π‘£βˆ’πΊπ‘Žπ‘šπ‘šπ‘Ž 𝑠 πœ€ , 𝑣 πœ€ ~𝒩 0, 𝜎 πœ‡ 2 ~𝒩 0, 𝜎 𝛿 2 ~𝒩 0, 𝜎 𝜏 2

14 𝜎 πœ‡ 2 , 𝜎 𝛿 2 , 𝜎 𝜏 2 ~ πΌπ‘›π‘£βˆ’πΊπ‘Žπ‘šπ‘šπ‘Ž 1, 0.01 Γ— π‘‰π‘Žπ‘Ÿ(𝑦)
We impose weak priors on 𝝈 𝝁 𝟐 , 𝝈 𝜹 𝟐 and 𝝈 𝝉 𝟐 reflecting the assumption that errors are small in the state process 𝑦 𝑑 = πœ‡ 𝑑 + 𝜏 𝑑 + π‘₯ 𝑑 𝑇 𝛽+ πœ€ 𝑑 πœ‡ 𝑑+1 = πœ‡ 𝑑 + 𝛿 𝑑 + πœ‚ πœ‡,𝑑 𝛿 𝑑+1 = 𝛿 𝑑 + πœ‚ 𝛿,𝑑 𝜏 𝑑+1 =βˆ’ 𝑖=0 π‘†βˆ’2 𝜏 π‘‘βˆ’π‘– + πœ‚ 𝜏,𝑑 Priors ~𝒩 0, 𝜎 πœ€ 2 𝜎 πœ‡ 2 , 𝜎 𝛿 2 , 𝜎 𝜏 2 ~ πΌπ‘›π‘£βˆ’πΊπ‘Žπ‘šπ‘šπ‘Ž 1, 0.01 Γ— π‘‰π‘Žπ‘Ÿ(𝑦) ~𝒩 0, 𝜎 πœ‡ 2 ~𝒩 0, 𝜎 𝛿 2 ~𝒩 0, 𝜎 𝜏 2

15 We let the model choose an appropriate set of controls by placing a spike and slab prior over coefficients 𝜷 𝑦 𝑑 = πœ‡ 𝑑 + 𝜏 𝑑 + π‘₯ 𝑑 𝑇 𝛽+ πœ€ 𝑑 πœ‡ 𝑑+1 = πœ‡ 𝑑 + 𝛿 𝑑 + πœ‚ πœ‡,𝑑 𝛿 𝑑+1 = 𝛿 𝑑 + πœ‚ 𝛿,𝑑 𝜏 𝑑+1 =βˆ’ 𝑖=0 π‘†βˆ’2 𝜏 π‘‘βˆ’π‘– + πœ‚ 𝜏,𝑑 Priors ~𝒩 0, 𝜎 πœ€ 2 𝑝 𝜚 ~ 𝑗=1 𝐽 πœ‹ 𝑗 𝜚 𝑗 (1βˆ’ πœ‹ 𝑗 ) 𝜚 𝑗 ~𝒩 0, 𝜎 πœ‡ 2 𝛽 𝛾 | 𝜎 πœ€ 2 ~ 𝒩(0, π‘›πœŽ πœ€ 2 𝑋 𝑇 𝑋 βˆ’1 ) ~𝒩 0, 𝜎 𝛿 2 ~𝒩 0, 𝜎 𝜏 2

16 The inference can be performed in R with just 6 lines of code
1 library(CausalImpact) 2 pre.period <- as.Date(c(" ", " ")) post.period <- as.Date(c(" ", " ")) 4 impact <- CausalImpact(data, pre.period, post.period) 5 plot(impact) 6 summary(impact)

17 Results can be plotted and summarised in a table
Cumulative panel only makes sense when the metric is additive, such as clicks or the number of orders, but not in the case when it is a share price

18 The package can even write a report for you!

19 Additional considerations
It is important that covariates included in the model are not themselves affected by the event. For each covariate included, it is critical to reason why this is the case. The model can be validated by running the Causal Impact analysis on an β€˜imaginary event’ before the actual event. We should not be seeing any significant effect, and actual and predicted lines should match reasonably closely before the actual event.

20 References K.H. Brodersen, F. Gallusser, J. Koehler, N. Remy, S.Β L. Scott, (2015). Inferring Causal Impact Using Bayesian Structural Time- Series Models S. L. Scott, H. Varian, (2013). Predicting the Present with Bayesian Structural Time Series present-with-bsts.pdf.

21 Thank you!


Download ppt "Causal Inference in R Ana Daglis, Farfetch x."

Similar presentations


Ads by Google