Download presentation
Presentation is loading. Please wait.
1
Rethinking Our Models or… ‘How Not to Kill Anyone with Regression Analysis’ IoF Insight SiG Conference 7th November 2016 Stuart McCoy, Data Strategy Consultant DM Insight
2
Modelling Relationships
Linear and Multiple Regression, CHAID, fuzzy models, clustering, etc. Regression analysis is a widely used statistical technique Identify and model relationships between variables The model produced predicts a variable of interest, e.g. response Statistical modelling requires special attention from the analyst. Each process step – from model specification and data collection, to model building and model validation, to interpreting the developed model – needs to be carefully examined and executed. A small mistake in any of these steps may lead to an erroneous model. Statistical modelling, even in the right hands, can be dangerous.
3
Avoid Killing Anyone with Statistical Analysis
4
Statistics vs. Knowledge of Fundraising
In 99% of cases don’t rely on the statistics alone. Use your intelligence and the charity’s vast experience of fundraising. Ask: What is the model telling you and why? What is the model not telling you, that you expected it to tell you? Do we need more than one to answer the same question?
5
Channels
6
The Most Important Question is…
…’What are we Modelling?’
7
What are we Modelling? ‘Who is most likely to respond?’
…or could we model: ‘Who is going to respond and become profitable?’ …or: ‘Who is going to respond and become a long term loyal supporter?’ Is value also important? (some propositions generate higher response but at a lower avg. value) Asking the right question is half the battle.
8
The 2nd Most Important Question is…
…’So What?’
9
Application Certain variables may correlate highly with the dependent variable but ‘so what?’ There may be historical reasons for this, e.g. regionally localised fundraising historically, or there may be little we can do with a variable such as gender What is it that we are modelling? In what circumstances might we not need a model? Raffle Model example (also commonly legacy models for small charities) What’s the strategic goal? (NB: it may not always be maximising income) Other ways of segmenting: historical outbound comms. ‘Tipping Point’ analysis
10
Requirements Gathering
11
Model Misspecification
Statistical analysis to identify a relationship between two or more variables Enable prediction (of response, value, engagement, affluence, etc.) The first step is to specify the model – that is, define the dependent and explanatory variables easy to commit a common mistake, misspecification of the model. Model misspecification means that not all of the relevant predictors are considered or that the model is fitted without one or more significant predictors. Just because a regression analysis indicates a strong relationship between two variables, they are not necessarily functionally related.
12
Drivers
13
Drivers or Explanatory Variables
Response, Value, Profitability, Engagement, Affluence? Demographics: Age, Acorn/Mosaic/Sonar, Census, Gender, Geography Behaviour: Transactions, non-financial actions, in-bound comms, complaints Attitudes: Survey response, modelled attitudes Communications: Received a welcome call
14
Descriptors vs. Selection Variables
Leave out ‘Recency’ in legacy models? Age? In Mem, Miss aged 50+, previous enquirers, considerers, intenders
15
Correlation
16
Causation or Correlation
Various statistical techniques will show you if two variables are correlated but they are unable to tell you whether: causality exists, and if true causality does exist, which variable is the explanatory variable Spurious Correlation - a mathematical relationship in which two variables have no direct causal connection, yet it may be wrongly inferred that they do, due to either coincidence or the presence of a certain third, unseen factor (referred to as a "confounding factor" or "lurking variable") Looking at UK rates of diagnosis of heart disease in Scotland and Northern regions. Increased significantly in recent years but…
20
Autocorrelation Who has propensity models? How many descriptive variables does it have? Multicollinearity exists when two or more of the predictors in a regression model are moderately or highly correlated. Unfortunately, when it exists, it can wreak havoc on our analysis and thereby limit the insight and conclusions we can draw. e.g. Tenure, LTV, Age heavily correlated Get the equation. i.e. the model. If anyone builds you a model with 20+ variables send it back. Lots of variables will already have been totally explained away by the first 7 or 8.
21
Common Misapplication
why only ever contacting your 'best' supporters and prospects via propensity models spells the death of best practice CRM (and the optimisation of income and supporter engagement) when models misinform rather than inform the future, e.g. when the profile of the supporter pool you are modelling has changed in recent years (this is becoming more common, especially with legacy supporters) when successive model refreshes and rebuilds become a self-fulfilling prophecy / modelling past strategy? when 'good' statistical models miss good prospects data issues: auto correlation, insufficient volumes, drivers of response that are not evident in the supporter pool, useful data enhancements for data modelling
22
Common Misapplication
(vi) when descriptive variables are better used as in the criteria to define a contact segment rather actually sitting in the model itself (vii) being mindful of response channels, eg: a supporter might be modelled as 'highly responsive' when in fact they are only highly responsive to say DM rather than TMK. Client using a direct mail raffles model as a targeting tool for raffles telemarketing and lost a lot of money. A focus on the product not the person! (viii) variables that are better left out of certain types of propensity models (ix) the organisation is now capturing items of data that it didn't used (or hardly used to) such as proximity to cause, s opens/click throughs, questionnaire response, even some external data overlays, then these variables will not be evident in significant volumes across the cohort of supporters that you're basing your model on, e.g. existing pledgers. Non-linear relationships: decreasing marginal utility
23
Testing
24
Gains Chart Back testing
25
Response through the Percentiles
26
Uncovering New Supporters
27
Measuring Performance
If you’re not measuring it, you’re not managing it...
28
Profiling What do you know about your supporters?
Use as a basis for targeting and segmentation Informs engagement Demographics Behaviour
29
High Net Worth Individuals
Charitable Giving High Net Worth Banks Also Acorn, Mosaic, etc.
30
Measuring Engagement Prospect profiling Flat models
X-tab for strategic decision making
31
Strategy Matrix So we start with three basic questions.
32
Engagement & Strategic Insight
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.