Exit-poll analysis and prediction Stephen Fisher Following: Curtice, John and David Firth (2008) Exit polling in a cold climate: the BBC-ITV experience.

Slides:



Advertisements
Similar presentations
So what happened to the electoral system in 2010? John Curtice Strathclyde University.
Advertisements

Ethnic Penalties in the Labour Market: What Role does Discrimination Play? Anthony Heath Department of Sociology Oxford University.
Automated Regression Modeling Descriptive vs. Predictive Regression Models Four common automated modeling procedures Forward Modeling Backward Modeling.
Managerial Economics Estimation of Demand
Influences on Voting. Part One
Election 2015: Prospects overall and the role of student electors Stephen D Fisher University of Oxford Presentation for the Higher Education Policy Institute.
Inference for Regression
Simple Linear Regression. G. Baker, Department of Statistics University of South Carolina; Slide 2 Relationship Between Two Quantitative Variables If.
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
Statistics for the Social Sciences Psychology 340 Spring 2005 Prediction cont.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
9. SIMPLE LINEAR REGESSION AND CORRELATION
Why sample? Diversity in populations Practicality and cost.
1 Introduction to Biostatistics (PUBHLTH 540) Sampling.
The Polls and The 2015 Election John Curtice 9 June 2015.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Sampling Theory and Surveys GV917. Introduction to Sampling In statistics the population refers to the total universe of objects being studied. Examples.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Congressional Elections Paul E. Peterson. Key Fact about Congressional Elections: Incumbency Advantage Definition: the electoral advantage a candidate.
Voting Behaviour at the 2010 General Election Dr Justin Greaves University of Warwick.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Inference for regression - Simple linear regression
LSS Black Belt Training Forecasting. Forecasting Models Forecasting Techniques Qualitative Models Delphi Method Jury of Executive Opinion Sales Force.
Chapter 7: Demand Estimation and Forecasting
T-test Mechanics. Z-score If we know the population mean and standard deviation, for any value of X we can compute a z-score Z-score tells us how far.
Multiple Regression Analysis Multivariate Analysis.
Introduction to Regression with Measurement Error STA302: Fall/Winter 2013.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
Forecasting Elections POL Forecasting Models Aim to accurately predict the results of an election, before the election is held, identifying.
10.1 DAY 2: Confidence Intervals – The Basics. How Confidence Intervals Behave We select the confidence interval, and the margin of error follows… We.
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Managerial Economics Demand Estimation & Forecasting.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
2015: A national election Nothing is certain Every 1% counts Does Tory vote matter? And the election just got more confused Rob Hayward 19 th January 2015.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Chapter 10 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 A perfect correlation implies the ability to predict one score from another perfectly.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
EXPLAINING LEFT-RIGHT PARTY CONGRUENCE ACROSS EUROPEAN PARTY SYSTEMS: A TEST OF MICRO, MESO AND MACRO LEVEL MODELS Ana Maria Belchior Comparitive politics.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
General Linear Model.
Week 101 ANOVA F Test in Multiple Regression In multiple regression, the ANOVA F test is designed to test the following hypothesis: This test aims to assess.
Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and Alison Kelly Copyright © 2014 by McGraw-Hill Higher Education. All rights.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
Introduction Sample surveys involve chance error. Here we will study how to find the likely size of the chance error in a percentage, for simple random.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Exposure Prediction and Measurement Error in Air Pollution and Health Studies Lianne Sheppard Adam A. Szpiro, Sun-Young Kim University of Washington CMAS.
Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.
Stats Methods at IC Lecture 3: Regression.
Predicting Energy Consumption in Buildings using Multiple Linear Regression Introduction Linear regression is used to model energy consumption in buildings.
The simple linear regression model and parameter estimation
CHAPTER 3 Describing Relationships
Sampling Why use sampling? Terms and definitions
Reasoning in Psychology Using Statistics
Statistics for the Social Sciences
Did people do what they said
Can we trust the opinion polls – a panel discussion
Scotland’s Voting System
Key Features of FPTP.
Introductory Statistical Language
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill.
CHAPTER 3 Describing Relationships
Model Comparison: some basic concepts
Statistics for the Social Sciences
Multiple Regression Chapter 14.
BUS 173: Applied Statistics
Where will the parties do well and badly in 2015?
Forecasting Plays an important role in many industries
Presentation transcript:

Exit-poll analysis and prediction Stephen Fisher Following: Curtice, John and David Firth (2008) Exit polling in a cold climate: the BBC-ITV experience in Britain 2005, J. R. Stat. Soc. A, 171, Presentation to the NCRM-BPC Opinion Polls Conference, British Academy, 20 th January 2010

Broad research design principles: 1 Model the pattern of change across constituencies in the share of the vote since the last election – i.e. not directly estimating the results in 2010 but change since 2005, since more variance between constituencies in shares than in changes. – Also, assessing and allowing for different swings in different seats

Broad research design principles: 2 Estimate probabilities for each party winning each seat – i.e. allow for the random/unexplained variation between constituencies in the prediction Predicted number of seats for each party is the sum of the probabilities across constituencies Primary aim is to predict seat totals, not share of the vote

Infrastructure 2010 exit poll will be a joint BBC/ITN/Sky project Fieldwork by MORI and NOP, as in 2005 People contributing to the analysis: Jouni Kuha (LSE), John Curtice (Strathclyde), Clive Payne (Oxford), Rob Ford (Manchester) Debt of gratitude and computer code to David Firth

Selection of constituencies Revisit all 107 viable locations from 2005 Top up to 130 by sampling the kinds of constituencies thought to be useful and currently under-represented. – e.g. new decision to explicitly attempt to have a group of Lab-LD seats Pick the most representative polling station in the constituency for the new locations

Exit poll locations in 2005

Statistical analysis Model the change in the share of the vote since the last election Consider data with and without interviewer guesses for those who refused – In 2005 ignoring the guesses and refusals worked best Consider lots of different predictor variables – E.g. census data, market research data, strategic situation, incumbency – Expectations of geographical variation informed by practice with pre-election day polls But keep the final model simple (N=130)

Producing predictions Generate predicted probabilities for each party winning each seat from the statistical regression models of the data. – Using estimates of both explained and unexplained variance. Sum the probabilities for each party across constituencies to estimate the total number of seats for the party. In 2005 the exit-poll data suggested a Lab majority of 100 under uniform change, but the method accurately predicted 66 – Introduction of explained (regression) and unexplained (probabilistic prediction) variation both equally account for the difference between uniform change and final prediction.

Probabilistic prediction compared with the swingometer What would be the effect of allowing for unexplained variation in a swingometer estimate of the result? – Depends on the distribution of seats according to marginality. – Smoothes the relationship between predicted swing and predicted seats.

Need for probabilistic prediction in 2005

Marginal Lab-Con seats for 2010 E.g. probabilistic method would predict fewer Con seats from a 7% swing than the swingometer because it would allow for the seats immediately either side of the 7% point to split between Con and Lab, and there are more to the left than right. -But not much difference.

Simulation 1: Stability of notional 2005 results Rerun 2005 notional results but adding noise to allow probabilistic results – Seats with 05 margin <1% become 50:50 – Seats with 05 margin c.4% become 90:10 Changes the seat totals from Con 210, Lab 348 to Con 212, Lab 346 with LD unchanged. – i.e. not much change

Simulation 2: Poll projection ukpollingreport.co.uk average of polls: – Con 41, Lab 29, LD 18 – i.e. +8, -7, -5 since 2005 Uniform swing: Con majority of 44 Probabilistic projection: Con majority of 48 Very little difference

Simulation 2: Seats in the balance for the Tories under the simulation constituency wp05 sp05 pctmaj05 conprw labprw ldprw Dagenham & Rainham LAB CON Erewash LAB CON Norwich South LAB LD Bath LD CON Leeds North East LAB CON Crewe & Nantwich LAB CON Ochil & South Perthshire LAB SNP Oxford West & Abingdon LD CON Newport West LAB CON Warwickshire North LAB CON Hampstead & Kilburn LAB LD Coventry South LAB CON Dorset Mid & Poole North LD CON Argyll & Bute LD CON Telford LAB CON Berwickshire, Roxburgh & Selkirk LD CON Winchester LD CON Luton South LAB CON Brighton Pavilion LAB CON St. Austell & Newquay LD CON

Simulation 2 – Distribution of Predicted Probabilities

Particular difficulties for exit poll prediction in 2010 Boundary changes – Possible errors in both dependent and explanatory variables Expenses scandal – More MPs stepping down – Potentially more variance between constituencies Census data old (2001) High expectations from 2005!