Modelling voter preferences: a multilevel, longitudinal approach Dr. Edward Fieldhouse, Jerry Johnson, Prof. Andrew Pickles, Dr. Kingsley Purdam, Nick.

Slides:



Advertisements
Similar presentations
MANOVA (and DISCRIMINANT ANALYSIS) Alan Garnham, Spring 2005
Advertisements

Internet Surveys and Political Attitudes: Evidence from the 2005 British Election Study David Sanders, Harold Clarke, Paul Whiteley and Marianne Stewart.
Brief introduction on Logistic Regression
Logistic Regression Psy 524 Ainsworth.
Longitudinal and Multilevel Methods for Models with Discrete Outcomes with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David.
Random effects as latent variables: SEM for repeated measures data Dr Patrick Sturgis University of Surrey.
Error Component models Ric Scarpa Prepared for the Choice Modelling Workshop 1st and 2nd of May Brisbane Powerhouse, New Farm Brisbane.
Random effects estimation RANDOM EFFECTS REGRESSIONS When the observed variables of interest are constant for each individual, a fixed effects regression.
Nguyen Ngoc Anh Nguyen Ha Trang
Multilevel Models 4 Sociology 8811, Class 26 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Models with Discrete Dependent Variables
1Prof. Dr. Rainer Stachuletz Limited Dependent Variables P(y = 1|x) = G(  0 + x  ) y* =  0 + x  + u, y = max(0,y*)
Maximum likelihood estimates What are they and why do we care? Relationship to AIC and other model selection criteria.
Multiple Linear Regression Model
Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
A quick introduction to the analysis of questionnaire data John Richardson.
Clustered or Multilevel Data
Topic 3: Regression.
REGRESSION AND CORRELATION
Lecture 14-2 Multinomial logit (Maddala Ch 12.2)
An Introduction to Logistic Regression
Internet Experiments in the 2005 BES David Sanders Harold Clarke Marianne Stewart Paul Whiteley.
Modeling clustered survival data The different approaches.
Log-linear analysis Summary. Focus on data analysis Focus on underlying process Focus on model specification Focus on likelihood approach Focus on ‘complete-data.
BIOST 536 Lecture 4 1 Lecture 4 – Logistic regression: estimation and confounding Linear model.
Discriminant Analysis Testing latent variables as predictors of groups.
Unit 4b: Fitting the Logistic Model to Data © Andrew Ho, Harvard Graduate School of EducationUnit 4b – Slide 1
Logistic Regression- Dichotomous Dependent Variables March 21 & 23, 2011.
Working Class Tories not Labour Traditionalists: why UKIP is damaging the Conservatives Eric Kaufmann
Lecture 8: Generalized Linear Models for Longitudinal Data.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
HSRP 734: Advanced Statistical Methods June 19, 2008.
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
Chapter 20 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 These tests can be used when all of the data from a study has been measured on.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
PPA 501 – Analytical Methods in Administration Lecture 6a – Normal Curve, Z- Scores, and Estimation.
Discrete Choice Modeling William Greene Stern School of Business New York University.
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
New Measures of Data Utility Mi-Ja Woo National Institute of Statistical Sciences.
Meeghat Habibian Analysis of Travel Choice Transportation Demand Analysis Lecture note.
Logistic Regression. Linear Regression Purchases vs. Income.
Correlation They go together like salt and pepper… like oil and vinegar… like bread and butter… etc.
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Principal Component Analysis
Nonparametric Statistics
[Part 5] 1/43 Discrete Choice Modeling Ordered Choice Models Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
Week 7: General linear models Overview Questions from last week What are general linear models? Discussion of the 3 articles.
Multivariate Statistics Latent Growth Curve Modelling. Random effects as latent variables: SEM for repeated measures data Dr Patrick Sturgis University.
Logit Models Alexander Spermann, University of Freiburg, SS Logit Models.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
CFA with Categorical Outcomes Psych DeShon.
Nonparametric Statistics
Statistical analysis.
Logistic Regression APKC – STATS AFAC (2016).
A Comparison of Two Nonprobability Samples with Probability Samples
Statistical analysis.
M.Sc. in Economics Econometrics Module I
THE LOGIT AND PROBIT MODELS
Linear Mixed Models in JMP Pro
An introduction to basic multilevel modeling
A Logit model of brand choice calibrated on scanner data
Introduction to logistic regression a.k.a. Varbrul
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Nonparametric Statistics
From GLM to HLM Working with Continuous Outcomes
Multivariate Methods Berlin Chen, 2005 References:
Presentation transcript:

Modelling voter preferences: a multilevel, longitudinal approach Dr. Edward Fieldhouse, Jerry Johnson, Prof. Andrew Pickles, Dr. Kingsley Purdam, Nick Shryane Cathie Marsh Centre for Census and Survey Research University of Manchester UK

Some limitations in modelling voter preferences Dichotomous response models, ‘minor parties’ and non-voting Handling complexity of voter preferences and party positions in ideological space Assumption of Independence of Irrelevant Alternatives Contextual Influences on voting

A Simplified conceptual model

Data and methods British Election Panel Study, Eight waves Information on preferences, voting, rankings of parties, left-right placement, tactical voting Multilevel design (occasion/person/location) Random Utility Models Generalised Linear Latent and Mixed Models

Some Assumptions of Party Identification Stable – even when vote switching takes place Enduring – across several consecutive years Resilient – to ephemeral political events Only relevant to only a small proportion of the electorate

BEPS 2001 Party ID

Party ID and vote

Party ID and SoF

U, the subjective value of a choice, i.e. utility, is modelled as being comprised of two parts: V, measured characteristics of the chooser or choice alternative, e.g. age, cost , a random component representing unmeasured idiosyncrasies Random Utility Models

There will be a utility associated with each choice-alternative. For example, with two alternatives: Binary choice Utility maximisation Alternative 1 will be chosen if U 1 > U 0 or equivalently if

If  1 and  0 have type-1 extreme value (Gumbel) distributions then  1 -  0 has a logistic distribution, and therefore the probability that U 1 is greater than U 0 is Utility Logit

Choice Logit When V is parameterised as a linear combination of subject-specific covariates X, the coefficients for the reference category are set to zero (for identification), yielding the familiar logit model: i.e. the probability that alternative 1 is chosen in preference to the reference (alternative 0)

When choosing among more than two alternatives, utility can be decomposed as before, e.g. for three alternatives: Polytomous choice

Assuming (  1 -  0 ) and (  2 -  0 ) are independent logistic distributions yields the familiar multinomial logit model: Multinomial logit

Assuming (  1 -  0 ) and (  2 -  0 ) are independent logistic distributions allowed specification of the multinomial logit model Independence from irrelevant alternatives This assumption of independence is known as “independence from irrelevant alternatives” (IIA) However, it is usually implausible to assume that (  1 -  0 ) and (  2 -  0 ) are independent.

Latent random variables The correlation between random components due to violation of IIA can be modelled by introducing shared random effects, u :

Latent variable distribution We assume that (  1 -  0 ) and (  2 -  0 ) have logistic distributions The latent variables are specified as  1 = ( u 1 - u 0 )  2 = ( u 2 - u 0 ) and are distributed bivariate normal The latent variables reflect the propensity to favour one choice over another when the effect of the explanatory variables ( X) has been accounted for.

Multinomial model with latent variables Allowing for correlation among utilities with latent variables gives the following model

Multinomial model with latent variables In general, the latent variables that give rise to the correlation among choices can be poorly identified This can be overcome using ranked preferences instead of first-choices

A model of ranked preferences The Luce model for ranked preferences is a direct extension of the random utility derivation of the multinomial choice model With three alternatives; first choice probabilities are as for the original model Second choice probabilities, conditional on the first choice, are given by the same multinomial form, but with the first-choice excluded from the choice set

For example, with three alternatives, the probability that choice 1 will be ranked first, followed by choice 2 second (with the final choice redundant) is: Multinomial logit for rankings

Multinomial logit for rankings with latent variables Allowing for correlations among utilities with latent variables gives:

GLLAMM Such models can be estimated using GLLAMM (Generalized Linear, Latent and Mixed Models; Rabe-Hesketh, Pickles & Skrondal, 2001) GLLAMM is a STATA programme freely available from

Latent variables structure and political theory A fundamental way by which political parties are characterised is where they fall along a uni-dimensional, “left-right” continuum (cf. spatial models of political preference by Downs [1957] and Black [1958])

Latent variables structure and political theory Conventionally, in the UK the Conservative party is seen as the most right-wing of the major parties, with Labour as the most left- wing. The Liberal Democrats are seen as occupying the middle ground, but closer to Labour than the conservatives.

Latent variables structure and political theory If this is so, ranked preferences for Labour and the Liberal Democrats should be clustered together to a greater extent than preferences for Conservative and Liberal Democrats (or indeed, for Conservative and Labour)

Latent variables structure and political theory In terms of the latent variables, those who prefer Labour to the Conservatives will have positive u lab – u con. The same people are also likely to prefer Liberal Democrat over the Conservatives, and thus also have positive u libdem - u con Therefore, the latent variables should be positively correlated

Political preference in the UK Data: British Election Panel Survey, 2001 wave (N = 1560 voting age respondents living in England [excludes Scottish- and Welsh-based respondents]) Party approval ratings were used to construct ranked preferences for the three major parties; Conservative, Labour, Liberal Democrat. (First place ties were split where possible by the respondents’ stated party ID. 80 first-placed ties remained after this)

Political preference in the UK Example party ranking data IDNoConrankLabrankLibDem_rank

Political preference in the UK Party preference ranks were modelled in GLLAMM using multinomial logistic regression with two latent variables. Covariates included were age and sex First, though, a multinomial logit model with no latent variables was fit, for comparison

Baseline category: Conservative Log Likelihood: Model 0: Multinomial logit of ranked party preference ParameterEst.SESig. LabourIntercept <.001 Age <.001 Sex <.05 LibDemIntercept.68.17<.001 Age ns Sex ns

Baseline category: Conservative Log Likelihood: Model 1: ranked preference with two latent variables ParameterEst.SESig. LabourIntercept <.001 Age <.01 Sex ns LibDemIntercept <.001 Age <.05 Sex ns LatentVar(1) VariablesVar(2) Corr(1,2)1.00

Model 1: ranked preference with two latent variables Model 1 is a massive improvement in fit over model 0 The latent variables are both significant, indicating a tendency to rank both Labour and Liberal Democrats differently from the Conservatives. The variance for Lab. vs. Con is greater than that of LD. vs. Con. – Lab. is more ‘distant’ from Con. than is LD. The two latent variables are highly correlated. The tendency to choose Labour over conservatives is related to the tendency to choose LibDems over Conservatives This violates IIA, invalidating Model 0

Uni-dimensional preference structure The strong correlation between latent variables implies that only one latent dimension is required to model ranked party preferences (the “left-right” dimension?)

A single-factor model was fitted to the data, whereby the second latent variable,  2 (the propensity to choose LibDem over Conservative) was defined as a function of  1 (Labour vs. Conservative) where is a ‘scale’ factor, to account for the different ‘distances’ between Lab-Con and LD-Con Model II: On factor model of ranked party preference

Baseline category: Conservative Log Likelihood: Model II: one-factor model of ranked party preference ParameterEst.SESig. LabourIntercept <.001 Age <.01 Sex ns LibDemIntercept <.001 Age <.05 Sex ns LatentVar(1) Variables.65.03

Model II fits at least as well as model I (difference in log-likelihoods is not significant) Coefficients are virtually identical to model I The scale factor ( ) is less than one, indicating that the Liberal Democrats are closer to the Conservatives than is Labour Model II: one-factor model of ranked party preference

A traditional multinomial logit model, fitted to political party preference in the UK, provided a poor fit of the data by failing to account for violation of IIA – the correlation between choices Latent variables were included to account for this A model with one latent variable fitted the data as well as the model with two, indicating that UK party preferences seem to fit a one- dimensional spatial model Summary