Www.ioe.ac.uk/bedfordgroup Analysing Variability Between Neighbourhoods By Exploiting Survey Design Features Paper for What is Multilevel Modelling? session.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

1
Ecole Nationale Vétérinaire de Toulouse Linear Regression
Chapter 1 The Study of Body Function Image PowerPoint
STATISTICS Joint and Conditional Distributions
STATISTICS Linear Statistical Models
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Detection of Hydrological Changes – Nonparametric Approaches
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
Variance Estimation in Complex Surveys Third International Conference on Establishment Surveys Montreal, Quebec June 18-21, 2007 Presented by: Kirk Wolter,
Figure 1. There Are 13.3 Million Uninsured Young Adults Ages 19–29, 30 Percent of the Nonelderly Uninsured, 2005 Source: Analysis of the March 2006 Current.
BUS 220: ELEMENTARY STATISTICS
NTTS conference, February 18 – New Developments in Nonresponse Adjustment Methods Fannie Cobben Statistics Netherlands Department of Methodology.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Undergraduates in Minnesota: Who are they and how do they finance their education? Tricia Grimes Shefali Mehta Minnesota Office of Higher Education November.
Year 6 mental test 5 second questions
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
1 Contact details Colin Gray Room S16 (occasionally) address: Telephone: (27) 2233 Dont hesitate to get in touch.
Contextual effects In the previous sections we found that when regressing pupil attainment on pupil prior ability schools vary in both intercept and slope.
Around the World AdditionSubtraction MultiplicationDivision AdditionSubtraction MultiplicationDivision.
Following lives from birth and through the adult years Evidence from the Millennium Cohort Study Denise D. Hawkes 29 September 2008 Early.
Multilevel modelling short course
Social Statistics Estimation and complex survey design Ian Plewis, CCSR, University of Manchester.
Chapter 7 Sampling and Sampling Distributions
1 Understanding Multiyear Estimates from the American Community Survey.
Simple Linear Regression 1. review of least squares procedure 2
Biostatistics Unit 5 Samples Needs to be completed. 12/24/13.
Multilevel Event History Modelling of Birth Intervals
Break Time Remaining 10:00.
Chapter 4: Basic Estimation Techniques
(This presentation may be used for instructional purposes)
PP Test Review Sections 6-1 to 6-6
Chi-Square and Analysis of Variance (ANOVA)
5-1 Chapter 5 Theory & Problems of Probability & Statistics Murray R. Spiegel Sampling Theory.
VOORBLAD.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Squares and Square Root WALK. Solve each problem REVIEW:
Hours Listening To Music In A Week! David Burgueño, Nestor Garcia, Rodrigo Martinez.
Following lives from birth and through the adult years Evidence from the First Two Surveys of the UK Millennium Cohort Denise D. Hawkes,
Statistical Analysis SC504/HS927 Spring Term 2008
Addition 1’s to 20.
25 seconds left…...
Slippery Slope
Januar MDMDFSSMDMDFSSS
Week 1.
Statistical Inferences Based on Two Samples
© The McGraw-Hill Companies, Inc., Chapter 10 Testing the Difference between Means and Variances.
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Essential Cell Biology
Chapter 8 Estimation Understandable Statistics Ninth Edition
PSSA Preparation.
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 11 Simple Linear Regression.
Experimental Design and Analysis of Variance
Essential Cell Biology
Simple Linear Regression Analysis
Correlation and Linear Regression
Multiple Regression and Model Building
Heibatollah Baghi, and Mastee Badii
Lecture 5 1 Continuous distributions Five important continuous distributions: 1.uniform distribution (contiuous) 2.Normal distribution  2 –distribution[“ki-square”]
Commonly Used Distributions
Neighbourhood & Citizenship The clustered nature of the MCS sample makes it possible to separate individual /family effects on, for example, child development,
Early Motherhood in the UK: Micro and Macro Determinants Denise Hawkes and Heather Joshi Centre for Longitudinal Research Institute of Education University.
Presentation transcript:

Analysing Variability Between Neighbourhoods By Exploiting Survey Design Features Paper for What is Multilevel Modelling? session Research Methods Festival, Oxford, 2 July Ian Plewis Institute of Education, University of London

Sample surveys with a clustered design tend to be more efficient than surveys using simple random samples. Clustering does, however, introduce complexities in the analysis because cases within a cluster are more similar, on average, than cases in different clusters. The degree of similarity is represented by the intra-cluster (or intra-class) correlation. We can adjust standard errors to allow for clustering within a number of statistical packages.

However, the clustering might be informative in the sense that the clusters represent neighbourhoods (or institutions) that could exert an independent or contextual effect on a social or developmental process. In other words, clustering is not necessarily a statistical nuisance. Rather it can be exploited to throw more light on social processes.

The Millennium Cohort Study population is a population of children defined as: all children born between 1 September 2000 and 31August 2001(for England and Wales), and between 23 November 2000 and 11 January 2002 (for Scotland and Northern Ireland), alive and living in the UK at age nine months, and eligible to receive Child Benefit at that age; and, after nine months: for as long as they remain living in the UK at the time of sampling. The MCS Population

All children living in the selected wards: ENGLAND: Advantaged110 ENGLAND: Disadvantaged71 ENGLAND: Ethnic19 WALES: Advantaged23 WALES: Disadvantaged50 SCOTLAND: Advantaged32 SCOTLAND: Disadvantaged30 N.IRELAND: Advantaged23 N.IRELAND: Disadvantaged40 TOTAL398 MCS Target Sample, Sweep 1

Observed Mean cluster size across the UK is 47 but the range is from 7 to 403.

We can generate a measure of the main respondents perceptions of her neighbourhood from a set of five items about vandalism, pollution etc. This measure can vary from 0 to 15 and, although skewed to the right, will be treated as having a Normal distribution. Example from Sweep 1 of MCS:

We will attempt to explain the variation in this measure initially in terms of individual characteristics using a multiple regression model: Mothers age Number of children Lone parent status Receiving benefits Ethnic group (8 categories) Example from Sweep 1 of MCS:

Figure 1: Within, between, and total regressions. (Snijders, T and Bosker, R (1999), Multilevel Analysis. London: Sage Publications)

Estimates.e. Mothers age No. of children Lone parent status On benefits Ethnic group: Mixed Indian Pakistani Bangladeshi Black Caribbean Black African Other The model also includes dummies for stratum to allow for the unequal probabilities of selection. R 2 = 0.18 Table 1: Multiple Regression Estimates

The multiple regression model ignores ward and we would expect variation between wards for measures of neighbourhoods. We first fit a simple two level model, just including a random intercept to estimate variation between wards (level-two variance) and compare that variation with variation within wards (level-one variance). We can represent the relative strengths of the two sources of variation by the intra-cluster correlation. The estimate is 0.26 which is important and there is, therefore a prima facie case for including ward in any model.

Estimate (MR)s.e.Estimate (MLM)s.e. Mothers age No. of children Lone parent status On benefits Ethnic group: Mixed Indian Pakistani Bangladeshi Black Caribbean Black African Other Between ward variancen.a Within ward variance Table 2: Comparing Estimates from a Multiple Regression and a Two Level Model

We have one external measure at the ward level – the Child Poverty Index (part of the Index of Multiple Deprivation or IMD2000). Does this explain variation between wards in neighbourhood satisfaction?

Table 3: Comparing Estimates from Two Level Models without and with CPI Estimate (MLM)s.e.Estimate (+CPI)s.e. Mothers age No. of chn Lone parent status On benefits Ethnic group: Mixed Indian Pakistani Bangladeshi Black Caribbean Black African Other CPI Between ward variance Within ward variance

If CPI is included in the single level multiple regression model then the estimate is: with a much lower standard error of

Why did the estimated coefficients for some of the ethnic groups change so much when we move from multiple regression (where the estimate is a function of within and between group estimates) to a multilevel model (where the within and between regressions are assumed to be the same)? Perhaps the ethnic group estimates vary from ward to ward. In other words, perhaps there are random slopes. It would be a little difficult to allow each ethnic group to have its own random slope so instead let us look at a white/non white split.

Table 4: Random Slopes Model Estimate (MLM)s.e. Non White % Coverage interval-0.39 to 1.9 Between ward variance (intercept) Between ward variance (slope) Correlation: intercept & slope-0.45 Within ward variance There are good reasons to suppose that some of the ward variation in both intercept and slope can be explained by the proportion of white respondents in the ward.

Table 5: Random Slopes Model with Proportion White Estimate (MLM)s.e. Non white Proportion white Non white*proportion white0.34 Between ward variance (intercept) Between ward variance (slope) Correlation: intercept & slope-0.51 Within ward variance

Conclusions 1.Multilevel modelling, carefully used, can throw light on complex processes and give us a better understanding of within and between group relations. 2.Our results show that there are differences between white and non white respondents in their perceptions of their neighbourhood. However, the differences between the two groups are more marked in wards with a low proportion of white respondents.