Using M-Quantile Models to measure the Contextual Value- Added of Schools in London Nikos Tzavidis (University Manchester) James Brown (Institute of Education)

Slides:



Advertisements
Similar presentations
Contextual effects In the previous sections we found that when regressing pupil attainment on pupil prior ability schools vary in both intercept and slope.
Advertisements

Properties of Least Squares Regression Coefficients
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Qualitative predictor variables
Kin 304 Regression Linear Regression Least Sum of Squares
The Simple Regression Model
The choice between fixed and random effects models: some considerations for educational research Claire Crawford with Paul Clarke, Fiona Steele & Anna.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Prediction, Correlation, and Lack of Fit in Regression (§11. 4, 11
Using Linked Administrative Data to Enhance Policy Relevant Analysis: Experience from the ADMIN Research Centre James Brown (University of Southampton)
1 Contextual Value Added and Data For Dummies The mystery explained.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 13 Multiple Regression
The Simple Linear Regression Model: Specification and Estimation
NCRM Annual Meeting January People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James.
Chapter 10 Simple Regression.
BA 555 Practical Business Analysis
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
REGRESSION AND CORRELATION
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Correlation and Regression Analysis
Simple Linear Regression Analysis
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
-- Preliminary, Do Not Quote Without Permission -- VALUE-ADDED MODELS AND THE MEASUREMENT OF TEACHER QUALITY Douglas HarrisTim R. Sass Dept. of Ed. LeadershipDept.
3. Multiple Regression Analysis: Estimation -Although bivariate linear regressions are sometimes useful, they are often unrealistic -SLR.4, that all factors.
Regression and Correlation Methods Judy Zhong Ph.D.
FFT Data Analysis Project – Supporting Self Evaluation  Fischer Family Trust / Fischer Education Project Extracts may be reproduced for non commercial.
Information for parents on the effectiveness of English secondary schools for different types of pupil Lorraine Dearden, John Micklewright and Anna Vignoles.
Inference for regression - Simple linear regression
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Chapter 11 Simple Regression
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 25 Categorical Explanatory Variables.
Simple Linear Regression
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Intergenerational Poverty and Mobility. Intergenerational Mobility Leblanc’s Random Family How does this excerpt relate to what we have been talking about?
Introduction Multilevel Analysis
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Key issues from yesterday and for panel Lorraine Dearden.
Introduction to Multilevel Modeling Stephen R. Porter Associate Professor Dept. of Educational Leadership and Policy Studies Iowa State University Lagomarcino.
Mike Treadaway Director of Research Fischer Family Trust Using FFT Live Secondary Schools.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
The Choice Between Fixed and Random Effects Models: Some Considerations For Educational Research Clarke, Crawford, Steele and Vignoles and funding from.
Multiple Linear Regression ● For k>1 number of explanatory variables. e.g.: – Exam grades as function of time devoted to study, as well as SAT scores.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
Chapter 13 Multiple Regression
Correlation Assume you have two measurements, x and y, on a set of objects, and would like to know if x and y are related. If they are directly related,
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Robust Estimators.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
4 basic analytical tasks in statistics: 1)Comparing scores across groups  look for differences in means 2)Cross-tabulating categoric variables  look.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Hertfordshire County Council The Role of the Secondary Assessment Co-ordinator Day One 5 th July 2005.
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
The simple linear regression model and parameter estimation
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
I271B Quantitative Methods
Simple Linear Regression
Basic Practice of Statistics - 3rd Edition Inference for Regression
Algebra Review The equation of a straight line y = mx + b
Introductory Statistics
Presentation transcript:

Using M-Quantile Models to measure the Contextual Value- Added of Schools in London Nikos Tzavidis (University Manchester) James Brown (Institute of Education)

2 Outline Looking at Pupil/School Performance using the NPD The approach ‘currently’ used by DfE Outline to using quantile or M-quantile approaches M-quantiles for exploring pupil and school performance Measuring and mapping performance across local authorities in London Outcome data for 2007/8

3 The National Pupil Database A major Admin database held by DCSF that is utilised by researchers studying pupil and school performance… Longitudinal record of a pupils’ attendance at State schools in England (updated each term) with performance data linked in at KS1, KS2, (KS3), KS4, and now KS5 with further extensions into further/higher education Limited covariates on individual pupils and their family background Language, ethnicity, fsm, income deprivation of area, in care,… Possible use of linkage with the LSYPE, which does have the more detailed family background (parental education)…

4 Measuring School Performance Raw exam scores can be misleading for measuring school performance Do not reflect different intakes of schools Value Added (VA) Models: Provide a better measure of performance by accounting for pupil prior attainment Contextualised Value Added (CVA) Models: Extension of VA models that also account for pupil characteristics (gender, age, deprivation) and the context within the school

5 DfE CVA Model Concept: Include school-specific random effects to account for the between school variation beyond that explained by the variation in model covariates. Captures the fact that pupil performance within a school is correlated, even after controlling for characteristics Notation: (j = School, i = Pupil) Variable of interest:y ij Covariate information: x ij School level random effect:u j Pupil level random effect:e ij

6 DfE CVA Model (random effects model) Dependent variable: Capped total Key Stage 4 score (best 8 GCSEs) Covariates (pupil level): Pupil prior attainment, fsm, income deprivation, special education needs, age, pupil mobility, gender, in care, ethnicity, English as an additional language, interaction terms Covariates (school level): School mean prior attainment, school mean spread y ij = α + βx ij + u j + e ij School random effect u j : Measures an unknown underlying level of performance for each school Normally distributed with constant variance…

7 Random Intercepts

8

9

10 Things to consider… Why should these (random) school effects be normally distributed with a constant variance? Problems with capping in high performing schools… Possible lack of outlier robustness Outlier schools and outlier pupils… The DfE CVA model assumes a random intercepts specification, what if random slopes provide a better fit? Losing information by simply summarising the school impact as a single value… Is the school impact really the same for all pupils in the school?

11 The Quantile Approach The conventional definition of a regression model as a model for the mean of Y|X can be extended considerably We view regression analysis as aimed at modelling the entire conditional distribution f(Y|X) Regression quantiles, and the easier to compute regression M- quantiles, offer a deeper understanding of the structure of conditional distributions In this presentation we don’t distinguish between regression quantiles and regression M-quantiles as both will serve the same purpose

12 The Quantile Approach A Linear Model for Regression Quantiles y i = α q + β q x i + e i Estimation of Regression Quantiles Computation:Simplex Algorithm (quantile regression) Weighted Least Squares (M-quantile regression) Implemented in:Rquantreg library Stataqreg For M-quantiles (Chambers and Tzavidis, 2006) we use the rlm function in R modified to qrlm

13 Interpreting Regression Quantiles For each value of q, the corresponding model shows how the qth percentile (quantile) of f(Y|X) varies with X q = 0.5 line shows how the “middle” (median) of f(Y|X) changes with X q = 0.1 line separates the “top” 90% of f(Y|X) from the “bottom” 10% it represents the behaviour of units that are “better” than the “worst” 10% and “worse” than the “best” 90% the ‘fitted’ regression quantiles do not need to be parallel but they should not cross… (if they do implies poor model specification)

14 Interpreting Regression Quantiles Comparing the fitted mean line (red) with fitted lines (grey) covering quantiles at 1%, 5%, 25%, 50%, 75%, 95% and 99%.

15 M-quantile Coefficients Individual level pupil data (y i, x i ) on Y and X Linear regression M-quantiles m q (x i ) = α q + β q x i For fixed x, m q (x i ) is continuous in q each sample value (y i, x i ) will lie on one and only one regression M-quantile line We refer to the q-value q i of this regression M-quantile as the M- quantile coefficient or q value of the corresponding pupil The M-quantile coefficients lie between 0 and 1 and characterize where the pupils lie in the conditional distribution f(Y|X)

16 Properties of the q i ’s The q i ’s represent dimensionless measures of the residual heterogeneity in Y after accounting for heterogeneity in X The q i ’s satisfy 4 conditions that a good measure of performance should satisfy (Kokic et al., 1997) They lie between 0 and 1 The poorest performing pupils given their x’s (prior attainment,…) have a performance measure close to zero The best performing pupils have a performance measure close to one The distribution of the performance measure should not depend on the level of inputs x (i.e. the pupil’s prior attainment)

17 An Alternative Measure of School Performance… With the q i we have a measure of the individual pupils’ efficiencies conditional on a set of covariates  But so far we have not imposed any structure on the data... IF the school has any additional impact on the performance of the pupils within that school it will be evidenced by pupils ‘on average’ being more or less efficient  We define Q j (a measure of this school efficiency) as the mean of the pupil q i ’s within school j This approach in small area estimation (Chambers and Tzavidis, 2006) shows that M-quantile coefficients are able to capture group effects  and tend to be highly correlated with standard random effects

18 How does it work? Step 1: Define a grid of q-values, e.g. g =(0.001,…,0.999) that “covers” the conditional distribution of Y and X

19 How does it work? Step 2: Fit an M-quantile model for each q-value in g and estimate the unique M-quantile coefficient q i for each pupil in the sample Step 3: The q i ’s describe pupil differences after controlling for X. Higher q i ’s imply better performance – Pupil efficiency

20 How does it work? Step 4: Using the q i ’s of pupils in the same school, estimate a school M-quantile coefficient, Q j, using the mean or the median Measure of school performance given by Q j : Higher → Better

21 ‘Properties’ of using quantiles for School Performance Less assumptions (should cope better with the capping of the performance measure) with respect to the structure of the school effects  the data guide the modelling process Less assumptions do generally imply less efficiency  working with London we still have a lot of data... Outlier robustness automatically achieved by using M-quantiles Defining a measure of individual (pupil) efficiencies allows flexible summaries at the school (and other) levels  BUT what about measuring the uncertainty associated with Q j ?

22 MSE Estimation of the Performance Measure Estimated using non-parametric bootstrap (Tzavidis et al. 2010) Starting from the original sample s, fit the M-quantile model and compute B bootstrap finite populations are generated by sampling From each bootstrap population, select L samples using simple random sampling without replacement within the schools For each bootstrap sample L implement the estimation procedure

23 Evaluating the MSE Estimator Use model-based simulation. Bootstrap is embedded within the Monte-Carlo Population data are generated using We generate in total 250 populations of size units in 40 schools From each of the 250 populations take independent samples by randomly selecting pupils within the 40 schools. MSE estimation: For each Monte-Carlo sample we implement the bootstrap scheme with B=1 and L=250

24 Evaluating the MSE Estimator Across schools distribution of the true (Monte Carlo) MSE, averages over Monte Carlo simulations of estimated mean squared error, relative bias (%) of the bootstrap MSE estimator and coverage rates of nominal 95% CIs

25 Evaluating the MSE Estimator True (empirical) Mean Square Error (Red Line) and estimated Mean Square Error (black line) of the M-quantile measure of school performance.

26 Modelling Pupil Performance across London We now implement the approach to the data for pupils within schools in London using linked NPD/PLASC Data (2007/08 cohort) Performance at age 11 (prior attainment) performance at age 16 (the outcome) – NPD Data Linked PLASC data to provide the background information Our working model specification similar to the CVA model Ray (2006)

27 Modelling Pupil Performance across London Comparing the fitted mean line (red) with fitted lines (grey) covering quantiles at 1%, 5%, 25%, 50%, 75%, 95% and 99%.

28 Pupil Performance within and across schools Caterpillar plot for school comparing males and females

29 Performance across LAs By estimating a pupil level effect we do not pre-impose a structure on the data To get a school performance value we are averaging across pupils in the school... The school has an impact when the average differs from the overall mean (close to 0.5) BUT We can estimate for other structures by aggregating our pupils by the desired structure the impact of the structure being represented by an average efficiency for the pupils different to the overall average

30 Performance across LAs (marginal measure)

31 Performance across LAs (conditional measure)

32 Performance across LAs (difference from mean)

33 Discussion... Utilising quantile models leads to a ‘robust’ measure of the relative performance of the individual pupils Measures how efficient a pupil is relative to others with the same inputs (prior attainment) and context We can then impose ‘structure’ on the data to see if that has any influence on the performance of the ‘group’ of students School effects Average pupil performance at the LA level

34 Some References Tzavidis, N. and Brown J. J. (2010). Using M-quantile models as an alternative to random effects to model the contextual value-added of schools in London. DoQSS Working Papers 1011, Department of Quantitative Social Science - Institute of Education, University of London. Chambers, R. and Tzavidis, N. (2006). M-quantile Models for Small Area Estimation. Biometrika, 93, Kokic, P. N., Chambers, R. L., Breckling, J. U. and Beare, S. (1997). A measure of production performance. Journal of Business and Economic Statistics, 10, Tzavidis, N., Marchetti, S. and Chambers, R. (2010). Robust prediction of small area means and quantiles. [To appear in the Australian and New Zealand Journal of Statistics]. Institute of Education University of London 20 Bedford Way London WC1H 0AL Tel +44 (0) Fax +44 (0) Web

35 Modelling Pupil Performance across London Comparing the multilevel and the M-quantile CVA measures