How Big was Response Bias in England to PISA 2003? John Micklewright & Sylke V. Schnepf July 2008.

Slides:



Advertisements
Similar presentations
Mathematics matters – the international perspective December 2013 Lorna Bertrand Head of International Evidence & Partnerships
Advertisements

Overview of Inferential Statistics
Comparing Two Means: One-sample & Paired-sample t-tests Lesson 12.
An Assessment of the Impact of Two Distinct Survey Design Modifications on Health Insurance Coverage Estimates in a National Health Care Survey Steven.
Linking administrative data to TALIS and PISA
Estimating the Standard Deviation © Christine Crisp “Teach A Level Maths” Statistics 2.
Threats to Conclusion Validity. Low statistical power Low statistical power Violated assumptions of statistical tests Violated assumptions of statistical.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples.
Sampling and Experimental Control Goals of clinical research is to make generalizations beyond the individual studied to others with similar conditions.
Statistics 101 Class 9. Overview Last class Last class Our FAVORATE 3 distributions Our FAVORATE 3 distributions The one sample Z-test The one sample.
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
1 (Student’s) T Distribution. 2 Z vs. T Many applications involve making conclusions about an unknown mean . Because a second unknown, , is present,
Understanding sample survey data
Culture, Gender, and Math Luigi Guiso Ferdinando Monte Paola Sapienza Luigi Zingales 1.
Quiz 5 Normal Probability Distribution.
Joint Canada/U.S. Health Survey Catherine Simile, National Center for Health Statistics Patrice Mathieu, Statistics Canada Ed Rama, Statistics Canada NCHS.
PISA What is PISA? Programme in International Student Assessment Developed jointly by member countries of the Organisation for Economic Co-operation.
Introduction to Statistics February 21, Statistics and Research Design Statistics: Theory and method of analyzing quantitative data from samples.
COLLECTING QUANTITATIVE DATA: Sampling and Data collection
1. Measuring the Impact of Universal Preschool Education and Care on Literacy Performance Scores. Tarek Mostafa Institute of Education – University of.
PIAAC results tell a story about the systemic nature of the skills deficit among U.S. adults. Overview of U.S. Results: Focus on Numeracy.
Statistics. Intro to statistics Presentations More on who to do qualitative analysis Tututorial time.
Estimating the Standard Deviation © Christine Crisp “Teach A Level Maths” Statistics 1.
The Impact of Including Predictors and Using Various Hierarchical Linear Models on Evaluating School Effectiveness in Mathematics Nicole Traxel & Cindy.
Overview of U.S. Results: Focus on Literacy PIAAC results tell a story about the systemic nature of the skills deficit among U.S. adults.
Comparing two sample means Dr David Field. Comparing two samples Researchers often begin with a hypothesis that two sample means will be different from.
Slide 1 Estimating Performance Below the National Level Applying Simulation Methods to TIMSS Fourth Annual IES Research Conference Dan Sherman, Ph.D. American.
England’s “plummeting” PISA test scores between 2000 and 2009: Is the performance of our secondary school pupils really in relative decline? 1.
Statistical Interval for a Single Sample
Data Collection and Sampling
1 МOSCOW 2009 MODERN UNIVERSITY FOR THE HUMANITIES MODERN UNIVERSITY FOR THE HUMANITIES Higher education in the world countries: Higher education in the.
Introductory Topics PSY Scientific Method.
Normal Distr Practice Major League baseball attendance in 2011 averaged 30,000 with a standard deviation of 6,000. i. What percentage of teams had between.
4.1 Statistical Measures. When one needs to compare individual values to others in a data set, the following statistical measures are used: Per Capita.
Determination of Sample Size: A Review of Statistical Theory
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
A Quality Driven Approach to Managing Collection and Analysis
Introduction to Statistical Inference Jianan Hui 10/22/2014.
Statistics and Quantitative Analysis U4320 Segment 5: Sampling and inference Prof. Sharyn O’Halloran.
The Single-Sample t Test Chapter 9. t distributions >Sometimes, we do not have the population standard deviation. (that’s actually really common). >So.
Improving of Household Sample Surveys Data Quality on Base of Statistical Matching Approaches Ganna Tereshchenko Institute for Demography and Social Research,
1 Revisions analysis of OECD composite leading indicators (CLI) Emmanuelle Guidetti Third Joint European Commission OECD Workshop on Business and Consumer.
Statistics Canada Citizenship and Immigration Canada Methodological issues.
Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E(X) =  and SD(X) = . Recall the following two facts about the average of n observations.
Results from the 2009 Programme for International Student Assessment (PISA): How does the United States compare to other nations? December 2010.
Examining the Trade Off between Sampling and Non-response Error in a Targeted Non-response Follow-up Sarah Tipping and Jennifer Sinibaldi, NatCen.
Module 25: Confidence Intervals and Hypothesis Tests for Variances for One Sample This module discusses confidence intervals and hypothesis tests.
Describing a Score’s Position within a Distribution Lesson 5.
Hypothesis Testing and Statistical Significance
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
Gary W. Phillips Vice President & Institute Fellow American Institutes for Research Next Generation Achievement Standard Setting Symposium CCSSO NCSA New.
Review Law of averages, expected value and standard error, normal approximation, surveys and sampling.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
1 Main achievement outcomes continued.... Performance on mathematics and reading (minor domains) in PISA 2006, including performance by gender Performance.
FFT Data Analysis Project Who wants to be in the top 1 percent?
PISA 2015 results in England
John Jerrim UCL Institute of Education
Statistical Inference
What have we learned from PISA and TIMSS?
STANDARD ERROR OF SAMPLE
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
Practice For an SAT test  = 500  = 100
Confidence Intervals and Hypothesis Tests for Variances for One Sample
Statistics in Applied Science and Technology
François Lequiller OECD
Types of Control I. Measurement Control II. Statistical Control
Five things you probably don’t know from PISA….
The Student Academic Experience Survey 2019
Presentation transcript:

How Big was Response Bias in England to PISA 2003? John Micklewright & Sylke V. Schnepf July 2008

2 Motivation PISA – Programme of International Student Assessment Reports from OECD for 2003 exclude the UK due to perceived non-response bias in England Simon Briscoe, Economics Editor at The Financial Times: the exclusion is among the ‘Top 20’ recent threats to public confidence in official statistics in the UK Presentation draws on: Response Bias in England in PISA 2000 and 2003, DfES Research Report 771

3 PISA response rates (%): England

4 ‘Bias’ in what? Mean Variance % beneath a given threshold

5 Five groups of 15 year olds i.all pupils in England (less permitted exclusions from the target population) ii.all pupils in sampled schools (initial, 1 st repl., 2 nd repl.) iii.all pupils in responding schools iv.sampled pupils in responding schools v.responding pupils

6 What we do Show how (i) mean, (ii) variance, and (iii) % < thresholds of KS3 and KS4 scores change from group to group Estimate biases in these 3 measures of PISA scores in group v, focusing on problem of pupil response Look at both 2000 and 2003

7 KS4 v. PISA scores - respondents R = 0.78 KS4 score PISA maths score

8 Means and SDs of KS4 scores, 2003

9 % below or above KS4 thresholds, 2003

10 Creating response weights Model probability of pupil response for sampled pupils in responding schools (group iv) Prob(response) i = F(domestic test score i, other characteristics i ) Predicted probability of response KS4 score

11 Bias due to pupil response Extent of bias, calculated as number of PISA score points (PISA international mean = 500, international SD = 100 Compare with the estimated standard errors (SEs) of the mean and SD for England in 2003: SE (mean) = 2.9 (reading and maths) and 3.0 for science SE (SD) = 1.6 (reading and maths) and 1.7 for science

12 However… Adjusting the mean by the estimated bias of about 6 points moves England one place in the country rankings ‘Post-stratification’ weights are used in many surveys – they could be provided with the data for England Think of possible biases in countries just above the 80% student response threshold (Australia, Austria, Canada, Ireland, Poland and the USA)

13 Conclusions 1.It is possible to assess reliably both the direction and magnitude of response biases in England in 2003, and hence compare UK scores with those for other countries 2.The extent of the bias in means and SDs of scores is equal to about double the SEs, but the impact on ‘league table’ rankings is small 3.Biases were similar in 2000, when pupil response in England just reached the required threshold