1 Guide to the PISA Data Analysis ManualPISA Data Analysis Manual Computation of Standard Errors for Multistage Samples.

Slides:



Advertisements
Similar presentations
1 Relative GDP per capita 2000 PPPs, OECD = countries. Estimates for Source: OECD National Accounts.
Advertisements

1 OECD Mean, OECD Average and Computation of Standard Errors on Differences Guide to the PISA Data Analysis ManualPISA Data Analysis Manual.
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Approximate random distribution of coefficients of correlation for two random variates  = 0.03 Under a normal approximation we can use Z-transformed score.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
© 2011 Pearson Education, Inc
Estimation in Sampling
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Prediction, Correlation, and Lack of Fit in Regression (§11. 4, 11
Objectives (BPS chapter 24)
Simple Linear Regression
Analysis and Interpretation Inferential Statistics ANOVA
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Chapter 10 Simple Regression.
9. SIMPLE LINEAR REGESSION AND CORRELATION
Sampling.
Chapter 11 Multiple Regression.
Simple Linear Regression Analysis
Formalizing the Concepts: Simple Random Sampling.
Introduction to Regression Analysis, Chapter 13,
© The Treasury 2009 Job Summit John Whitehead, Secretary to the Treasury.
Separate multivariate observations
Poverty & Human Capability 101 Introductory Class.
Inference for regression - Simple linear regression
Overview of U.S. Results: Digital Problem Solving PIAAC results tell a story about the systemic nature of the skills deficit among U.S. adults.
1. Measuring the Impact of Universal Preschool Education and Care on Literacy Performance Scores. Tarek Mostafa Institute of Education – University of.
PIAAC results tell a story about the systemic nature of the skills deficit among U.S. adults. Overview of U.S. Results: Focus on Numeracy.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Correlation.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Chapter 3: Central Tendency. Central Tendency In general terms, central tendency is a statistical measure that determines a single value that accurately.
International Outcomes of Learning in Mathematics and Problem Solving: PISA 2003 Results from the U.S. Perspective Commissioner Robert Lerner National.
© 2006 Michigan State University, Center for Research in Mathematics and Science Education Content Standards in an International Context William H. Schmidt.
1 Achievement, Standards, and Assessment in Iowa and in Iowa Districts.
Overview of U.S. Results: Focus on Literacy PIAAC results tell a story about the systemic nature of the skills deficit among U.S. adults.
© 1998, Geoff Kuenning General 2 k Factorial Designs Used to explain the effects of k factors, each with two alternatives or levels 2 2 factorial designs.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
“Measuring the Information Economy” WITSA Public Policy Meeting hosted by BIAC 24 October 2002.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 STANDARD ERRORS ON DIFFERENCES. How to compute the standard error of the difference between : – Two countries; – An OECD country and the OECD total.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
Introduction to Linear Regression
LECTURE 3 SAMPLING THEORY EPSY 640 Texas A&M University.
PISA2006 LEVEL/BALANCE JARKKO HAUTAMÄKI PATRIK SCHEININ SEPPO LAAKSONEN UNIVERSITY OF HELSINKI Revised
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 13 Multiple Regression
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.
International Comparison of Health Care Gene Chang.
1 OECD Mean, OECD Average and Computation of Standard Errors on Differences Guide to the PISA Data Analysis ManualPISA Data Analysis Manual.
The Simple Linear Regression Model. Estimators in Simple Linear Regression and.
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
NUMERACY “Well hey that’s just knowing numbers right.” NOT EXACTLY.
Lecture 7: Bivariate Statistics. 2 Properties of Standard Deviation Variance is just the square of the S.D. If a constant is added to all scores, it has.
STATISTICS People sometimes use statistics to describe the results of an experiment or an investigation. This process is referred to as data analysis or.
Tax Policy Challenges in a Changing World. Unintended Consequences of Tax Rob Marston, “Window Tax”, 1 September 2010 uploaded via Flickr, creative commons.
Estimating standard error using bootstrap
John Jerrim UCL Institute of Education
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
How Canada Compares Internationally
Jenny Bradshaw NCETM National CPD Conference 23rd March 2011
The 1680 Family’s Reach.
Developing Maths Improvement Practitioner Team.
Analyzing large-scale achievement surveys in Stata using
Bettina Wistrom OECD Statistics Directorate
EU: First- & Second-Generation Immigrants
Simple Linear Regression
Chapter 13 Additional Topics in Regression Analysis
Presentation transcript:

1 Guide to the PISA Data Analysis ManualPISA Data Analysis Manual Computation of Standard Errors for Multistage Samples

What is a Standard Error (SE) In PISA, as well as in IEA studies, results are based on a sample – Published statistics are therefore estimates Estimates of the means, of the standard deviations, of the regression coefficients … – The uncertainty due to the sampling process has to be quantified Standard Errors, Confidence Intervals, P Value OECD (2001). Knowledge and Skills for Life: First Results from PISA Paris: OECD.

Let us imagine a teacher willing to implement the mastery learning approach, as conceptualized by B.S. Bloom. Need to assess students after each lesson With 36 students and 5 lessons per day… What is a Standard Error (SE)

Description of the population distribution What is a Standard Error (SE) The teacher decides to randomly draw 2 student’s tests for deciding if a remediation is needed How many samples of 2 students from a population of 36 students?

Number of possible sample of size n from a population of size N What is a Standard Error (SE) If the thickness of a coin is 1 mm, then 1 billion of coins on the edge corresponds to 1000 km

What is a Standard Error (SE)

Graphical representation of the population mean estimate for all possible samples

The distribution of sampling variance on the previous slide has: – a mean of 10 – a Standard Deviation (STD) of 1.7 – The STD of a sampling distribution is denoted Standard Error (SE) What is a Standard Error (SE)

The sampling distribution on the mean looks like a normal distribution

Let us count the number of samples with a mean included between – [( SE);( SE)] – [( );( )] – [6.70;13.30] – There are: =598 samples, thus 94.9 % of all possible samples With a population N(10, 5, 83), 95% of all possible samples of size 2 will have a population mean estimate included between 6.70 and What is a Standard Error (SE)

Sampling distribution of the mean estimates of all possible samples of size 4

What is a Standard Error (SE) The distribution of sampling variance on the previous slide has: – a mean of 10 – a Standard Deviation of 1.7

Distribution of the scores versus distribution of the mean estimates What is a Standard Error (SE)

The sampling variance of the mean is inversely proportional to the sample size: – If two students are sampled, then the smallest possible mean is 5.5 and the highest possible mean is 14.5 – If four students are sampled, it ranges from 6 to 14 – If 10 students are sampled, it ranges from 7 to 13 The sampling variance is proportional to the variance: – If the score are reported on 20, with a sample of size 2, then it ranges from 5,5 to 14,5 – If the score are reported on 40 (multiplied by 2) then it ranges from 11 to 29 What is a Standard Error (SE)

Individual 1Individual 2Individual 3Individual 4Mean estimates Sample 1X 11 X 21 X 31 X 41 û1û1 Sample 2X 12 X 22 X 32 X 42 û2û2 Sample 3X 13 X 23 X 33 X 43 û3û3 Sample 4X 14 X 24 X 34 X 44 û4û4 Sample 5X 15 X 25 X 35 X 45 û5û5 Sample 6X 16 X 26 X 36 X 46 û6û6 Sample 7X 17 X 27 X 37 X 47 û7û7 Sample 8X 18 X 28 X 38 X 48 û8û8 Sample 9X 19 X 29 X 39 X 49 û9û9 …… Sample XX 1x X 2x X 3x X 4x ûxûx What is a Standard Error (SE)

As we don’t know the variance in the population, the SE for a mean of  as obtained from a sample is calculated as similarly, the SE for a percentage P is calculated as What is a Standard Error (SE)

Linear regression assumptions: – Homoscedasticity the variance of the error terms is constant for each value of x – Linearity the relationship between each X and Y is linear – Error Terms are normally distributed – Independence of Error Terms successive residuals are not correlated If not, the SE of regression coefficients is biased What is a Standard Error (SE)

With a multistage sample design, errors will be correlated

Multistage samples are usually implemented in International Surveys in Education: – schools (PSU=primary sampling units) – classes – students If schools/classes / students are considered as infinite populations and if units are selected according to a SRS procedures, then: Standard Errors for multistage samples

PISA: – 2 stage samples : schools and then students IEA: – 2 stage samples : schools and then 1 class per selected school Standard Errors for multistage samples

Three fictitious examples in PISA Standard Error for multistage samples If considered as a SRS or random assignment to schools

24 Standard Error for multistage samples

25 Variance Decomposition for Reading Literacy in PISA 2000 Standard Error for multistage samples

Impact of the stratification variables on sampling variance N 1, N 2 considered as constant Independent samples so COV=0

27 EffectSum of Squares Degree of Freedom Mean square Gender (50F+50G)2500L-1 (1)2500 ERROR7500N-L (98)76.53 TOTAL10000N-1 (99) Standard Error for multistage samples

School and within school variances of the student performance in reading, intraclass correlation with and without control of the explicit stratification variables (OECD, PISA 2000 database) Country School variance Within school variance School variance under control of stratification Rho Rho under control of stratification AUT BEL CHE CZE DNK ESP FIN FRA GBR GRC HUN IRL ISL ITA Standard Error for multistage samples

Consequences of considering PISA samples as simple random samples – In most cases, underestimation of the sampling variance estimates Non significant effect will be reported as significant How can we measure the risk? Computation of the Type I error Standard Error for multistage samples

Consequences : Type I error underestimation

Sampling Variance Standard Error RatioRatio x Z scoreType I Error Unbiased estimate Biased estimate Standard Error for multistage samples Consequences : Type I error underestimation

Sampling Design Effect Standard Error for multistage samples

CountrySDEType ICountrySDEType I Australia Korea Austria Latvia Belgium Liechtenstein Brazil Luxembourg Canada Mexico Czech Republic Netherlands Denmark New Zealand Finland Norway France Poland Germany Portugal Greece Russian Federation Hungary Spain Iceland Sweden Ireland Switzerland Italy United Kingdom Japan United States Standard Error for multistage samples Sampling design effect in PISA 2000 Reading

Factors influencing the SE other than the sample size – School Variance: depending on the variable Usually high for performance Low for other variables – Efficiency of the stratification variables A stratification variable can be efficient for some variables and not for others – Population parameter estimates Standard Error for multistage samples

A few examples (PISA2000, Belgium) – Intraclass correlation Performance in reading: Rho=0.60 SDE=7.19 Social Background (HISEI): Rho= 0.24 SDE=3.45 Enjoyment for Reading: Rho=0.10 SDE=1.86 – Statistics Regression analyses: Reading = HISEI + GENDER – InterceptSDE= 5.50 – HISEISDE=3.78 – GENDERSDE=3.91 Logistic regression Level (0/1 Reading below or above 500) =HISEI – InterceptSDE= 2.39 – HISEISDE=2.09 Standard Error for multistage samples

Very few mathematical solutions for the estimation of the sampling variance for multistage samples – For mean estimates under the condition Simple Random Sample (SRS) and stratified Probability Proportional to Size (PPS) sample but with no stratification variables – No mathematical solutions for other statistics – Use of replication methodologies for the estimation of sampling variance For SRS: – Jackknife: n replications de n-1 cases – Bootstrap : an infinite number of samples of n cases randomly drawn with replacement. Standard Error for multistage samples

Student Mean Value Replication Replication Replication Replication Replication Replication Replication Replication Replication Replication Replication methods for SRS Jackknife for SRS

Replication methods for SRS Jackknife for SRS – Estimation of the SE by replication – Estimation by using the mathematical formula

Replication methods for SRS

Student Mean Value Structure Structure From 13.7 to 15.4 Structure From 12.9 to 16.1 Structure From 10 to 19 Replication methods for SRS Bootstrap for SRS

ReplicateR1R2R3R4R5R6R7R8R9R10 School School School School School School School School School School Replication methods for multistage sample Jackknife for unstratified Multistage Sample Each replicate= contribution of a school

Pseudo- stratum SchoolR1R2R3R4R Replication methods for SRS Jackknife for stratified Multistage Sample Each replicate= contribution of a pseudo stratum

Pseudo- stratum SchoolR1R2R3R4R5R6R7R Replication methods for multistage sample Balanced Replicated Replication Each replicate= an estimate of the sampling variance

IDSizeFromToSAMPLED Total400 – Within explicit strata, with a systematic sampling procedure, schools are sequentially selected. – Pairs are formed according to the sequence School 1 with School 5 School 8 with School 10 Replication methods for multistage sample How to form the pseudo-strata, i.e. how to pair schools?

IEA TIMSS / PIRLS procedure IDParticipationPseudo- Stratum Replication methods for multistage sample PISA procedure IDParticipationPseudo- Stratum How to form the pseudo-strata, i.e. how to pair schools?

Stratum 1Stratum 2Stratum 3Stratum Replication methods for multistage sample Balanced Replicated Replication – With L pseudo-strata, there are 2 L possible combinations – If 4 strata, then 16 combinations – Same efficiency with an Hadamard Matrix of Rank 4

Each row is orthogonal to all other rows, i.e. the sum of the products is equal to 0. Selection of school according to this matrix Combination Replication methods for multistage sample Hadamard Matrix

Replication methods for multistage sample

Pseudo- stratum SchoolR1R2R3R4R5R6R7R Replication methods for multistage sample Fays method

General formula BRR / Fay : each replicate is an estimate of the sampling variance C = average Same number of replicate for each country JK2 : each replicate corresponds of the pseudo-stratum to the sampling variance estimate C = sum Possibility of different number of replicates Replication methods for multistage sample

In the case of infinite populations, the sampling variance of the mean estimate consists of 2 components in the case of a PISA sampling design: – Between school variance – Within school variance Replication methods for multistage sample Replication methods, by removing entire schools, only integrate the uncertainty due to the selection of schools, not due to the selection of students within schools

Replication methods for multistage sample Let us image an educational system with no school variance and with infinite populations of students within schools

SchoolFullR1R2R3R4R5R6R7R8R9R Mean Replication methods for multistage sample An illustration of this mathematical equality: – Estimation of the SE by JK1 replication method

School School mean Mean145 SS8250 MS Replication methods for multistage sample An illustration of this mathematical equality: – Estimation of the SE by the formula