ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture.

Slides:



Advertisements
Similar presentations
“Students” t-test.
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
SAMPLE DESIGN: HOW MANY WILL BE IN THE SAMPLE—DESCRIPTIVE STUDIES ?
9. Weighting and Weighted Standard Errors. 1 Prerequisites Recommended modules to complete before viewing this module  1. Introduction to the NLTS2 Training.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Statistics Versus Parameters
Sampling: Final and Initial Sample Size Determination
Chapter 10: Estimating with Confidence
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Using the IEA IDB Analyzer to merge and analyze data.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Using the IEA IDB Analyzer to merge and analyze data.
Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 10: Hypothesis Tests for Two Means: Related & Independent Samples.
Variability Measures of spread of scores range: highest - lowest standard deviation: average difference from mean variance: average squared difference.
Independent Sample T-test Often used with experimental designs N subjects are randomly assigned to two groups (Control * Treatment). After treatment, the.
Chapter 9 - Lecture 2 Computing the analysis of variance for simple experiments (single factor, unrelated groups experiments).
Chapter 10: Estimating with Confidence
Understanding sample survey data
Standard error of estimate & Confidence interval.
Scot Exec Course Nov/Dec 04 Ambitious title? Confidence intervals, design effects and significance tests for surveys. How to calculate sample numbers when.
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Review of normal distribution. Exercise Solution.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Using the IEA IDB Analyzer Correlations & Regression.
ICCS th NRC Meeting, February 15 th - 18 th 2010, Madrid 1 Sample Participation and Sampling Weights.
Today’s lesson Confidence intervals for the expected value of a random variable. Determining the sample size needed to have a specified probability of.
Chapter 11: Estimation Estimation Defined Confidence Levels
1 SAMPLE MEAN and its distribution. 2 CENTRAL LIMIT THEOREM: If sufficiently large sample is taken from population with any distribution with mean  and.
Chap 20-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 20 Sampling: Additional Topics in Sampling Statistics for Business.
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Review: Two Main Uses of Statistics 1)Descriptive : To describe or summarize a collection of data points The data set in hand = all the data points of.
Lecture 14 Dustin Lueker. 2  Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample.
PARAMETRIC STATISTICAL INFERENCE
Comparing two sample means Dr David Field. Comparing two samples Researchers often begin with a hypothesis that two sample means will be different from.
Confidence intervals and hypothesis testing Petter Mostad
Population Estimation Objective : To estimate from a sample of households the numbers of animals in a population and to provide a measure of precision.
Confidence Intervals: The Basics BPS chapter 14 © 2006 W.H. Freeman and Company.
ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Using the IEA IDB Analyzer Percentages & Means.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
© Copyright McGraw-Hill 2000
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
Analysis Overheads1 Analyzing Heterogeneous Distributions: Multiple Regression Analysis Analog to the ANOVA is restricted to a single categorical between.
Introduction to Statistical Inference Jianan Hui 10/22/2014.
Sampling Fundamentals 2 Sampling Process Identify Target Population Select Sampling Procedure Determine Sampling Frame Determine Sample Size.
Review Normal Distributions –Draw a picture. –Convert to standard normal (if necessary) –Use the binomial tables to look up the value. –In the case of.
Chapter 5 Sampling Distributions. The Concept of Sampling Distributions Parameter – numerical descriptive measure of a population. It is usually unknown.
CHAPTER 13 DETERMINING THE SIZE OF A SAMPLE. Important Topics of This Chapter Different Methods of Determining Sample size. Standard Normal Distribution.
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany Training Workshop on the ICCS 2009 database Weights and Variance Estimation picture.
Measuring change in sample survey data. Underlying Concept A sample statistic is our best estimate of a population parameter If we took 100 different.
1 Probability and Statistics Confidence Intervals.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
Lecture 13 Dustin Lueker. 2  Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample.
Variability. The differences between individuals in a population Measured by calculations such as Standard Error, Confidence Interval and Sampling Error.
Estimating standard error using bootstrap
Sampling and Sampling Distribution
Variability.
Dependent-Samples t-Test
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
ECO 173 Chapter 10: Introduction to Estimation Lecture 5a
Inference: Conclusion with Confidence
ECO 173 Chapter 10: Introduction to Estimation Lecture 5a
Ch. 8 Estimating with Confidence
Estimation of Sampling Errors, CV, Confidence Intervals
Random sampling Carlo Azzarri IFPRI Datathon APSU, Dhaka
Calculating Probabilities for Any Normal Variable
Confidence intervals for the difference between two means: Independent samples Section 10.1.
15.1 The Role of Statistics in the Research Process
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
STA 291 Spring 2008 Lecture 13 Dustin Lueker.
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Presentation transcript:

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 1 Training Workshop on the ICCS 2009 database Weighting and Variance Estimation picture

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Content of this presentation Analyzing weighted data Standard errors –What are they? –Why do we need them? –How do we estimate them?

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid What are sampling weights? Values assigned to all sampling units –Weighted results from the sample can be generalized for the whole population –Weights allow unbiased estimates of population parameters Based on the sample selection probabilities –Applied at each sampling stage Adjusted to correct for non-response –Applied at each sampling stage

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 4 Weights in ICCS The ICCS Data contain several weight variables –Total Student Weight: TOTWGTS –Total Teacher Weight: TOTWGTT –Total School Weight: TOTWGTC The IDB Analyzer automatically selects the correct weight

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 5 Analyzing weighted data – a simple example 1:10 1:1

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 6 Un-weighted mean

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 7 Weighted mean

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 8 Example using ICCS data Civic knowledge score in an ICCS country Unweighted: average of

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Example using ICCS data Difference: 10.1 score points Reason for the difference: over-sampling of students in private schools –13.7% of the tested students –5.9% of the sum of weights

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid What are standard errors? The standard error of an estimate is the standard deviation of the sampling distribution associated with it The sampling distribution is the distribution of the statistic for all possible samples of the same size and method Since we do not select all possible samples, we can only estimate the standard error

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 11 What are standard errors good for? The ICCS results are based on samples All ICCS results are therefore estimates of unknown population values Standard errors can be used to measure how close these estimates are to the real values

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Confidence Intervals Let ε stand for any statistic of interest A 95% confidence interval is defined as This is the black bar in Table 3.4 With a confidence of 95%, the true mean is between and Take rounding into account!

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Estimating standard errors In a simple random sample, estimating the standard error of a mean x is easy –Just divide the standard deviation of the sample (s) by the square root of the sample size (n) In a complex sample design like in ICCS, it is not as easy to estimate the standard error as in a simple random sample ^

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Complex sample design Clustered sample –students within a school are more similar to each other than students from different schools Stratification –usually increases sampling precision Weights –complicate the calculations

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 15 Why not just use SPSS? Standard software packages like SPSS will not give correct estimates for standard errors The software assumes that the data is from a simple random sample, and uses the incorrect formula Generally, the estimate will be too small

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Jackknife Repeated Replication Solution: Jackknife Repeated Replication (JRR) Used for estimating standard errors in complex designs Basic idea: systematically re-compute a statistic on a set of replicated samples –setting the weights to zero for one school at a time –while doubling the weights of another school Estimate the variability of that statistic from the variability of that statistic between the full sample and the replicates

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid The JRR in ICCS Jackknife variance estimation in ICCS –Participating schools are paired according to the order in which they were sampled –These school pairs are called jackknife zones – JKZONES (JKZONET, JKZONEC) –One school in each zone is randomly assigned an indicator of 1 (0 for the other school) – JKREPS (JKREPT, JKREPC) –This indicator decides whether a school gets its replicate weight doubled or zeroed

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 18 A look inside the IDB Analyzer

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 19 Standard error: ^

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 20 Example using ICCS data Standard error of the teacher age SPSS just can‘t do that

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid SE and plausible values For ICCS achievement data, the standard error consists of two components Sampling error –this is what we just discussed Addtionally: measurement error –resulting from the use of plausible values This is the topic of the next presentation

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid 22 Conclusion Use the sampling weights! Compute standard errors using the JRR!

ICCS 2009 IDB Workshop, 18 th February 2010, Madrid Thank you for your attention!