Presentation is loading. Please wait.

Presentation is loading. Please wait.

HLM with Educational Large-Scale Assessment Data: Restrictions on Inferences due to Limited Sample Sizes Sabine Meinck International Association.

Similar presentations


Presentation on theme: "HLM with Educational Large-Scale Assessment Data: Restrictions on Inferences due to Limited Sample Sizes Sabine Meinck International Association."— Presentation transcript:

1 HLM with Educational Large-Scale Assessment Data: Restrictions on Inferences due to Limited Sample Sizes Sabine Meinck International Association for the Evaluation of Educational Achievement (IEA), Data Processing and Research Center, Germany, Hamburg Caroline Vandenplas University of Lausanne, Switzerland

2 Introduction: Inferences and Sample Sizes
Usually, researchers are not interested in features of the sample, but want to infer from it on population features. Remember your statistics course: When inferring from sample data on populations, You just estimate a population feature. You need to indicate the “uncertainty” or precision of your estimate. The related measure is the standard error (s.e.). Using the standard error, You can build a confidence interval (CI) around the mean as You can test for group differences. Look up you statistic books for the formula on estimating se! Or, when doing it for LSA, look in the User Guides!

3 Introduction: Inferences and Sample Sizes
Sometimes, we cannot detect “significant” differences between groups to be compared. However: The fact that we cannot detect a difference does not mean there is none!! Based on our data, we just don’t know… How can we get more precise results? One possibility: Increase the sample size How/why does this work?

4 The Sample is Picturing the Population…
What if a picture is our population? Has about 340,000 pixels (if I remember correctly…)

5 The Sample is Picturing the Population…
Sample Size = ,000 10,000 50,000 We can play with the sample size to change the precision of our picture. With increasing sample size, sampling error reduces. Measure for this preciseness: Standard error (s.e.)

6 Introduction: Inferences and Sample Sizes
This relationship between sample size and s.e. holds for any estimated parameter. Percentages Correlation coefficients Regression coefficients Etc. It also holds for any estimated coefficient of a hierarchical model! The relationship is not linear though and can depend on many factors.

7 Connection to HLM What is HLM? HLM = Hierarchical Linear Modeling
Nice introduction to HLM give, e.g., Snijder & Bosker (1999) Analysis method that addresses the hierarchical structure of data/populations Almost all datasets from large-scale assessments (LSA) display a hierarchical structure. E.g., students nested in classes/schools (TIMSS, PIRLS, PISA …) Effects playing out at different levels of a hierarchy can be disentangled. We can specify “fixed” and “random” effects. HLM is an enhancement of linear regression analysis… Students nested in teachers. (TIPI) Teachers nested in schools. (ICCS, ICILS) Example Random effects: If we know the slope of some parameter differs between cluster a lot, we may want to let it vary at random.

8 From Linear Regression to HLM
Assume we are trying to predict student math achievement from SES scores. Linear regression model: 𝑦 𝑖 = 𝛽 0 + 𝛽 1 𝑥 𝑖 + ε, for i = 1, …,n where 𝑦 𝑖 = student achievement, 𝛽 0 and 𝛽 1 are unknown coefficients, 𝑥 𝑖 = student SES, ε = error term (residual variance) This model confounds effects between and within groups! Model parameters to be estimated.  All subject to sampling error! Subscript “i” denotes students in the sample.

9 Example: What if we have a case like this?

10 Example: What if we have a case like this?
Linear regression, no consideration of cluster effects

11 Example – Individuals Belong to Clusters!
Intercepts can vary (“random intercepts”) Slopes are fixed (parallel lines) Slopes can vary (“random slopes”)

12 From Linear Regression to HLM
Extending the model allows us to disentangle the effects at different levels (= Hierarchical Linear Model) E.g., we can specify a model with one explanatory variable at level 1 and level 2, the intercept and the slope are random. Example research question: Does the influence of SES on achievement vary between schools, i.e., does SES affect students achievement in different schools in different magnitudes or even directions? We acknowledge the fact that within each school, we may have a different regression line. Residuals at both levels are assumed to follow normal distributions with zero means.

13 From Linear Regression to HLM
Hierarchical model: The research question can be answered by measuring by U1 and its significance. Model parameters to be estimated.  All subject to sampling error! First equation = normal regression equation. Specifying β0 (second equation), we acknowledge the fact that each school has a different intercept AND we consider the average SES at school level. Specifying β1 (last equation), we acknowledge the fact that each school has a regression line with a different slope. U0, U1 & ε are all variances. Residuals at both levels are assumed to follow normal distributions with zero means.

14 Purpose of this Research
Increasing demand to apply hierarchical linear models to educational large-scale assessment (LSA) data FAQ: “How many units do I need on different levels of hierarchy to do meaningful multilevel modeling?“ (secondary researchers, survey designers) “Meaningful“ = achieving parameter estimates with certain precision levels

15 Data and Methods We utilized a Monte-Carlo simulation study, mimicking typical large-scale assessment data: Selection of samples from a virtual population with 2 hierarchical levels Varied parameters: Sample sizes at both levels Intra-class correlation coefficient Selection probabilities of level 2 units Covariance distributions between levels Model complexity 288 sampling scenarios were explored, each with 6,000 replicates

16 Research Question What is the association between sample sizes and precision of the estimated parameters of hierarchical models under varying population conditions?

17 Results Precision of all explored parameters increase when sample size increases in all scenarios No rule of thumb concerning required sample sizes can be given! Rather, required sample sizes depend heavily on the parameter of interest: Sampling precision levels vary extremely for different model parameters Sample size requirements differ widely for the estimation of fixed model parameters vs. the estimation of variances Research interest in macro-level regression coefficients: Increase the number of sampled clusters Focus on variance estimates: The level at which the sample size is increased is less important In opposition to what is often suggested in the literature, no rule of thumb…

18 Results With the sample sizes typically employed in LSA, some parameters cannot be estimated with sufficient precision (i.e., being significantly different from zero) in relatively simple hierarchical models With such imprecise parameters, group differences are even less likely to be detected Mean of random intercepts, residual variance: about 1% to 5% Slope of random intercepts: > 100% (S.E. Bigger than the parameter)

19 Conclusions Be mindful when phrasing your research questions and interpreting your results: Does the data at hand actually suits the type of analysis? What are relevant group differences that you want to detect? Examine closely standard errors of the explored model parameters. Full paper with practical guidelines and detailed results can be downloaded at (IERI Monograph Series Special Issue 1, October 2012) Mean of random intercepts, residual variance: about 1% to 5% Slope of random intercepts: > 100% (S.E. Bigger than the parameter)

20 Thank you for your attention!
We are grateful to the National Center for Education Statistics (NCES), U.S. Department of Education, who funded the project. Contact:


Download ppt "HLM with Educational Large-Scale Assessment Data: Restrictions on Inferences due to Limited Sample Sizes Sabine Meinck International Association."

Similar presentations


Ads by Google