Sample enumeration: Forecasting from statistical models Dr Vernon Gayle and Dr Paul Lambert (Stirling University) Tuesday 29th April 2008.

Slides:



Advertisements
Similar presentations
The educational gender gap, catch up and labour market performance Martyn Andrews (University of Manchester), Steve Bradley, Dave Stott & Jim Taylor (Lancaster.
Advertisements

Confidence Intervals Objectives: Students should know how to calculate a standard error, given a sample mean, standard deviation, and sample size Students.
ANALYSIS OF SELECTIVE DNA POOLING DATA IN FOX Joanna Szyda, Magdalena Zatoń-Dobrowolska, Heliodor Wierzbicki, Anna Rząsa.
1 Growing Up in the 1990s – An exploration of the educational experiences of cohorts of Rising 16s in the BHPS Vernon Gayle, University of Stirling & ISER.
Social Stratification: the enduring concept that shapes the lives of Britain’s youth - Empirical analysis using the British Household Panel Survey Roxanne.
Educational Attitudes and Social Stratification: A Multivariate Analysis of the British Youth Panel Professor Vernon Gayle, University of Stirling, Scotland.
Confidence Intervals, Effect Size and Power
Stats 95.
EDUCATIONAL ACHIEVEMENT AND UNDER-ACHIEVEMENT Explanations This lecture focuses mainly on class but touches on gender and race. Sociology Revision Lectures.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
Stat 301 – Day 36 Bootstrapping (4.5). Last Time – CI for Odds Ratio Often the parameter of interest is the population odds ratio,   Especially with.
Data Freshman Clinic II. Overview n Populations and Samples n Presentation n Tables and Figures n Central Tendency n Variability n Confidence Intervals.
SOSC 103D Social Inequality in HK
CONFIDENCE INTERVALS What is the Purpose of a Confidence Interval?
8-1 Introduction In the previous chapter we illustrated how a parameter can be estimated from sample data. However, it is important to understand how.
Generalized Linear Models
Logistic regression for binary response variables.
Achievement of pupils in Salford. L4+ English and maths.
‘Interpreting results from statistical modelling – a seminar for social scientists’ Dr Vernon Gayle and Dr Paul Lambert (Stirling University) Tuesday 29th.
The Campbell Collaborationwww.campbellcollaboration.org C2 Training: May 9 – 10, 2011 Data Analysis and Interpretation: Computing effect sizes.
GEODE, 16 Jan 2007 Occupational Analysis – Issues and Examples Grid Enabled Occupational Data Environment GEODE Project workshop, 16 th January 2007 Vernon.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Confidence Interval Estimation.
10-1 Introduction 10-2 Inference for a Difference in Means of Two Normal Distributions, Variances Known Figure 10-1 Two independent populations.
Cross-national differences in participating in tertiary science, technology, engineering and mathematics education Dr. Annemarie van Langen ITS, Radboud.
Employment, unemployment and economic activity Coventry working age population by gender Source: Annual Population Survey, Office for National Statistics.
Logit model, logistic regression, and log-linear model A comparison.
What major reference points has your research established to date in your field? What are your research plans over the next three years? Dr Vernon Gayle.
Using Quasi-variance to Communicate Sociological Results from Statistical Models Vernon Gayle & Paul S. Lambert University of Stirling Gayle and Lambert.
HSRP 734: Advanced Statistical Methods June 19, 2008.
Presenting results from statistical models Professor Vernon Gayle and Dr Paul Lambert (Stirling University) Wednesday 1st April 2009.
Wim Van den Noortgate Katholieke Universiteit Leuven, Belgium Belgian Campbell Group Workshop systematic reviews.
Module 14: Confidence Intervals This module explores the development and interpretation of confidence intervals, with a focus on confidence intervals.
Scottish Social Survey Network: Master Class 1 Data Analysis with Stata Dr Vernon Gayle and Dr Paul Lambert 23 rd January 2008, University of Stirling.
Scottish Social Survey Network: Master Class 1 Data Analysis with Stata Dr Vernon Gayle and Dr Paul Lambert 23 rd January 2008, University of Stirling.
Limits to Statistical Theory Bootstrap analysis ESM April 2006.
Probability = Relative Frequency. Typical Distribution for a Discrete Variable.
Misuses of Quantitative Research V. Darleen Opfer.
Chong Ho (Alex) Yu. One-sample z-test and one-sample t-test Test the sample mean against the population mean To see whether there is a big gap between.
Chapter 7 Estimation Procedures. Basic Logic  In estimation procedures, statistics calculated from random samples are used to estimate the value of population.
Comparing Two Proportions. AP Statistics Chap 13-2 Two Population Proportions The point estimate for the difference is p 1 – p 2 Population proportions.
1 Chapter FIGURE 11.1 Normal densities for male and for female femoral circumferences (in millimeters). The summary statistics for drawing these.
Logistic regression (when you have a binary response variable)
‘Interpreting results from statistical modelling – A seminar for Scottish Government Social Researchers’ Professor Vernon Gayle and Dr Paul Lambert (Stirling.
1 Probability and Statistics Confidence Intervals.
Week 111 Review - Sum of Normal Random Variables The weighted sum of two independent normally distributed random variables has a normal distribution. Example.
Gender and Achievement Professor Becky Francis and Professor Christine Skelton DSCF 8 th April 2008.
Denise Kendrick University of Nottingham.  Inequality or inequity?  Differences in injury risk ◦ Child factors ◦ Family factors ◦ Social factors ◦ Environmental.
Logistic Regression Logistic Regression - Binary Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model the probability.
GEODE, March 2007 Occupational Analysis – the examples of: - the Youth Cohort Study of England & Wales - ‘By Slow Degrees’ - social mobility research Grid.
Chapter 7 Estimation. Chapter 7 ESTIMATION What if it is impossible or impractical to use a large sample? Apply the Student ’ s t distribution.
Comparing Two Proportions
Figure e-1. Further Meta-analysis of previously reported PD and mtDNA haplogroup associations and the data from this study.   Meta-analysis of published.
Generalized Linear Models
SAMPLE SIZE DETERMINATION
SA3202 Statistical Methods for Social Sciences
Bootstrap Confidence Intervals using Percentiles
F - Ratio Table Degrees of Freedom for the Factor
Chapter 10 Inferences on Two Samples
Statistical Process Control
Volume 155, Issue 1, Pages (September 2013)
Tutorial 4 For the Seat Belt Data, the Death Penalty Data, and the University Admission Data, (1). Identify the response variable and the explanatory.
ONE-WAY ANOVA.
Introduction to log-linear models
Elliott P, et al. JAMA 2009;302:37-48.
Social Change: The birth cohort: Evidence from the BHPS
Table 2. Regression statistics for independent and dependent variables
95% Confidence Interval for Mean
© The Author(s) Published by Science and Education Publishing.
Table 4. Independent Samples Test Application of ICT by Gender
Presentation transcript:

Sample enumeration: Forecasting from statistical models Dr Vernon Gayle and Dr Paul Lambert (Stirling University) Tuesday 29th April 2008

Communicating Results (to non-technically informed audiences) Davies (1992) Sample Enumeration Payne (1998) Labour Party campaign data Gayle et al. (2002) War against the uninformed use of odds (e.g. on breakfast t.v.)

Sample Enumeration Methods In a nutshell… “What if” – what if the gender effect was removed 1. Fit a model (e.g. logit) 2. Focus on a comparison (e.g. boys and girls) 3. Use the fitted model to estimate a fitted value for each individual in the comparison group 4. Sum these fitted values and construct a sample enumerated % for the group

Naïve Odds Naïvely presenting odds ratios is widespread (e.g. Connolly 2006) In this model naïvely (after controlling for other factors) Girls have an odds of 1.0 Boys have an odds of.58 We should avoid this where possible!

Logit Model Example from YCS 11 (these pupils took GCSE in 2001) y=1 5+ GCSE passes (A* - C) X vars gender; family social class (NS-SEC); ethnicity; housing tenure; parental education; parental employment; school type; family type

Naïve Odds Example from YCS 11 (these pupils took GCSE in 2001) In this model naïvely (after controlling for other factors) Girls have an odds of 1.0 Boys have an odds of.66 We should avoid this where possible!

Sample Enumeration Results Percentage with 5+ GCSE (A*-C) All52% Girls58% Boys47% (Sample enumeration est. boys)(50%) Observed difference11% Difference due ‘directly’ to gender3% Difference due to other things8%

Pseudo Confidence Interval Sample Enumeration Male Effect Upper Bound50.32% Estimate49.81% Lower Bound49.30% Bootstrapping to construct a pseudo confidence interval (1000 Replications)

Reference A technical explanation of the issue is given in Davies, R.B. (1992) ‘Sample Enumeration Methods for Model Interpretation’ in P.G.M. van der Heijden, W. Jansen, B. Francis and G.U.H. Seeber (eds) Statistical Modelling, Elsevier.