UK Longitudinal Studies Centre Weighting the BHPS Nick Buck Institute for Social and Economic Research University of Essex
UK Longitudinal Studies Centre Outline Brief description of the BHPS What this means for weighting How we calculate BHPS weights Basic use of BHPS weights, Implications of new samples for weighting
UK Longitudinal Studies Centre Overview of the BHPS Repeated data collection about a panel of individuals Usually interview all members of respondent households Follow individuals as they move between different households Short intervals between surveys, allows collection of continuous information Use Household panel where household context is important (e.g. income dynamics) Collects information on changing household units
UK Longitudinal Studies Centre BHPS overview Annual survey of members of initial sample of 5511 households interviewed in autumn 1991 Random sample representative of population of Great Britain (south of Caledonian canal) Follow sample members as they move and form new households Sample increased by births, new household members,... reduced by deaths, refusals, moves out of scope
UK Longitudinal Studies Centre BHPS sample members Original sample members – in originally selected households (plus births to OSMs) – always eligible for interview if living in GB/UK Temporary sample members – people who join households with OSMs – indeterminate selection probabilities; only eligible for interview if living with OSM) (also Permanent sample members, subset of TSMs who continue to be eligible for interview if not longer living with OSM)
UK Longitudinal Studies Centre Implications of BHPS design for weighting Analysis at both individual and household levels Design for both longitudinal and cross sectional analysis – separate weights? Different types of respondent – full interview respondents, children etc Different types of sample membership status Presents issues of both unequal selection probabilities and unequal response probabilities, attrition
UK Longitudinal Studies Centre Types of weights Longitudinal – respondent at all waves up to last wave in sequence (only individual level, not household) Cross-sectional – all respondents at wave, regardless of sample status and response history, available at individual and household levels Weights for full-interview respondents and weights for all individuals (including children) in respondent households So, for main sample, four person level and one household level weight
UK Longitudinal Studies Centre Longitudinal weights Based on a sequence of attrition models for each wave, which is multiplied by the previous wave weight (cf Lynn discussion). Models are based on a CHAID algorithm to identified best prediction imputation classes Post-stratified against 1991 census Trimmed to maximum value of 2.5 Only available for individuals respondent at each wave up to latest (but sample children who reach 16 and are then interviewed have respondent weights)
UK Longitudinal Studies Centre Cross-sectional weights Enumerated individual weight computed as a single step, adjusting for attrition from first wave (and multiplied by first wave weight) TSMs given weight on the basis of fair shares method (i.e. sum of OSM weights divided by total number of OSMs+TSMs) Respondent weight based on probability of responding given in enumerated household Cross-sectional household weight = enumerated individual weight
UK Longitudinal Studies Centre Variables used in construction of weights Respondent weight models use a wide range of social, economic and demographic characteristics from previous wave, measured at both household and individual variable (also whether recorded as mover between waves) Enumerated individual weights use individual demographic characteristics, plus measures of socio-economic status of household, including household level measures (e.g. income and tenure) and characteristics of key individuals in household
UK Longitudinal Studies Centre Advice on using weights Analysis of change up to wave W should use wave W longitudinal weight, adjusting for attrition losses. Cross-sectional analysis can use either cross-sectional weight or longitudinal weight – the former includes many more cases. If use cross-sectional weights must include temporary sample members in analysis. Weights affect standard errors and significance tests – e.g. SPSS does not adjust for this.
UK Longitudinal Studies Centre BHPS samples Initial 1991 GB sample ECHP low income sub-sample Scotland and Wales extension samples about 1500 extra households in each 1999-> Northern Ireland extension sample, 2000 extra households, > Each contain OSMs, TSM, and PSMs
UK Longitudinal Studies Centre New samples and weighting Cases in new samples have different selection probabilities from initial sample ECHP sample marginally raises selection probability of low income households Scotland, Wales and Northern Ireland substantially raise selection probabilities in these areas New cross-sectional weights incorporating ECHP Full new set of weights from 1999 onwards (rebased on all OSM respondents in that year) – and further new set from 2001
UK Longitudinal Studies Centre New sample and weighting II So, increasing proliferation of weights, increasing complexity of weight selection – need to decide what sample of interest is (will depend on both substantive concerns and years for which data required) Data set contains two versions, one for GB/UK analysis, the other for separate analysis of Scotland Wales etc. New weights have higher variance, so lower effective sample size. For example at wave sample longitudinal weight increases variance by 14%, all sample weight by 47%.
UK Longitudinal Studies Centre Gross usual pay: confidence intervals WeightCountryMeanLower bound Upper bound UnweightedEngland Wales Scotland XRWTSW1England (STATA)Wales Scotland XRWTSW1England (SPSS)Wales Scotland XRWTSW1England with design effect Wales Scotland
UK Longitudinal Studies Centre Further issues Computing weights for other response samples Arguments for not using weights Arguments for constructing special purpose weights Pooled multi-wave samples Statistical procedures which do not accept weights