Plausibility Ranges for Population Estimates Focusing on ranges for children
Outline Aims Data sources Approaches Results for children Research on other age groups Summary of benefits
Aim Explore and Combine High estimate of population Low estimate of population Administrative sources
Data sources: Patient Register (0-15) GP Patient Register Usually resident population aged 0 to 15 Usual residents registered with a GP Short term migrants List inflation or registration lag for out- migrants Multiple or duplicate NHS numbers School boarders Non- registration or registration lag for in-migrants Erroneous list cleaning Adjusted to remove short-term migrants and duplicate records
Data sources: Child benefit (0-15) Child Benefit Usually resident population aged 0 to 15 Usual residents registered for Child Benefit Change of details lag for out-migrants Children living abroad Short term immigrants Non- registration or registration lag Change of details lag for in-migrants School boarders Clerical claims Adjustments to the child benefit data to compensate for coverage differences were not possible.
Data sources: Live births (age 0) Live Births Usually resident population aged 0 Usual residents born in LA j Immigration between birth and mid-year In-migrants between birth and mid-year Emigration between birth and mid-year Out-migrants between birth and mid-year Infant mortality The live births data has been adjusted for infant mortality (IMR). To allow for internal migration between birth and mid-year, the live births minus IMR has been re-distributed to local authorities using Child Benefit data.
Data sources: School Census (3-15) School Census Usually resident population aged 3 to 15 Usual residents at a state maintained school Short term immigrants Change of details lag for out-migrants Multiple pupil reference numbers School boarders Children aged 3 and 4 Attendance lag for immigrants Children at independent schools, pupil referral units or home educated Change of details lag for in-migrants Although the School Census was available at individual record level, it was not possible to make any adjustments for over coverage.
Aggregate data: tolerance range approach Mid-point (LA j) Low source (LA j) High source (LA j) Step 1 Difference (LA j) Patient Register = 3000 Mid-Point = 2800 Child Benefit = 2600 For example... Difference = 400
Aggregate data: tolerance range approach Range size (LA j) = 2 x Difference (LA j) Difference (400) from Step 1 Range size (LA j = 800) Step 2 Mid-point (LA j) Low source High source 2800 Upper limit (LA j) 3200 Lower limit (LA j) 2400
Aggregate data: tolerance range approach Size of range (% of mid-point) 10% of LAs Step 3 Percentage range size (LA j) = Range size (LA j) / Mid-point(LA j) min % max % Range size (%) restricted to prevent very narrow or wide ranges. rank of LAs
Record level data: linkage approach
Record-level sources (LA j) e.g. School Census e.g. Patient Register Linked dataset unlinked School Census unlinked Patient Register Lower limit Upper limit High linkage rate = Narrow range for LA j
Summary of approaches Age groupApproach Under 1sCombines Patient Register and Live Births adjusted with Child Benefit 1 to 4 year oldsCombines Patient Register and Child Benefit 5 to 7 year olds 8 to 11 year olds 12 to 15 year olds Lower limit Linked Patient register and school Census (England) Upper limit Combines Patient register and Child Benefit Source Tolerance range Lower limit – linkage approach Upper limit – Tolerance range
Results: data sources summary
Results: LA example - Adur (males)
Results: LA example - Adur (females)
Results: summary e.g. all LAs (males 8-11) Relatively few areas with estimates out of range Where areas have estimates out of range, often by small amount Rare for areas to have estimates more than 5% above upper limit or below lower limit Ranges quite narrow for ages 0 and 1-4, and more areas slightly out of range
Plausibility ranges for children Plausibility ranges are proof of concept at this stage Project allowed us to demonstrate techniques using aggregate and record level data Results published 27 March 2012 (report and Excel-based tool) Results were discussed with LAs at roadshows Plan to further evaluate ranges in future
Research: age group Age at which people most likely to migrate Sources: L2, HESA, Patient Register Where Patient Register is lower than population estimates, these areas are predominantly university towns Tested approach with HESA and PRDS linkage Further work on matching required
Research: 25-59/64 age group Investigated use of confidence intervals around estimates Local Labour Market Database (L2) For quinary age groups sample size often small Difficulty with excluding short-term migrants from latest tax-year data Not yet able to apply a universal method for all LAs using the L2
Research: over retirement age group Patient Register and Work and Pensions Longitudinal Study compared Data sources were often very close to each other, potentially leading to ranges that were not diagnostically useful Large differences between the sources for females aged and males aged Surprising result that population estimate higher than PR and WPLS in 90+ age group
Summary of benefits Gathered together metadata and research on administrative sources in one report Knowledge of administrative sources fed back to teams quality assuring 2011 Census Helped inform future population estimates methods (e.g. school boarders) Evidence that small number of LAs may have had undercount of 0 & 1 year olds at 2001 Census Ranges may be used in quality assuring estimates in future