Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Age and Sex Data United Nations Statistics Division

Similar presentations


Presentation on theme: "Analysis of Age and Sex Data United Nations Statistics Division"— Presentation transcript:

1 Analysis of Age and Sex Data United Nations Statistics Division
1

2 Evaluation method of age and sex distribution data
Basic tools Graphical analysis Population pyramids Graphical cohort analysis Age and sex ratios Summary indices of error in age-sex data Whipple’s index Myers’ Blended Method The use of stable population theory Uses of consecutive censuses Focus of the presentation

3 Importance of analysis of age-sex structure
Planning purposes – health services, education programs, transportation, labour supply Social science, economist, gender studies Studying population dynamics – fertility, mortality, migration Insight on quality of census enumeration Having strong effect on other characteristics of a population Determined by fertility, mortality and migration, and follows fairly recognizable patterns

4 What to look for at the evaluation
Possible data errors in the age-sex structure, including Age misreporting (age heaping and/or age exaggeration) Coverage errors – net under/over enumeration(by age or sex) Significant discrepancies in age-sex structure due to extraordinary events High migration, war, famine, HIV/AIDS epidemic etc.

5 Approaches to collecting age and its impact on quality
Age - the interval of time between the date of birth and the date of the census, expressed in completed solar years Two approaches to collect data on age Date of birth (year, month and day) -Suggested More precise information Completed age (age at the individual’s last birthday) Less accurate Misunderstanding: the last, the next or the nearest birthday? Rounding to nearest age ending in 0 or 5 (age heaping) Children under 1 - may be reported as 1 year of age For approach one, need to be confident that most of population knows birthdays by dates – important that all are using same calendar (e.g. in countries that use both the Western and Islamic calendars)

6 Basic graphical methods- Population Pyramid
Basic procedure for assessing the quality of census data on age and sex Displays the size of population enumerated in each age group (or cohort) by sex The base of the pyramid is mainly determined by the level of fertility in the population, while how fast it converges to peak is determined by previous levels of mortality and fertility The levels of migration by age and sex also affect the shape of the pyramid

7 Population pyramid (1) – high population growth
Increasing growth with declining mortality Smaller size of bars under age 5 indicate that fertiltiy is starting to decline (could also be under-reporting of young children – need to know about context) Wide base indicates high fertility Source: Tabulated using data from United Nations Demographic Yearbook

8 Population pyramid (2) – low population growth
birth cohorts small due to WWII “baby boom” Flattening base indicates long-term low fertility

9 Population pyramid (3) - detecting errors
Under enumeration of young children (< age 2) Age misreporting errors (heaping) among adults Source: U.S. Census Bureau, Evaluating Censuses of Population and Housing What does this pyramid suggest?

10 Population pyramid (3) - detecting errors
High fertility level Smaller population in age group – extraordinary events in ? Smaller males relative to females in 20 – 44 - labor out-migration? Source: Tabulated using data from U.S. Census Bureau, Evaluating Censuses of Population and Housing What does this pyramid suggest?

11 Population pyramid (4)- detecting errors
Age heaping Liberia shows less severe age-heaping that in Yemen example, but does not appear to be concentrated only on 0 and 5 Qatar shows massive in-migration of males, and to a lesser extent females Bottom of pyramid shows that fertility is fairly constant Declining fertility

12 Population pyramid (5)- detecting errors
Liberia shows less severe age-heaping that in Yemen example, but does not appear to be concentrated only on 0 and 5 Qatar shows massive in-migration of males, and to a lesser extent females Bottom of pyramid shows that fertility is fairly constant Effect of labour migration

13 Population pyramid (5) - line instead of bars
Data source: Tabulated using data from United Nations Demographic Yearbook

14 Population pyramid (5) - line instead of bars
Single year age Data source: Tabulated using data from United Nations Demographic Yearbook

15 Population pyramid (5) - line instead of bars
Shows same data from population pyramid in previous slide but in line-graph form Single-year plot also shows age heaping is considerable - 5-year age group plot is much smoother, so analyzing by 5-year age groups will reduce much of the heaping effects Single-year plot shows decline in numbers under age 5 – fertility decline is unlikely to be so rapid so would suggest under-enumeration of children – note aversion of age 1 5-year plot smoothes this out a lot – follows general population trend, suggesting that issue may be more one of age displacement than one of under-enumeration **Note that heaping is also common for age data given in day/month/year format – certain days and months may be preferred as well ages – same technique just used for age can be use to plot distribution of pop or particular age group by reported day or month of birth, e.g. Data source: Tabulated using data from United Nations Demographic Yearbook

16 School attendance – quality assessment
Expected pattern ? Expected pattern ?

17 Age at death of children (in month) declared by the mother, Nepal 1975
Quality assessment Age at death of children (in month) declared by the mother, Nepal 1975

18 Basic graphical methods - Graphical cohort analysis
Tracking actual cohorts over multiple censuses The size of each cohort should decline over each census due to mortality, with no significant international migration The age structure (the lines) for censuses should follow the same pattern in the absence of census errors An important advantage - possible to evaluate the effects of extraordinary events and other distorting factors by following actual cohorts over time

19 Graphical cohort analysis – Example (1)
ALGERIA 1998 2008 Age group Male Age Group Birth cohort 0-4 `5-9 `10-14 15-19 20-24 25-29 30-34 35-39 841768 40-44 691275 45-49 565289 817004 50-54 371843 682357 55-59 345318 547181 60-64 301247 354694 65-69 252003 314958 70-74 163292 248672 75-79 107732 181478 Data is organized by birth cohort Exclude open-ended age category New cohorts will be added and older cohorts will be lost as we progress to later censuses People who were born in the same years are compared in the analysis

20 Graphical cohort analysis – Example (1)
Algeria, 1998 and 2008 Censuses

21 Graphical cohort analysis – Example (2)
Animation 1: To follow a cohort over time, we look at the vertical line that passes through the survival curves For the 1948 – 1952 cohort (age 30 – 34 at time of first enumeration) the cohort size is declining over time, as we should expect – note vertical gap between 1992 – 2002 lines is larger than gap between lines for males, as we should expect given accelerated mortality for men in particular as they age. Decline for women is smaller For 1978 – 1982 cohort (age 0 – 4 at time of first enumeration) see that cohort size increases substantially for both sexes from 1982 – 1992 – as large scale immigration at this age (and in this country) are very unlikely – it is almost certain that the cohort was significantly underenumerated in 1982 Also for this cohort notice that vertical gap between 1992 – 2002 censuses is quite large, particularly for males – as even HIV/AIDS mortality is not expected to be so high among this age group (20 – 24 at time of 2002 census) it is likely that this is due to out-migration of labor to South Africa and elsewhere (not we can see this throughout those in the prime working ages in 2002 – see how far below other lines the curve falls)

22 Graphical cohort analysis – Example (3)
Animation 1: To follow a cohort over time, we look at the vertical line that passes through the survival curves For the 1948 – 1952 cohort (age 30 – 34 at time of first enumeration) the cohort size is declining over time, as we should expect – note vertical gap between 1992 – 2002 lines is larger than gap between lines for males, as we should expect given accelerated mortality for men in particular as they age. Decline for women is smaller

23 Graphical cohort analysis – Example (2)

24 Age ratios (1) In the absence of sharp changes in fertility or mortality, significant levels of migration or other distorting factors, the enumerated size of a particular cohort should be approximately equal to the average size of the immediately preceding and following cohorts The age ratio for a particular cohort to the average of the counts for the adjacent cohorts should be approximately equal to 1 (or 100 if multiplied by a constant of 100) Significant departures from this “expected” ratio indicate either the presence of census error in the census enumeration or of other factors

25 Age ratios (2) Age ratio for the age category x to x+4 5ARx = 2 * 5Px
5ARx = The age ratio for the age group x to x+4 5Px =The enumerated population in the age category x to x+4 5Px-5 = The enumerated population in the adjacent lower age category 5Px+5 = The enumerated population in the adjacent higher age category 5ARx = * 5Px 5Px-n + 5Px+n

26 Age ratios (3) - example Source: Tabulated using data from United Nations Demographic Yearbook

27 Age ratios (3) - example Age Total Pop Age Ratio 5 107400 6 101476 7 100834 8 99548 9 92673 10 109332 11 69642 12 90873 13 72411 14 79408 15 82098 16 72428 17 64152 18 89467 19 67550 20 93943 Note that 5-year age groups smoothed the line out quite a bit, and restricted the range of values (ie the discrepancies between the groups) – this is why we often do analysis in 5 year age groups to mitigate the effects of age heaping This picture generally confirms what we saw in the population pyramid (the unusually long bars) which show some preference for ages ending in 0 and 5 but also for other ages (for younger pop, 12 and 18 – may be significant ages for some reason) Ratios get considerably more skewed at older ages – may be that older people less likely to know actual age

28 Comparison by sex may help identify extraordinary events and/or patterns of age misreporting by sex

29 Age ratios (3) – example –Yemen

30 Age ratios (3) – example –Yemen
Age misreporting increases with ages

31 Sex ratios (1) - calculation
Sex Ratio = 5Mx / 5Fx 5Mx = Number of males enumerated in a specific age group 5Fx = Number of females enumerated in the same age

32 Sex ratios (2) - example

33 Sex ratios (1) – cohort analysis
Fluctuation due to age misreporting –different level for males and females? Two censuses indicate an excess of male population at age group 55-59

34 Sex ratios (3) – cohort analysis
In general should expect SR to decline over subsequent censuses due to excess male mortality relative to female mortality First off, the bump in the year olds in 2000 (the birth cohort) clearly does not show up in the other censuses – this suggests that there is an age misreporting issue and the bump is not “real” The data are also unexpected for the series of cohorts born in the 1930s and 1940s – normally we would not expect a sex ratio over 1 at these ages– need to investigate possible historical causes for excess female mortality in these age groups (the 10 year gap is fudged a bit – is actually an 8 year gap between oldest two censuses, but shouldn’t cause great difference over 5 year age groups)

35 Summary indices - Whipple`s Index
Developed to reflect preference for or avoidance of a particular terminal digit or of each terminal digit Ranges between 100, representing no preference for “0” or “5” and 500, indicating that only digits “0” and “5” were reported in the census If heaping on terminal digits “0” and “5” is measured; Index= Source: Shryock and Siegel, 1976, Methods and Materials of Demography

36 Whipple`s Index (2) If the heaping on terminal digit “0” is measured;
The choice of the range 23 to 62 is standard, but largely arbitrary. In computing indexes of heaping, ages during childhood and old age are often excluded because they are more strongly affected by other types of errors of reporting than by preference for specific terminal digits

37 Whipple`s Index (3) The index can be summarized through the following categories: Value of Whipple’s Index Highly accurate data <= 105 Fairly accurate data – 109.9 Approximate data – 124.9 Rough data – 174.9 Very rough data >= 175

38 Whipple’s index around the world
Many of the countries that continue to have high Whipple’s Index values are in Sub-Saharan Africa Note data are for the most latest (most recent) census conducted between 1985 – 2003 Data source: Demographic Yearbook special issue on age heaping:

39 Improvement in the accuracy of age reporting over time
Shows long-term trend of reduction in value of Whipple’s index, i.e. improvement of age reporting as measured by age heaping

40 Summary indices – Myers` Blended Index
It is conceptually similar to Whipple`s index, except that the index considers preference (or avoidance) of age ending in each of the digits 0 to 9 in deriving overall age accuracy score The theoretical range of Myers` Index is from 0 to 90, where 0 indicates no age heaping and 90 indicates the extreme case where all recorded ages end in the same digit

41 Summary indices – Myers` Blended Index

42 Summary indices – Myers` Blended Index
Age misreporting Ages ending with 0 and 5 : over-counting Ages ending with other digits (particularly with 1, 3, 6, 7 and 9) –under counting

43 Conclusion: Uses and limitations
Assessment of the age and sex structure of the population enumerated in a census is typically the first step taken in evaluating a census by means of demographic methods Demographic methods provide: A quick and inexpensive indication of the general quality of data Evidence on the specific segments of the population in which the presence of error is likely “Historical” information which may be useful for interpreting the results of evaluation studies based on other methods, and in determining how the census data should be adjusted for use in demographic analyses

44 Conclusion: Uses and limitations
The major limitation of age and sex structure analysis is that it is not possible to derive separate numerical estimates of the magnitude of coverage and content error on the basis of such analyses alone It is often possible to assess particular types of errors which are likely to affect the census counts for particular segments of the population. Estimates of coverage error from other sources often are required to verify these observations.

45 References Shryock and Siegel, 1976, Methods and Materials of Demography IUSSP Tools for Demographic Estimation PAS-Population Analysis Spreadsheets /uscbtoolsdownload.html


Download ppt "Analysis of Age and Sex Data United Nations Statistics Division"

Similar presentations


Ads by Google