Presentation is loading. Please wait.

Presentation is loading. Please wait.

Basic Statistics Some general principles Contributors Mark Dancox SWPHO Shelly Bradley EMPHO Jacq Clarkson Somerset PCT Dave Jephson EMPHO.

Similar presentations


Presentation on theme: "Basic Statistics Some general principles Contributors Mark Dancox SWPHO Shelly Bradley EMPHO Jacq Clarkson Somerset PCT Dave Jephson EMPHO."— Presentation transcript:

1 Basic Statistics Some general principles Contributors Mark Dancox SWPHO Shelly Bradley EMPHO Jacq Clarkson Somerset PCT Dave Jephson EMPHO

2 Learning Objectives To learn, understand and use methods of quantifying disease in populations Understand the need for standardisation and know how to apply it To understand the advantages and disadvantages of different methods.

3 Epidemiology “the study of the frequency, distribution and determinants of health problems and disease in human populations” The unit of interest is the population

4 Methods of quantifying disease Absolute counts Ratio Proportion Percentage Rate Special measures –prevalence, –incidence

5 Many sources of information about the occurrence of disease in a population….

6

7 Absolute counts of disease Number of cases of disease that occurred in a specific population –100 cases of lung cancer in Area A –50 cases of lung cancer in Area B Cannot conclude that lung cancer is more frequent in Area A – Need to know the size of the population and the time period involved

8 Ratio Is simply, one number divided by another. E.g looking at the ratio of men to women for a particular disease This however may be misleading as the number of women in a population is not always equal to the number of men especially in elderly populations.

9 Ratio - example In 2005 there were a total of 512,692 deaths from all causes across all ages. –269,368 of these were deaths occurring in females. –243, 324 of these were deaths occurring in males. Thus the ratio of female to male deaths is about 1.1 (=269,368/ 243,324) Source: ONS

10 Proportion Those who are included in the numerator are also included in the denominator, often expressed as a percentage. As the numerator and the denominator have the same units these divide out, leaving a dimensionless quantity (a number without units). E.g we could look at the proportion of all men with a particular disease.

11 Percentage Similar to proportion – the percentage is the proportion multiplied by 100 In 2005 there were a total of 512,692 deaths from all causes across all ages. –269,368 of these were deaths occurring in females. –243, 324 of these were deaths occurring in males. The proportion of deaths in 2005 that occurred among females is 0.525 (269,368/512,692) The percentage of deaths in 2005 that occurred among females is 52.5% (= 0.525*100) Source: ONS

12 Rate The denominator of a rate consists of all those, and only those, who might appear in the numerator. That is, those AT RISK. A rate specifies the time period during which the outcomes occur.

13 Rate - example In 2004-2006 there were 3,232 conceptions amongst under 18s in the South West. Expressed as a rate per 1,000 females aged 15- 17. In the South West, the under-18 conception rate for the 2004-2006 period was 33.7 per 1,000 females aged 15-17. Source: Health Profiles

14 Prevalence Prevalence quantifies the proportion of individuals in a population who have the disease at a specific instant. Note: No time period is involved here.

15 Example In 24 practices in Scotland with a total male population of size 60,577 there were 577 male patients with epilepsy. Thus the prevalence of epilepsy in this population is

16 Prevalence – example (ii) In 2003-2005 –21.5% of adults in the South West smoke –15.3% of adults in the South West binge drink –25.9% of adults in the South West eat healthily –12.6% of adults in the South West were physically active –23.2% of adults in the South West were obese Source: Health Profiles (using Health Survey for England)

17 Incidence Incidence quantifies the number of new cases of disease that develop in a population of individuals at risk during a specified time period. The denominator, “population at risk”, should consist of the entire population in which new cases can occur.

18 Incidence – things to remember Those having the disease or those who cannot develop the disease because of age or immunization should not be included in the denominator. When persons not at risk are included in the denominator, the resulting rate underestimates the true incidence. If the proportion not at risk is relatively small then including these persons in the denominator won ’ t significantly influence the incidence rate.

19 Example If in the previous example there were 165 new cases (of epilepsy) seen during one year then the incidence rate would be.

20 Incidence and prevalence – Quick Questions Cases of cold infections in class 4J. Class size: 20 JanuaryFebruaryMarch What is the period prevalence during February? What is the point prevalence on the 28 th February? What is the incidence in February? 6/20 = 30% 1/20 = 5% 4/?

21 How are incidence and prevalence related? For diseases with a low incidence rate but where those with the disease are affected for a long time period e.g. diabetes or asthma, the prevalence will be high relative to the incidence. If the rate of development of a disease is high, but it has a short duration, the prevalence will be low relative to the incidence. Prevalence = Incidence x Average Duration of Disease

22 Incidence and prevalence Sick population (Prevalence) Healthy population Incidence (new cases) die (mortality) recover

23 However… Changes in prevalence over time can be due to changes in incidence rates and/or changes in the duration of the disease. –Decreasing incidence rates due to new preventive measures might result in a low annual incidence in a population where the prevalence is higher. –Preventive measures may also increase the chance of survival for those already with the disease, thus affecting prevalence

24 When to use incidence or prevalence Prevalence rates descriptive studies can calculate the effect of a particular disease in a community can predict the health care requirements Incidence rates studying aetiology can establish the sequence of events not susceptible to bias by survival

25 Exercise 1 Calculate the prevalence of diabetes using the QOF data and summarise the data for each practice and for each PCT

26 Crude Rate Is the number of cases in a population divided by the total population during a specific time interval. Provides information on the experience of the population. Useful for the allocation of health resources and public health planning. However if comparing heart disease rates between two populations where one population had a larger proportion of young people then differences in rates might simply reflect the relationship between heart disease and age.

27 Example: Crude cancer rates in the U.S.

28 Category-Specific Rates To account for different population distributions of a factor of interest we can present and compare category-specific rates. These are calculated on a subgroup of the population which is defined by stratifying the populations into categories e.g age. They permit comparisons between different categories within the same population.

29 Age specific cancer rates

30

31 Adjusting the cancer rates..

32 Exercise 3 Calculate the age specific diabetes prevalence for practices in Somerset

33 Factors affecting disease Health is affected by many factors, as summarised by Dahlgren and Whitehead’s diagram…

34 One famous determinant… The number of Cholera infections fell after John Snow removed the handle from the Broad Street pump..

35 Factors affecting disease Age a confounding factor Occurrence of disease in one area may appear to be higher than in another because: – population structures are different –one area is older than another Relying on crude rates can be misleading Need to ‘standardise’ for the effect of age

36 Question If you are comparing a rate in one area with another, what assumptions are you making?

37 Standardisation

38 Age standardisation Occurrence of disease in one area may appear to be higher than in another because: –Population structures are different –One area is older than another Standardisation used to adjust for the effects of age on mortality rates or other rates Direct or Indirect Involves the calculation of numbers of expected events which are then compared with numbers of observed events

39 Two Methods of standardisation Different methods available but ‘Direct’ and ‘Indirect’ methods are most common Can calculate confidence intervals for each Which method used depends on the comparisons to be performed and the availability of data

40 When to use which? No right or wrong approach, but… –Direct standardisation useful to compare different areas or through time –Indirect Standardisation useful to determine if disease incidence is high or low in one area.

41 But…. Cannot compare different diseases Do what is possible with the data available Can only calculate a DSR if we have population data by age and counts by age. Use an indirectly standardised rate if only overall counts available.

42 Standardisation What do the ‘standards’ refer to? –Population Structure –Reference Rates Try to ensure that the standard population shares characteristics with the study population.

43 Direct Standardisation Uses a standard population in calculations. For a specific population (the study population) age specific rates are determined from observed data The observed age specific rates are then used to determine the number of expected events in a standard population

44 Example of direct standardisation Age Band Age specific rates in pop 00,000 Proportion of reference pop. in age group Expected number in reference population 30-39900.35 40-493550.35 50-599600.30 Standardised rate (direct method) 31.50 124.25 288.00 443.75 ++=++= X 90 = 31.5

45 Choice of standard population Usually European Standard Population Other standards can be used –Males compared to persons –Specific area to wider region

46 European Standard Population structure

47 Pros Able to compare different areas with each other. Can look at trends through time. Rare diseases may have no events in specific age bands so age specific rates may be unavailable May need to merge events from different years or combine age bands Cons Pros and Cons of DSRs

48 Exercise 4 Direct Standardisation

49 Indirect Standardisation Uses standard set of age-specific rates and compares the observed number of events in a study population to those expected, assuming the reference rates apply Used to determine if disease incidence is high or low in one area.

50 Example of Indirect standardisation AgeNo. in study Age specific rate in reference pop Expected cases 30-391000.08 40-493500.06 50-599000.10 Total Expected cases 8 21 90 119 ++=++= If 130 cases in total were observed this would give an SER of 130/119= 1.09 When the event is death this method gives us the SMR standardised mortality ratio

51 Types of Indirectly Standardised Rates If the outcome is mortality the indirectly standardised rate is known as a Standardised Mortality Ratio (SMR) If the outcome is incidence the indirectly standardised rate is known as a Standardised Incidence Rate (SIR).

52 Interpretation of SMRs SMR < 100 : lower rate than expected SMR = 100 : expected rate SMR > 100 : higher rate than expected An SMR of 180 represents a mortality rate that is 80% higher than expected.

53 Pros Can use where diseases are rare Don’t need information for all age groups Just need total number of observed and expected counts Cannot compare SMRs with each other Cannot look at trends through time Cons Pros and cons of indirectly standardised rates

54 Exercise 5: Indirect Standardisation

55 General Points NCHOD/ ONS publish DSRs and SMRs A SIR is an indirectly standardised rate for the incidence of disease Indirect and Direct standardisation are equally valid. The choice to use one or other of these is partly dependent on the comparisons to be performed and on the limitations of the available data.

56 A final note! Standardisation can be use in many areas Although we’ve looked at mortality, the technique can be applied in other ways: – Prescriptions – Hat ownership –Lollipops

57 Some Questions…. Which type of rate would you use if 1.You are trying to monitor a trend 2.Trying to pick out which wards have a mortality rate higher than the regional average 3.Looking at infant mortality 4.Trying to estimate the impact on resources next year.

58 Finding out more: APHO http://www.apho.org.uk/resource/item.aspx?RID=48457

59 Finding out more The East Region Public Health Observatory has also produced a useful briefing note on standardisation

60 Finding out more Lots of useful information can be found at the HealthKnowledge website… http://www.healthknowledge.org.uk/

61 Finding out more The NCHOD website also contains useful information on methodology… http://www.nchod.nhs.uk/

62 Finding out more Some further references of interest: –Bland, M. Introduction to Medical Statistics. Third Edition. Oxford University Press, 2000. –Hennekens CH, Buring JE. Epidemiology in Medicine, Lippincott Williams & Wilkins, 1987. –Larson, H.J. Introduction to Probability Theory and Statistical Inference. Third Edition. Wiley, 1982

63 Epidemiology “the study of the frequency, distribution and determinants of health problems and disease in human populations” The unit of interest is the population

64 Types of analytical study Observational studies –Cross sectional study (may be descriptive or analytical) –Case control study –Cohort study Intervention study (experimentation) –Randomised controlled trial (RCT)

65 Types of Risk May be interested in the incidence of disease in groups exposed to some risk and to compare to the incidence in a group not exposed (controls). Will look at several measure of risk –Attributable Risk –Relative Risk –Odds Ratio

66 Cross-sectional study Information on health status and other characteristics is collected from each subject at one point in time. Cross-sectional studies can be descriptive...(e.g. The prevalence of cough in population)...or analytical –(e.g. the association between cough and risk factors such as type of house lived in or whether person is a smoker)

67 Cohort Study Follow up two groups over time and compare the occurrence of disease One group is exposed to a possible risk factor for the disease, while the other is not (the control group) The exposure is the starting point, the disease is the outcome of interest. Risk of disease = Number with disease Number with + without disease May be interested in the risk of disease in groups exposed to compare to the risk in a group not exposed

68 Calculating Relative Risk I e = a/(a+b) I c = c/(c+d) Relative Risk= I e / I c Attributable Risk = I e - I c Odds Ratio = (a/c)/ (b/d)

69 A two by two table used in epidemiology

70 A two by two table used in cohort studies I e = Risk of disease in exposed = a/(a+b) I c = Risk of disease in unexposed = c/(c+d)

71 Types of Risk Attributable Risk is the disease rate in exposed persons minus that in unexposed persons. Relative Risk is the ratio of the disease rate in exposed persons to that in people who are unexposed. Not always possible to calculate the relative risk. In this case we use the Odds-Ratio.

72 Case Control Study Compares people with a condition (cases) to a similar group of people without the condition (controls). Often used to investigate the source of an outbreak of disease. Cannot calculate the incidence risk as selection based on the basis of having disease in the first place. The aim is to try and identify the risk factors which may have caused the cases to get the condition in the first place. Calculate odds of exposure instead Relative risk of disease is estimated by the odds ratio for rare diseases

73 A two by two table used in case-control studies Odds Ratio = (a/c)/ (b/d) Odds ratio= Odds of disease in exposed Odds of disease in unexposed If a disease is rare odds ratio approximates the relative risk of disease

74 Calculating Risk – an imaginary example

75 About the example From the data we can see that – 70 out of 100 smokers have lung cancer. – 30 out of 100 non smokers have lung cancer. The Relative Risk is the ratio of the disease rate in exposed persons to that in people who are unexposed, in this case –70/100 divided by 30/100 or 2.33 The Odds-Ratio is given by the ratio of the odds for each group, in this case – 70/30 divided by 30/70 = 5.44

76 Exercise 2 Using the hayfever prevalence data, calculate the relative risk, attributable risk and odds ratio for developing hayfever for eczema and non-eczema sufferers.

77 Finding out more Some further references of interest: –Bland, M. Introduction to Medical Statistics. Third Edition. Oxford University Press, 2000. –Hennekens CH, Buring JE. Epidemiology in Medicine, Lippincott Williams & Wilkins, 1987. –Larson, H.J. Introduction to Probability Theory and Statistical Inference. Third Edition. Wiley, 1982


Download ppt "Basic Statistics Some general principles Contributors Mark Dancox SWPHO Shelly Bradley EMPHO Jacq Clarkson Somerset PCT Dave Jephson EMPHO."

Similar presentations


Ads by Google