Basic Statistics Some general principles Contributors Mark Dancox SWPHO Shelly Bradley EMPHO Jacq Clarkson Somerset PCT Dave Jephson EMPHO.

Slides:



Advertisements
Similar presentations
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Advertisements

Presentation and interpretation of epidemiological data: objectives Raj Bhopal, Bruce and John Usher Professor of Public Health, Public Health Sciences.
Study Designs in Epidemiologic
1 Case-Control Study Design Two groups are selected, one of people with the disease (cases), and the other of people with the same general characteristics.
Extension Article by Dr Tim Kenny
Understanding real research 3. Assessment of risk.
Assessing Disease Frequency
Intermediate methods in observational epidemiology 2008 Instructor: Moyses Szklo Measures of Disease Frequency.
BIOSTATISTICS 5.5 MEASURES OF FREQUENCY BIOSTATISTICS TERMINAL OBJECTIVE: 5.5 Prepare a Food Specific Attack Rate Table IAW PEF 5.5.
HaDPop Measuring Disease and Exposure in Populations (MD) &
Measuring Epidemiologic Outcomes
Understanding study designs through examples Manish Chaudhary MPH (BPKIHS)
How do cancer rates in your area compare to those in other areas?
Biology in Focus, HSC Course Glenda Childrawi, Margaret Robson and Stephanie Hollis A Search For Better Health Topic 11: Epidemiology.
Are exposures associated with disease?
Epidemiology 101: basic concepts
INTRODUCTION TO EPIDEMIOLO FOR POME 105. Lesson 3: R H THEKISO:SENIOR PAT TIME LECTURER INE OF PRESENTATION 1.Epidemiologic measures of association 2.Study.
Lecture 3: Measuring the Occurrence of Disease
“A Tale of Two Worlds”
Multiple Choice Questions for discussion
Dr. Abdulaziz BinSaeed & Dr. Hayfaa A. Wahabi Department of Family & Community medicine  Case-Control Studies.
Measurement Measuring disease and death frequency FETP India.
Measuring disease and death frequency
CHP400: Community Health Program- lI Research Methodology STUDY DESIGNS Observational / Analytical Studies Case Control Studies Present: Disease Past:
 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.
Chapter 3: Measures of Morbidity and Mortality Used in Epidemiology
Measures of Association
ANALYTICAL STUDIES Prospective Studies COHORT Prepared by: Dr. Sahar Sabbour Community Medicine Department.
Study Designs in Epidemiologic
Rates, Ratios and Proportions and Measures of Disease Frequency
Epidemiology: Basic concepts and principles ENV
Leicester Warwick Medical School Health and Disease in Populations Case-Control Studies Paul Burton.
Case Control Study Dr. Ashry Gad Mohamed MB, ChB, MPH, Dr.P.H. Prof. Of Epidemiology.
Basic concept of clinical study
Overview of Study Designs. Study Designs Experimental Randomized Controlled Trial Group Randomized Trial Observational Descriptive Analytical Cross-sectional.
The Impact of Epidemiology in Public Health Robert Hirokawa, DrPH Epidemiologist, Science and Research Group HHI / TSP, Hawaii Department of Health.
1 Basic epidemiological study designs and its role in measuring disease exposure association M. A. Yushuf Sharker Assistant Scientist Center for Communicable.
Standardization of Rates. Rates of Disease Are the basic measure of disease occurrence because they most clearly express probability or risk of disease.
Measures of Disease Frequency
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
Case-Control Studies Abdualziz BinSaeed. Case-Control Studies Type of analytic study Unit of observation and analysis: Individual (not group)
Introduction to Disease Prevalence modelling Day 6 23 rd September 2009 James Hollinshead Paul Fryers Ben Kearns.
CROSS SECTIONAL STUDIES
CHP400: Community Health Program - lI Research Methodology STUDY DESIGNS Observational / Analytical Studies Cohort Study Present: Disease Past: Exposure.
Descriptive study design
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
CASE CONTROL STUDY. Learning Objectives Identify the principles of case control design State the advantages and limitations of case control study Calculate.
Headlines Introduction General concepts
Confidence Intervals and Hypothesis Testing Mark Dancox Public Health Intelligence Course – Day 3.
Standardisation Alexander Ives Public Health England, South West.
Case control & cohort studies
Basic Analytical Techniques Used in epidemiology and public health intelligence Justine Fitzpatrick.
Methods of quantifying disease Stuart Harris Public Health Intelligence Analyst Course – Day 3.
Introduction to General Epidemiology (2) By: Dr. Khalid El Tohami.
Chapter 2. **The frequency distribution is a table which displays how many people fall into each category of a variable such as age, income level, or.
Introduction to epidemiology Mark Dancox Public Health Intelligence Analyst Course – Day 1.
Epidemiological Study Designs And Measures Of Risks (1)
8/12/2010 Nursing 1. 8/12/2010 Nursing2 Associate Professor Family and Community Medicine Department King Saud University By Epidemiological Methods.
Instructional Objectives:
Present: Disease Past: Exposure
Types of Errors Type I error is the error committed when a true null hypothesis is rejected. When performing hypothesis testing, if we set the critical.
College of Applied Medical Sciences
INDIRECT STANDARDIZATION BY MBBSPPT.COM
Measures of Disease Occurrence
Measurements of Risk & Association …
Mpundu MKC MSc Epidemiology and Biostatistics, BSc Nursing, RM, RN
Measures of risk and association
Cohort and longitudinal studies: statistics
Presentation transcript:

Basic Statistics Some general principles Contributors Mark Dancox SWPHO Shelly Bradley EMPHO Jacq Clarkson Somerset PCT Dave Jephson EMPHO

Learning Objectives To learn, understand and use methods of quantifying disease in populations Understand the need for standardisation and know how to apply it To understand the advantages and disadvantages of different methods.

Epidemiology “the study of the frequency, distribution and determinants of health problems and disease in human populations” The unit of interest is the population

Methods of quantifying disease Absolute counts Ratio Proportion Percentage Rate Special measures –prevalence, –incidence

Many sources of information about the occurrence of disease in a population….

Absolute counts of disease Number of cases of disease that occurred in a specific population –100 cases of lung cancer in Area A –50 cases of lung cancer in Area B Cannot conclude that lung cancer is more frequent in Area A – Need to know the size of the population and the time period involved

Ratio Is simply, one number divided by another. E.g looking at the ratio of men to women for a particular disease This however may be misleading as the number of women in a population is not always equal to the number of men especially in elderly populations.

Ratio - example In 2005 there were a total of 512,692 deaths from all causes across all ages. –269,368 of these were deaths occurring in females. –243, 324 of these were deaths occurring in males. Thus the ratio of female to male deaths is about 1.1 (=269,368/ 243,324) Source: ONS

Proportion Those who are included in the numerator are also included in the denominator, often expressed as a percentage. As the numerator and the denominator have the same units these divide out, leaving a dimensionless quantity (a number without units). E.g we could look at the proportion of all men with a particular disease.

Percentage Similar to proportion – the percentage is the proportion multiplied by 100 In 2005 there were a total of 512,692 deaths from all causes across all ages. –269,368 of these were deaths occurring in females. –243, 324 of these were deaths occurring in males. The proportion of deaths in 2005 that occurred among females is (269,368/512,692) The percentage of deaths in 2005 that occurred among females is 52.5% (= 0.525*100) Source: ONS

Rate The denominator of a rate consists of all those, and only those, who might appear in the numerator. That is, those AT RISK. A rate specifies the time period during which the outcomes occur.

Rate - example In there were 3,232 conceptions amongst under 18s in the South West. Expressed as a rate per 1,000 females aged In the South West, the under-18 conception rate for the period was 33.7 per 1,000 females aged Source: Health Profiles

Prevalence Prevalence quantifies the proportion of individuals in a population who have the disease at a specific instant. Note: No time period is involved here.

Example In 24 practices in Scotland with a total male population of size 60,577 there were 577 male patients with epilepsy. Thus the prevalence of epilepsy in this population is

Prevalence – example (ii) In –21.5% of adults in the South West smoke –15.3% of adults in the South West binge drink –25.9% of adults in the South West eat healthily –12.6% of adults in the South West were physically active –23.2% of adults in the South West were obese Source: Health Profiles (using Health Survey for England)

Incidence Incidence quantifies the number of new cases of disease that develop in a population of individuals at risk during a specified time period. The denominator, “population at risk”, should consist of the entire population in which new cases can occur.

Incidence – things to remember Those having the disease or those who cannot develop the disease because of age or immunization should not be included in the denominator. When persons not at risk are included in the denominator, the resulting rate underestimates the true incidence. If the proportion not at risk is relatively small then including these persons in the denominator won ’ t significantly influence the incidence rate.

Example If in the previous example there were 165 new cases (of epilepsy) seen during one year then the incidence rate would be.

Incidence and prevalence – Quick Questions Cases of cold infections in class 4J. Class size: 20 JanuaryFebruaryMarch What is the period prevalence during February? What is the point prevalence on the 28 th February? What is the incidence in February? 6/20 = 30% 1/20 = 5% 4/?

How are incidence and prevalence related? For diseases with a low incidence rate but where those with the disease are affected for a long time period e.g. diabetes or asthma, the prevalence will be high relative to the incidence. If the rate of development of a disease is high, but it has a short duration, the prevalence will be low relative to the incidence. Prevalence = Incidence x Average Duration of Disease

Incidence and prevalence Sick population (Prevalence) Healthy population Incidence (new cases) die (mortality) recover

However… Changes in prevalence over time can be due to changes in incidence rates and/or changes in the duration of the disease. –Decreasing incidence rates due to new preventive measures might result in a low annual incidence in a population where the prevalence is higher. –Preventive measures may also increase the chance of survival for those already with the disease, thus affecting prevalence

When to use incidence or prevalence Prevalence rates descriptive studies can calculate the effect of a particular disease in a community can predict the health care requirements Incidence rates studying aetiology can establish the sequence of events not susceptible to bias by survival

Exercise 1 Calculate the prevalence of diabetes using the QOF data and summarise the data for each practice and for each PCT

Crude Rate Is the number of cases in a population divided by the total population during a specific time interval. Provides information on the experience of the population. Useful for the allocation of health resources and public health planning. However if comparing heart disease rates between two populations where one population had a larger proportion of young people then differences in rates might simply reflect the relationship between heart disease and age.

Example: Crude cancer rates in the U.S.

Category-Specific Rates To account for different population distributions of a factor of interest we can present and compare category-specific rates. These are calculated on a subgroup of the population which is defined by stratifying the populations into categories e.g age. They permit comparisons between different categories within the same population.

Age specific cancer rates

Adjusting the cancer rates..

Exercise 3 Calculate the age specific diabetes prevalence for practices in Somerset

Factors affecting disease Health is affected by many factors, as summarised by Dahlgren and Whitehead’s diagram…

One famous determinant… The number of Cholera infections fell after John Snow removed the handle from the Broad Street pump..

Factors affecting disease Age a confounding factor Occurrence of disease in one area may appear to be higher than in another because: – population structures are different –one area is older than another Relying on crude rates can be misleading Need to ‘standardise’ for the effect of age

Question If you are comparing a rate in one area with another, what assumptions are you making?

Standardisation

Age standardisation Occurrence of disease in one area may appear to be higher than in another because: –Population structures are different –One area is older than another Standardisation used to adjust for the effects of age on mortality rates or other rates Direct or Indirect Involves the calculation of numbers of expected events which are then compared with numbers of observed events

Two Methods of standardisation Different methods available but ‘Direct’ and ‘Indirect’ methods are most common Can calculate confidence intervals for each Which method used depends on the comparisons to be performed and the availability of data

When to use which? No right or wrong approach, but… –Direct standardisation useful to compare different areas or through time –Indirect Standardisation useful to determine if disease incidence is high or low in one area.

But…. Cannot compare different diseases Do what is possible with the data available Can only calculate a DSR if we have population data by age and counts by age. Use an indirectly standardised rate if only overall counts available.

Standardisation What do the ‘standards’ refer to? –Population Structure –Reference Rates Try to ensure that the standard population shares characteristics with the study population.

Direct Standardisation Uses a standard population in calculations. For a specific population (the study population) age specific rates are determined from observed data The observed age specific rates are then used to determine the number of expected events in a standard population

Example of direct standardisation Age Band Age specific rates in pop 00,000 Proportion of reference pop. in age group Expected number in reference population Standardised rate (direct method) =++= X 90 = 31.5

Choice of standard population Usually European Standard Population Other standards can be used –Males compared to persons –Specific area to wider region

European Standard Population structure

Pros Able to compare different areas with each other. Can look at trends through time. Rare diseases may have no events in specific age bands so age specific rates may be unavailable May need to merge events from different years or combine age bands Cons Pros and Cons of DSRs

Exercise 4 Direct Standardisation

Indirect Standardisation Uses standard set of age-specific rates and compares the observed number of events in a study population to those expected, assuming the reference rates apply Used to determine if disease incidence is high or low in one area.

Example of Indirect standardisation AgeNo. in study Age specific rate in reference pop Expected cases Total Expected cases =++= If 130 cases in total were observed this would give an SER of 130/119= 1.09 When the event is death this method gives us the SMR standardised mortality ratio

Types of Indirectly Standardised Rates If the outcome is mortality the indirectly standardised rate is known as a Standardised Mortality Ratio (SMR) If the outcome is incidence the indirectly standardised rate is known as a Standardised Incidence Rate (SIR).

Interpretation of SMRs SMR < 100 : lower rate than expected SMR = 100 : expected rate SMR > 100 : higher rate than expected An SMR of 180 represents a mortality rate that is 80% higher than expected.

Pros Can use where diseases are rare Don’t need information for all age groups Just need total number of observed and expected counts Cannot compare SMRs with each other Cannot look at trends through time Cons Pros and cons of indirectly standardised rates

Exercise 5: Indirect Standardisation

General Points NCHOD/ ONS publish DSRs and SMRs A SIR is an indirectly standardised rate for the incidence of disease Indirect and Direct standardisation are equally valid. The choice to use one or other of these is partly dependent on the comparisons to be performed and on the limitations of the available data.

A final note! Standardisation can be use in many areas Although we’ve looked at mortality, the technique can be applied in other ways: – Prescriptions – Hat ownership –Lollipops

Some Questions…. Which type of rate would you use if 1.You are trying to monitor a trend 2.Trying to pick out which wards have a mortality rate higher than the regional average 3.Looking at infant mortality 4.Trying to estimate the impact on resources next year.

Finding out more: APHO

Finding out more The East Region Public Health Observatory has also produced a useful briefing note on standardisation

Finding out more Lots of useful information can be found at the HealthKnowledge website…

Finding out more The NCHOD website also contains useful information on methodology…

Finding out more Some further references of interest: –Bland, M. Introduction to Medical Statistics. Third Edition. Oxford University Press, –Hennekens CH, Buring JE. Epidemiology in Medicine, Lippincott Williams & Wilkins, –Larson, H.J. Introduction to Probability Theory and Statistical Inference. Third Edition. Wiley, 1982

Epidemiology “the study of the frequency, distribution and determinants of health problems and disease in human populations” The unit of interest is the population

Types of analytical study Observational studies –Cross sectional study (may be descriptive or analytical) –Case control study –Cohort study Intervention study (experimentation) –Randomised controlled trial (RCT)

Types of Risk May be interested in the incidence of disease in groups exposed to some risk and to compare to the incidence in a group not exposed (controls). Will look at several measure of risk –Attributable Risk –Relative Risk –Odds Ratio

Cross-sectional study Information on health status and other characteristics is collected from each subject at one point in time. Cross-sectional studies can be descriptive...(e.g. The prevalence of cough in population)...or analytical –(e.g. the association between cough and risk factors such as type of house lived in or whether person is a smoker)

Cohort Study Follow up two groups over time and compare the occurrence of disease One group is exposed to a possible risk factor for the disease, while the other is not (the control group) The exposure is the starting point, the disease is the outcome of interest. Risk of disease = Number with disease Number with + without disease May be interested in the risk of disease in groups exposed to compare to the risk in a group not exposed

Calculating Relative Risk I e = a/(a+b) I c = c/(c+d) Relative Risk= I e / I c Attributable Risk = I e - I c Odds Ratio = (a/c)/ (b/d)

A two by two table used in epidemiology

A two by two table used in cohort studies I e = Risk of disease in exposed = a/(a+b) I c = Risk of disease in unexposed = c/(c+d)

Types of Risk Attributable Risk is the disease rate in exposed persons minus that in unexposed persons. Relative Risk is the ratio of the disease rate in exposed persons to that in people who are unexposed. Not always possible to calculate the relative risk. In this case we use the Odds-Ratio.

Case Control Study Compares people with a condition (cases) to a similar group of people without the condition (controls). Often used to investigate the source of an outbreak of disease. Cannot calculate the incidence risk as selection based on the basis of having disease in the first place. The aim is to try and identify the risk factors which may have caused the cases to get the condition in the first place. Calculate odds of exposure instead Relative risk of disease is estimated by the odds ratio for rare diseases

A two by two table used in case-control studies Odds Ratio = (a/c)/ (b/d) Odds ratio= Odds of disease in exposed Odds of disease in unexposed If a disease is rare odds ratio approximates the relative risk of disease

Calculating Risk – an imaginary example

About the example From the data we can see that – 70 out of 100 smokers have lung cancer. – 30 out of 100 non smokers have lung cancer. The Relative Risk is the ratio of the disease rate in exposed persons to that in people who are unexposed, in this case –70/100 divided by 30/100 or 2.33 The Odds-Ratio is given by the ratio of the odds for each group, in this case – 70/30 divided by 30/70 = 5.44

Exercise 2 Using the hayfever prevalence data, calculate the relative risk, attributable risk and odds ratio for developing hayfever for eczema and non-eczema sufferers.

Finding out more Some further references of interest: –Bland, M. Introduction to Medical Statistics. Third Edition. Oxford University Press, –Hennekens CH, Buring JE. Epidemiology in Medicine, Lippincott Williams & Wilkins, –Larson, H.J. Introduction to Probability Theory and Statistical Inference. Third Edition. Wiley, 1982