Health and Disease in Populations 2002 Sources of variation (1) Paul Burton! Jane Hutton
Informal lecture objectives ¥Objective 1 ¥To enable the student to distinguish between observed data and the underlying tendencies which give rise to those data ¥Objective 2: ¥To understand the concepts of sources of variation and randomness
Formal lecture objectives for Random Variation (1) and (2) ¥ Objective 1 Distinguish between observed epidemiological quantities (incidence, prevalence, incidence rate ratio etc) and their true or underlying values. ¥ Objective 2 Discuss how observed epidemiological quantities depart from their true values because of random variation.
Formal lecture objectives for Sources of Variation Objective 3 Describe how observed values help us towards a knowledge of the true values by: ¥allowing us to test hypotheses about the true value (SoV 1) ¥allowing us to calculate a range within which the true value probably lies (SoV 2)
Drawing conclusions ¥Experiment ¥Flip a coin 10 times ¥Result ¥Observe 7 heads, 3 tails ¥Conclusions ¥Data wrong (e.g. a miscount) ¥Artefact ¥Chance ¥The coin is biased towards heads
Tendency versus observation ¥Coins tend to produce equal numbers of heads and tails, but what we observe may depart from this by random variation. ¥Random variation in health ¥On average, there are 4 cases of meningitis per month in Leicester; some months we observe 10, some months 0. ¥Smokers tend to be less healthy than non- smokers; but if we pick a few people at random, we might find that the smokers are healthier than the non-smokers.
Tendency versus observation ¥Epidemiologists, health planners etc. want to know about the underlying tendencies and patterns. However, as well as systematic variation, everything they observe is affected by random variation.
If we know about the underlying tendency, we can predict what we may reasonably expect to observe (probability theory).
Neonatal Intensive Care (NIC) cots ¥True requirement (1992 figures) 1/1,000 live births per annum ¥Health authority has approximately 12,000 live births per annum ¥On average 12 NIC `cots' will be required per year (this is the true tendency)
95% 18 29/ % 21
Obstetric beds (NIC cots) ¥ Often observe 8-16 cots being used ¥ Need 19 or more on 1/day per month ¥ Need 21 or more on 1% of days ¥ Hardly ever need more than 24 cots ¥ Provide 19 cots ¥ On average 12 are occupied = 63% occupancy True tendency observed distribution easy ¥ BUT how do we reverse the direction of inference? Observed distribution true tendency
Any questions?
Hypothesis testing Objective 3 Describe how observed values help us towards a knowledge of the true values by: ¥Allowing us to test hypotheses about the true value
Hypothesis testing ¥An hypothesis: A statement that an underlying tendency of scientific interest takes a particular quantitative value ¥The coin is fair (the probability of heads is 0.5) ¥The new drug is no better than the standard treatment (the ratio of survival rates = 1.0) ¥The true prevalence of tuberculosis in a given population is 2 in 10,000
Testing hypotheses Are the observed data consistent with the stated hypothesis? ¥Informally? ¥Formally?
Formally ¥ Calculate the probability of getting an observation as extreme as, or more extreme than, the one observed if the stated hypothesis was true. ¥ If this probability is very small, then either ¥ something very unlikely has occurred; or ¥ the hypothesis is wrong ¥ It is then reasonable to conclude that the data are incompatible with the hypothesis. The probability is called a p-value
Hypothesis: this coin is fair ¥ Observed data: 10 heads, 0 tails P-value: (1 in 500) (exactly 2 1/ 1,024) ¥ Conclusion: Data inconsistent with hypothesis; strong evidence against the hypothesis ¥ Prior beliefs relevant here: ¥ 10 heads, 0 tails: (Is the coin biased?) ¥ 10 survivors, 0 deaths on new treatment X: (Does X work if historically 50% died)
An arbitrary convention P-value: p 0.05 Data inconsistent with hypothesis Substantive evidence against the hypothesis Reasonable to reject the hypothesis Statistically significant ¥ P-value: p>0.05 ¥ None of the above ¥ The mean surface temperature of the earth has increased by only 1°C over the last 50 years p=0.1 does not prove that there is no global warming!
Hypothesis tests ¥ The incidence of disease X in Warwickshire is significantly lower than in the rest of the UK (p=0.01) ¥ The death rate from disease Y is significantly higher in Barnsley than in Leicester (p=0.05) ¥ Patients on the new drug did not live significantly longer than those on the standard drug (p=0.4)
The null hypothesis The hypothesis to be tested is often called the null hypothesis (H 0 ) ¥ The ratio of death rates is 1.0 ¥ The prevalence in Warwickshire is the same as in Leicestershire p<=0.05 : substantial evidence against the hypothesis being tested, not that it is definitely false ¥ p>0.05: Data (not in-) consistent with the hypothesis. Little or no evidence against the hypothesis being tested, not that it is definitely true
An experiment: flip a coin 10 times ¥ Observed result: 7 heads, 3 tails ¥ Question: ¥ Is the coin biased? * * * * p = 2×( ) = * * * *
An experiment: flip a coin 10 times ¥Observed result: 7 heads, 3 tails ¥Data consistent with the coin being unbiased. Weak evidence against the null hypothesis ¥So: little evidence that the coin is biased ¥But: does not prove that the coin is unbiased
Problems ¥ Rejecting H 0 is not always much use. ¥ p<0.05 is arbitrary; nothing special happens between p=0.049 and p=0.051 ¥ p= and p=0.6 easy to interpret ¥ False positive results Statistical significance depends on sample size. Flip a coin 3 times minimum p=0.25 (i.e. 2×1/8) Statistically significant clinically important ¥ Nevertheless, p values are used a lot
A solution Objective 3 Describe how observed values help us towards a knowledge of the true values by: ¥Allowing us to test hypotheses about the true value ¥Providing us with a range within which the underlying tendency probably lies
Any questions?