Goldsmith’s teachers lecture 2010 Medical statistics Joan Morris Professor of Medical Statistics
Aims To describe medical statistics To give examples of where medical statistics has contributed to society Use of statistics in screening To mention some novel statistical methods
Statistics - definition Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation, and presentation of data.
Data Collection
Florence Nightingale –She realised that soldiers were dying from malnutrition, poor sanitation, and lack of activity. –She kept meticulous records of the death toll in the hospitals as evidence of the importance of patient welfare.
National Data Collection National Mortality Statistics Health survey for England and Wales Population statistics ….. Large amounts of data are available on the web
Standardised mortality ratios: Mortality from skin cancer
All births in England and Wales according to maternal age : compared with : 2090,000 births : 1991,000 births
Comparisons of individuals Observational cross-sectional case-control studies cohort studies Interventional Randomised controlled trials Comparisons of populations Time trends Ecological studies: Geographical variations Age/sex patterns Social variations Epidemiology
Comparison of Individuals Study Design –Ensure “valid” data is collected –Ensure enough data is collected Main designs –Case control studies –Cohort studies –Clinical trials
Richard Doll (doctor) and Austin Bradford Hill (statistician)
Is there a relationship between smoking and lung cancer? British Doctors Cohort Study (BMJ 1994;309: ) 34,000 British male doctors who replied to a postal questionnaire in 1951 and further questionnaires in 1957, 66, 72, 78, 90, … Flagged the doctors at NHSCR and obtained their death certificates as they died. Compared death rates in smokers and non-smokers..
Number of cigarettes smoked per day Risk of dying from lung cancer compared to non smokers None1.0 1 to to to More than Is there a relationship between smoking and lung cancer?
What causes Sudden Infant Death Syndrome ? Sudden Infant Death Syndrome Case Control Study Methods –Collected information about infants that were potential “SIDS” –Identified “similar” children who had not died –Compared the differences Results – Children who died were much more likely to have been put on their fronts to sleep than children who did not die
Randomised Controlled Trial A clinical trial is an experiment in which a treatment is administered to humans in order to evaluate its efficacy and safety Randomised = allocated to groups on basis of chance e.g. tossing a coin (ensures fair comparison) Controlled = a comparison group
Can folic acid reduce neural tube defects (e.g. spina bifida) ? MRC Vitamin trial - randomised controlled trial Large: 1817 women who had had a previous NTD, 33 centres, 7 countries
Can folic acid reduce neural tube defects (e.g. spina bifida) ? Results : Women who did not receive folic acid were 3 times more likely to have a second NTD pregnancy Impact : Women are advised to take folic acid prior to becoming pregnant Majority of countries around the world fortify flour with folic acid
Collection of Data Study Design –Cohort –Case Control –Clinical Trial
Analysis Could the observed results have arisen by chance ? Given that we have a sample what can we say about the population from which the sample comes
Folic Acid vs Placebo for Neural Tube Defects Neural Tube Defects YesNoTotal Folic Acid Yes No Risk of NTD in treated group = Risk of NTD in control group = Relative Risk of NTD in treated group compared to control group = 1% 3.5% 1%/3.5% = 0.29
P values P is the probability of the observed event or one more extreme occurring if the null hypothesis is true Null hypothesis : No difference in treatments P = probability out of 27 babies with an NTD what is the chance that 6 or less are in the FA group and 21 in placebo group IF FA has no effect
P values These are just probability calculations with prob NTD in group A = prob NTD in group B And comparing numbers of combinations of obtaining a total of 21 NTDs
No. treatedNo. died A204 B 2 A10020 B10010 A20040 B20020 RR death in A vs B = 2.0 Is it due to chance or not ? Interpreting the results of a trial Mainly Yes Split Yes / No Mainly No P = 0.37 P = 0.07 P = 0.008
P values P < 0.05 is taken to mean statistical significance This means if there is no difference between treatments, and you do 20 trials one will be statistically significant
Folic Acid vs Placebo for Neural Tube Defects RR = 0.29 P = Therefore we assume there is a real difference between the folic acid group and the placebo group But how big is the reduction ?
Folic Acid vs Placebo for Neural Tube Defects RR = % Confidence Interval : 0.10 to 0.76 P = % confidence intervals means that 95% of the time this interval contains the true reduction Therefore it gives an indication of the likely size of the reduction
Folic Acid and NTD Dose Response
Interpretation The same proportional increase in serum folate has the same proportional reduction in NTD All women benefit from taking folic acid. There is not a threshold effect
So far…. Collection –Nightingale –National statistics –Study design Presentation –Estimates and confidence intervals Analysis –Vital to interpretation
Use of Statistics in Screening Screening is the identification, among apparently healthy individuals, of those who are sufficiently at risk from a specific disorder to benefit from a subsequent diagnostic test, procedure or direct preventive action. Screening for Heart Disease
Relative odds of major IHD event by fifths of the distribution of haemostatic and lipid markers for all men (——) and for men free of IHD at baseline examination ( ∘ ––– ∘ ). Yarnell J et al. Eur Heart J 2004;25: The European Society of Cardiology
AffectedUnaffected Biomarker : ZZ
AffectedUnaffected Screen negative Screen positive Biomarker : ZZ
Screen negative Screen positive Biomarker : ZZ False positives False negatives
Risk Factor Unaffected Affected Good test Screening for a medical disorder
Risk Factor Unaffected Affected Poor test Screening for a medical disorder
Is Cholesterol any good for screening ? AffectedUnaffected Risk screen converter
Detection Rate False Positive Rate
4.2mm Hg
7.5mm Hg
Are there any good screening tests ? Antenatal screening for Down’s syndrome
Quadruple test markers Total hCG Inhibin-A AFP uE 3 Down’s syndrome Unaffected Down’s syndrome Unaffected Down’s syndrome Unaffected
1:10 8 1:10 6 1:10 4 1:10 2 1: : :1 Down’s syndrome Unaffected Distribution of risk in Down’s syndrome and unaffected pregnancies using AFP, uE3, total hCG and inhibin-A measured at weeks (+ maternal age) Risk of a Down’s syndrome pregnancy at term
Analysis based on…. Understanding probability Properties of combinations of random variables Properties of normal distribution Understanding hypothesis testing
Recent Developments Collection Analysis Interpretation or explanation Presentation
Collection Danish mother and child study –Recruiting people on the internet Linking data sets –Probability linking eg Date of mother’s birth fairly accurate Gestational age of baby often wrong Weight of baby –REALLY ACCURATE !!!
Analysis Meta-analysis Monte-carlo simulations Bayesian analysis Analysis of micro-arrays
Several studies looking at the same thing Each study may be relatively inconclusive because of too much uncertainty (too small) Statistical (mathematical) method of combining and presenting results from several studies Can indicate more robust results
Copyright ©2007 BMJ Publishing Group Ltd. Eurich, D. T et al. BMJ 2007;335:497 Pooled odds ratio for thiazolidinediones compared with other treatments for all cause mortality Proportions dying in each group Odds ratio Relies on the logarithm of the odds ratio being approximately normally distributed Forest plot
Comparing institutions, individual doctors and identifying outliers What’s the problem? –Lots of variables important –Random variation –Random variation greater for smaller units or institutions Way of presenting the values for units so that this is taken into account
Funnel plot
Conclusion As much about collection, interpretation and presentation as calculation Concepts of random variables, probability feed into analyses Making sense out of uncertainty Changing techniques as times change