Statistics for Neurosurgeons A David Mendelow Barbara A Gregson Newcastle upon Tyne England, UK
The Normal Distribution
Means and standard errors
Comparison of curves (Sig. dif vs. Sig. bigger) Bell shaped:Student’s t test Paired data: Student’s paired t test Skewed curves: Non parametric tests –Sign test (+ve –ve) –Wilcoxson ranked sum test
Types of data Binary eg Yes/No, Male/Female Nominal eg eye colour (blue/green/brown) Ordinaleg normal/weak/paralysed, GCS eye Countsno. of aneurysms, no. of operations Continuouswidth of haematoma
Displaying data Bar chart Pie chart Histogram Box and whisker Scatterplot
Bar Chart and Pie Chart Total GCS at randomisation in STICH II Figures for the first 234 cases Median GCS=13 Gender of patients in STICH II Figures for the first 234 cases
Histograms Figures produced on 19/11/2009: 234 cases Mean = 63.8 Std = Median = 65 years Quartiles = 55, 74 Min = 20 years, Max = 94 years Mean = 39.5 Std = Median = 35ml Quartiles = 22, 54 Min =10ml, Max =96ml
Boxplot (Box and Whisker Plot) Plot of volume of haematoma by age group in STICH).
Scatterplot Plot of 1,490 simultaneous end tidal and arterial CO 2 measurements. Dot areas are proportional to the number of measurements with that combination of values. End tidal CO 2 values tend to be lower than corresponding PaCO 2 values (most points are below the equivalence line).
Summarising data Central tendency –Mean –Median –Mode Spread –Range –Interquartile range –Standard deviation/variance
Confidence intervals –statistic ± (1.96 x standard error) –e.g. difference between means ± (1.96 x standard error of difference)
Comparison of means Sample mean v population mean –One sample t-test Two small sample means –T-test (assuming equal variance) –T-test (assuming unequal variance) Two paired samples means –Paired t-test Large samples –Z-test
Comparison of tables (2x2) Fisher’s exact test p = (r1!r2!c1!c2!)/n!a!b!c!d! Chi Squared test Observed vs. expected frequencies abr1 cdr2 c1c2n
Chi squared test abr1 cdr2 c1c2n McNemar’s = (a - d) 2 /(a + d) degrees-of-freedom = (rows - 1)(columns - 1) = 1
Relative risk sensitivity and specificity Test +veTest -ve Disease yesabr1 Disease nocdr2 Sensitivity = a/r1 Specificity = d/r2 Positive predictive value = a/a+c Negative predictive value = d/b+d
Comparison of related values: a.Linear regression (best linear fit)
Linear regression (best linear fit)
Comparison of related values: b.Altman Bland Plots
Statistical tests comparing two samples Binary –Large frequencies – χ 2, compare proportions, odds ratio –Small frequencies – Fisher’s exact Nominal not ordered –Large frequencies – χ 2, –Small frequencies – combine categories Nominal ordered –Large frequencies – χ 2 for trend Ordinal –Mann-Whitney U test Continuous –Large samples – Normal distribution for means –Small normal samples – Two sample t test –Small non normal – Mann-Whitney U test
Statistical tests for paired or matched data Binary McNemar Nominal Stuart test OrdinalSign test Continuous (small, non-normal)Wilcoxonmatched pairs Continuous (small, normal)Paired t-test Continuous (large)Normal distribution
Choice of test for independent observations Outcome variable NominalCateg >2Catrg Ordered OrdinalNon-normalNormal Input variable Nominal χ2 Fisher χ2χ2χ2 trend Mann- Whitney Log rank Student’s t Normal test Categ >2 χ2χ2χ2χ2χ2χ2 Kruskal- Wallis Analysis of variance Categ Ordered χ2 trend Mann- Whitney χ2χ2 Kendall’s rank Linear regression OrdinalLogistic regression Kruskal- Wallis Kendall’s rank Spearman rank Spearman rank Linear regression Non-normalLogistic regression Kruskal- Wallis Kendall’s rank Spearman rank Spearman rank and linear regression NormalLogistic regression Spearman rank Spearman rank and linear regression Pearson and Linear regression
Relative risk and odds ratios With diseaseWithout disease Maleabr1 Femalecdr2 Risk for men p1 = a/r1 Risk for women p2 = c/r2 – Relative risk = p1/p2 Odds for men = a/b Odds for women = c/d – Odds ratio = (a/b)/(c/d) = ad/bc
Multivariate techniques Multiple linear regression Logistic regression Survival analysis –Kaplan Meier –Cox proportional hazard model
Early Surgery Initial Conservative Treatment Kaplan Meier Plot of Survival
Type I and type II errors Null hypothesis FalseTrue Test result SignificantPower (1- ) Type I error ( ) Not significant Type II error ( )
ROC Curves Multiple chi squared 2 x 2 tests See www.
Multiple 2x2 tables = ROC Curve