Summary Statistics & Confidence Intervals Annie Herbert Medical Statistician Research & Development Support Unit Salford Royal NHS Foundation Trust

Slides:



Advertisements
Similar presentations
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Advertisements

Statistics Part II Math 416. Game Plan Creating Quintile Creating Quintile Decipher Quintile Decipher Quintile Per Centile Creation Per Centile Creation.
Review of Descriptive Graphs and Measures Here is a quick review of what we have covered so far. Pie Charts Bar Charts Pareto Tables Dotplots Stem-and-leaf.
Which Test? Which Test? Explorin g Data Explorin g Data Planning a Study Planning a Study Anticipat.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Population vs. Sample Population: A large group of people to which we are interested in generalizing. parameter Sample: A smaller group drawn from a population.
/4/2010 Box and Whisker Plots Objective: Learn how to read and draw box and whisker plots Starter: Order these numbers.
Lecture Slides Elementary Statistics Tenth Edition
1 Week 1 Review of basic concepts in statistics handout available at Trevor Thompson.
CS1512 Foundations of Computing Science 2 Lecture 20 Probability and statistics (2) © J R W Hunter,
Intro to Statistics Part2 Arier Lee University of Auckland.
Basic statistics: a survival guide
Box and Whiskers with Outliers. Outlier…… An extremely high or an extremely low value in the data set when compared with the rest of the values. The IQR.
The basics for simulations
Aim: How do we organize and interpret statistical data?
Looking at Data-Distributions
Confidence Intervals Objectives: Students should know how to calculate a standard error, given a sample mean, standard deviation, and sample size Students.
Statistics Review – Part I
Introduction Our daily lives often involve a great deal of data, or numbers in context. It is important to understand how data is found, what it means,
Hours Listening To Music In A Week! David Burgueño, Nestor Garcia, Rodrigo Martinez.
Quantitative Analysis (Statistics Week 8)
Understanding p-values Annie Herbert Medical Statistician Research and Development Support Unit
Chapter 1 review “Exploring Data”
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
Basics of Biostatistics for Health Research Session 2 – February 14 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
Basic Statistics Measures of Central Tendency.
Chapter 2 Tutorial 2nd & 3rd LAB.
Sample Size Annie Herbert Medical Statistician Research & Development Support Unit Salford Royal Hospitals NHS Foundation Trust
Effective Use of Graphs Annie Herbert Medical Statistician Research & Development Support Unit Salford Royal (Hope) Hospitals NHS Foundation Trust
The t-test Inferences about Population Means when population SD is unknown.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Describing distributions with numbers
Objectives Vocabulary
1 MATB344 Applied Statistics Chapter 2 Describing Data with Numerical Measures.
M07-Numerical Summaries 1 1  Department of ISM, University of Alabama, Lesson Objectives  Learn when each measure of a “typical value” is appropriate.
STAT 280: Elementary Applied Statistics Describing Data Using Numerical Measures.
Descriptive Statistics1 LSSG Green Belt Training Descriptive Statistics.
Lecture 3 Describing Data Using Numerical Measures.
Statistics - methodology for collecting, analyzing, interpreting and drawing conclusions from collected data Anastasia Kadina GM presentation 6/15/2015.
INVESTIGATION 1.
Measures of Center vs Measures of Spread
Medical Statistics as a science
Review BPS chapter 1 Picturing Distributions with Graphs What is Statistics ? Individuals and variables Two types of data: categorical and quantitative.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 6 Putting Statistics to Work.
Copyright © 2005 Pearson Education, Inc. Slide 6-1.
Unit 3: Averages and Variations Week 6 Ms. Sanchez.
RESEARCH & DATA ANALYSIS
Edpsy 511 Exploratory Data Analysis Homework 1: Due 9/19.
Surveillance and Population-based Prevention Department for Prevention of Noncommunicable Diseases Displaying data and interpreting results.
Descriptive Statistics Unit 6. Variable Any characteristic (data) recorded for the subjects of a study ex. blood pressure, nesting orientation, phytoplankton.
Descriptive Statistics(Summary and Variability measures)
Unit 2: Some Basics. The whole vs. the part population vs. sample –means (avgs) and “std devs” [defined later] of these are denoted by different letters.
Data Presentation Numerical Summary Measures Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU.
Introduction to Statistics
Introduction to Statistics
Anticipating Patterns Statistical Inference
Chapter 3 Describing Data Using Numerical Measures
Description of Data (Summary and Variability measures)
Unit 4 Statistics Review
Percentiles and Box-and- Whisker Plots
Descriptive and inferential statistics. Confidence interval
Summary descriptive statistics: means and standard deviations:
(-4)*(-7)= Agenda Bell Ringer Bell Ringer
DESIGN OF EXPERIMENT (DOE)
Describing Data Coordinate Algebra.
Warm up Honors Algebra 2 3/14/19
Introductory Statistics
Presentation transcript:

Summary Statistics & Confidence Intervals Annie Herbert Medical Statistician Research & Development Support Unit Salford Royal NHS Foundation Trust

Timetable TimeTask 60 minsPresentation 20 minsCoffee Break 90 mins Practical Tasks in IT Room

Outline Sampling Summary statistics Confidence intervals Statistics Packages

‘Population’ and ‘Sample’ Studying population of interest. Usually would like to know typical value and spread of outcome measure in population. Data from entire population usually impossible or inefficient/expensive so take a sample (even census data can have missing values). Sample must be representative of population. Randomise!

E.g. Randomised Controlled Trial (RCT) POPULATIONSAMPLE RANDOMISATION GROUP 1 GROUP 2 OUTCOME

Types of Data Categorical Example: Yes/No Blood Group Graphs: Bar Chart Pie Chart Summary: Frequency (n) Proportion (%) Numerical/Continuous Example: Weight Pain Score Graphs: Histogram Box and Whisker Plot Summary: Mean & Standard Deviation (SD) Median & Inter-quartile range (IQR)

Types of Average (‘Average’ - a number which typifies a set of numbers) Mean = Total divided by n Median = Middle value Mode = Most common value/group (rarely used)

Types of Average - Example Pain score data: 10, 8, 7, 7, 1, 7, 6, 5, 3, 4 Ordered: 1, 3, 4, 5, 6, 7, 7, 7, 8, 10 Mean = ( … + 10) ÷ 10 = 5.8 Median = (6+7) ÷ 2 = 6.5 Mode = 7 5 th 6 th 2 nd 3 rd 8 th 9 th Median

Mean or Median? Roughly Normally distributed: Mean or median Mean by convention Skewed: Median Less affected by extreme values

Variation and Spread Standard Deviation (‘SD’) - Average distance from mean - Use alongside mean Inter-Quartile Range (‘IQR’) - Range in which middle 50% of the data lie (middle 50% when ordered) - Use alongside median Range - Highest and lowest value - Possibly quote in addition to SD/IQR

Types of Variation - Example Pain score data: 10, 8, 7, 7, 1, 7, 6, 5, 3, 4 Ordered: 1, 3, 4, 5, 6, 7, 7, 7, 8, 10 SD = 2.6 IQR = (3.75, 7.25) Range = (1,10) IQR 5 th 6 th 2 nd 3 rd 8 th 9 th Median

Standard Error Not the same as standard deviation. Calculated using a measure of variability and sample size. Used to construct confidence intervals. Not very informative when given alongside statistics or as error bars on a plot.

Sample statistic is the best guess of the (true) population value E.g. Sample mean is the best estimate of mean in population. Mean likely to be different if take a new sample from the population. Know that estimate not likely to be exactly right.

Confidence Intervals (CIs) Confidence interval = “range of values that we can be confident will contain the true value of the population”. The “give or take a bit” for best estimate. Convention is to use a 95% confidence interval (‘95% CI’). But also leaves 5% confidence that this interval does not contain the true value.

Example: Legislation for smoke-free workplaces and health of bar workers in Ireland: before and after study (Allwright et al; BMJ Oct 2005) Before N=138 After N=138 Difference (95% CI) Salivary cotinine (nmol/l) Median (-26.7 to -19.0) Any respiratory symptoms n (%) 90 (65%)67 (49%)-16.7 (-26.1 to -7.3) Runny nose/sneezing n (%) 61 (44%)48 (35%)-9.4 (-19.8 to 0.9)

Example: Supplementary feeding with either ready-to- use fortified spread or corn-soy blend in wasted adults starting antiretroviral therapy in Malawi (MacDonald et al; BMJ May 2009) “After 14 weeks, patients receiving fortified spread had a greater increase in BMI and fat-free body mass than those receiving corn-soy blend: 2.2 (SD 1.9) v 1.7 (SD 1.6) (difference 0.5, 95% confidence interval 0.2 to 0.8), and 2.9 (SD 3.2) v 2.2 (SD 3.0) kg (difference 0.7 kg, 0.2 to 1.2 kg), respectively.”

Example: Sample size matters What proportion of patients attending clinic are satisfied? Sample size Number satisfied Proportion satisfied 95% CI for proportion 10770%35% to 93% %50% to 88% %55% to 82% %60% to 79% %67% to 73%

Example: % confidence matters Sample size = 50 No. satisfied = 35 Proportion satisfied= 70% 90% CI58% to 81% 95% CI55% to 82% 99% CI51% to 85% What proportion of patients attending clinic are satisfied?

p-values vs. Confidence Intervals p-value: -Weight of evidence to reject null hypothesis -No clinical interpretation Confidence Interval: -Can be used to reject null hypothesis -Clinical interpretation -Effect size -Direction of effect -Precision of population estimate

So… it’s not all about p-values! For some hypotheses p-value and CI will both indicate whether to reject it or not. A CI will also provide an estimate, as well as a range for that estimate. General medical journals prefer CI.

Statistical Packages PackageSummary StatisticsConfidence Intervals SPSS Not user-friendly Gives a large choice of statistics to calculate Doesn’t provide a CI for some key comparative statistics: e.g. simple percentage Stats Direct One right-click Will produce a set 20 or so of the most commonly used statistics Provides a CI for most statistics

Thanks for listening!