Introduction to Statistics Dr Linda Morgan Clinical Chemistry Division School of Clinical Laboratory Sciences.

Slides:



Advertisements
Similar presentations
“Students” t-test.
Advertisements

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Hypothesis Testing Steps in Hypothesis Testing:
Statistical Tests Karen H. Hagglund, M.S.
INDEPENDENT SAMPLES T Purpose: Test whether two means are significantly different Design: between subjects scores are unpaired between groups.
T-Tests.
t-Tests Overview of t-Tests How a t-Test Works How a t-Test Works Single-Sample t Single-Sample t Independent Samples t Independent Samples t Paired.
T-Tests.
DATA ANALYSIS I MKT525. Plan of analysis What decision must be made? What are research objectives? What do you have to know to reach those objectives?
Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5 October 30,2008.
Final Review Session.
Lecture 9: One Way ANOVA Between Subjects
Chapter Topics Confidence Interval Estimation for the Mean (s Known)
EXPERIMENTAL DESIGN Random assignment Who gets assigned to what? How does it work What are limits to its efficacy?
Chapter 2 Simple Comparative Experiments
Inferences About Process Quality
5-3 Inference on the Means of Two Populations, Variances Unknown
Summary of Quantitative Analysis Neuman and Robson Ch. 11
1 Introduction to biostatistics Lecture plan 1. Basics 2. Variable types 3. Descriptive statistics: Categorical data Categorical data Numerical data Numerical.
AM Recitation 2/10/11.
Statistical Analysis I have all this data. Now what does it mean?
Ch 10 Comparing Two Proportions Target Goal: I can determine the significance of a two sample proportion. 10.1b h.w: pg 623: 15, 17, 21, 23.
Statistical Analysis Statistical Analysis
Statistical Techniques I EXST7005 Review. Objectives n Develop an understanding and appreciation of Statistical Inference - particularly Hypothesis testing.
Choosing and using statistics to test ecological hypotheses
Statistical Significance R.Raveendran. Heart rate (bpm) Mean ± SEM n In men ± In women ± The difference between means.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 17 Inferential Statistics.
1 G Lect 6b G Lecture 6b Generalizing from tests of quantitative variables to tests of categorical variables Testing a hypothesis about a.
Statistics for the Behavioral Sciences Second Edition Chapter 11: The Independent-Samples t Test iClicker Questions Copyright © 2012 by Worth Publishers.
Testing means, part II The paired t-test. Outline of lecture Options in statistics –sometimes there is more than one option One-sample t-test: review.
The exam duration: 1hour 30 min. Marks :25 All MCQ’s. You should choose the correct answer. No major calculations, but simple maths IQ is required. No.
2005 Updated 10/19/09 Sampling Distribution Tripthi M. Mathew, MD, MPH, MBA.
Statistical estimation, confidence intervals
DATA IDENTIFICATION AND ANALYSIS. Introduction  During design phase of a study, the investigator must decide which type of data will be collected and.
Sampling Distribution Tripthi M. Mathew, MD, MPH.
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
Introduction to Inferential Statistics Statistical analyses are initially divided into: Descriptive Statistics or Inferential Statistics. Descriptive Statistics.
Determination of Sample Size: A Review of Statistical Theory
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
STATISTICAL ANALYSIS FOR THE MATHEMATICALLY-CHALLENGED Associate Professor Phua Kai Lit School of Medicine & Health Sciences Monash University (Sunway.
Medical Statistics as a science
Inferential Statistics. The Logic of Inferential Statistics Makes inferences about a population from a sample Makes inferences about a population from.
Three Broad Purposes of Quantitative Research 1. Description 2. Theory Testing 3. Theory Generation.
Chapter 10 The t Test for Two Independent Samples
© Copyright McGraw-Hill 2004
Statistical inference Statistical inference Its application for health science research Bandit Thinkhamrop, Ph.D.(Statistics) Department of Biostatistics.
1 Testing Statistical Hypothesis The One Sample t-Test Heibatollah Baghi, and Mastee Badii.
Revision of topics for CMED 305 Final Exam. The exam duration: 2 hours Marks :25 All MCQ’s. (50 questions) You should choose the correct answer. No major.
Today’s lesson (Chapter 12) Paired experimental designs Paired t-test Confidence interval for E(W-Y)
Confidence Intervals Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
Statistical Significance or Hypothesis Testing. Significance testing Learning objectives of this lecture are to Understand Hypothesis: definition & types.
Statistics and probability Dr. Khaled Ismael Almghari Phone No:
Data Presentation Numerical Summary Measures Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU.
Doc.RNDr.Iveta Bedáňová, Ph.D.
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Chapter 2 Simple Comparative Experiments
Basic Statistics Overview
Part Three. Data Analysis
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
SA3202 Statistical Methods for Social Sciences
Basic analysis Process the data validation editing coding data entry
Statistical Process Control
STATISTICS Topic 1 IB Biology Miss Werba.
Chapter Nine: Using Statistics to Answer Questions
PSY 250 Hunter College Spring 2018
Georgi Iskrov, MBA, MPH, PhD Department of Social Medicine
Statistical Inference for the Mean: t-test
Introductory Statistics
Presentation transcript:

Introduction to Statistics Dr Linda Morgan Clinical Chemistry Division School of Clinical Laboratory Sciences

Outline Types of data Descriptive statistics Estimates and confidence intervals Hypothesis testing Comparing groups Relation between variables Statistical aspects of study design Pitfalls

Types of data Categorical data –Ordered categorical data Numerical data –Discrete –Continuous

Descriptive statistics Categorical variables Graphical representation – bar diagram Numbers and proportions in each category

Descriptive statistics Continuous variables Distributions –Gaussian –Lognormal –Non-parametric Central tendency –Mean –Median Scatter –Standard deviation –Range –Interquartile range

Gaussian (normal) distribution

Central tendency Mean =  x n Scatter Variance =  (x-mean) 2 n –1 Standard deviation =  variance Gaussian (normal) distribution

Lognormal distribution

Mean =  log x n Geometric mean = antilog of mean (10 mean ) Median –Rank data in order –Median = (n+1) / 2 th observation

Variability Variance =  (x-mean) 2 n –1 Standard deviation =  variance Range Interquartile range

Variability of Sample Mean The sample mean is an estimate of the population mean The standard error of the mean describes the distribution of the sample mean Estimated SEM = SD/  n The distribution of the sample mean is Normal providing n is large

Standard error of the difference between two means SEM = SD/  n Variance of the mean = SD 2 /n Variance of the difference between two sample means = sum of the variances of the two means = (SD 2 /n) 1 + (SD 2 /n) 2 SE of difference between means =  [ (SD 2 /n) 1 + (SD 2 /n) 2 ]

Variability of a sample proportion Assume Normal distribution when np and n(1-p) are > 5 SE of a Binomial proportion =  (pq/n) where q = 1-p

Standard error of the difference between two proportions SE (p 1 – p 2 ) =  [variance (p 1 ) + variance (p 2 ) ] =  [ ( p 1 q 1 /n 1 ) + ( p 2 q 2 /n 2 ) ]

Confidence intervals of means 95% ci for the mean = Sample mean  1.96 SEM 95% ci for difference between 2 means = (mean 1 – mean 2 )  1.96 SE of difference

Confidence intervals of proportions 95% ci for proportion = p  1.96  (pq/n) 95% ci for difference between two proportions = (p 1 – p 2 )  1.96 x SE (p 1 – p 2 )

Hypothesis testing The null hypothesis The alternative hypothesis What is a P value?

Comparing 2 groups of continuous data Normal distribution: paired or unpaired t test Non-Normal distribution: transform data OR Mann-Whitney-Wilcoxon test

Paired t test We wish to compare the fasting blood cholesterol levels in 10 subjects before and after treatment with a new drug. What is the null hypothesis?

Paired t test SubjectFasting cholesterolD NumberPredrugPostdrug

Paired t test Calculate the mean and SEM of D The null hypothesis is that D = 0 The test statistic t = mean(d) – 0 SEM (d)

Paired t test Mean = 0.62 SEM = t = Degrees of freedom = n - 1 = 9 From tables of t, 2-tailed probability (P) is between 0.1 and 0.2 How would you interpret this?

Comparing 2 groups of categorical data In a study of the effect of smoking on the risk of developing ischaemic heart disease, 250 men with IHD and 250 age-matched healthy controls were asked about their current smoking habits. What is the null hypothesis?

Results 70 of the 250 patients were smokers 30 of the healthy controls were smokers SmokerNon- smoker Total IHD Control Total

SmokerNon-smokerTotal IHD Control Total Calculate expected values, E, for each cell

Calculate (observed – expected) value, D SmokerNon-smokerTotal IHD 70 – 50 = –200= -20 Control 30-50= = 20 Total

Calculate D 2 /E SmokerNon-smokerTotal IHD 400/50= 8 400/200= 2 Control 400/50= 8 400/200= 2 Total

Calculate the sum of D 2 /E = 20 This is the test statistic, chi squared Compare with tables of chi squared with (r-1)(c-1) degrees of freedom In this case, chi squared with 1 df has a P value of < How do you interpret this?

Statistical analysis using computer software SPSS as an example

Planning Experimental design Suitable controls Database design

Statistical power The power of a study to detect an effect depends on: –The size of the effect –The sample size The probability of failing to detect an effect where one exists is called  The power of a study is 100(1-  )% Wide confidence intervals indicate low statistical power

Statistical power The necessary sample size to detect the effect of interest should be calculated in advance Pilot data are usually required for these calculations

Statistical power - example 30% of the population are carriers of a genetic variant. You wish to test whether this variant increases the risk of Alzheimers Disease. For P < 0.05, and 80% power, number of controls and cases required: Control carriersCase carriersSample size 30%50% %40% %35%1400

Multiple testing Number ofProbability of Testsfalse positive Bonferroni correction: Divide 0.05 by the number of tests to provide the required P value for hypothesis testing at the conventional level of statistical significance

Data trawling Decide in advance which statistical tests are to be performed Post hoc testing of subgroups should be viewed with caution Multiple correlations should be avoided

HELP! “In house” support Cripps Computing Centre Trent Institute for Health Service Research Practical Statistics for Medical Research Douglas G Altman