Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,

Slides:



Advertisements
Similar presentations
Previous Lecture: Distributions. Introduction to Biostatistics and Bioinformatics Estimation I This Lecture By Judy Zhong Assistant Professor Division.
Advertisements

Sampling: Final and Initial Sample Size Determination
Objectives (BPS chapter 18) Inference about a Population Mean  Conditions for inference  The t distribution  The one-sample t confidence interval 
PSY 307 – Statistics for the Behavioral Sciences
BCOR 1020 Business Statistics Lecture 17 – March 18, 2008.
Sampling Distributions
Chap 9-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 9 Estimation: Additional Topics Statistics for Business and Economics.
BS704 Class 7 Hypothesis Testing Procedures
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
Hypothesis Testing Using The One-Sample t-Test
Comparing Population Parameters (Z-test, t-tests and Chi-Square test) Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology Director,
Chapter 9 Hypothesis Testing II. Chapter Outline  Introduction  Hypothesis Testing with Sample Means (Large Samples)  Hypothesis Testing with Sample.
1/49 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 9 Estimation: Additional Topics.
Confidence Interval A confidence interval (or interval estimate) is a range (or an interval) of values used to estimate the true value of a population.
Are exposures associated with disease?
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
Analytic Epidemiology
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
Overview Definition Hypothesis
Statistics for clinicians l Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Single-Sample T-Test Quantitative Methods in HPELS 440:210.
Section 10.1 ~ t Distribution for Inferences about a Mean Introduction to Probability and Statistics Ms. Young.
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Estimation of Various Population Parameters Point Estimation and Confidence Intervals Dr. M. H. Rahbar Professor of Biostatistics Department of Epidemiology.
September 15. In Chapter 11: 11.1 Estimated Standard Error of the Mean 11.2 Student’s t Distribution 11.3 One-Sample t Test 11.4 Confidence Interval for.
Estimation of Statistical Parameters
Topic 5 Statistical inference: point and interval estimate
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Week 8 Chapter 8 - Hypothesis Testing I: The One-Sample Case.
Chapter 9 Hypothesis Testing and Estimation for Two Population Parameters.
1 Objective Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means.
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
POTH 612A Quantitative Analysis Dr. Nancy Mayo. © Nancy E. Mayo A Framework for Asking Questions Population Exposure (Level 1) Comparison Level 2 OutcomeTimePECOT.
Statistics for the Behavioral Sciences Second Edition Chapter 11: The Independent-Samples t Test iClicker Questions Copyright © 2012 by Worth Publishers.
N318b Winter 2002 Nursing Statistics Hypothesis and Inference tests, Type I and II errors, p-values, Confidence Intervals Lecture 5.
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
1 Estimation From Sample Data Chapter 08. Chapter 8 - Learning Objectives Explain the difference between a point and an interval estimate. Construct and.
Statistical estimation, confidence intervals
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
The binomial applied: absolute and relative risks, chi-square.
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
© Copyright McGraw-Hill 2000
1 EPI 5240: Introduction to Epidemiology Measures used to compare groups October 5, 2009 Dr. N. Birkett, Department of Epidemiology & Community Medicine,
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved. Essentials of Business Statistics: Communicating with Numbers By Sanjiv Jaggia and.
Chapter 10 The t Test for Two Independent Samples
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
Statistical Analysis II Lan Kong Associate Professor Division of Biostatistics and Bioinformatics Department of Public Health Sciences December 15, 2015.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
More Contingency Tables & Paired Categorical Data Lecture 8.
1 Estimation of Population Mean Dr. T. T. Kachwala.
Chapter 7 Inference Concerning Populations (Numeric Responses)
AP Statistics Chapter 24 Notes “Comparing Two Sample Means”
Confidence Intervals Dr. Amjad El-Shanti MD, PMH,Dr PH University of Palestine 2016.
Dr.Theingi Community Medicine
Chapter 6 Inferences Based on a Single Sample: Estimation with Confidence Intervals Slides for Optional Sections Section 7.5 Finite Population Correction.
Two-Sample Hypothesis Testing
The binomial applied: absolute and relative risks, chi-square
Association between two categorical variables
Elementary Statistics
SAMPLE SIZE DETERMINATION
Comparing Populations
Quantitative Methods in HPELS HPELS 6210
What are their purposes? What kinds?
Interpreting Epidemiologic Results.
Presentation transcript:

Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida, College of Nursing Professor, College of Public Health Department of Epidemiology and Biostatistics Associate Member, Byrd Alzheimer’s Institute Morsani College of Medicine Tampa, FL, USA 1

SECTION 3.1 Module Overview and Introduction Confidence intervals, estimation of parameters, and hypothesis testing.

Module 3 Learning Objectives: 1.Describe the concepts of parameter estimation and confidence intervals 2.Apply use of the z and t distribution for calculation of confidence intervals based on sample size 3.Select appropriate z and t values based on the width of a desired confidence interval 4.Calculate and interpret confidence intervals for means, proportions, and relative risk for one and two sample designs including matched design 5.Use SPSS to calculate confidence intervals 6.Distinguish the theoretical relationship between the risk ratio and odds ratio

Module 3 Learning Objectives: 7.List the concept, guidelines, and primary steps involved in hypothesis testing 8.Differentiate between the “null” and “alternative” hypothesis. 9.Understand and interpret parameters used in hypothesis testing (level of significance, p-value). 10.Differentiate type I and type II error and factors that impact statistical power. 11.Calculate and interpret sample hypotheses: a) One-sample - continuous outcome b) One-sample - dichotomous outcome c) One-sample - categorical/ ordinal outcome d) Matched design – continuous outcome

Assigned Reading: Textbook: Essentials of Biostatistics in Public Health Chapters 6 and 7

Key terms Estimation: Process of determining a likely value for a population parameter (e.g. mean or proportion) based on a sample. Point Estimate: Single valued estimate of a population parameter, such as a mean or a proportion. Confidence Interval (CI): Range of values (e.g. likely) for a population parameter with a level of confidence attached (e.g. 95% confidence that the interval contains the unknown parameter). General form for CI is: point estimate margin of error Common confidence levels are 90%, 95%, and 99% but, theoretically, any level between 0% and 100% can be selected.

SECTION 3.2 Use of the z and t distributions for calculation of confidence intervals

For the standard normal distribution, the following is true: P(-1.96 < z < 1.96) = 0.95 i.e. there is a 95% probability that a standard normal variable, denoted z, will fall between and Using the Central Limit Theorem, and some algebra, the 95% confidence interval (CI) for the population mean is: General form for a CI can be written as: point estimate + zSE(point estimate) where z is value from standard normal distribution reflecting the desired confidence level, and SE=standard error of the point estimate

For the formula below for the mean (or any other parameter, we often do not know the true value of the population standard deviation (σ) For large sample sizes (n > 30), σ can be estimated from the sample standard deviation (s) based on the Central Limit Theorem. For small sample size (n < 30), the Central Limit Theorem does not apply, and instead, the t distribution is used (Table 2 of Appendix) t values depend on n small samples have larger t value (less precision) values are indexed by degrees of freedom (df = n-1)

Listing of Selected t Values for Confidence Intervals Confidence Level df80%90%95%98%99% Example: For a confidence interval of a mean with n < 30, use t:

SECTION 3.3 Calculation and interpretation of confidence intervals One Sample a)Continuous outcome b)Dichotomous outcome

CI for One Sample – Continuous Outcome Parameter:Mean Body Mass Index (BMI) Sample N:180 (n > 30, so use large sample – z value) Sample Mean:28.2 Sample SD:5.4 Confidence Level:95%Z value: % Confidence Interval for μ: = x (5.4 / sqrt(180)) = = (27.4, 29.0)

µ 28.2 Lower limit 27.4 Upper limit % C.I. = x (5.4 / sqrt(180)) = = (27.4, 29.0) From the sample, we estimate the mean BMI as 28.2, and are 95% confident that the true population mean lies between the interval of 27.4 to 29.0

CI for One Sample – Continuous Outcome (Practice) Parameter:Mean diastolic blood pressure Sample N:503 Sample Mean:80.69 Sample SD: Confidence Level:95%Z value: ___ or t value: ____ 95% Confidence Interval for μ:

CI for One Sample – Continuous Outcome (Practice) Parameter:Mean diastolic blood pressure Sample N:503 (large sample, n > 30) Sample Mean:80.69 Sample SD: Confidence Level:95%Z value: % Confidence Interval for μ: = x / sqrt(503) = = (79.8, 81.6)

CI for One Sample – Continuous Outcome (Practice) Parameter:Mean diastolic blood pressure Sample N:503 (large sample, n > 30) Sample Mean:80.69 Sample SD: Confidence Level:95%Z value: 1.96 = x / sqrt(503) = = (79.8, 81.6) SPSS Analyze Compare Means One Sample T Test Options: 95% confidence interval

CI for One Sample – Continuous Outcome (Practice) Parameter:Mean resting pulse (beats per minute) Sample N:14 Sample Mean:63.3 Sample SD:9.5 Confidence Level:95%Z value: ___ or t value: ____ 95% Confidence Interval for μ:

CI for One Sample – Continuous Outcome (Practice) Parameter:Mean resting pulse (beats per minute) Sample N:14 (small sample, n > 30) Sample Mean:63.3 Sample SD:9.5 Confidence Level:95%t value: 2.16 (i.e. n-1) 95% Confidence Interval for μ: = x 9.5 / sqrt(14) = = (57.8, 68.8)

CI for One Sample – Dichotomous Outcome Parameter:Proportion of population treated for hypertension Sample N:3,532 (large sample, so use z value) Sample Proportion:0.345 (i.e. 1,219 / 3,532) Confidence Level:95%Z value: % Confidence Interval for: = (0.329, 0.361) From the sample, we estimate the proportion of persons treated for hypertension to be 0.345, and we are 95% confident that the true proportion lies between the interval of to

CI for One Sample – Dichotomous Outcome (Practice) Parameter:Proportion of population with diabetes Sample N:501 Sample Proportion:(91 / 501)= _______ Confidence Level:95%Z value: _______ 95% Confidence Interval for:

CI for One Sample – Dichotomous Outcome (Practice) Parameter:Proportion of population with diabetes Sample N:501 (large sample, so use z value) Sample Proportion:(91 / 501)= Confidence Level:95%Z value: % Confidence Interval for: (0.148, 0.215) From the sample, we estimate the proportion of persons with diabetes to be , and we are 95% confident that the true proportion lies between the interval of to =

SECTION 3.4 Calculation and interpretation of confidence intervals Two Samples – Matched a)Continuous outcome

CI for Two Samples – Matched Continuous Outcome  Often used for intervention studies with a pre- and post-measurement design (e.g. before and after treatment)  Goal is to compare the mean score before and after the intervention  Because the sample is matched (same persons completing pre- and post measurements), cannot use aggregate means (i.e. see below) Subject IDPrePostDifference  Parameter of interest is the mean difference, denoted μ d  Parameter of interest is SD of the difference scores, denoted s d

CI for Two Samples – Matched Continuous Outcome Parameter:Mean difference in depressive symptom scores after taking a new drug: X d = Sample N:100 (number of persons, not measurements) Sample SD:SD of difference scores: s d = 8.9 Confidence Level: 95%Z value: 1.96 = x (8.9 / sqrt(100)) = = (-14,4, -11.0)

CI for Two Samples – Matched Continuous Outcome (Practice) Parameter:Mean difference in anxiety symptom scores after psychotherapy: X d = Sample N:52 (number of persons, not measurements) Sample SD:SD of difference scores: s d = 9.6 Confidence Level: 90%Z value: ______

CI for Two Samples – Matched Continuous Outcome (Practice) Parameter:Mean difference in anxiety symptom scores after psychotherapy: X d = Sample N:52 (number of persons, not measurements) Sample SD:SD of difference scores: s d = 9.6 Confidence Level: 90%Z value: = x (9.6 / sqrt(52)) = = (-17.0, -12.6) From the sample, we estimate a mean difference in anxiety scores of after undergoing psychotherapy, and we are 90% confident that the true proportion lies between the interval of to

SECTION 3.5 Calculation and interpretation of confidence intervals Two Samples - Independent a)Continuous – mean difference b)Dichotomous – risk difference c)Dichotomous – risk ratio d)Dichotomous – odds ratio

CI for Two Samples – Independent Continuous Outcome  Common parameter of interest is difference in means between the two groups, X 1 and X 2, and denoted for the population as:  Since there are 2 independent groups, we also have: n 1 and n 2 and s 1 and s 2  If the sample variances are approximately equal, then we can “pool” the standard deviations, s 1 and s 2. A typical rule of thumb to pool is: s 2 1 / s 2 2 > 0.5 and s 2 1 / s 2 2 < 2.0  The pooled (common) standard deviation is a weighted average: μ 1 – μ 2

CI for Two Samples – Independent Continuous Outcome Parameter:Mean difference in systolic blood pressure between a sample of men and a sample of women X men = 128.2; n 1 = 1623;s 1 = 17.5 X women = 126.5; n 2 = 1911;s 2 = 20.1 Note:s 2 1 / s 2 2 = 0.76, so can use pooled SD (S p ) Confidence Level: 95%Z value: 1.96 = sqrt(359.12) = 19.0 Formula

CI for Two Samples – Independent Continuous Outcome Parameter:Mean difference in systolic blood pressure between a sample of men and women X men = 128.2; n 1 = 1623;s 1 = 17.5 X women = 126.5; n 2 = 1911;s 2 = 20.1 Formula = = (0.44, 2.96)

CI for Two Samples – Independent Continuous Outcome (Practice) Parameter:Mean difference in depression scores between a sample of men and women X men = 5.77; n 1 = 163;s 1 = X women = 6.86; n 2 = 333;s 2 = Note:s 2 1 / s 2 2 = = _________ Assume calculation of a 95% confidence interval

CI for Two Samples – Independent Continuous Outcome (Practice) Parameter:Mean difference in depression scores between a sample of men and women X men = 5.77; n 1 = 163;s 1 = X women = 6.86; n 2 = 333;s 2 = Note:s 2 1 / s 2 2 = 0.78, so can use pooled SD (S p ) = sqrt(( ) / 494) = 8.39 (5.77 – 6.86) (8.39) = (-2.66, 0.49) = -1.09

CI for Two Samples – Independent Continuous Outcome (Practice) Parameter:Mean difference in depression scores between a sample of men and women X men = 5.77; n 1 = 163;s 1 = X women = 6.86; n 2 = 333;s 2 = SPSS Analyze Compare Means Independent Samples T Test Test Variable Grouping Variable Options – CI percentage From the sample, we estimate a mean difference in depression scores between men and women of -1.09, and we are 95% confident that the true mean difference lies between the interval of to 0.49.

CI for Two Samples – Independent: Risk Difference  Parameter of interest is the risk difference for the incidence proportions in the population, denoted as RD = p 1 – p 2  For a sample, the point estimate for the risk difference is denoted as: RD = p 1 – p 2 Formula No CVDCVDTotalIncidence Current smoker66381 (x 1 )744p 1 = 81 / 744 = Non-smoker (x 2 )3055p 2 = 298 / 3055 = Total Example: Incidence of CVD in Smokers and Non-Smokers

CI for Two Samples – Independent: Risk Difference  Example: Compare the incidence proportion of CHD among smokers (exposed) and non-smokers (not exposed) Smokers:n 1 = 744w/CHD(x 1 ) = 81p 1 = Non-smokers:n 2 = 3055w/CHD(x 2 ) = 298p 2 = Confidence Level:95%Z value: 1.96 = = ( , )

CI for Two Samples – Independent: Risk Difference (Practice)  Example: Compare the incidence proportion of sleep disorder among person on statins (exposed) and not on statins (not exposed) Confidence Level:95%Z value: _______ Sleep OKSleep DxTotalIncidence Statin user9114 (x 1 )105p 1 = 14 / 105 = Non-statin user36928 (x 2 )397p 2 = 28 / 397 = Total

CI for Two Samples – Independent: Risk Difference (Practice)  Example: Compare the incidence proportion of sleep disorder among person on statins (exposed) and not on statins (not exposed) Confidence Level:95%Z value: 1.96 = = (-0.007, 0.133) Sleep OKSleep DxTotalIncidence Statin user9114 (x 1 )105p 1 = 14 / 105 = Non-statin user36928 (x 2 )397p 2 = 28 / 397 = Total (1 – ) (1 – ) –

CI for Two Samples – Independent: Risk Difference (Practice)  Example: Compare the incidence proportion of sleep disorder among person on statins (exposed) and not on statins (not exposed) Confidence Level:95%Z value: 1.96 = = (-0.007, 0.133) Sleep OKSleep DxTotalIncidence Statin user9114 (x 1 )105p 1 = 14 / 105 = Non-statin user36928 (x 2 )397p 2 = 28 / 397 = Total From the sample, we estimate that absolute risk of sleep disorder is higher in statin-users compared to non-users, and we are 95% confident that the true risk difference lies between the interval of to

CI for Two Samples – Independent: Risk Ratio  Parameter of interest is the ratio of the incidence proportions for the population, denoted as RR = p 1 / p 2  For a sample, the point estimate for the risk ratio (RR) is denoted as: RR = p 1 / p 2  Note that the RR does not follow a normal distribution, but the natural log (ln) of the RR is approximately normally distributed and is used to calculate the confidence interval – this entails 2 steps: ---Calculate CI for ln(RR) ---Calculate CI for RR (i.e. transform) CI for ln(RR): CI for (RR):exp(Lower limit), exp(Upper limit)

CI for Two Samples – Independent: Risk Ratio RR = p 1 / p 2 CI for ln(RR): CI for (RR):exp(Lower limit), exp(Upper limit)  Example: Compare future risk of CHD among smokers (exposed) and non-smokers (not exposed) Smokers:n 1 = 744w/CHD(x 1 ) = 81 p 1 = Non-smokers:n 2 = 3055w/CHD(x 2 ) = 298 p 2 = Confidence Level:95%Z value: 1.96 RR = p 1 / p 2 = / = 1.12 CI for ln(RR): = = (-0.119, 0.345) (exp(-0.119), exp(0.345)) = (0.89, 1.41)

CI for Two Samples – Independent: Risk Ratio (Practice) RR = p 1 / p 2 CI for ln(RR): CI for (RR):exp(Lower limit), exp(Upper limit)  Example: Compare the future risk of sleep disorder among statin users (exposed) versus non-statin users (not exposed) Confidence Level:95%Z value: _______ RR = p 1 / p 2 = CI for ln(RR): Sleep OKSleep DxTotalIncidence Statin user9114 (x 1 )105p 1 = 14 / 105 = Non-statin user36928 (x 2 )397p 2 = 28 / 397 = Total

CI for Two Samples – Independent: Risk Ratio (Practice) RR = p 1 / p 2 CI for ln(RR): CI for (RR):exp(Lower limit), exp(Upper limit)  Example: Compare the future risk of sleep disorder among statin users (exposed) versus non-statin users (not exposed) Confidence Level:95%Z value: 1.96 RR = p 1 / p 2 = / = 1.89 CI for ln(RR): = = (0.0322, ) (exp(0.0322), exp(1.24)) = (1.03, 3.46) Sleep OKSleep DxTotalIncidence Statin user9114 (x 1 )105p 1 = 14 / 105 = Non-statin user36928 (x 2 )397p 2 = 28 / 397 = Total

CI for Two Samples – Independent: Risk Ratio (Practice)  Example: Compare the future risk of sleep disorder among statin users (exposed) versus non-statin users (not exposed) Confidence Level:95%Z value: 1.96 RR = p 1 / p 2 = / = 1.89 CI for ln(RR): = = (0.0322, ) (exp(0.0322), exp(1.24)) = (1.03, 3.46) Sleep OKSleep DxTotalIncidence Statin user9114 (x 1 )105p 1 = 14 / 105 = Non-statin user36928 (x 2 )397p 2 = 28 / 397 = Total From the sample, we estimate that risk of sleep disorder is 1.89 times higher in statin-users compared to non-users, and we are 95% confident that the true risk lies between the interval of 1.03 to 3.46.

CI for Two Samples – Independent: Odds Ratio  Conceptually similar to risk ratio, yet the parameter of interest is the odds ratio (OR), defined as: Odds of exposure among cases / Odds of exposure among controls CVD (D + )No-CVD (D - ) Current smoker(E + )81663 Non-smoker (E - ) Example: Prevalence of CVD in Smokers and Non-Smokers (95% C.I.) CasesControls Exposedab Not exposedcd CI for ln(OR): OR = (81 / 298) / (663 / 2757) = 1.13Z = 1.96 = = (-0.138, 0.382) (exp(-0.138), exp(0.382)) = (0.87, 1.47)

CI for Two Samples – Independent: Odds Ratio (Practice) OR = Odds of exposure among cases / Odds of exposure among controls Prevalence of Sleep Disorder Among Statin and Non-Statin Users (95% C.I.) CasesControls Exposedab Not exposedcd CI for ln(OR): OR = (a / c) / (b / d) = _________Z = ___________ Sleep DxSleep OK Statin user (E+)1491 Non-statin user (E-)28369 CI for (OR): exp(Lower limit), exp(Upper limit)

CI for Two Samples – Independent: Odds Ratio (Practice) OR = Odds of exposure among cases / Odds of exposure among controls Example: Prevalence of Sleep Disorder Among Statin and Non-Statin Users CasesControls Exposedab Not exposedcd CI for ln(OR): OR = (14 / 28) / (91 / 369) = 2.027Z = 1.96 = = (0.0253, ) (exp(0.0253), exp(1.3879)) = (1.03, 4.01) Sleep DxSleep OK Statin user (E+)1491 Non-statin user (E-)28369

CI for Two Samples – Independent: Odds Ratio (Practice) OR = Odds of exposure among cases / Odds of exposure among controls Example: Prevalence of Sleep Disorder Among Statin and Non-Statin Users CasesControls Exposedab Not exposedcd OR = (14 / 28) / (91 / 369) = Sleep DxSleep OK Statin user (E+)1491 Non-statin user (E-)28369 From the sample, we estimate that the odds of statin use among persons with sleep disorder are 2.03 times higher that the odds of statin-use among persons without sleep disorder, and we are 95% confident that the value lies between the interval of 1.03 to = = (0.0253, ) (exp(0.0253), exp(1.3879)) = (1.03, 4.01)

SECTION 3.6 Use of SPSS to calculate confidence intervals

CI for Two Samples – Independent: Odds Ratio (Practice) Example: Prevalence of Sleep Disorder Among Statin and Non-Statin Users CasesControls Exposedab Not exposedcd OR = (14 / 28) / (91 / 369) = Sleep DxSleep OK Statin user (E+)1491 Non-statin user (E-)28369 = = (0.0253, ) (exp(0.0253), exp(1.3879)) = (1.03, 4.01) SPSS Analyze Descriptive Statistics Crosstabs Row and Column Variable Statistics (check “Risk”)

1.0 Null value Lower limit 1.04 Upper limit 4.01 Sleep DxSleep OK Statin user (E+)1491 Non-statin user (E-)28369 OR = % C.I. = 1.04, Bounded at 0 10 Unbounded OR = 2.03 Note: The confidence interval for a continuous variable such as mean or difference in mean is symmetric around the point estimate. In contrast, for the risk ratio and odds ratio, the confidence interval is skewed to the right of the point estimate: This is because: a)Values for RR and OR have a lower bound of 0 yet no upper bound b)The C.I. formulas are based on an exponential function

SECTION 3.7 Relationship between the risk ratio and the odds ratio

Odds Ratio & Risk Ratio Relationship between RR and OR: The odds ratio will provide a good estimate of the risk ratio when: 1.The outcome (disease) is rare OR 2.The effect size is small or modest

Odds Ratio & Risk Ratio The odds ratio will provide a good estimate of the risk ratio when: 1.The outcome (disease) is rare D+D+ D-D- E+E+ ab E-E- cd OR = (a / c) / (b / d) OR = (ad) / (bc) a / (a +b ) RR = c / (c +d) If the disease is rare, then cells (a) and (c) will be small a / (a +b )a / b ad RR = = = --- = OR c / (c +d)c / d bc

Odds Ratio & Risk Ratio The odds ratio will provide a good estimate of the risk ratio when: 2.The effect size is small or modest. D+D+ D-D- E+E E-E (40 / 120)0.333 OR = = = 1.0 (60 / 180) / ( )0.40 RR = = / )0.40

Odds Ratio & Risk Ratio D+D+ D-D- E+E E-E Finally, we expect the risk ratio to be closer to the null value of 1.0 than the odds ratio. Therefore, be especially cautious when interpreting the odds ratio as a measure of relative risk when the outcome is not rare and the effect size is large. (20 / 10) 2.0 OR = = = 6.0 (30 / 90)0.333 (20 / 50)0.40 RR = = = 4.0 (10 / 100)0.10