Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,

Similar presentations


Presentation on theme: "Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,"— Presentation transcript:

1 Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida, College of Nursing Professor, College of Public Health Department of Epidemiology and Biostatistics Associate Member, Byrd Alzheimer’s Institute Morsani College of Medicine Tampa, FL, USA 1

2 SECTION 4.1 Module Overview and Introduction Hypothesis testing for 2 or more independent groups and non- parametric methods.

3 Module 4 Learning Outcomes: 1.Calculate and interpret 2 sample hypotheses: a)2 sample – continuous outcome b)>2 samples – continuous outcome c)2 sample dichotomous outcome d)>2 samples dichotomous outcome 2.Specify 2-sample hypotheses and conduct formal testing using SPSS 3.Differentiate between parametric and non- parametric tests 4.Identify properties of non-parametric tests

4 Module 4 Learning Outcomes: 5.Calculate and interpret non-parametric tests: a)2 independent samples – Wilcoxon Rank Sum Test b)Matched samples – Wilcoxon Signed Rank Test c)>2 independent samples – Kruskal Wallis Test 6.Conduct and interpret non-parametric analyses using SPSS.

5 Assigned Reading: Textbook: Essentials of Biostatistics in Public Health Chapter 7 Sections 7.5, 7.7 to 7.9 Pages 138-141 and 144-162 Chapter 10

6 SECTION 4.2 Framework of hypothesis testing

7 General Steps for Hypothesis Testing: 1)Set up the hypothesis and determine the level of statistical significance (including 1 versus 2-sided hypothesis). 2)Select the appropriate test statistic 3)Set up the decision rule 4)Compute the test statistic 5)Conclusion (interpretation)

8 Hypothesis Testing Calculations: 1)Two Sample – Independent Groups a)Continuous outcome (student t test) b)Dichotomous outcome (risk difference or risk ratio—chi-square test) 2)More than 2 Samples – Independent Groups a)Continuous outcome (analysis of variance- ANOVA) b)Categorical Outcome (chi-square test)

9 Framework of Hypothesis Testing  Goal is to compare sample parameter estimates (e.g. mean, proportion, etc.) between 2 or more independent groups.  The groups can be defined from a clinical trial, such as treatment versus placebo, or an observational study, such as men versus women, or exposed versus not exposed.  With 2 groups, one group serves as the “comparison” or “control” group representing the null value.  Groups do not need to be of the same size.  With more than 2 groups, can compare whether any groups differ (e.g. means) or whether groups differ in an ordered manner.

10 SECTION 4.3 Two-sample: independent groups – continuous outcome

11 1. Two-Sample: Independent Groups-Continuous Outcome  Parameter:Difference in population means: μ 1 – μ 2  H 0 :μ 1 – μ 2 = 0;μ 1 = μ 2  H 1 :μ 1 > μ 2 ;μ 1 < μ 2 ;μ 1 = μ 2 ;  Test statistics: n 1 > 30 and n 2 > 30 n 1 < 30 or n 2 < 30 Critical value of z in Table 1C Critical value of t in table 2 d.f. = n 1 + n 2 - 2

12 1. Two-Sample: Independent Groups-Continuous Outcome Example: From the Framingham Heart Study (offspring), compare mean systolic blood pressure between men and women. nX s Men 1623 128.2 17.5 Women 1911 126.5 20.1 1)Set up the hypothesis and determine the level of statistical significance (including 1 versus 2-sided hypothesis). H 0 :μ 1 = μ 2 H 1 :μ 1 = μ 2 (two-sided hypothesis) α = 0.05

13 1. Two-Sample: Independent Groups-Continuous Outcome Example: From the Framingham Heart Study (offspring), compare mean systolic blood pressure between men and women. nX s Men 1623 128.2 17.5 Women 1911 126.5 20.1 2)Select the appropriate test statistic: n 1 > 30 and n 2 > 30, so use z 3)Set up the decision rule: Reject H 0 if z 1.96

14 1. Two-Sample: Independent Groups-Continuous Outcome Example: From the Framingham Heart Study (offspring), compare mean systolic blood pressure between men and women. nX s Men 1623 128.2 17.5 Women 1911 126.5 20.1 4)Compute the test statistic: = sqrt(359.12) = 19.0 5)Conclusion: Reject H 0 because 2.66 > 1.96

15 1. Two-Sample: Independent Groups-Continuous Outcome (Practice) Example: From the Heart SCORE Study, compare mean total cholesterol levels between men and women. (α = 0.05) nX s Men 165 198.88 38.416 Women 337 222.23 42.023 1)Set up the hypothesis and determine the level of statistical significance (including 1 versus 2-sided hypothesis). H 0 :_________________ H 1 :_________________ 2)Select the appropriate test statistic: n 1 > 30 and n 2 > 30, so use z__________________ 3)Set up the decision rule: _________________________________________

16 1. Two-Sample: Independent Groups-Continuous Outcome (Practice) Example: From the Heart SCORE Study, compare mean total cholesterol levels between men and women. (α = 0.05) nX s Men 165 198.88 38.416 Women 337 222.23 42.023 1)Set up the hypothesis and determine the level of statistical significance (including 1 versus 2-sided hypothesis). H 0 :μ 1 = μ 2 H 1 :μ 1 = μ 2 (two-sided hypothesis) 2)Select the appropriate test statistic: n 1 > 30 and n 2 > 30, so use z 3)Set up the decision rule: Reject H 0 if z 1.96

17 1. Two-Sample: Independent Groups-Continuous Outcome (Practice) Example: From the Heart SCORE Study, compare mean total cholesterol levels between men and women. (α = 0.05) nX s Men 165 198.88 38.416 Women 337 222.23 42.023 4)Compute the test statistic 5)Conclusion: _________________________________

18 1. Two-Sample: Independent Groups-Continuous Outcome (Practice) Example: From the Heart SCORE Study, compare mean total cholesterol levels between men and women. (α = 0.05) nX s Men 165 198.88 38.416 Women 337 222.23 42.023 4)Compute the test statistic 165 + 337 - 2 (165–1)(38.416) 2 + (337–1)(42.023) 2 S p = = 40.875 198.88 – 222.23 -23.35 z = ----------------------- = ------- = -6.01 40.875 1/165 + 1/337 3.884 5)Conclusion: Reject H0: abs(-6.01) > 1.96

19 1. Two-Sample: Independent Groups-Continuous Outcome (Practice) Example: From the Heart SCORE Study, compare mean total cholesterol levels between men and women. (α = 0.05) SPSS Analyze Compare Means Independent Samples T-Test Test Variable: Total cholesterol Group Variable: Gender (defined as 1,2) Options: 95% C.I.

20 SECTION 4.4 Two-sample: independent groups – dichotomous outcome

21 2. Two-Sample: Independent Groups-Dichotomous Outcome  Parameter:Risk Difference (RD)(p 1 – p 2 ) or Risk Ratio (RR)(p 1 / p 2 )  H 0 :RD: p 1 = p 2 ; or p 1 – p 2 = 0;RR: p 1 / p 2 = 1.0  H 1 :RD: p 1 = p 2 ; or p 1 – p 2 = 0;RR: p 1 / p 2 = 1.0  Test statistics: Critical value of z in Table 1C min[n 1 p 1, n 1 (1 – p 1 )] > 5 min[n 2 p 2, n 2 (1 – p 2 )] > 5 Note: p = proportion of successes (outcomes)

22 2. Two-Sample: Independent Groups-Dichotomous Outcome Example: From the Framingham Heart Study (offspring), compare the prevalence of CVD between smokers and non-smokers. No CVDCVD Total Smoker 663 81 744 p 1 = 81/744 = 0.1089 Non-smoker 2757 298 3055 p 2 = 298/3055 = 0.0975 (RD)(p 1 – p 2 = 0.0114);Risk Ratio (RR)(p 1 / p 2 = 1.12) 1)Set up the hypothesis and determine the level of statistical significance (including 1 versus 2-sided hypothesis). H 0 :p 1 = p 2 H 1 :p 1 = p 2 (two-sided hypothesis) α = 0.05

23 2. Two-Sample: Independent Groups-Dichotomous Outcome Example: From the Framingham Heart Study (offspring), compare the prevalence of CVD between smokers and non-smokers. No CVDCVD Total Smoker 663 81 744 p 1 = 81/744 = 0.1089 Non-smoker 2757 298 3055 p 2 = 298/3055 = 0.0975 2)Select the appropriate test statistic: min[n 1 p 1, n 1 (1 – p 1 )] > 5 min[n 2 p 2, n 2 (1 – p 2 )] > 5--- use z 3)Set up the decision rule: Reject H 0 if z 1.96

24 2. Two-Sample: Independent Groups-Dichotomous Outcome Example: From the Framingham Heart Study (offspring), compare the prevalence of CVD between smokers and non-smokers. No CVDCVD Total Smoker 663 81 744 p 1 = 81/744 = 0.1089 Non-smoker 2757 298 3055 p 2 = 298/3055 = 0.0975 4)Compute the test statistic: 81 + 298 379 p = ---------------- = -------- = 0.0988 744 + 3055 3799 0.1089 – 0.0975 z = -------------------------------------------- = 0.927 0.0988(1 – 0.0988)(1/744 + 1/3055) 5)Conclusion: Do not reject H 0 : -1.96 < 0.927 < 1.96

25 2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice) Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise. ExerciseNo diabetesDiabetes Total < 3 times/wk 177 18 195 p 1 = _____________ > 3 times/wk 278 23 301 p 2 = _____________ (RD) (p 1 – p 2 = _______);Risk Ratio (RR) (p 1 / p 2 = _______) 1)Set up the hypothesis and determine the level of statistical significance (including 1 versus 2-sided hypothesis). H 0 :_____________________________ H 1 :_____________________________ α = 0.05

26 2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice) Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise. ExerciseNo diabetesDiabetes Total < 3 times/wk 177 18 195 p 1 = 18/195 = 0.0923 > 3 times/wk 278 23 301 p 2 = 23/301 = 0.0764 (RD) (p 1 – p 2 = 0.0159);Risk Ratio (RR) (p 1 / p 2 = 1.21) 1)Set up the hypothesis and determine the level of statistical significance (including 1 versus 2-sided hypothesis). H 0 :p 1 = p 2 H 1 :p 1 = p 2 (two-sided hypothesis) α = 0.05

27 2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice) Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise. ExerciseNo diabetesDiabetes Total <3 times/wk 177 18 195 p 1 = _______ >3 times/wk 278 23 301 p 2 = _______ 2)Select the appropriate test statistic: min[n 1 p 1, n 1 (1 – p 1 )] > 5 min[n 2 p 2, n 2 (1 – p 2 )] > 5--- use z 3)Set up the decision rule: Reject H 0 if: ________________________

28 2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice) Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise. ExerciseNo diabetesDiabetes Total <3 times/wk 177 18 195 p 1 = 18/195 = 0.0923 >3 times/wk 278 23 301 p 2 = 23/301 = 0.0764 2)Select the appropriate test statistic: min[n 1 p 1, n 1 (1 – p 1 )] > 5 min[n 2 p 2, n 2 (1 – p 2 )] > 5--- use z 3)Set up the decision rule: Reject H 0 if z 1.96

29 2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice) Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise. ExerciseNo diabetesDiabetes Total <3 times/wk 177 18 195 p 1 = _______ >3 times/wk 278 23 301 p 2 = _______ 4)Compute the test statistic: p = ______________=__________ z = __________________________________ 5)Conclusion: ________________________________

30 2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice) Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise. ExerciseNo diabetesDiabetes Total <3 times/wk 177 18 195 p 1 = 18/195 = 0.0923 >3 times/wk 278 23 301 p 2 = 23/301 = 0.0764 4)Compute the test statistic: 18 + 23 41 p = ---------------- = -------- = 0.0827 195 + 301 496 0.0923 – 0.0764 z = ------------------------------------------- = 0.628 0.0827(1 – 0.0827)(1/195 + 1/301) 5)Conclusion: Do not reject H 0 : -1.96 < 0.628 < 1.96

31 2. Two-Sample: Independent Groups-Dichotomous Outcome (Practice) Example: From the Heart SCORE Study, compare the prevalence of diabetes by level of weekly exercise (α = 0.05) SPSS Analyze Descriptive Statistics Crosstabs Row Variable: Exercise >times/week Column Variable: History of diabetes Statistics – Chi-square Cells – Observed, Expected Note: Pearson chi-square test in SPSS includes Yates correction.

32 SECTION 4.5 More than two-samples: independent groups – continuous outcome

33 3. More Than Two Independent Groups-Continuous Outcome  Parameter:Difference in means for more than 2 groups (ANOVA)  H 0 :μ 1 = μ 2 = …μ k  H 1 :Means are not all equal  Test statistic: F value Find critical value in Table 4 (df 1 = k – 1; df 2 = N – k) Where n j = sample size in the jth group (e.g. j = 1,2,3……) Xj = mean in the jth group X = overall mean k = number of independent groups (k > 2) N = total number of observations in analysis ANOVA Assumptions:Outcome follows a normal distribution (all groups) Variances approximately equal among groups

34 3. More Than Two Independent Groups-Continuous Outcome “Between-group” variability F = ---------------------------------------- “Residual or error” variability (“within-group” variability) (i.e. variability in the outcome); (null hypothesis is that all groups are random samples) F statistic assesses whether differences among the means (the numerator) are larger than expected by chance. F statistic has 2 degrees of freedom; df 1 (numerator), df 2 (denominator) df 1 = k – 1; df 2 = N – k Table 4 contains critical values for the F distribution

35 3. More Than Two Independent Groups-Continuous Outcome Analysis of Variance (ANOVA) Table Source of Variation Sum of Squares (SS) Degrees of freedom (df) Mean Squares (MS)F Between-group SSB k - 1 SSB MSB = ---------- k – 1 MSB F = ------- MSE Error or residual (random) “within-group” SSE N - k SSE* MSE = ---------- N – k ------ Total SST N - 1 ------ *Textbook on page 150 has typographical error

36 3. More Than Two Independent Groups-Continuous Outcome Low CalorieLow FatLow CarbControl 8232 9452 634 7520 3133 n 1 = 5n 2 = 5n 3 = 5n 4 = 5 X 1 = 6.6X 2 = 3.0X 3 = 3.4X 4 = 1.2 Example: Weight Loss by Treatment (in Pounds) 1)Set up the hypothesis and determine level of statistical significance H 0 : µ 1 = µ 2 = µ 3 = µ 4 H 1 : Means are not all equal; α = 0.05 2)Select the appropriate test statistic

37 3. More Than Two Independent Groups-Continuous Outcome Low CalorieLow FatLow CarbControl n 1 = 5n 2 = 5n 3 = 5n 4 = 5 X 1 = 6.6X 2 = 3.0X 3 = 3.4X 4 = 1.2 Example: Weight Loss by Treatment (in Pounds) 3)Set up the decision rule --- see critical value in Table 4 df 1 = k – 1 = 4 – 1 = 3 df 2 = N – k= 20 – 4 = 16 Reject H 0 if F > 3.24

38 3. More Than Two Independent Groups-Continuous Outcome Low CalorieLow FatLow CarbControl n 1 = 5n 2 = 5n 3 = 5n 4 = 5 X 1 = 6.6X 2 = 3.0X 3 = 3.4X 4 = 1.2 Example: Weight Loss by Treatment (in Pounds) 4)Compute test statistic: SSB = SSE = (ANOVA table) MSB = SSB / (k – 1) MSE = SSE / (N – k) F = MSB / MSE SSB = 5(6.6 – 3.6) 2 + 5(3.0 – 3.6) 2 + 5(3.4 – 3.6) 2 + 5(1.2 – 3.6) 2 = 45.0 + 1.8 + 0.2 + 28.8 = 75.8 SSE = 21.4 + 10.0 + 5.4 + 10.6 = 47.4 (see tables 7-24 to 7-28, page 151 of text) MSB = 75.8 / (4 – 1) = 25.3 MSE = 47.4 / (20 – 4) = 3.0 F = 25.3 / 3.0 = 8.43 5) Conclusion: Reject H 0 ; 8.43 > 3.24

39 3. More Than Two Independent Groups-Continuous Outcome Low CalorieLow FatLow CarbControl n 1 = 5n 2 = 5n 3 = 5n 4 = 5 X 1 = 6.6X 2 = 3.0X 3 = 3.4X 4 = 1.2 Example: Weight Loss by Treatment (in Pounds) ANOVA orthogonal contrasts of mean: Sometimes, rather than just comparing a difference among all means, we wish to compare specific means or whether the means increase or decrease in a monotonic (linear) manner. This can be achieved with orthogonal contrasts of the means.  Sum of coefficients in each linear contrast must equal zero  In the example above: µ 1 versus (µ 2, µ 3, µ 4 )-3 1 1 1 (µ 1, µ 2 ) versus (µ 3, µ 4 )-1 -1 1 1 (µ 1, µ 2, µ 3 ) versus µ 4 -1 -1 -1 3 linear trend -2 -1 1 2

40 3. More Than Two Independent Groups-Continuous Outcome (Practice) NormalPre-hypertensiveHtn Stage IHtn Stage II n 1 = 88n 2 = 191n 3 = 139n 4 = 55 X 1 = 28.42X 2 = 29.43X 3 = 30.75X 4 = 33.39 s = 5.37s = 5.75s = 5.89s = 6.39 Example: Body Mass Index by Blood Pressure Classification 1)Set up the hypothesis and determine level of statistical significance H 0 :__________________________________ H 1 : ___________________________________ α = 0.05 2)Select the appropriate test statistic:__________________

41 3. More Than Two Independent Groups-Continuous Outcome (Practice) NormalPre-hypertensiveHtn Stage IHtn Stage II n 1 = 88n 2 = 191n 3 = 139n 4 = 55 X 1 = 28.42X 2 = 29.43X 3 = 30.75X 4 = 33.39 s = 5.37s = 5.75s = 5.89s = 6.39 Example: Body Mass Index by Blood Pressure Classification 1)Set up the hypothesis and determine level of statistical significance H 0 : µ 1 = µ 2 = µ 3 = µ 4 H 1 : Means are not all equal; H 1 : Means increase or decrease in a monotonic (linear) manner; α = 0.05 2)Select the appropriate test statistic

42 3. More Than Two Independent Groups-Continuous Outcome (Practice) NormalPre-hypertensiveHtn Stage IHtn Stage II n 1 = 88n 2 = 191n 3 = 139n 4 = 55 X 1 = 28.42X 2 = 29.43X 3 = 30.75X 4 = 33.39 s = 5.37s = 5.75s = 5.89s = 6.39 Example: Body Mass Index by Blood Pressure Classification 3)Set up the decision rule --- see critical value in Table 4 df 1 = k – 1 = ___________ df 2 = N – k= ___________ http://www.danielsoper.com/statcalc3/calc.aspx?id=4 Reject H 0 if: _______________________ Total N = n 1 + n 2 + n 3 + n 4 = ___________________________

43 3. More Than Two Independent Groups-Continuous Outcome (Practice) NormalPre-hypertensiveHtn Stage IHtn Stage II n 1 = 88n 2 = 191n 3 = 139n 4 = 55 X 1 = 28.42X 2 = 29.43X 3 = 30.75X 4 = 33.39 s = 5.37s = 5.75s = 5.89s = 6.39 Example: Body Mass Index by Blood Pressure Classification 3)Set up the decision rule --- see critical value in Table 4 df 1 = k – 1 = 4 – 1 = 3 df 2 = N – k= 473 – 4 = 469 http://www.danielsoper.com/statcalc3/calc.aspx?id=4 Reject H 0 if F > 2.62 Total N = n 1 + n 2 + n 3 + n 4 = 88 + 191 + 139 + 55 = 473

44 3. More Than Two Independent Groups-Continuous Outcome (Practice) NormalPre-hypertensiveHtn Stage IHtn Stage II n 1 = 88n 2 = 191n 3 = 139n 4 = 55 X 1 = 28.42X 2 = 29.43X 3 = 30.75X 4 = 33.39 s = 5.37s = 5.75s = 5.89s = 6.39 Example: Body Mass Index by Blood Pressure Classification Source of Variation Sum of Squares (SS) Degrees of freedom (df) Mean Squares (MS)F Between-group SSB 990.9 k – 1 ________ SSB MSB = ------ = _____ k – 1 MSB F = ------- MSE Error or residual (random) “within-group” SSE 15766.2 N – k _______ SSE MSE = ------- = _____ N – k F = _______ Total SST 16757.1 N – 1 _______ ------ 4.Compute the test statistic F = ____________N = ____ 5.Conclusion: ___________________________

45 3. More Than Two Independent Groups-Continuous Outcome (Practice) NormalPre-hypertensiveHtn Stage IHtn Stage II n 1 = 88n 2 = 191n 3 = 139n 4 = 55 X 1 = 28.42X 2 = 29.43X 3 = 30.75X 4 = 33.39 s = 5.37s = 5.75s = 5.89s = 6.39 Example: Body Mass Index by Blood Pressure Classification Source of Variation Sum of Squares (SS) Degrees of freedom (df) Mean Squares (MS)F Between-group SSB 990.9 k – 1 3 SSB MSB = ------ = 330.3 k – 1 MSB F = ------- MSE Error or residual (random) “within-group” SSE 15766.2 N – k 469 SSE MSE = ------- = 33.6 N – k F = __9.83__ Total SST 16757.1 N – 1 472 ------ 4.Compute the test statistic F = MSB / MSE N = 473 5.Conclusion: Reject H 0 : 9.83 > 2.62 Linear trend: F=11.27

46 3. More Than Two Independent Groups-Continuous Outcome (Practice) Example: Body mass index and blood pressure classification in the Heart SCORE Study (α = 0.05) SPSS Analyze Compare Means One-Way ANOVA Dependent Variable: Body mass index Group Variable (Factor): Blood pressure class Contrasts-2 -1 1 2 Options: Descriptive Homogeneity of variance test Means plot


Download ppt "Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,"

Similar presentations


Ads by Google