Download presentation
Presentation is loading. Please wait.
1
SA3202, Solution for Tutorial 4
The Seat Belt Data: The data was collected to assess the effect of wearing Seat Belt on safety. Response variable: Injury (fatal or non fatal) Explanatory var. : Seat Belt (yes or no) (1) The estimate and standard error for p1: e-3, e-5 p2: e-3, e-4 95% and 99% CI for p1: [1.1283e-3, e-3], [1.0946e-3, e-3] Key note: for 95% CI, use z(.975)= , for 99%CI use z(.995)= for p2: [9.2791e-3,1.0230e-2], [9.1297e-3, e-2] 95% and 99% CI for p1-p2: [-9.007e-3,-8.032e-3], [ e-3, e-3] To test if wearing a belt is useful for safety, we need to test H0: p1=p2 against H1: p1<p2 the estimated pooled sample proportion: e-3 the standard error of the sample proportion difference under H0: e-4 z= The value of z is highly significant compared with N(0,1). Therefore wearing Belt does makes a difference (it is useful for safety). The odds ratio (the log odds ratio) : (-2.075) The s.e. : e-3 (.05093) The 95% and 99% CI are: [.1130,.1381], [.1091,.1420] ([ , ], [ , ]) Conclusions: Passengers with seat belts are less at risk of fatal injury than those without seat belts: the odds of a fatal injury is only about % of that without seat belt. 4/13/2019 SA3202, Solution for Tutorial 4
2
SA3202, Solution for Tutorial 4
The Death Penalty Data: The data given in Data set 9 is only partial data; the more complete data is given in Dataset 16. The analysis based on partial data leads to misleading conclusions. The data was collected to see if “race” affects the verdict Response variable: Verdict Explanatory var. : Defendant’s Race (1) The estimate and standard error for p1: , p2: , 95% and 99% CI for p1: [.06862,.16887] [ , ] for p2: [.05629,.14853] [.04180, ] 95% and 99% CI for p1-p2:[ , ], [ , ] In the context of race bias, it is proper to test H0: p1=p2 against H1: p1<p2 the estimated pooled sample proportion: .1104 the standard error of the sample proportion difference under H0: Z=.4706. We will reject H0 when Z is too small. Therefore we don’t reject H0 since z is not that small. So there is no evidence of race bias. However, this is not the end of story. See Tutorial 5. The odds ratio (the log odds ratio) :1.181 (.1664) The s.e. : (.3539) The 95% and 99% CI are: [ ,2.0003], [.1044, ] ([-.5273,.8601], [-.7452,1.0780]) Note that the CI for the odds ratio (the log odds ratio) contains 1 ( 0). 4/13/2019 SA3202, Solution for Tutorial 4
3
SA3202, Solution for Tutorial 4
The University Admission Data: The data is collected in the context of “Sex discrimination (against women, of course). The data here is only partial data so that the conclusion may be misleading. Response variable: Admission Explanatory var. : Sex (1) The estimate and standard error for p1: , p2: , 95% and 99% CI for p1: [.4264, ] [.42051, ] for p2: [.2825,.32458], [.27589,.33119] 95% and 99% CI for p1-p2:[.1134,.1698], [.1046,.1787] (they don’t contain 0) H0: p1=p2 against H1: p1>p2 (Sex discrimination against women, more boys admitted) the estimated pooled sample proportion:.3877 the standard error under H0: Z=9.60 It seems that there is strong evidence that there exists sex discrimination against women. But this is a misleading conclusion based on partial data;see Tutorial 5 for details. The odds ratio (the log odds ratio) : (.6104) The s.e. :.1176 (.06389) The 95% and 99% CI are: [1.6105, ], [1.5381,2.1441] ([.4851,.7355], [.44577, ) they don’t contain 1 (0). 4/13/2019 SA3202, Solution for Tutorial 4
4
SA3202, Solution for Tutorial 4
Statistic Vitamin C data Death Penalty Data z1 based on proportion difference z2 based on odds ratio z3 based on log odds ratio (Key Note: for CI computation, SE computation should not be under H0 while for hypothesis testing, SE computation should be under H0) The values of z1 and z3 are usually quite close but may quite different from z2. It is better not to use z2 due to power consideration. 4/13/2019 SA3202, Solution for Tutorial 4
5
SA3202, Solution for Tutorial 4
This is a simple algebra. Note that Response G G2 Total Positive a b a+b Negative c d c+d Total a+c b+d a+b+c+d The estimate for p1 and p2 are a/(a+c) and b/(b+d) The estimate for the common p is (a+b)/(a+b+c+d) The estimated proportion difference is a/(a+c)-b/(b+d)=(ad-bc)/[(a+c)(b+d)] The estimated variance under H0= (1/(a+c)+1/(b+d))*[(a+b)/(a+b+c+d)*(c+d)/(a+b+c+d)] =(a+b+c+d)/[(a+c)(b+d)] * (a+b)(c+d)/(a+b+c+d)^2 =(a+b)(c+d)/([(a+b+c+d)(a+c)(b+d)] Thus, Z=(ad-bc)/[(a+c)(b+d)]/{(a+b)(c+d)/[(a+b+c+d)(a+c)(b+d)]}^{1/2} =(ad-bc) {(a+b+c+d)/[a+b)(c+d)(a+c)(b+d)}^{1/2} The results follow. 4/13/2019 SA3202, Solution for Tutorial 4
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.