Download presentation
Presentation is loading. Please wait.
Published byMaximilian McKenzie Modified over 9 years ago
1
Unintended Pregnancy in College-aged Women Yi Su and Zhe Zhao May 3, 2010
2
Survey Data 2008 subjects 65 questions -Age (range from 18-35) -Race -High School Class Size -Year in College -History of Sexuality (Hx Sex) -Use of Emergency Contraception (EC) -Questions involving Knowledge of EC -Questions involving Accessibility to EC
3
Client’s Goal Obtain frequency tables for certain variables Find relationship between knowledge of EC and given variables Find relationship between accessibility of EC and given variables
4
Before Analysis… Eight columns involving race should be represented by one variable showing all levels --Race variable SAS Code SAS Code Variable should be created to summarize subject’s knowledge of EC, level of accessibility to EC --Knowledge_index, Access_indicator, Access_index SAS Code SAS Code
5
SAS Code …… array races {*} Race_White Race_Af_American Race_Indig_Aborig RaceAsian_PI Race_Hispanic Race_Latino Race_Multiracial Race_NS; do i=1 to DIM(races) until (races {i} ='Yes'); end; select (i); when (1) Race ='White’; when (2) Race='Af_American‘; when (3) Race='Indig_Aborig’; when (4) Race='Asian_PI’; when (5) Race='Hispanic’; when (6) Race ='Latino’; when (7) Race='Multiracial’; otherwise Race='NS’; end; returnreturn
6
SAS Code …… array knowledge {*} Aware Other_EC EC_Effective Ec_Abortive Ec_Ovulate EC_egg EC_trimester Ec_defects Student_health Pharmacy_OTC ; nmrt=0; dnmt=0; do i=1 to DIM(knowledge); if knowledge {i} ne 99 then do; if i in (1 2) then do; dnmt=dnmt+2; nmrt=nmrt+2*knowledge {i}; end; else if i in (3 4 9 10) then do; dnmt=dnmt+3; nmrt=nmrt+3*knowledge {i}; end; else do; dnmt=dnmt+1; nmrt=nmrt+knowledge {i}; end; end; if dnmt ne 0 then Knowledge_Index=nmrt/dnmt; returnreturn
7
all subjects Hx sex Yes Hx sex No EC use Yes EC use No Frequency Tables
8
Frequency Tables (continued) Create, customize and manage output via SAS ODS Code ods listing close; ods html body='C:\Consulting for Melissa\OUTPUT\all_freq.xls' style=Minimal; ods NOPROCTITLE; proc freq data = mydata.Survey_V03; …… Ods html close; Output Excel Excel
9
Relationship between Knowledge Index and Race Knowledge Index: [0,1) Race: Eight different races
10
Initial Try: ANOVA on original data proc glm data=mydata.survey_final; class Race_Coded; model Knowledge_Index=Race_Coded; lsmeans Race_Coded/adjust=Tukey pdiff; output out=o p=pred r=resid; run;
11
Test Results Least Squares Means for effect Race_Coded Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: Knowledge_Index i/j 1 2 3 4 5 6 7 8 1 0.9970 0.9917 0.9954 0.2634 0.9995 1.0000 1.0000 2 0.9970 0.9999 0.9494 <.0001 1.0000 0.9993 0.9885 3 0.9917 0.9999 0.9311 0.1671 0.9999 0.9971 0.9790 4 0.9954 0.9494 0.9311 0.5436 0.9654 0.9927 0.9917 5 0.2634 <.0001 0.1671 0.5436 0.0828 0.3404 0.0395 6 0.9995 1.0000 0.9999 0.9654 0.0828 0.9999 0.9992 7 1.0000 0.9993 0.9971 0.9927 0.3404 0.9999 1.0000 8 1.0000 0.9885 0.9790 0.9917 0.0395 0.9992 1.0000
12
Model Diagnostics
13
Tests for Normality Test --Statistic--- -----p Value------ Shapiro-Wilk W 0.81658 Pr < W <0.0001 Kolmogorov-Smirnov D 0.189102 Pr > D <0.0100 Cramer-von Mises W-Sq 17.61803 Pr > W-Sq <0.0050 Anderson-Darling A-Sq 104.3814 Pr > A-Sq <0.0050
14
Model Diagnostics
15
Conclusion Non-normality Non-constant variance
16
Second Try Box-Cox transformation for ANOVA proc transreg data=mydata.survey_final; Model boxcox(Knowledge_Index/parameter=0.00001)=class(Ra ce_Coded); run;
17
Box Cox transformation result The TRANSREG Procedure Transformation Information for BoxCox(Knowledge_Index) Lambda R-Square Log Like -3.00 0.00 -58498.8 -2.75 0.00 -53051.7 -2.50 0.00 -47621.1 ……………………………………………………………………………………………………. 1.75 0.02 3953.4 2.00 0.02 3989.9 2.25 0.01 4018.2 2.50 0.01 4039.6 2.75 0.01 4055.2 3.00 + 0.01 4065.8 < < - Best Lambda
18
ANOVA with cube transformation proc glm data=a; class Race_Coded; model Knowledgetr=Race_Coded; lsmeans Race_Coded/adjust=Tukey pdiff; output out=o p=pred r=resid; run;
19
Pairwise Comparison Result Least Squares Means for effect Race_Coded Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: knowledgetr i/j 1 2 3 4 5 6 7 8 1 0.9974 0.9922 0.9791 0.3095 0.9996 1.0000 1.0000 2 0.9974 0.9998 0.8736 <.0001 1.0000 0.9988 0.9743 3 0.9922 0.9998 0.8427 0.2235 0.9999 0.9954 0.9606 4 0.9791 0.8736 0.8427 0.4110 0.9052 0.9750 0.9735 5 0.3095 <.0001 0.2235 0.4110 0.1135 0.3510 0.0368 6 0.9996 1.0000 0.9999 0.9052 0.1135 0.9998 0.9976 7 1.0000 0.9988 0.9954 0.9750 0.3510 0.9998 1.0000 8 1.0000 0.9743 0.9606 0.9735 0.0368 0.9976 1.0000
20
Model Diagnostics
22
Conclusion Non-constant variance problem was fixed but still has problem with non-normality
23
Third Try: Non-parametric test Wilcoxon two sample test---- The Wilcoxon-Mann-Whitney test is a non- parametric analog to the independent samples t- test and can be used when you do not assume that the dependent variable is a normally distributed variable
24
Kruskal-Wallis Test---- The Kruskal Wallis test is used when you have one independent variable with two or more levels. In other words, it is the non-parametric version of ANOVA. It is also a generalized form of the Mann-Whitney test method, as it permits two or more groups.
25
Kruskal-Wallis Test---- This test is an alternative to the independent group ANOVA, when the assumption of normality or equality of variance is not met. This, like many non-parametric tests, uses the ranks of the data rather than their raw values to calculate the statistic. Since this test does not make a distributional assumption, it is not as powerful as the ANOVA.
26
SAS code for Kruskal-Wallis Test proc npar1way data=mydata.survey_final wilcoxon; class Race_Coded; var Knowledge_Index; run;
27
Test Result Kruskal-Wallis Test Chi-Square 27.4365 DF 7 Pr > Chi-Square 0.0003
28
Pairwise Comparison Not provided by proc npar1way Solutions 1) Carry out all tests one by one, be careful of controlling for family error rate 2) SAS macro
29
Pairwise Comparison P-value 0 and 1 0.4739 0 and 2 0.4692 0 and 3 0.2810 0 and 4 0.0371 0 and 5 0.6427 0 and 6 0.9011 0 and 7 0.9637 1 and 2 0.7909 1 and 3 0.1200 1 and 4 <.0001 1 and 5 0.7874 1 and 6 0.4259 1 and 7 0.2596 2 and 3 0.1546 2 and 4 0.0234 2 and 5 0.7117 2 and 6 0.4199 2 and 7 0.3096 3 and 4 0.0518 3 and 5 0.1860 3 and 6 0.3709 3 and 7 0.2524 4 and 5 0.0148 4 and 6 0.0469 4 and 7 0.0038 5 and 6 0.5720 5 and 7 0.5267 6 and 7 0.9816 Compare P-value with 0.05/21=0.00238
30
SAS Macro (1) ODS OUTPUT WilcoxonScores=wlx(drop=variable); ODS EXCLUDE wilcoxonScores; proc npar1way data=mydata.survey_final wilcoxon; class Race_Coded; var Knowledge_Index; run; PROC PRINT DATA=wlx NOObs ; run; * macro var k == number of groups; DATA _null_ ; SET wlx nobs=nobs; CALL SYMPUT("k",LEFT(nobs)); run; %put &k.; PROC TRANSPOSE DATA =wlx OUT=cnts(drop=_name_ _label_) prefix=_n; var n; ID class; run; PROC TRANSPOSE DATA =wlx OUT=mns(drop=_name_ _label_) prefix=_mn; var meanscore; ID class; run; proc print data=cnts; RUN; proc print data=mns; RUN; %LET alpha=.05; * familywise pvalue ; DATA results; SET cnts; SET mns; DROP nn _n1-_n&k. _mn1-_mn&k.; LENGTH reject $2; RETAIN reject ' '; LABEL compare='Critical Value' abs_diff='Absolute Difference in Mean Ranks';
31
SAS Macro (2) c= ((&k.*(&k.-1))/2); * number of pairwise tests; z = PROBIT( (1- ((&alpha./2)/ c) ) ); * multiplier ; nn=SUM(of _n1-_n&k.); * total number of observations ; ARRAY nc{&k.} _n1 - _n&k.; ARRAY mn{&k.} _mn1 - _mn&k.; DO i = 1 to (&k.-1); DO j = (i+1) TO &k.; sc1 = mn{i}; sc2 = mn{j}; ABS_diff = abs(sc1 - sc2); compare = z * SQRT( nn*(nn+1)/12 * ((1/nc{i}) + (1/nc{j}))); IF abs_diff > compare then reject='**'; * the ** marker is to denote any significant differences ; OUTPUT results; reject=' '; * reset marker to missing ; END; RUN; proc print data=results NOobs label; var i j sc1 sc2 ABS_diff compare reject; FORMAT abs_diff 6.3 comp 6.2; run;
32
Absolute Difference Critical i j sc1 sc2 in MeanRanks Value reject 1 2 1000.41 976.19 24.216 287.78 1 3 1000.41 1605.50 605.09 1257.79 1 4 1000.41 717.08 283.32 193.10 ** 1 5 1000.41 1026.73 26.318 322.07 1 6 1000.41 1138.90 138.49 563.76 1 7 1000.41 1136.55 136.14 390.23 1 8 1000.41... 2 3 976.19 1605.50 629.31 1288.92 2 4 976.19 717.08 259.11 341.40 2 5 976.19 1026.73 50.533 427.78 2 6 976.19 1138.90 162.71 630.15 2 7 976.19 1136.55 160.36 481.19 2 8 976.19... 3 4 1605.50 717.08 888.42 1271.13 3 5 1605.50 1026.73 578.77 1297.00 3 6 1605.50 1138.90 466.60 1377.07 3 7 1605.50 1136.55 468.95 1315.59 3 8 1605.50... 4 5 717.08 1026.73 309.64 370.76 4 6 717.08 1138.90 421.82 592.93 4 7 717.08 1136.55 419.46 431.29 4 8 717.08... 5 6 1026.73 1138.90 112.17 646.53 5 7 1026.73 1136.55 109.82 502.45 5 8 1026.73... 6 7 1138.90 1136.55 2.352 683.05 6 8 1138.90... 7 8 1136.55...
33
Test Results for Knowledge Var/ Meth od Year In College Size HS Class RaceHx Sex False Alarm EC Use Birth Contr ol Original ANOVA/ Two Sample t-test Freshman worse than Junior, Senior and Graduate Not Significa nt White Better than Asian_PI Hx_Yes better than Hx_No False Alarm Yes better than False Alarm No EC Use Yes better Than EC use No Birth Control Yes better than Birth Control No Transfor med ANOVA Same As Above Not Significa nt Multiracial and White are better than Asian_PI Non- Param etric Freshman worse than Junior and Graduate Not Significa nt White better than Asian_PI Hx_Yes better than Hx_No False Alarm Yes better than False Alarm No EC Use Yes better Than EC use No Birth Control Yes better than Birth Control No
34
Test Results for Access Var/ Method Year In College Size HS Class RaceHx SexFalse Alarm Birth Control Original ANOVA/T wo Sample t-test Freshman worse than Sophomore, Junior, Senior and Graduate Not Significant White Better than Asian_PI Not Significant Not Significant Not Significant Transforme d ANOVA Freshman worse than Graduate Not Significant White Better than Asian_PI Non- Paramet ric Freshman worse than Graduate Not Significant White better than Asian_PI Not Significant Not Significant Not Significant
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.