Presentation, data and programs at: Stata 2, Bivariate Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/ Timing: Intro and continuous symmetrical: 60 min Skewed, categorical and regression (not survival): 60 min 8:30-9:30 Groups 15+60 min 9:30-10:45 Plenary 45 min 10:45.11:30 Dec-18 Dec-18 H.S. H.S. 1
Datatypes Categorical data Numerical data Nominal: married/ single/ divorced Ordinal: small/ medium/ large Numerical data Discrete: number of children Continuous: weight Coding 1, 2, 3, is 2 twice as much as 1 1. Set of methods for categorical data proportion married 1. Set of methods for numerical data average weight Dec-18 Dec-18 H.S. H.S. 2
Data type dictates type of analysis Start with continuous data Dec-18 Dec-18 H.S. H.S. 3
Continuous symmetric outcome Example: Birth weight Dec-18 Dec-18 H.S. H.S. 4
Distribution kdensity weight drop if weight<2000 kdensity weight Dec-18 Dec-18 H.S. H.S. 5
Central tendency and dispersion Mean and standard deviation: Mean with confidence interval: Std Dev for Data Std Err for Estimate Dec-18 Dec-18 H.S. H.S. 6
Compare groups, equal variance? Not equal Compare boys and girls Fokus om means or fokus on low tail gives opposite results!! Dec-18 Dec-18 H.S. H.S. 7
2 independent samples Are birth weights the same for boys and girls? Density plot Scatterplot Scatter to see linear/no-linear effect, look for outliers Density to see equal variance Dec-18 Dec-18 H.S. H.S. 8
2 independent samples test Dec-18 Dec-18 H.S. H.S. 9
K independent samples Is birth weight the same over parity? Density plot Scatterplot Scatter to see linear/no-linear effect, look for outliers Density to see equal variance Equal means? Linear effect? Outliers? Equal variances? Dec-18 Dec-18 H.S. H.S. 10
K independent samples test equal means? Equal variances? Dec-18 Dec-18 H.S. H.S. 11
Continuous by continuous Does birth weight depend on gestational age? Scatterplot Scatterplot, outlier dropped Dec-18 Dec-18 H.S. H.S. 12
Continuous by continuous tests Cut gestational age up in groups, then use T-test or ANOVA or Use linear regression with 1 covariate Dec-18 Dec-18 H.S. H.S. 13
Test situations 2 independent samples K independent samples ttest weight, by(sex) K independent samples oneway weight parity By continuous regress weight gestAge 2 dependent samples (Paired) ttest weight_last_year = weight_today 1: ttest weight=10 4: ttest weight0=weight1 (assumes paired test) Equal/unequal Dec-18 Dec-18 H.S. H.S. 14
Continuous skewed outcome Example: Number of sexual partners Dec-18 Dec-18 H.S. H.S. 15
Distribution kdensity partners if partners<=50 Dec-18 Dec-18 H.S. Lower 75% fractile here than on next page because partner>50 are dropped here Dec-18 Dec-18 H.S. H.S. 16
Central tendency and dispersion Median and percentiles: cci binomial exact; conservative confidence interval normal normal, based on observed centiles meansd normal, based on mean and standard deviation Dec-18 Dec-18 H.S. H.S. 17
2 independent samples Do males and females have the same number of partners? Scatterplot Density plot Scatter to see linear/no-linear effect, look for outliers Density to see equal variance Unequal variance! Test somewhat problematic Dec-18 Dec-18 H.S. H.S. 18
2 independent samples test equal medians? Could also use T-test since the “difference in means” is probably normal from 400 observations, even thou the underlying distribution are quite skewed. T-test gives p=0.0000 Dec-18 Dec-18 H.S. H.S. 19
K independent samples Do partners vary with age? Scatterplot Scatterplot (partners<20) Density plot (partners<20) Scatter to see linear/no-linear effect, look for outliers Problems with unequal variance Dec-18 Dec-18 H.S. H.S. 20
K independent samples test equal medians? Probably a cohort effect rather than an age effect Oneway anova gives p=0.48, and Bartlett’s test for equal var gives p=0.000, that is clearly unequal variances. Group sizes also somewhat different. Both tests (K-Wallis and anova) shaky. Regroup to 2 groups, or remove outlier Dec-18 Dec-18 H.S. H.S. 21
Table of tests Categorical ordered: use nonparametric tests Dec-18 Mann-Whithey U=Wilcoxon rank sum Categorical ordered: use nonparametric tests Dec-18 Dec-18 H.S. H.S. 22
Example: Being bullied Categorical data Example: Being bullied Shown flowchart p3 Boys more or less than girls (2 sided test) Dec-18 Dec-18 H.S. H.S. 23
Frequency and proportion Proportion with CI: Proportion: May standardize, adjust for clusters, use bootstrap or jacknife est May weigth if stratified sample Dec-18 Dec-18 H.S. H.S. 24
Proportion, confidence interval x=”disease” n=total number proportion: standard error: confidence interval: How much increase n to get half the standard error? Dec-18 Dec-18 H.S. H.S. 25
Crosstables Are boys bullied as much as girls? equal proportions? Dec-18 Dec-18 H.S. H.S. 26
Ordered categories, trend Does bullied vary with age? twoway (fpfitci bullied agegr) /// (lfit bullied agegr) Could also have used age as countinuous. Have not shown the data as two rugs. Dec-18 H.S.
Ordered categories, trend equal proportions? Dec-18 Dec-18 H.S. H.S. 28
Table of tests Categorical ordered: use nonparametric tests Dec-18 Mann-Whithey U=Wilcoxon rank sum For matched CC data: Mc-Nemar for 2*2 tables symmetry for K*K tables (outcome with more than 2 categories) Categorical ordered: use nonparametric tests Dec-18 Dec-18 H.S. H.S. 29