Download presentation
Presentation is loading. Please wait.
1
Stata 9, Summing up
2
Why Stata Pro Con Aimed at epidemiology Many methods, growing Graphics
Structured, Programable Comming soon to a course near you Con Memory>file size Copy table Used by leading univ, and at many summer schools H.S.
3
Use Import data Do files Full syntax DBMS-Copy
Highlight commands, Ctrl-D Full syntax [by varlist:] command [varlist] [if exp] [in range] [, opts] list if age<50 list in 1/10 regress y x1 x2 if deltabeta<0.3 Show open do files, copy from menu List if x<100, list in 1/10 may use both Regress … if db<0.3 Options often advanced statistics H.S.
4
Data check describe describe dataset summarize means ++
list x1 x2 in 1/10 list first 10 obs gen id=_n generate id numlabel x1 x2, add add value to label tab x1 x2, mis x1 by x2 including missing list id x1 if (x2==.)+(x3==.)==1 list if x1 or x2 is missing egen miss=rowmiss(x1 x2 x3) number missing tab miss from 0-3 missing drop x1 if x1<0 drop negative drop x1 if x1>100 & x1<. drop large Numlabel: 3 cat, add coding to label List if one of two variables is missing H.S.
5
Graphics Explore data Plot means Plot means using aggregate and twoway
kdensity y distribution scatter y x scatter twoway (scatter y x)(lfit y x) scatter+line Plot means graph bar (mean) y1 y2 mean of y1 and y2 graph bar (mean) y, over(c) mean y for values of c Plot means using aggregate and twoway preserve collapse (mean) ym=y, by(c) one line pr c value line ym c lineplot mean(y) by c restore (mean) may use median, count, p25, … Advanced: Save results in macros, use in plot (same as bar over(c) ) Aggregate data: collapse:means, contract:freq May add sd and count to collapse to get CI of mean, se(mean)=sd/sqrt(count) H.S.
6
Help General Examples help command search keyword findit keyword
help table search GAM findit GAM findit key=search key,all rc=return code H.S.
7
Continuous symetrical data
Univariate kdensity y distribution summarize y means ++ Bivariate sdtest y, by(sex) equal variance? ttest y, by(sex) equal means? oneway y parity3, tabulate equal means? Multivariable regress y x1 x2 linear regression dfbeta Bivar: 2 groups, 3+ groups H.S.
8
Some options mean y mean+ci mean y, cluster(region)
mean y, standardize mean y, bootstrap H.S.
9
Continuous skewed data
Univariate kdensity y distribution summarize y, detail medians ++ Bivariate table sex, c(median y) medians ranksum y, by(sex) equal medians? kwallis y, by(age3) equal medians? Multivariable regress y x1 x2 linear regression dfbeta 2 groups. Medians+ci, cci (conservative ci, (broader)) not assuming normality Mann-Whitney U test=Wilcoxon rank sum 3+ groups Kruskal-Wallis Linear regression, but look at influence H.S.
10
Categorical data Univariate Bivariate Multivariable
tabulate y freq table proportion y prop with ci Bivariate tabulate y sex, col chi2 column %, chisquare Multivariable logistic y x1 x2 logistic regression binreg y x1 x2, rd risk difference H.S.
11
Survival data Set Univariate Bivariate Multivariable
stset time, failure(status==1) Univariate sts graph, fail gwood KM failure+ci Bivariate sts graph, fail by(x1) KM failure sts test x1 log rank test Multivariable stcox x1 x2 cox regression Analyse time to event. Not fully obsereved Two variables: time and failure st=survival time Gwood=Greenwood ci H.S.
12
Model building Estimate Compare Interaction
regress y exp exposure only est store m1 store regress y exp x1 x2 exposure +conf. est store m2 store Compare est table m1 m2 confounding? est stat m1 m2 model fit Interaction regress y exp x1 x1exp with interaction term lincom exp+2*x1exp effect of exp for x1=2 lincom=linear combination with ci H.S.
13
Model testing Assumptions Influence Independent errors discuss
Linear effects categorize, plot coefs Constant error plot resid (linear mod) Influence Influential points plot delta-beta Linear effect on g(y) Prop hazards, plot schoenfeld resid H.S.
14
Regression with simple error structure GLM
regress linear regression (also heteroschedastic errors) nl non linear least squares GLM logistic logistic regression poisson Poisson regression binreg binary outcome, OR, RR, or RD effect measures Conditional logistc clogit for matched case-control data Multiple outcome mlogit multinomial logit (not ordered) ologit ordered logit Regression with complex error structure xtmixed linear mixed models xtlogit random effect logistic Some of Stata’s regression commands Glm for other link-distr combinations H.S.
15
GLLAMM Generalized Linear Latent And Mixed Models Response types
continuous ordered and unordered categories counts survival Model types Generalized Linear Models (GLM) Structural Equation Models (SEM) Mixed Models Measurement Error models SEM: intermediate variables Mixed: hiararcical data, repeated mearsurements Have special data, find appropriate tools in Stata or in added programs H.S.
16
H.S.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.