Stata 9, Summing up.

Slides:



Advertisements
Similar presentations
SADC Course in Statistics Modelling ideas in general – an appreciation (Session 20)
Advertisements

Apr-15H.S.1 Stata: Linear Regression Stata 3, linear regression Hein Stigum Presentation, data and programs at: courses.
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Apr-15H.S.1Apr-15H.S.1 Stata Introduction, Short v2 Hein Stigum Presentation, data and programs at: courses.
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/
Computing for Research I Spring 2013 Primary Instructor: Elizabeth Garrett-Mayer Regression Using Stata February 19.
Departments of Medicine and Biostatistics
By Wendiann Sethi Spring  The second stages of using SPSS is data analysis. We will review descriptive statistics and then move onto other methods.
A Simple Guide to Using SPSS© for Windows
Jul-15H.S.1 Stata 3, Regression Hein Stigum Presentation, data and programs at:
Jul-15H.S.1 Short overview of statistical methods Hein Stigum Presentation, data and programs at: courses.
Jul-15H.S.1 Linear Regression Hein Stigum Presentation, data and programs at:
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Stata Workshop #1 Chiu-Hsieh (Paul) Hsu Associate Professor College of Public Health
Basic Biostatistics Prof Paul Rheeder Division of Clinical Epidemiology.
Linear correlation and linear regression + summary of tests
Introduction to Multiple Imputation CFDR Workshop Series Spring 2008.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
ANALYSIS PLAN: STATISTICAL PROCEDURES
Going from data to analysis Dr. Nancy Mayo. Getting it right Research is about getting the right answer, not just an answer An answer is easy The right.
Basics of Biostatistics for Health Research Session 3 – February 21, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Mr. Magdi Morsi Statistician Department of Research and Studies, MOH
Statistics for Neurosurgeons A David Mendelow Barbara A Gregson Newcastle upon Tyne England, UK.
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Approaches to quantitative data analysis Lara Traeger, PhD Methods in Supportive Oncology Research.
Data Workshop H397. Data Cleaning  Inputting data  Missing Values  Converting String Variables  Creating Scales  Creating Dummy Variables.
SURVIVAL ANALYSIS WITH STATA. DATA INPUT 1) Using the STATA editor 2) Reading STATA (*.dta) files 3) Reading non-STATA format files (e.g. ASCII) - infile.
Bivariate analysis. * Bivariate analysis studies the relation between 2 variables while assuming that other factors (other associated variables) would.
Before the class starts: Login to a computer Read the Data analysis assignment 1 on MyCourses If you use Stata: Start Stata Start a new do file Open the.
A radical view on plots in analysis
EHS 655 Lecture 4: Descriptive statistics, censored data
Advanced Quantitative Techniques
Causality, Null Hypothesis Testing, and Bivariate Analysis
Logistic Regression APKC – STATS AFAC (2016).
From t-test to multilevel analyses Del-2
Advanced Quantitative Techniques
CHAPTER 7 Linear Correlation & Regression Methods
Chapter 13 Nonlinear and Multiple Regression
Advanced Quantitative Techniques
Stata Intro Mixed Models
QM222 Class 13 Section D1 Omitted variable bias (Chapter 13.)
Applied Biostatistics: Lecture 2
Statistics.
Lab 9 – Regression Diagnostics
Generalized Linear Models
Y - Tests Type Based on Response and Measure Variable Data
Analysis of Data Graphics Quantitative data
A statistical package for epidemiologists
Generalized Linear Models (GLM) in R
Introduction to logistic regression a.k.a. Varbrul
Advanced Quantitative Analysis
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
SA3202 Statistical Methods for Social Sciences
Introduction to analysis DAGitty
Quantitative Methods What lies beyond?.
CHAPTER 29: Multiple Regression*
Presentation, data and programs at:
What is Regression Analysis?
Presentation, data and programs at:
Standard Statistical analysis Linear-, logistic- and Cox-regression
Regression diagnostics
Quantitative Methods What lies beyond?.
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Chapter 4, Regression Diagnostics Detection of Model Violation
Problems with infinite solutions in logistic regression
Simple and Multiple Regression
Presentation, data and programs at:
Common Statistical Analyses Theory behind them
Nazmus Saquib, PhD Head of Research Sulaiman AlRajhi Colleges
Presentation transcript:

Stata 9, Summing up

Why Stata Pro Con Aimed at epidemiology Many methods, growing Graphics Structured, Programable Comming soon to a course near you Con Memory>file size Copy table Used by leading univ, and at many summer schools 01.01.2019 H.S.

Use Import data Do files Full syntax DBMS-Copy Highlight commands, Ctrl-D Full syntax [by varlist:] command [varlist] [if exp] [in range] [, opts] list if age<50 list in 1/10 regress y x1 x2 if deltabeta<0.3 Show open do files, copy from menu List if x<100, list in 1/10 may use both Regress … if db<0.3 Options often advanced statistics 01.01.2019 H.S.

Data check describe describe dataset summarize means ++ list x1 x2 in 1/10 list first 10 obs gen id=_n generate id numlabel x1 x2, add add value to label tab x1 x2, mis x1 by x2 including missing list id x1 if (x2==.)+(x3==.)==1 list if x1 or x2 is missing egen miss=rowmiss(x1 x2 x3) number missing tab miss from 0-3 missing drop x1 if x1<0 drop negative drop x1 if x1>100 & x1<. drop large Numlabel: 3 cat, add coding to label List if one of two variables is missing 01.01.2019 H.S.

Graphics Explore data Plot means Plot means using aggregate and twoway kdensity y distribution scatter y x scatter twoway (scatter y x)(lfit y x) scatter+line Plot means graph bar (mean) y1 y2 mean of y1 and y2 graph bar (mean) y, over(c) mean y for values of c Plot means using aggregate and twoway preserve collapse (mean) ym=y, by(c) one line pr c value line ym c lineplot mean(y) by c restore (mean) may use median, count, p25, … Advanced: Save results in macros, use in plot (same as bar over(c) ) Aggregate data: collapse:means, contract:freq May add sd and count to collapse to get CI of mean, se(mean)=sd/sqrt(count) 01.01.2019 H.S.

Help General Examples help command search keyword findit keyword help table search GAM findit GAM findit key=search key,all rc=return code 01.01.2019 H.S.

Continuous symetrical data Univariate kdensity y distribution summarize y means ++ Bivariate sdtest y, by(sex) equal variance? ttest y, by(sex) equal means? oneway y parity3, tabulate equal means? Multivariable regress y x1 x2 linear regression dfbeta Bivar: 2 groups, 3+ groups 01.01.2019 H.S.

Some options mean y mean+ci mean y, cluster(region) mean y, standardize mean y, bootstrap 01.01.2019 H.S.

Continuous skewed data Univariate kdensity y distribution summarize y, detail medians ++ Bivariate table sex, c(median y) medians ranksum y, by(sex) equal medians? kwallis y, by(age3) equal medians? Multivariable regress y x1 x2 linear regression dfbeta 2 groups. Medians+ci, cci (conservative ci, (broader)) not assuming normality Mann-Whitney U test=Wilcoxon rank sum 3+ groups Kruskal-Wallis Linear regression, but look at influence 01.01.2019 H.S.

Categorical data Univariate Bivariate Multivariable tabulate y freq table proportion y prop with ci Bivariate tabulate y sex, col chi2 column %, chisquare Multivariable logistic y x1 x2 logistic regression binreg y x1 x2, rd risk difference 01.01.2019 H.S.

Survival data Set Univariate Bivariate Multivariable stset time, failure(status==1) Univariate sts graph, fail gwood KM failure+ci Bivariate sts graph, fail by(x1) KM failure sts test x1 log rank test Multivariable stcox x1 x2 cox regression Analyse time to event. Not fully obsereved Two variables: time and failure st=survival time Gwood=Greenwood ci 01.01.2019 H.S.

Model building Estimate Compare Interaction regress y exp exposure only est store m1 store regress y exp x1 x2 exposure +conf. est store m2 store Compare est table m1 m2 confounding? est stat m1 m2 model fit Interaction regress y exp x1 x1exp with interaction term lincom exp+2*x1exp effect of exp for x1=2 lincom=linear combination with ci 01.01.2019 H.S.

Model testing Assumptions Influence Independent errors discuss Linear effects categorize, plot coefs Constant error plot resid (linear mod) Influence Influential points plot delta-beta Linear effect on g(y) Prop hazards, plot schoenfeld resid 01.01.2019 H.S.

Regression with simple error structure GLM regress linear regression (also heteroschedastic errors) nl non linear least squares GLM logistic logistic regression poisson Poisson regression binreg binary outcome, OR, RR, or RD effect measures Conditional logistc clogit for matched case-control data Multiple outcome mlogit multinomial logit (not ordered) ologit ordered logit Regression with complex error structure xtmixed linear mixed models xtlogit random effect logistic Some of Stata’s regression commands Glm for other link-distr combinations 01.01.2019 H.S.

GLLAMM Generalized Linear Latent And Mixed Models Response types continuous ordered and unordered categories counts survival Model types Generalized Linear Models (GLM) Structural Equation Models (SEM) Mixed Models Measurement Error models SEM: intermediate variables Mixed: hiararcical data, repeated mearsurements Have special data, find appropriate tools in Stata or in added programs 01.01.2019 H.S.

01.01.2019 H.S.