Jul-15H.S.1 Linear Regression Hein Stigum Presentation, data and programs at:

Slides:



Advertisements
Similar presentations
Apr-15H.S.1 Stata: Linear Regression Stata 3, linear regression Hein Stigum Presentation, data and programs at: courses.
Advertisements

Chapter 9: Regression Analysis
Qualitative predictor variables
1 1 Chapter 5: Multiple Regression 5.1 Fitting a Multiple Regression Model 5.2 Fitting a Multiple Regression Model with Interactions 5.3 Generating and.
BA 275 Quantitative Business Methods
/k 2DS00 Statistics 1 for Chemical Engineering lecture 4.
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
2DS00 Statistics 1 for Chemical Engineering Lecture 3.
Feb 21, 2006Lecture 6Slide #1 Adjusted R 2, Residuals, and Review Adjusted R 2 Residual Analysis Stata Regression Output revisited –The Overall Model –Analyzing.
Statistics for Managers Using Microsoft® Excel 5th Edition
Lecture 25 Regression diagnostics for the multiple linear regression model Dealing with influential observations for multiple linear regression Interaction.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Correlation. Two variables: Which test? X Y Contingency analysis t-test Logistic regression Correlation Regression.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Linear Regression MARE 250 Dr. Jason Turner.
Jul-15H.S.1 Stata 3, Regression Hein Stigum Presentation, data and programs at:
Regression Diagnostics Checking Assumptions and Data.
Jul-15H.S.1 Short overview of statistical methods Hein Stigum Presentation, data and programs at: courses.
Ch. 14: The Multiple Regression Model building
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Analysis of Covariance Goals: 1)Reduce error variance. 2)Remove sources of bias from experiment. 3)Obtain adjusted estimates of population means.
Measures of Association Deepak Khazanchi Chapter 18.
Business Statistics - QBM117 Statistical inference for regression.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Linear Regression and Correlation Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and the level of.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Regression and Correlation Methods Judy Zhong Ph.D.
Simple Linear Regression
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Basics of Regression Analysis. Determination of three performance measures Estimation of the effect of each factor Explanation of the variability Forecasting.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 7 Logistic Regression I.
Regression Regression relationship = trend + scatter
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
Week 5Slide #1 Adjusted R 2, Residuals, and Review Adjusted R 2 Residual Analysis Stata Regression Output revisited –The Overall Model –Analyzing Residuals.
REGRESSION DIAGNOSTICS Fall 2013 Dec 12/13. WHY REGRESSION DIAGNOSTICS? The validity of a regression model is based on a set of assumptions. Violation.
MARE 250 Dr. Jason Turner Linear Regression. Linear regression investigates and models the linear relationship between a response (Y) and predictor(s)
Multiple regression.
Chapter 12: Correlation and Linear Regression 1.
Ch14: Linear Least Squares 14.1: INTRO: Fitting a pth-order polynomial will require finding (p+1) coefficients from the data. Thus, a straight line (p=1)
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
A first order model with one binary and one quantitative predictor variable.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
4 basic analytical tasks in statistics: 1)Comparing scores across groups  look for differences in means 2)Cross-tabulating categoric variables  look.
Multiple Regression Analysis Regression analysis with two or more independent variables. Leads to an improvement.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Chapter 12: Correlation and Linear Regression 1.
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
A radical view on plots in analysis
Chapter 12: Regression Diagnostics
A statistical package for epidemiologists
Jeffrey E. Korte, PhD BMTRY 747: Foundations of Epidemiology II
Stats Club Marnie Brennan
Introduction to analysis DAGitty
Presentation, data and programs at:
Standard Statistical analysis Linear-, logistic- and Cox-regression
Regression diagnostics
Regression is the Most Used and Most Abused Technique in Statistics
Adequacy of Linear Regression Models
Checking Assumptions Primary Assumptions Secondary Assumptions
Review of Chapter 3 Examining Relationships
Presentation transcript:

Jul-15H.S.1 Linear Regression Hein Stigum Presentation, data and programs at:

CONCEPTS Linear regression Jul-15H.S.2

Jul-15H.S.3 Outcome and regression types Numerical data –Discrete number of partners –Continuous Weight Categorical data –Nominal disease/ no disease –Ordinal small/ medium/ large Poisson regression Linear regression Logistic regression Ordinal regression

Jul-15H.S.4 Regression idea

Jul-15H.S.5 Measures and Assumptions Adjusted effects –b 1 is the increase in weight per day of gestational age –b 1 is adjusted for b 2 Assumptions –Independent errors –Linear effects –Constant error variance Robustness –influence

Jul-15H.S.6 Workflow DAG Plots: distribution and scatter Bivariate analysis Regression –Model estimation –Test of assumptions Independent errors Linear effects Constant error variance –Robustness Influence Discuss Plot

ANALYSIS Continuous outcome: Linear regression, Birth weight Jul-15H.S.7

Jul-15H.S.8 DAGs E gest age D birth weight C2 parity C1 sex AssociationsBivariate (unadjusted) Causal effectsMultivariable (adjusted) Draw your assumptions before your conclusions

Jul-15H.S.9 Plot outcome by exposure OK Be clear on the research question: overall birth weight: linear regression low birth weight:logistic regression linear and logistic can give opposite results May lead to non-constant error variance May have high influential outliers Effects on linear regression:

Plot outcome by exposure, cont. Jul-15H.S.10 Linear effects? Yes

Bivariate analysis Jul-15H.S.11 Outcome: birthweight

REGRESSION Continuous outcome: Linear regression, Birth weight Jul-15H.S.12

Categorical covariates 2 categories –OK, but know the coding 3+ categories –Use “dummies” “Dummies” are 0/1 variables used to create contrasts Want 3 categories for parity: 0, 1 and 2-7 children Choose 0 as reference Make dummies for the two other categories Jul-15H.S.13 generate Parity1 =(parity==1) if parity<. generate Parity2_7 =(parity>=2) if parity<.

Model estimation Jul-15H.S.14 Syntax: regress weight gest sex Parity1 Parity2_7

Create meaningful constant Expected birth weight at: gest= 0, sex=0, parity=0 gest=280, sex=1, parity=0 Alternative: center variables gen gest280=gest-280 gest280 has a meaningful zero at 280 days gen sex0=sex-1 sex0 has a meaningful zero at boys

Model results Jul-15H.S.16

Jul-15H.S.17 Test of assumptions Discuss Independent residuals? Plot residuals versus predicted y Linear effects? constant variance?

Jul-15H.S.18 Violations of assumptions Dependent residuals Use linear mixed models Non linear effects Add square term Or use piecewise linear Non-constant variance Use robust variance estimation

Jul-15H.S.19 Influence

Jul-15H.S.20 Measures of influence Measure change in: –Predicted outcome –Deviance –Coefficients (beta) Delta beta Remove obs 1, see change remove obs 2, see change

Delta beta for gestational age Jul-15H.S.21 If obs nr 539 is removed, beta will change from 6 to 16

Removing outlier Jul-15H.S.22 Full dataOutlier removed One outlier affected two estimatesFinal model

Jul-15H.S.23 Summing up DAGs –Guide analysis Plots –Unequal variance, non-linearity, outliers Bivariate analysis Linear regression –Fit model –Check assumptions –Check robustness –Make meaningful constant