Xuhua Xia Fitting Several Regression Lines Many applications of statistical analysis involves a continuous variable as dependent variable (DV) but both.

Slides:



Advertisements
Similar presentations
Questions From Yesterday
Advertisements

Linear Equations Review. Find the slope and y intercept: y + x = -1.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Pearson Product- Moment Correlation PowerPoint Prepared.
Correlation and regression
Regresi Linear Sederhana Pertemuan 01 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Lecture 6 Basic Statistics Dr. A.K.M. Shafiqul Islam School of Bioprocess Engineering University Malaysia Perlis
Qualitative Variables and
Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.
More about Correlations. Spearman Rank order correlation Does the same type of analysis as a Pearson r but with data that only represents order. –Ordinal.
Analyzing quantitative data – section III Week 10 Lecture 1.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
 Combines linear regression and ANOVA  Can be used to compare g treatments, after controlling for quantitative factor believed to be related to response.
© The McGraw-Hill Companies, Inc., 2000 Business and Finance College Principles of Statistics Lecture 10 aaed EL Rabai week
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
Multiple Regression Lab Chapter Topics Multiple Linear Regression Effects Levels of Measurement Dummy Variables 2.
Regression. Population Covariance and Correlation.
Regression using lm lmRegression.R Basics Prediction World Bank CO2 Data.
Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:
Regression Lesson 11. The General Linear Model n Relationship b/n predictor & outcome variables form straight line l Correlation, regression, t-tests,
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Aim: Review for Exam Tomorrow. Independent VS. Dependent Variable Response Variables (DV) measures an outcome of a study Explanatory Variables (IV) explains.
Analysis of Covariance Combines linear regression and ANOVA Can be used to compare g treatments, after controlling for quantitative factor believed to.
Analysis of Covariance (ANCOVA)
Regression Models for Quantitative (Numeric) and Qualitative (Categorical) Predictors KNNL – Chapter 8.
Environmental Modeling Basic Testing Methods - Statistics III.
Unit 1: Scientific Process. Level 2 Can determine the coordinates of a given point on a graph. Can determine the independent and dependent variable when.
Categorical Independent Variables STA302 Fall 2013.
Scatter Diagrams scatter plot scatter diagram A scatter plot is a graph that may be used to represent the relationship between two variables. Also referred.
9.1 Chapter 9: Dummy Variables A Dummy Variable: is a variable that can take on only 2 possible values: yes, no up, down male, female union member, non-union.
© 2001 Prentice-Hall, Inc.Chap 13-1 BA 201 Lecture 18 Introduction to Simple Linear Regression (Data)Data.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Differential Equations Linear Equations with Variable Coefficients.
Unit 3 Section : Regression  Regression – statistical method used to describe the nature of the relationship between variables.  Positive.
The General Linear Model. Estimation -- The General Linear Model Formula for a straight line y = b 0 + b 1 x x y.
BPA CSUB Prof. Yong Choi. Midwest Distribution 1. Create scatter plot Find out whether there is a linear relationship pattern or not Easy and simple using.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
BUSINESS MATHEMATICS & STATISTICS. Module 6 Correlation ( Lecture 28-29) Line Fitting ( Lectures 30-31) Time Series and Exponential Smoothing ( Lectures.
Introduction Many problems in Engineering, Management, Health Sciences and other Sciences involve exploring the relationships between two or more variables.
3.3. SIMPLE LINEAR REGRESSION: DUMMY VARIABLES 1 Design and Data Analysis in Psychology II Salvador Chacón Moscoso Susana Sanduvete Chaves.
Chapter 11 Regression Analysis in Body Composition Research.
Analyzing Mixed Costs Appendix 5A.
Section 12.2 Linear Regression
Lecture 10 Regression Analysis
REGRESSION (R2).
B&A ; and REGRESSION - ANCOVA B&A ; and
Analyzing Mixed Costs Appendix 5A.
S519: Evaluation of Information Systems
Correlation and Regression-II
Regression Analysis PhD Course.
2. Find the equation of line of regression
2-7 Curve Fitting with Linear Models Holt Algebra 2.
Regression Analysis Week 4.
Multiple Regression A curvilinear relationship between one variable and the values of two or more other independent variables. Y = intercept + (slope1.
Prepared by Lee Revere and John Large
The Multiple Regression Model
Analysis of Covariance
Simple Linear Regression
Regression making predictions
Multiple Linear Regression
Regression and Categorical Predictors
Analysis of Covariance
Analysis of Covariance
Chapter 14 Multiple Regression
Regression Part II.
Presentation transcript:

Xuhua Xia Fitting Several Regression Lines Many applications of statistical analysis involves a continuous variable as dependent variable (DV) but both continuous and categorical variables as independent variables (IV). –Relationship between DV and continuous IVs is linear and the slope remains the same in different groups: ANCOVA. –Different slopes: Full model. An illustrative data set will make this clear.

Xuhua Xia Fitting Several Regression Lines The muscle strength (MS) depends on the diameter of the muscle fiber and the type of muscle (TM). Identify DV and IV. How do we incorporate the qualitative variable in to the model? The dummy variables. TMDMS A111.5 A213.8 A314.4 A416.8 A518.7 B110.8 B212.3 B313.7 B414.2 B516.6 C113.1 C216.2 C319.0 C422.9 C526.5

Xuhua Xia Two Scenarios Same intercept Different intercepts Different slopes: full model Same slope: ANCOVA

Xuhua Xia Two Scenarios Same intercept Different intercepts Different slopes Same slope Y 1 = a + b 1 X Y 2 = a + b 2 X Y 1 - Y 2 = (b 1 -b 2 )X Y 1 = a 1 + b X Y 2 = a 2 + b X Y 1 - Y 2 = a 1 -a 2 Multiplicative effect Additive effect

Xuhua Xia Plot of MS vs D by TM

Objectives Obtain regression equations relating MS to D for each TM. Compare the mean MS for the three TMs at a given level of D. Is it meaningful to compare the mean MS for the three TMs without specifying the level of D?

Xuhua Xia Explaining the R functions Every 'factor' variable (TM in our case) used in lm model-fitting creates k-1 dummy variable: DUMA = 0 (not created) DUMB = 1 if TM=B = 0 otherwise DUMC = 1 if TM=C = 0 otherwise MS =  +  1 DUMB +  2 DUMC +  3 D +  4 DUMB*D +  5 DUMC*D +  The solution option prints estimates of the model coefficients.

Xuhua Xia Illustration with EXCEL MS TM DDUMBDUMCDUMB*DDUMC*D 11.5A A A A A B B B B B C C C C C5 0105

R functions md <- read.table("DiffSlopeMuscle.txt",header=T) attach(md) minX<-min(D) maxX<-max(D) minY<-min(MS) maxY<-max(MS) plot(D[TM=="A"],MS[TM=="A"],xlab="D",ylab="MS",xlim=c(minX,maxX),ylim=c(minY,m axY),pch=16) points(D[TM == "B"], MS[TM == "B"], col='red',pch=16) points(D[TM=="C"], MS[TM == "C"], col='blue',pch=16) # Will ANOVA reveal the difference between the three teachers? fitANOVA<-aov(D~TM);anova(fitANOVA) # No significant difference in D, so students at the beginning appears # to be similar. Given the same-quality students to begin with, which # teacher will produce high-performing students at the end? fitANOVA<-aov(MS~TM);anova(fitANOVA) # Check the plot for slope heterogeneity # Explicit test of slope heterogeneity fit<-lm(MS~D*TM) anova(fit) # Check for significance: if not significant, then do ANCOVA fit<-lm(MS~D+TM) anova(fit)

R Output > anova(fit) Analysis of Variance Table Response: MS Df Sum Sq Mean Sq F value Pr(>F) D e-10 TM e-08 D:TM e-06 Residuals > summary(fit) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) e-09 D e-07 TMB TMC D:TMB D:TMC e-05 highly significant interaction. MS= D-0.35B-0.33C-0.39D*B+1.61D*C A: MS = *D B: MS = D D = *D C: MS = D-0.33C+1.61D = *D It might help to show regression with dummy variables in EXCEL

Type I and Type III SS Xuhua Xia > anova(fit) Analysis of Variance Table Response: MS Df Sum Sq Mean Sq F value Pr(>F) D e-10 *** TM e-08 *** D:TM e-06 *** Residuals > drop1(fit,~.,test="F") Single term deletions Model: MS ~ D * TM Df Sum of Sq RSS AIC F value Pr(>F) D e-07 *** TM D:TM e-06 *** Type I SS and F-test Type III SS and F-test

R functions Xuhua Xia nd1<-subset(md,subset=(TM=="A")) nd2<-subset(md,subset=(TM=="B")) nd3<-subset(md,subset=(TM=="C")) nd1<-nd1[order(nd1$D),] nd2<-nd2[order(nd2$D),] nd3<-nd3[order(nd3$D),] y1<-predict(fit,nd1,interval="confidence") y2<-predict(fit,nd2,interval="confidence") y3<-predict(fit,nd3,interval="confidence") par(mfrow=c(1,3)) plot(D[TM=="A"],MS[TM=="A"],xlab="D",ylab="MS",xlim=c(minX,maxX),ylim=c(minY,maxY),pch=16) points(D[TM == "B"], MS[TM == "B"], col='red',pch=16) points(D[TM=="C"], MS[TM == "C"], col='blue',pch=16) lines(nd1$D,y1[,1],col="black") lines(nd1$D,y1[,2],col="black") lines(nd1$D,y1[,3],col="black") lines(nd2$D,y2[,1],col="red") lines(nd2$D,y2[,2],col="red") lines(nd2$D,y2[,3],col="red") lines(nd3$D,y3[,1],col="blue") lines(nd3$D,y3[,2],col="blue") lines(nd3$D,y3[,3],col="blue") Call plot before lines

95% CI plots Xuhua Xia