Download presentation
Presentation is loading. Please wait.
Published byBruce Hunt Modified over 9 years ago
1
Creating Graphs on Saturn GOPTIONS DEVICE = png HTITLE=2 HTEXT=1.5 GSFMODE = replace; PROC REG DATA=agebp; MODEL sbp = age; PLOT sbp*age; RUN; This will create file sasgraph.png 1. Transfer file to PC (binary mode) 2. Open Word 3. Choose Insert picture from file PROC REG DATA=agebp LP; MODEL sbp = age; PLOT sbp*age; RUN;
2
Multiple Linear Regression More than 1 independent variable –See how combinations of several variables are associated with and can predict the dependent variable. How much of the total variability can be explained? –Control for confounding (interested in the effect of one variable but want to “adjust” for another variable) –Explore interactions PROC REG DATA=datasetname ; MODEL depvar = x1; MODEL depvar = x1 x2; MODEL depvar = x1 x2 x3; RUN;
3
Question Explored Using Multiple Regression How much of the variation in test scores among school districts can be explained by several district characteristics? Is calcium intake related to BP independent of age? Is the relationship between age and BP the same for men and women.
4
Reminder Y variable is continuous and is normally distributed for each combination of X’s with the same variability X variables can be continuous or indicator variables and do not need to be normally distributed
5
2 Factors 1.Y = 0 + 1 X 1 2.Y = 0 + 2 X 2 3.Y = 0 + 1 X 1 + 2 X 2 Do you get the same slope in models 1 and 3
6
Control for confounding Both SLR models for each cohort significant Overall not significant (negative confounding)
7
n The equation that describes how the mean value of y is related to x 1, x 2,... x p. y = 0 + 1 x 1 + 2 x 2 +... + p x p Multiple Regression Equation = Mean of y when all x variables are equal to 0 i = change in mean y corresponding to a 1 unit change in x i considering all other predictors fixed Implied: The impact of x 1 is the same for each of the other values of x 2, x 3, … x p
8
Multiple Regression Model n The equation that describes how the dependent variable y is related to the independent variables x 1, x 2,... x p and an error term is called the multiple regression model. y = 0 + 1 x 1 + 2 x 2 +... + p x p + reflects how individuals deviate from others with the same values of x’s reflects how individuals deviate from others with the same values of x’s
9
n The estimated multiple regression equation is: y = b 0 + b 1 x 1 + b 2 x 2 +... + b p x p Estimated Multiple Regression Equation ^ b i estimates i y y is estimated (or predicted) value for a set of x’s ^
10
Estimation n Least Squares Criterion n Computation of Coefficients Values The formulas for the regression coefficients b 0, b 1, b 2,... b p involve the use of matrix algebra. We will use SAS to perform the calculations. ^
11
Find the best multidimensional plane
12
Testing for Significance: Global Test n Hypotheses H 0 : 1 = 2 =... = p = 0 H 0 : 1 = 2 =... = p = 0 H a : One or more of the parameters H a : One or more of the parameters is not equal to zero. is not equal to zero. n Test Statistic F = MSR/MSE n Rejection Rule Reject H 0 if F > F where F is based on an F distribution with p d.f. in the numerator and n - p - 1 d.f. in the denominator.
13
Testing for Significance: Individual ’s n Hypotheses H 0 : i = 0 H 0 : i = 0 H a : i = 0 H a : i = 0 n Test Statistic n Rejection Rule Reject H 0 for small or large t Meaning: Is X i related to Y after taking into account all other variables in the model
14
Possibilities n X1 is related to Y alone but after adjusting for X2, then X1 is no longer related to Y n X1 is not related to Y alone but after adjusting for X2, then X1 is related to Y n Relation of X1 with Y1 gets stronger after adjusting for X2 n Relation of X1 with Y gets weaker after adjusting for X2
15
Pulmonary Function Example Dependent Variable: Forced Expired Volume (FEV 1.0 ) Independent Variables: –Age of person –Smoking status of person Questions: –Is age related to FEV independent of smoking status –Is smoking status related to FEV independent of age –How much of the variability in FEV is explained by age and smoking combined
16
Model for FEV Example Y = 0 + 1 X 1 + 2 X 2 X 1 = smoking status (1=smoker, 0=nonsmoker) X 2 = age Smokers FEV = 0 + 1 + 2 age Non Smokers FEV = 0 + 2 age
17
Interpretation of Parameters Smokers FEV = 0 + 1 + 2 age Non Smokers FEV = 0 + 2 age 1 is the effect of smoking for fixed levels of age 2 is the effect of age pooled over smokers and non-smokers This model assumes the relation of age to FEV is the same for smokers and non-smokers
18
DATA fev; INFILE DATALINES; INPUT age smk fev; DATALINES; 28 1 4.0 30 1 3.9 30 1 3.7 31 1 3.6 54 0 2.9 More data
20
PROC MEANS; VAR fev; CLASS smk; RUN; The MEANS Procedure Analysis Variable : fev N smk Obs N Mean Std Dev Minimum Maximum 0 15 15 3.6000000 0.4208834 2.9000000 4.3000000 1 15 15 3.2933333 0.5257195 2.2000000 4.000000
21
PROC CORR DATA=fev; Pearson Correlation Coefficients, N = 30 Prob > |r| under H0: Rho=0 age smk fev age 1.00000 -0.12788 -0.73024 0.5007 <.0001 smk -0.12788 1.00000 -0.31620 0.5007 0.0887 fev -0.73024 -0.31620 1.00000 <.0001 0.0887
22
PROC REG; MODEL fev = age smk ; RUN; Dependent Variable: fev Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 SSR 4.96510 2.48255 32.08 <.0001 Error 27 SSE 2.08957 0.07739 Corrected Total 29 SST 7.05467 Root MSE 0.27819 R-Square 0.7038 Dependent Mean 3.44667 Coeff Var 8.07136 Tests Ho: 1 = 0; 2 =0 Proportion of variance explained by both variables
23
PROC REG; MODEL fev = age smk ; MODEL fev = age ; MODEL fev = smk ; RUN; Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 5.58114 0.27653 20.18 <.0001 age 1 -0.04702 0.00634 -7.42 <.0001 smk 1 -0.40384 0.10242 -3.94 0.0005 Intercept 1 5.24787 0.32456 16.17 <.0001 age 1 -0.04382 0.00775 -5.66 <.0001 Intercept 1 3.60000 0.12295 29.28 <.0001 smk 1 -0.30667 0.17388 -1.76 0.0887 R 2 =.7038 R 2 =.5333 R 2 =.1000
24
PROC REG; MODEL fev = age smk; PROC REG; MODEL fev = age ; WHERE smk = 0; PROC REG; MODEL fev = age ; WHERE smk = 1; Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 5.58114 0.27653 20.18 <.0001 age 1 -0.04702 0.00634 -7.42 <.0001 smk 1 -0.40384 0.10242 -3.94 0.0005 Parameter Standard Variable DF Estimate Error t Value Pr > |t| Intercept 1 5.24764 0.38050 13.79 <.0001 age 1 -0.03911 0.00887 -4.41 0.0007 Intercept 1 5.50002 0.36163 15.21 <.0001 age 1 -0.05508 0.00885 -6.22 <.0001 Non-smokers Smokers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.