Download presentation
Presentation is loading. Please wait.
Published byHannah Nichols Modified over 9 years ago
1
MATH 3359 Introduction to Mathematical Modeling Project Multiple Linear Regression Multiple Logistic Regression
2
Project Dataset: Any fields you are interested in, large sample size Methods: simple/multiple linear regression simple/multiple logistic regression Due on April 23 rd
3
Outline Multiple Linear Regression Introduction Make scatter plots of the data Fit multiple linear regression model Prediction Multiple Logistic Regression Introduction Fit multiple logistic regression model Exercise
4
Given a data set {y i, x i, i=1,…,n} of n observations, y i is dependent variable, x i is independent variable, the linear regression model is or where Recall: Simple Linear Regression
5
Given a data set of n observations, y i is dependent variable, are independent variables, the linear regression model is Multiple Linear Regression
7
Generally, we can do transformations for those x i ’s before plugging them in the model and they might not be independent with each other. 1. Transformations: 2. Dependent case: 3. Cross-Product Terms:
8
Example The data includes the selling price at auction of 32 antique grandfather clocks. The ages of the clocks and the number of people who mad a bid are also recorded in this dataset. AgeBiddersPrice 127131235 115121080 1277845 15091522 15661047
9
Recall: Scatter Plots — Function ‘plot’ plot (auction $ Age, auction $ Price, main= 'Relationship between Price and Age')
10
plot (auction $ Bidders, auction $ Price, main= 'Relationship between Price and Number of bidders')
11
plot ( auction )
12
Fit Multiple Linear Regression Model — Function ‘lm’ in R reg= lm ( formula, data ) summary ( reg ) In our example, reg= lm ( Price ~ Age + Bidders, data = auction )
13
> summary(reg) Call: lm(formula = Price ~ Age + Bidders, data = auction) Residuals: Min 1Q Median 3Q Max -207.2 -117.8 16.5 102.7 213.5 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1336.7221 173.3561 -7.711 1.67e-08 *** Age 12.7362 0.9024 14.114 1.60e-14 *** Bidders 85.8151 8.7058 9.857 9.14e-11 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Hence, the function of best fit is Price = 12.7362 * Age + 85.8151 * Bidders – 1336.7221
14
Prediction — Function ‘predict’ in R predict the average price of the clock with Age=150, bidders=10: predict ( reg, data.frame ( Age=150,Bidders=10) ) predict the average price of the clock with Age=150, Bidders=10 and Age=160, Bidders=5: predict ( reg, data.frame ( Age=c(150,160), Bidders=c(10,5)) )
15
Exercise 1. Download data: http://www.statsci.org/data/multiple.html ‘Mass and Physical Measurements for Male Subjects’ 2. Import txt file in R 3. Use ‘Mass’ as the response, ‘ Fore’, ‘Waist’, ‘Height’ and ‘Thigh’ as independent variables 4. Make scatter plot for the response and each of the independent variables 5. Fit the multiple linear regression 6. Predict ‘Mass’ with Fore= 30, Waist=180, Height=38 and Thigh=58 and with Fore=29, Waist=179, Height=39 and Thigh=57
16
Recall: Simple Logistic Regression Odds: Log-odds:
17
Recall: Simple Logistic Regression Logistic regression models the log-odds as a linear function of independent variables Not a linear function of X
18
Multiple Logistic Regression
19
Example am: transmission, 0: auto, 1: manual hp: gross horsepower wt: weight (lb/1000)
20
Multiple Logistic Regression — Function ‘glm’ in R logreg=glm(fomula, family=‘binomial’,data=binary) glm: generalized linear model Family: distribution of variance Data: name of the dataset In the example, reg = lm ( am ~ hp + wt, data = mtcars )
21
> summary(reg) Call: lm(formula = am ~ hp + wt, data = mtcars) Residuals: Min 1Q Median 3Q Max -0.6309 -0.2562 -0.1099 0.3039 0.5301 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.547430 0.211046 7.332 4.46e-08 *** hp 0.002738 0.001192 2.297 0.029 * wt -0.479556 0.083523 -5.742 3.24e-06 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Final Model:
22
For every one unit change in hp, the log odds of manual (versus auto) increases by 0.002738, odds of manual (versus auto) increases by exp(0.002738)=1.002742. For every one unit change in wt, the log odds of manual (versus auto) decreases by 0.479556, odds of manual (versus auto) decreases by exp(0.479556)=1.615357.
23
Exercise 1. Import data from web: http://www.ats.ucla.edu/stat/data/binary.csv 2. Fit the logistic regression of admit (as response) and gre, rank and gpa (as independent variables). What is the final logistic model? Are three independent variables significant ? glm(formula, family=‘binomial’, data=)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.