Lecture 22 – Thurs., Nov. 25 Nominal explanatory variables (Chapter 9.3) Inference for multiple regression (Chapter 10.1-10.2)

Slides:



Advertisements
Similar presentations
Lecture 17: Tues., March 16 Inference for simple linear regression (Ch ) R2 statistic (Ch ) Association is not causation (Ch ) Next.
Advertisements

Example 1 To predict the asking price of a used Chevrolet Camaro, the following data were collected on the car’s age and mileage. Data is stored in CAMARO1.
Stat 112: Lecture 7 Notes Homework 2: Due next Thursday The Multiple Linear Regression model (Chapter 4.1) Inferences from multiple regression analysis.
Lecture 28 Categorical variables: –Review of slides from lecture 27 (reprint of lecture 27 categorical variables slides with typos corrected) –Practice.
Fundamentals of Real Estate Lecture 13 Spring, 2003 Copyright © Joseph A. Petry
1 Multiple Regression Chapter Introduction In this chapter we extend the simple linear regression model, and allow for any number of independent.
Class 16: Thursday, Nov. 4 Note: I will you some info on the final project this weekend and will discuss in class on Tuesday.
Chapter 13 Multiple Regression
Lecture 23: Tues., Dec. 2 Today: Thursday:
Class 15: Tuesday, Nov. 2 Multiple Regression (Chapter 11, Moore and McCabe).
BA 555 Practical Business Analysis
Lecture 26 Model Building (Chapters ) HW6 due Wednesday, April 23 rd by 5 p.m. Problem 3(d): Use JMP to calculate the prediction interval rather.
Class 19: Tuesday, Nov. 16 Specially Constructed Explanatory Variables.
Lecture 25 Multiple Regression Diagnostics (Sections )
Chapter 12 Multiple Regression
Lecture 23: Tues., April 6 Interpretation of regression coefficients (handout) Inference for multiple regression.
Lecture 25 Regression diagnostics for the multiple linear regression model Dealing with influential observations for multiple linear regression Interaction.
Class 6: Tuesday, Sep. 28 Section 2.4. Checking the assumptions of the simple linear regression model: –Residual plots –Normal quantile plots Outliers.
Stat 112: Lecture 19 Notes Chapter 7.2: Interaction Variables Thursday: Paragraph on Project Due.
Lecture 6 Notes Note: I will homework 2 tonight. It will be due next Thursday. The Multiple Linear Regression model (Chapter 4.1) Inferences from.
Stat 112: Lecture 8 Notes Homework 2: Due on Thursday Assessing Quality of Prediction (Chapter 3.5.3) Comparing Two Regression Models (Chapter 4.4) Prediction.
Lecture 26 Omitted Variable Bias formula revisited Specially constructed variables –Interaction variables –Polynomial terms for curvature –Dummy variables.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Lecture 24: Thurs., April 8th
Class 10: Tuesday, Oct. 12 Hurricane data set, review of confidence intervals and hypothesis tests Confidence intervals for mean response Prediction intervals.
Class 7: Thurs., Sep. 30. Outliers and Influential Observations Outlier: Any really unusual observation. Outlier in the X direction (called high leverage.
1 Lecture Eleven Probability Models. 2 Outline Bayesian Probability Duration Models.
Stat 512 – Lecture 17 Inference for Regression (9.5, 9.6)
Lecture 27 Polynomial Terms for Curvature Categorical Variables.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Stat 112: Lecture 20 Notes Chapter 7.2: Interaction Variables. Chapter 8: Model Building. I will Homework 6 by Friday. It will be due on Friday,
1 Lecture Eleven Probability Models. 2 Outline Bayesian Probability Duration Models.
Stat 112: Lecture 18 Notes Chapter 7.1: Using and Interpreting Indicator Variables. Visualizing polynomial regressions in multiple regression Review Problem.
Stat 112: Lecture 13 Notes Finish Chapter 5: –Review Predictions in Log-Log Transformation. –Polynomials and Transformations in Multiple Regression Start.
Lecture 19 Transformations, Predictions after Transformations Other diagnostic tools: Residual plot for nonconstant variance, histogram to check normality.
Class 20: Thurs., Nov. 18 Specially Constructed Explanatory Variables –Dummy variables for categorical variables –Interactions involving dummy variables.
Lecture 20 – Tues., Nov. 18th Multiple Regression: –Case Studies: Chapter 9.1 –Regression Coefficients in the Multiple Linear Regression Model: Chapter.
Lecture 21 – Thurs., Nov. 20 Review of Interpreting Coefficients and Prediction in Multiple Regression Strategy for Data Analysis and Graphics (Chapters.
Stat 112: Lecture 16 Notes Finish Chapter 6: –Influential Points for Multiple Regression (Section 6.7) –Assessing the Independence Assumptions and Remedies.
Stat 112: Lecture 9 Notes Homework 3: Due next Thursday
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 13-1 Chapter 13 Introduction to Multiple Regression Statistics for Managers.
Copyright © 2011 Pearson Education, Inc. Multiple Regression Chapter 23.
Inference for regression - Simple linear regression
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 23 Multiple Regression.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Stat 112 Notes 17 Time Series and Assessing the Assumption that the Disturbances Are Independent (Chapter 6.8) Using and Interpreting Indicator Variables.
Stat 112 Notes 15 Today: –Outliers and influential points. Homework 4 due on Thursday.
Chapter 10 Correlation and Regression
Economics 173 Business Statistics Lecture 22 Fall, 2001© Professor J. Petry
Outline When X’s are Dummy variables –EXAMPLE 1: USED CARS –EXAMPLE 2: RESTAURANT LOCATION Modeling a quadratic relationship –Restaurant Example.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Stat 112 Notes 16 Today: –Outliers and influential points in multiple regression (Chapter 6.7)
Lecture 4 Introduction to Multiple Regression
Lecture 27 Chapter 20.3: Nominal Variables HW6 due by 5 p.m. Wednesday Office hour today after class. Extra office hour Wednesday from Final Exam:
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.3 Two-Way ANOVA.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Stat 112 Notes 6 Today: –Chapters 4.2 (Inferences from a Multiple Regression Analysis)
Stat 112 Notes 14 Assessing the assumptions of the multiple regression model and remedies when assumptions are not met (Chapter 6).
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
1 Chapter 20 Model Building Introduction Regression analysis is one of the most commonly used techniques in statistics. It is considered powerful.
Stat 112 Notes 8 Today: –Chapters 4.3 (Assessing the Fit of a Regression Model) –Chapter 4.4 (Comparing Two Regression Models) –Chapter 4.5 (Prediction.
Lecture Eleven Probability Models.
Chapter 14 Introduction to Multiple Regression
Inference for Least Squares Lines
Basic Estimation Techniques
Regression and Categorical Predictors
Presentation transcript:

Lecture 22 – Thurs., Nov. 25 Nominal explanatory variables (Chapter 9.3) Inference for multiple regression (Chapter )

Nominal Variables To incorporate nominal variables in multiple regression analysis, we use indicator variables. Indicator variable to distinguish between two groups –The time onset (early vs. late) is a nominal variable. To incorporate it into multiple regression analysis, we used indicator variable early which equals 1 if early, 0 if late.

Nominal Variables with More than Two Categories To incorporate nominal variables with more than two categories, we use multiple indicator variables. If there are k categories, we need k-1 indicator variables.

Nominal Explanatory Variables Example: Auction Car Prices A car dealer wants to predict the auction price of a car. –The dealer believes that odometer reading and the car color are variables that affect a car’s price (data from sample of cars in auctionprice.JMP) –Three color categories are considered: White Silver Other colors Note: Color is a nominal variable.

I 1 = 1 if the color is white 0 if the color is not white I 2 = 1 if the color is silver 0 if the color is not silver The category “Other colors” is defined by: I 1 = 0; I 2 = 0 Indicator Variables in Auction Car Prices

Solution –the proposed model is –The data White car Other color Silver color Auction Car Price Model

Odometer Price Price = (Odometer) (0) (1) Price = (Odometer) (1) (0) Price = (Odometer) (0) + 148(0) (Odometer) (Odometer) (Odometer) The equation for an “other color” car. The equation for a white color car. The equation for a silver color car. From JMP we get the regression equation PRICE = (Odometer)+90.48(I-1) (I-2) Example: Auction Car Price The Regression Equation

From JMP we get the regression equation PRICE = (Odometer)+90.48(I-1) (I-2) A white car sells, on the average, for $90.48 more than a car of the “Other color” category A silver color car sells, on the average, for $ more than a car of the “Other color” category. For one additional mile the auction price decreases by 5.55 cents. Example: Auction Car Price The Regression Equation

There is insufficient evidence to infer that a white color car and a car of “other color” sell for a different auction price. There is sufficient evidence to infer that a silver color car sells for a larger price than a car of the “other color” category. Xm18-02b Example: Auction Car Price The Regression Equation

Shorthand Notation for Nominal Variables Shorthand Notation for regression model with Nominal Variables. Use all capital letters for nominal variables –Parallel Regression Lines model: –Separate Regression Lines model:

Nominal Variables in JMP It is not necessary to create indicator variables yourself to represent a nominal variable. Nominal variables in JMP: Make sure that the nominal variable’s modeling type is in fact nominal. Include the nominal variable in the Construct Model Effects box in Fit Model JMP will create indicator variables. The brackets indicate the category of the nominal variable for which the indicator variable is 1. JMP will leave out the level which is highest alphabetically or numerically.

Specially Constructed Explanatory Variables Types of specially constructed explanatory variables: –Powers of variables –Products of variables (interactions) –Indicator variables to represent nominal variables –Transformations of variables (e.g., log) Use matrix of pairwise scatterplots to initially examine the data and look for needed transformations, powers of variables.

Inference for Multiple Regression Chapter 10.2 –Tests for single coefficients –Confidence intervals for single coefficients –Confidence intervals for mean response at –Prediction intervals for Chapter 10.3 –F-test for overall significance of regression –F-test for joint significance of several terms (will not cover)

Case Study Question: Do echolocating bats expend more energy than nonecholocating bats after accounting for body size? Data: Body mass and flight energy expenditure for 4 nonecholocating bats, 12 non-echolocating birds and 4 echolocating bats. Strategy: Build a multiple regression model for mean energy expended as a function of type of flying vertebrate (echolocating bat, nonecholocating bat, nonecholocating bird) and body size. –Explore (resolve need for transformation) –Test for interaction –If no interaction, answer question with the three parallel lines model

Coded Scatterplots To construct a coded scatterplot, create columns energy nonecholocating bat, energy nonecholocating bird and energy echolocating bat. The column energy nonecholocating bat should contain only the energies for nonecholocating bats and a blank for all other species. Click graph, overlay plot, put energy nonecholocating bat, energy nonecholocating bird and energy echolocating bat in Y and mass in X.

Coded Scatterplots

Separate/Parallel Regression Lines Model Separate regression lines model: Parallel regression lines model:

Inferences for Echolocating Bats Is the parallel regression lines model appropriate? Test and There is no evidence against the parallel regression lines model so we go ahead and use it to answer the question of interest – do echolocating bats use less energy than nonecholating bats of the same body size ( ) and nonecholocating birds of the same body size.( )

Inferences for Echolocating Bats Cont. No strong evidence that echolocating bats use less energy than either nonecholocating bats (p-value = 0.35) or nonecholocating birds (p- value = 0.77) of same body size. 95% Confidence interval for difference in mean of log energy for nonecholocating bats and echolocating bats of same body size: (- 0.51,0.35). This means that 95% confidence interval for ratio of median energy for nonecholocating bats and echolocating bats of same body size is Summary of findings: Although there is no strong evidence that echolocating bats use less energy than nonecholocating bats of same body size, it is still plausible that they use quite a less bit energy (60% as much at the median). Study is inconclusive.

Prediction Intervals To find a 95% prediction interval for the mean log energy of a flying vertebrate of a given type and mass, –Fit the multiple regression model –Click red triangle next to response log energy, click save columns, click predicted values and also click indiv confid interval. This saves the predicted values, lower 95% prediction interval endpoint and upper 95% prediction interval endpoint for each observation in data set. –To get prediction interval for X’s that are not in the data set, enter a row with those X’s and then exclude the observation.