Just one quick favor… Please use your phone or laptop Please take just a minute to complete Course Evaluations online….. Check your for a link or go to… Lecture and Lab
Introduction to Statistics for the Social Sciences SBS200, COMM200, GEOG200, PA200, POL200, or SOC200 Lecture Section 001, Spring 2016 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays & Fridays
Before our fourth and final exam (May 2 nd ) OpenStax Chapters 1 – 13 (Chapter 12 is emphasized) Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions Schedule of readings Study guide for Exam 4 is online Stats Review by Jonathon & Nick Wednesday evening (April 27 th ) Time: 6:30 – 8:30 Location: ILC 120 Cost: $5.00 Stats Review by Jonathon & Nick Wednesday evening (April 27 th ) Time: 6:30 – 8:30 Location: ILC 120 Cost: $5.00
On class website: Please complete homework worksheet #27 Completing 6 types of analysis using Excel: Semester Summary Due: Friday, April 29 th Homework
By the end of lecture today 4/27/16 Multiple Regression Using multiple predictor variables (independent) to make predictions about a single predicted variable (dependent) Multiple regression coefficients (b) One regression coefficient for each independent variable
No more labs this semester
Sample memorandum and general instructions for Project 4 are both online Due this week in class
Review (0.71 > 0.632) 50% is explained so the other 50% has yet to be explained
Summary Slope: as sales calls increase by one, more systems should be sold Intercept: suggests that we can assume each salesperson will sell at least systems Review
Some useful terms Regression uses the predictor variable (independent) to make predictions about the predicted variable (dependent) Coefficient of correlation is name for “r” Coefficient of determination is name for “r 2 ” (remember it is always positive – no direction info) Coefficient of regression is name for “b” Residual is found by y – y’
Pop Quiz – 2.How many dependent variables are in a simple regression and in a multiple regression? 1.How does a multiple regression differ from a simple regression? (Give an example of each) 3. How are “slopes”, “b”s, and “regression coefficients” related? 4. Please name each symbol r r 2 b y – y’ 5. What possible values can each of these have? r r 2 b y – y’ Standard error of the estimate
Pop Quiz – 2.How many dependent variables are in a simple regression and in a multiple regression? 1.How does a multiple regression differ from a simple regression? (Give an example of each) 3. How are “slopes”, “b”s, and “regression coefficients” related? 4. Please name each symbol r r 2 b y – y’ 5. What possible values can each of these have? r r 2 b y – y’ Standard error of the estimate Simple regression has one predictor variable and one predicted variable Multiple regression has multiple predictor variables and one predicted variable Simple regression has one predictor variable and one predicted variable Multiple regression has multiple predictor variables and one predicted variable Examples: Simple regression: Predicting sales from number of sales calls made Multiple regression: Predicting job success from age, niceness, and harshness Examples: Simple regression: Predicting sales from number of sales calls made Multiple regression: Predicting job success from age, niceness, and harshness
Pop Quiz – 2.How many dependent variables are in a simple regression and in a multiple regression? 1.How does a multiple regression differ from a simple regression? (Give an example of each) 3. How are “slopes”, “b”s, and “regression coefficients” related? 4. Please name each symbol r r 2 b y – y’ 5. What possible values can each of these have? r r 2 b y – y’ Standard error of the estimate Simple regression has one independent variable and one dependent variable Multiple regression has multiple independent variables and one dependent variable Simple regression has one independent variable and one dependent variable Multiple regression has multiple independent variables and one dependent variable
Pop Quiz – 2.How many dependent variables are in a simple regression and in a multiple regression? 1.How does a multiple regression differ from a simple regression? (Give an example of each) 3. How are “slopes”, “b”s, and “regression coefficients” related? 4. Please name each symbol r r 2 b y – y’ 5. What possible values can each of these have? r r 2 b y – y’ Standard error of the estimate All are names for the same thing
Pop Quiz – 2.How many dependent variables are in a simple regression and in a multiple regression? 1.How does a multiple regression differ from a simple regression? (Give an example of each) 3. How are “slopes”, “b”s, and “regression coefficients” related? 4. Please name each symbol r r 2 b y – y’ 5. What possible values can each of these have? r r 2 b y – y’ Standard error of the estimate Coefficient of correlation Coefficient of determination Coefficient of regression Residual
Pop Quiz – 2.How many dependent variables are in a simple regression and in a multiple regression? 1.How does a multiple regression differ from a simple regression? (Give an example of each) 3. How are “slopes”, “b”s, and “regression coefficients” related? 4. Please name each symbol r r 2 b y – y’ 5. What possible values can each of these have? r r 2 b y – y’ Standard error of the estimate Can vary from -1 to +1 Can vary from 0 to +1 Any number Any positive number Any number
14-18 Can we predict heating cost? Three variables are thought to relate to the heating costs: (1) the mean daily outside temperature, (2) the number of inches of insulation in the attic, and (3) the age in years of the furnace. To investigate, Salisbury's research department selected a random sample of 20 recently sold homes. It determined the cost to heat each home last January Multiple Linear Regression - Example
14-25 The Multiple Regression Equation – Interpreting the Regression Coefficients b 1 = The regression coefficient for mean outside temperature (X 1 ) is The coefficient is negative and shows a negative correlation between heating cost and temperature. As the outside temperature increases, the cost to heat the home decreases. The numeric value of the regression coefficient provides more information. If we increase temperature by 1 degree and hold the other two independent variables constant, we can estimate a decrease of $4.583 in monthly heating cost.
14-26 The Multiple Regression Equation – Interpreting the Regression Coefficients b 2 = The regression coefficient for mean attic insulation (X 2 ) is The coefficient is negative and shows a negative correlation between heating cost and insulation. The more insulation in the attic, the less the cost to heat the home. So the negative sign for this coefficient is logical. For each additional inch of insulation, we expect the cost to heat the home to decline $14.83 per month, regardless of the outside temperature or the age of the furnace.
14-27 The Multiple Regression Equation – Interpreting the Regression Coefficients b 3 = The regression coefficient for mean attic insulation (X 3 ) is The coefficient is positive and shows a negative correlation between heating cost and insulation. As the age of the furnace goes up, the cost to heat the home increases. Specifically, for each additional year older the furnace is, we expect the cost to increase $6.10 per month.
Applying the Model for Estimation What is the estimated heating cost for a home if: the mean outside temperature is 30 degrees, there are 5 inches of insulation in the attic, and the furnace is 10 years old?
Average Temperature Heating Cost r(18) = r(18) = Insulation Heating Cost r(18) = r(18) = Age of Furnace Heating Cost r(18) = r(18) =
Average Temperature Heating Cost r(18) = r(18) = Insulation Heating Cost r(18) = r(18) = Age of Furnace Heating Cost r(18) = r(18) =
x x x 3 Y’ =
x x x 3 Y’ =
x x x 3 Y’ =
x x x 3 Y’ =
x x x 3 Y’ =
(30) (5) (10) Y’ = Y’ = = $ Calculate the predicted heating cost using the new value for the age of the furnace Use the regression coefficient for the furnace ($6.10), to estimate the change
(30) (5) (10) Y’ = Y’ = = $ $ Calculate the predicted heating cost using the new value for the age of the furnace Use the regression coefficient for the furnace ($6.10), to estimate the change (30) (5) (10) Y’ = Y’ = = $ (30) (5) (11) Y’ = Y’ = = $ These differ by only one year but heating cost changed by $ – = 6.10
High School GPA GPA r(7) = 0.50 r(7) = SAT (Verbal) GPA r(7) = r(7) = SAT (Mathematical) GPA r(7) = r(7) =
High School GPA GPA r(7) = 0.50 r(7) = SAT (Verbal) GPA r(7) = r(7) = SAT (Mathematical) GPA r(7) = r(7) =
High School GPA GPA r(7) = 0.50 r(7) = SAT (Verbal) GPA r(7) = r(7) = SAT (Mathematical) GPA r(7) = r(7) =
High School GPA GPA r(7) = 0.50 r(7) = SAT (Verbal) GPA r(7) = r(7) = SAT (Mathematical) GPA r(7) = r(7) =
No
Yes No
No Yes No
No Yes No
No Yes No High School GPA
No Yes No High School GPA x x x 1 Y’ =
(460) (430) (2.8) Y’ = x x x 1 Y’ = =
(460) (430) (3.8) Y’ = x x x 1 Y’ = =
Yes, use the regression coefficient for the HS GPA (1.2), to estimate the change = 1.2
Yes – close enough
± (1.96) (289.06) ± ± (2.58) (289.06) ±
Number of doors 22-doors vs 4-doors Quasi Price of car (dollars) Ratio Between Two-tailed $23, $20, alpha = p = The average price for 2-door cars was $23,807.14, while the average price for 4-door cars was $20, A t-test was conducted and found this to be a significant difference t(802) = ; p < 0.01
Size of the engine 3 4- versus 6- versus 8 cylinder engines Quasi Price of car (dollars) Ratio Between $17,862 $20,081 alpha = p =.(17 zeroes, then)69755 The average prices were $17,862, $20,081 and $ for the 4-, 6-, and 8-cylinder engines (respectively). An ANOVA was conducted and found this to be a significant difference F(2,801) = 345; p < $38,968
The relationship between mileage and car price was This is a weak and not significant correlation r(802) = ;n.s
df = 802 b = r = a = lower higher x For each additional mile driven (as x goes up by 1), cost of the car goes down by cents. The base cost for the car (before taking into account the mileage) is $24,765 (30,000)( ) = $19, = or 2.04% The proportion of total variance for price of car that was accounted for by miles was 2.04%
Y’ = cost of car (dollars) mileagesize of car Y’ = ( )(mileage) + ( )( car size) yes cents $4,027.67