STAT 250 Dr. Kari Lock Morgan

Slides:



Advertisements
Similar presentations
The Right Questions about Statistics: How regression works Maths Learning Centre The University of Adelaide Regression is a method designed to create a.
Advertisements

Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Regression Inferential Methods
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTION 2.6, 9.1 Least squares line Interpreting.
Through Thick and Thin By:Mark Bergman Thomas Bursey Jay LaPorte Paul Miller Aaron Sinz.
July 1, 2008Lecture 17 - Regression Testing1 Testing Relationships between Variables Statistics Lecture 17.
AP Statistics Mrs Johnson
Stat 112: Lecture 9 Notes Homework 3: Due next Thursday
Multiple Regression III 4/16/12 More on categorical variables Missing data Variable Selection Stepwise Regression Confounding variables Not in book Professor.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Least Squares Regression
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Simple Linear Regression Least squares line Interpreting coefficients Prediction Cautions The formal model Section 2.6, 9.1, 9.2 Professor Kari Lock Morgan.
Simple Linear Regression Analysis
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Simple Linear Regression SECTIONS 9.3 Confidence and prediction intervals.
Multiple Regression continued… STAT E-150 Statistical Methods.
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Understanding Multivariate Research Berry & Sanders.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Multiple Regression SECTIONS 10.1, 10.3 (?) Multiple explanatory variables.
Regression. Height Weight How much would an adult female weigh if she were 5 feet tall? She could weigh varying amounts – in other words, there is a distribution.
Regression. Height Weight Suppose you took many samples of the same size from this population & calculated the LSRL for each. Using the slope from each.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan Multiple Regression SECTIONS 9.2, 10.1, 10.2 Multiple explanatory variables.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Simple Linear Regression SECTION 9.1 Inference for correlation Inference for.
Multiple Regression I 4/9/12 Transformations The model Individual coefficients R 2 ANOVA for regression Residual standard error Section 9.4, 9.5 Professor.
Regression with Inference Notes: Page 231. Height Weight Suppose you took many samples of the same size from this population & calculated the LSRL for.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 11/6/12 Simple Linear Regression SECTIONS 9.1, 9.3 Inference for slope (9.1)
Inference with computer printouts. Coefficie nts Standard Errort StatP-value Lower 95% Upper 95% Intercept
Least Squares Regression.   If we have two variables X and Y, we often would like to model the relation as a line  Draw a line through the scatter.
Regression Analysis: Part 2 Inference Dummies / Interactions Multicollinearity / Heteroscedasticity Residual Analysis / Outliers.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Chapter 10 Correlation and Regression 10-2 Correlation 10-3 Regression.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Simple Linear Regression SECTION 2.6 Least squares line Interpreting coefficients.
Statistics: Unlocking the Power of Data Lock 5 STAT 101 Dr. Kari Lock Morgan 11/20/12 Multiple Regression SECTIONS 9.2, 10.1, 10.2 Multiple explanatory.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Multiple Regression SECTIONS 10.1, 10.3 Multiple explanatory variables (10.1,
Statistics: Unlocking the Power of Data Lock 5 STAT 250 Dr. Kari Lock Morgan Simple Linear Regression SECTION 9.1 Inference for correlation Inference for.
Regression Inference. Height Weight How much would an adult male weigh if he were 5 feet tall? He could weigh varying amounts (in other words, there is.
Stats Methods at IC Lecture 3: Regression.
Lecture #25 Tuesday, November 15, 2016 Textbook: 14.1 and 14.3
Statistics 200 Lecture #6 Thursday, September 8, 2016
Unit 4 LSRL.
Regression.
Introduction to Regression Analysis
How regression works The Right Questions about Statistics:
Regression Inferential Methods
Applied Biostatistics: Lecture 2
Introduction to Statistics for the Social Sciences SBS200 - Lecture Section 001, Spring 2017 Room 150 Harvill Building 9:00 - 9:50 Mondays, Wednesdays.
Chapter 5 LSRL.
Basics of Group Analysis
Regression.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
BUSI 410 Business Analytics
CHAPTER 26: Inference for Regression

Practice Mid-Term Exam
Chapter 12 Regression.
Regression Inference.
Two Quantitative Variables: Linear Regression
Multiple Regression BPS 7e Chapter 29 © 2015 W. H. Freeman and Company.
Regression.
Regression.
Regression.
Regression Chapter 8.
Regression.
Chapter 5 LSRL.
Regression.
Regression & Prediction
Day 68 Agenda: 30 minute workday on Hypothesis Test --- you have 9 worksheets to use as practice Begin Ch 15 (last topic)
Honors Statistics Review Chapters 7 & 8
Presentation transcript:

STAT 250 Dr. Kari Lock Morgan Multiple Regression SECTIONS 10.1, 10.3 Multiple explanatory variables (10.1, 10.3)

More than 2 variables! For the rest of the course, we’ll finally get beyond one or two variables!

How can we predict body fat percentage from easy measurements? Question of the Day How can we predict body fat percentage from easy measurements? pheromones = subconscious chemical signals

Predicting Body Fat Percentage The percentage of a person’s weight that is made up of body fat is often used as an indicator of health and fitness Accurate measures of percent body fat are hard to get (for example, immerse the body in water to estimate density, then apply a formula) Another option: build a model predicting % body fat based on easy to obtain measurements

Body Fat Data Measurements were collected on 100 men Response variable: percent body fat Explanatory variables: Age (in years) Weight (in pounds) Height (in inches) Neck circumference (in cm) Chest circumference (in cm) Abdomen circumference (in cm) Ankle circumference (in cm) Biceps circumference (in cm) Wrist circumference (in cm) A sample taken from data provided by Johnson R., "Fitting Percentage of Body Fat to Simple Body Measurements," Journal of Statistics Education, 1996

Multiple Regression Multiple regression extends simple linear regression to include multiple explanatory variables: Each x is a different explanatory variable k is the number of explanatory variables

Three Explanatory Variables We’ll start with three explanatory variables: age, weight, height The regression equation is… 𝐵𝑜𝑑𝑦𝑓𝑎𝑡 =16.81+0.05𝐴𝑔𝑒+0.023𝑊𝑒𝑖𝑔ℎ𝑡+.256𝐻𝑒𝑖𝑔ℎ𝑡 𝐵𝑜𝑑𝑦𝑓𝑎𝑡 =0.05𝐴𝑔𝑒+0.023𝑊𝑒𝑖𝑔ℎ𝑡+.256𝐻𝑒𝑖𝑔ℎ𝑡 𝐵𝑜𝑑𝑦𝑓𝑎𝑡 =49.6+0.1653𝐴𝑔𝑒+0.2264𝑊𝑒𝑖𝑔ℎ𝑡−1.1169𝐻𝑒𝑖𝑔ℎ𝑡 𝐵𝑜𝑑𝑦𝑓𝑎𝑡 =0.1653𝐴𝑔𝑒+0.2264𝑊𝑒𝑖𝑔ℎ𝑡−1.1169𝐻𝑒𝑖𝑔ℎ𝑡

Predicting Percent Body Fat What can we do with this? Make predictions Interpret coefficients Inference Interpret R2 and more!

Making Predictions If you are male, you can use this to predict your percent body fat! (Females can try too, just for practice, but it won’t be accurate – why not?) Age: years, weight: pounds, height: inches

Percent Body Fat

Interpreting Coefficients Intercept: a man 0 years old, weighs 0 lbs, and is 0 inches tall would have 49.6% body fat Slope: Keeping weight and height constant, percent body fat increases by 0.1653 for every additional year Keeping age and height constant, percent body fat increases by 0.2264 for every additional pound

Interpreting Coefficients Which of the following is a correct interpretation? Keeping age and weight constant, height decreases by 1.117 for every additional percent of body fat Keeping age and weight constant, percent body fat decreases by 1.117 for every additional inch Predicted body fat decreases by 1.117 for every additional inch

Minitab Output

Inference Are our explanatory variables significant predictors? All of the p-values corresponding to the explanatory variables are very small Age, weight, and height are all significant predictors of percent body fat (given the other variables in the model)

R2 R2 is the proportion of the variability in the response variable, Y, that is explained by the fitted model For simple linear regression, R2 = r2 (R2 is just the sample correlation squared) R2 is also called the coefficient of determination

R2 How much does the variability in Y decrease if you know X?

R2 About 55% of the variability in percent body fat is explained by age, weight, and height Can we do better?

Comparing with BMI BMI is used more commonly than percent body fat because it is easy to calculate Currently, our predicted percent body fat is not using much more information than BMI (just age as an extra predictor) What’s wrong with body mass index (BMI) as a indicator of health and fitness? How might we improve our model to fix this problem?

New Model Bodyfat = -55.9 + 0.0067 Age - 0.1724 Weight + 0.099 Height + 1.066 Abdomen Anything look odd about this equation??? Model without Abdomen: Bodyfat = 49.6 + 0.1653 Age + 0.2264 Weight - 1.117 Height What’s going on?!?

Significance Which explanatory variable(s) are significant? All of them – age, weight, height, abdomen Weight and height Weight, height, abdomen Weight and abdomen Abdomen only

Multiple Regression The coefficient for each explanatory variable is the predicted change in y for one unit change in x, given the other explanatory variables in the model! The p-value for each coefficient indicates whether it is a significant predictor of y, given the other explanatory variables in the model! If explanatory variables are associated with each other, coefficients and p-values will change depending on what else is included in the model

Full Model

Which explanatory variable(s) are significant? All of them Weight and abdomen Neck only Abdomen and wrist

Insignificant Terms What should we do with the insignificant variables? Keep them in the model? Take them out of the model? Deciding which variables to keep in the model (variable selection) is an entire subfield of statistics, and beyond the scope of this class Want to learn more about it? Take STAT 462!

Explaining Variability How much of the variability in percent body fat is explained by this model? Which of the following would tell us this? p-value correlation slope coefficients R2 confidence interval

Full Model

What will I get on the final exam??? Question #2 of the Day What will I get on the final exam??? pheromones = subconscious chemical signals

Model Output All grades are in percent form (0 – 100) You can predict your final exam score based on your performance so far! You have a point estimate… what do you really want???

Uncertainty? To get an exact prediction interval, use “Predict for Regression” in Minitab To get an approximate interval, take your predicted value and add and subtract 2 × 𝑆

Significance WileyPlus and Clicker grades are not significant in the model. Does this mean that they are not significantly associated with Final Exam score? Yes No

Significance Clicker is still not significant in the model. Does this mean coming to class doesn’t matter? Yes No

Clicker Can we conclude that coming to class improves your score on the final exam? Yes No

Multiple Regression Coefficients and p-values depend on the other explanatory variables included in the model!!!

Lots More! The goal of this class was to expose you to multiple regression as a way to incorporate more than two variables This one class does NOT cover everything you should know about regression! If you really want to use multiple regression for data analysis, take STAT 462! (Or consult with a statistician)

To Do Do HW 10.13 (due Wednesday, 4/19)