Influential Observations in Regression Measurements on Heat Production as a Function of Body Mass and Work Effort. M. Greenwood (1918). “On the Efficiency.

Slides:



Advertisements
Similar presentations
Claude Beigel, PhD. Exposure Assessment Senior Scientist Research Triangle Park, USA Practical session metabolites Part II: goodness of fit and decision.
Advertisements

Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
Section 10-3 Regression.
Objectives 10.1 Simple linear regression
Ridge Regression Population Characteristics and Carbon Emissions in China ( ) Q. Zhu and X. Peng (2012). “The Impacts of Population Change on Carbon.
Chapter 27 Inferences for Regression This is just for one sample We want to talk about the relation between waist size and %body fat for the complete population.
Multivariate Data Analysis Chapter 4 – Multiple Regression.
Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.
Slide Copyright © 2010 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Business Statistics First Edition.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Chapter 12 Section 1 Inference for Linear Regression.
Linear Regression/Correlation
Regression, Residuals, and Coefficient of Determination Section 3.2.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Linear Regression – Estimating Demand Elasticities U.S. Sugar Consumption H. Schultz (1933). “A Comparison of Elasticities of Demand Obtained.
Linear Regression.
Inference for regression - Simple linear regression
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Data Collection & Processing Hand Grip Strength P textbook.
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
Mass & Volume Lab: Measuring Volume
Statistics for Business and Economics Dr. TANG Yu Department of Mathematics Soochow University May 28, 2007.
STAT E100 Section Week 3 - Regression. Review  Descriptive Statistics versus Hypothesis Testing  Outliers  Sample vs. Population  Residual Plots.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Linear Regression Chapter 8.
Xuhua Xia Polynomial Regression A biologist is interested in the relationship between feeding time and body weight in the males of a mammalian species.
Regression. Population Covariance and Correlation.
Welcome to Econ 420 Applied Regression Analysis Study Guide Week One Ending Sunday, September 2 (Note: You must go over these slides and complete every.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Regression Regression relationship = trend + scatter
Anaregweek11 Regression diagnostics. Regression Diagnostics Partial regression plots Studentized deleted residuals Hat matrix diagonals Dffits, Cook’s.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Simple Linear Regression ANOVA for regression (10.2)
Lecture 10 Chapter 23. Inference for regression. Objectives (PSLS Chapter 23) Inference for regression (NHST Regression Inference Award)[B level award]
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Model Building and Model Diagnostics Chapter 15.
Creating a Residual Plot and Investigating the Correlation Coefficient.
Lecture 18 Today: More Chapter 9 Next day: Finish Chapter 9.
Statistical Data Analysis 2010/2011 M. de Gunst Lecture 10.
Regression Analysis1. 2 INTRODUCTION TO EMPIRICAL MODELS LEAST SQUARES ESTIMATION OF THE PARAMETERS PROPERTIES OF THE LEAST SQUARES ESTIMATORS AND ESTIMATION.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
Chapter 9 Regression Wisdom
Residuals, Influential Points, and Outliers
Individual observations need to be checked to see if they are: –outliers; or –influential observations Outliers are defined as observations that differ.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
732G21/732G28/732A35 Lecture 3. Properties of the model errors ε 4. ε are assumed to be normally distributed
Correlation & Simple Linear Regression Chung-Yi Li, PhD Dept. of Public Health, College of Med. NCKU 1.
CHAPTER 3 Describing Relationships
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
Non-Linear Models Tractable non-linearity Intractable non-linearity
Regression Diagnostics
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Chapter 12: Regression Diagnostics
Regression Statistics
Regression Model Building - Diagnostics
AP Stats: 3.3 Least-Squares Regression Line
Linear Regression/Correlation
Motivational Examples Three Types of Unusual Observations
Influential Observations in Regression
Residuals, Influential Points, and Outliers
Correlation and Regression
Regression Model Building - Diagnostics
Outliers and Influence Points
24/02/11 Tutorial 2 Inferential Statistics, Statistical Modelling & Survey Methods (BS2506) Pairach Piboonrungroj (Champ)
Ch 4.1 & 4.2 Two dimensions concept
2k Factorial Design k=2 Ex:.
Introductory Statistics
Presentation transcript:

Influential Observations in Regression Measurements on Heat Production as a Function of Body Mass and Work Effort. M. Greenwood (1918). “On the Efficiency of Muscular Work,” Proc. Roy. Soc. Of London, Series B, Vol. 90, #627, pp

Data Description Study involved Algerians accustomed to heavy labor. Experiment consisted of several hours on stationary bicycle. Dependent (Response) Variable: –Heat Production (Calories) Independent (Explanatory/Predictor) Variables: –Work Effort (Calories) –Body Mass (kg) Model: –H =     W   M 

Raw Data (Table III, p.203)

Estimated Regression Coefficients Note that that we can conclude, controlling for the other factor:  Work Effort increase  Heat Production increases (p =.0136)  Body Mass increase does not  Heat Production increases (p =.1957)

Plot of Residuals versus Fitted Values Huge, Positive, Residual

Influential Measures (I) Note: n=37, p*=3 Parameters

Standardized / Studentized Residuals

Influential Measures (II)

Influential Measures (III)

Diagnosing Influential Observations Clearly, Observation #19 exerts a huge influence (although it has a small hat or leverage value, so it must be near center of Mass/Work observations Upon further review to author’s original calculations provided in paper, the mean and S.D. are much to high for H (but exactly the same for M and W). Could observation been a “typo”? Try replacing H 19 =3936 with H 19 =2936 Note: Do not do this arbitrarily, check your data sources in practice

Analysis with Corrected Data Point Note that both factors are significant, and that the intercept and body mass coefficients have changed drastically

Plot of Residuals versus Predicted Values