Linear Regression t-Tests Cardiovascular fitness among skiers.

Slides:



Advertisements
Similar presentations
Continuation of inference testing 9E
Advertisements

Statistics Review – Part II Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1.
Forecasting Using the Simple Linear Regression Model and Correlation
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Inferences for Regression.
Copyright © 2010 Pearson Education, Inc. Chapter 27 Inferences for Regression.
Chapter 27 Inferences for Regression This is just for one sample We want to talk about the relation between waist size and %body fat for the complete population.
Copyright © 2010 Pearson Education, Inc. Slide
Inference for Regression
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
CHAPTER 24: Inference for Regression
Objectives (BPS chapter 24)
Inference for Regression 1Section 13.3, Page 284.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
PSY 307 – Statistics for the Behavioral Sciences
The Simple Regression Model
Linear Regression and Correlation Analysis
Chapter 9: Correlation and Regression
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Correlation and Regression Analysis
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Chapter 12 Section 1 Inference for Linear Regression.
Relationships Among Variables
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
The Chi-Square Distribution 1. The student will be able to  Perform a Goodness of Fit hypothesis test  Perform a Test of Independence hypothesis test.
Correlation and Regression
Introduction to Linear Regression and Correlation Analysis
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Inference for regression - Simple linear regression
STA291 Statistical Methods Lecture 27. Inference for Regression.
Linear Regression Inference
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
Inference for Regression
Inferences for Regression
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Chapter 10 Inference for Regression
ANOVA, Regression and Multiple Regression March
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Example x y We wish to check for a non zero correlation.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
Inference for Regression
Lesson Testing the Significance of the Least Squares Regression Model.
Chapter 26 Inferences for Regression. An Example: Body Fat and Waist Size Our chapter example revolves around the relationship between % body fat and.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
BPS - 5th Ed. Chapter 231 Inference for Regression.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Statistical principles: the normal distribution and methods of testing Or, “Explaining the arrangement of things”
CHAPTER 12 More About Regression
Regression and Correlation
CHAPTER 12 More About Regression
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
Inferences for Regression
Inference for Regression
CHAPTER 12 More About Regression
The Practice of Statistics in the Life Sciences Fourth Edition
CHAPTER 26: Inference for Regression
CHAPTER 12 More About Regression
CHAPTER 12 More About Regression
Inferences for Regression
Presentation transcript:

Linear Regression t-Tests Cardiovascular fitness among skiers

Cardiovascular fitness is measured by the time required to run to exhaustion on a treadmill. In the following study, cardiovascular fitness is compared to performance in a 20-km ski race. The following data are for biathletes, as reported in an article on sports physiology: “ Physiological Characteristics and Performance of Top U.S. Biathletes ” (Medicine and Science in Sports and Exercise(1995): x y x = treadmill time (minutes) y = 20-km ski time (minutes)

When we encounter data in ordered pairs we usually examine the data first by making a scatterplot. First, we will enter the data into lists on the calculator. Now setting up the scatterplot: The scatterplot suggests a negative linear relationship between treadmill time and ski race time. Note that while my graphs do not have axes labeled, this is due to technical constraints, and when you write your answers on paper you should always label the axes and show the scale.

We perform linear regression to obtain the equation of the best- fit line. On the TI-83 press Recall that L 1 and L 2 are the default lists so I don ’ t have to specify them, but do need to specify Y 1 in order to store the equation: Press Press.

The linear model shows that for every minute increase in treadmill time there is a decrease of minutes (on average) in ski race time. When the treadmill time is zero, the ski race time is expected to be minutes. Now graphing the line we see that the model looks good. Recall that whenever we perform linear regression we must confirm our results by making a residual plot.

To make a scatterplot of the residuals, set up the scatterplot, as usual. To enter the residuals in the Ylist, press then scroll to. Press to see the image. Here we see that the residuals are fairly randomly scattered. This patternless residual plot allows us to confirm our linear model for the data.

A new concern for us in this new test is that our residuals need to be normally distributed. This meets a requirement that the response variable varies normally. We have not seen this assumption before. We have in the past needed to establish that data is derived from a normal distribution. We will follow the same approach here.

We make a normal probability plot to check this. The normal probability plot shows a linear pattern. This is consistent with the residuals having a normal distribution. Another piece of information we need for the linear regression t- test deals with an assumption that the standard deviation is the same far all values of x. To check this we reexamine the residual plot.

If the data are scattered to about the same extent as we move from left to right, we can say that the equal variance assumption is met. Some people call this visual inspection the Does the plot thicken? condition. That is, do the residuals get closer together in part of the graph? In our example the variance seems the same throughout. We really just have one more assumption, and that is that the individual ordered pairs are independent of one another. In practice this is difficult to truly satisfy, and we often move forward without fully knowing that the data is independent.

The best we can do is carefully examine the data and the residuals, looking for patterns that we might have overlooked. If data is collected over time, we might want to graph it as a function of time to see if there has been a general trend that would represent a violation of the independence assumption.

Now we look more towards the test. If we make repeated samples of data from a population and fit each sample with linear regression, we will likely get different equations each time. We understand that, due to sampling variability, our estimates of a and b in the equation are just that, estimates. Since we calculate them on samples they are statistics. They estimate values that are true for the population. Ultimately, we seek the true regression line, and write it with Greek letters for the parameters  and .

The true regression line is Our significance test will attempt to determine whether β is zero. If the slope is zero then the explanatory variable is useless as a predictor of the response variable. The null hypothesis is always The alternate hypothesis is always

For us to be able to judge whether the variability we see is explainable by chance alone, we must have an idea of how much variability there is in this system. We calculate the standard error about the line. We use s to estimate  in the regression model.

We have degrees of freedom in this test because it is a t-test. We will have n-2 degrees of freedom in these tests, where n is the number of data points. We need one further concept, and that is of standard error of the regression slope SE b.. We are now ready to define our test statistic:

Let ’ s give a quick example of how you should write this as a 7-step write-up. Step 1:Step 2: This scatterplot shows a negative linear relationship between treadmill time and ski-race time. where x is treadmill time and y is ski time

This residual plot is patternless, which is consistent with our linear model. Further examination of this residual plot shows that the standard deviation is the same throughout. A linear form of the normal probability plot shows that the residuals appear to fit a normal model.

Step 3: Our data appear to be independent since treadmill measures would be made independently of the ski race. df = 9. This was found by running the linear regression t-test on the calculator. To do this press. Set the values, as shown.

Step 5:Step 6:Step 7:Step 4: The test statistic is too extreme to see the shading on this graph. Reject H 0, a value this extreme may occur by chance alone less than 1% of the time. We have evidence that cardiovascular fitness, as measured by a treadmill test, does correspond to reduced race time on 20-km ski race.

THE END