Download presentation
Presentation is loading. Please wait.
Published byBarry Damian Wood Modified over 9 years ago
1
Math 227 Elementary Statistics Math 227 Elementary Statistics Sullivan, 4 th ed.
2
Copyright Copyright of the definitions and examples is reserved to Pearson Education, Inc.. In order to use this PowerPoint presentation, the required textbook for the class is the Fundamentals of Statistics, Informed Decisions Using Data, Michael Sullivan, III, fourth edition.
3
CHAPTER 4 Describing the Relation between Two Variables 3
4
Ch 4.1 & 4.2 Two dimensions concept I. Scatter plot
5
linear regression quadratic regression power regression exponential regression perfect correlation positive correlation As x increases, y increases negative correlation As x increases, y decreases no correlation Linear Regression
6
Linear correlation coefficient The linear correlation coefficient or Pearson product moment correlation coefficient is a measure of the strength and direction of the linear relation between two quantitative variables. The Greek letter ρ (rho) represents the population correlation coefficient, and r represents the sample correlation coefficient. We present only the formula for the sample correlation coefficient. 6
7
linear regression quadratic regression power regression exponential regression perfect correlation positive correlation As x increases, y increases negative correlation As x increases, y decreases no correlation Linear Regression
8
Linear correlation coefficient The linear correlation coefficient or Pearson product moment correlation coefficient is a measure of the strength and direction of the linear relation between two quantitative variables. The Greek letter ρ (rho) represents the population correlation coefficient, and r represents the sample correlation coefficient. We present only the formula for the sample correlation coefficient. 8
9
Properties of the Linear Correlation Coefficient 1. The linear correlation coefficient is always between –1 and 1, inclusive. That is, –1 ≤ r ≤ 1. 2. If r = + 1, then a perfect positive linear relation exists between the two variables. 3. If r = –1, then a perfect negative linear relation exists between the two variables. 4. The closer r is to +1, the stronger is the evidence of positive association between the two variables. 5. The closer r is to –1, the stronger is the evidence of negative association between the two variables.
10
6.If r is close to 0, then little or no evidence exists of a linear relation between the two variables. So r close to 0 does not imply no relation, just no linear relation. 7.The linear correlation coefficient is a unitless measure of association. So the unit of measure for x and y plays no role in the interpretation of r. 8.The correlation coefficient is not resistant. Therefore, an observation that does not follow the overall pattern of the data could affect the value of the linear correlation coefficient.
12
EXAMPLE Determining the Linear Correlation Coefficient Determine the linear correlation coefficient of the drilling data.
15
Testing for a Linear Relation Step 1 Determine the absolute value of the correlation coefficient Step 2 Find the critical value in Table II from Appendix A for the given sample size Step 3 If the absolute value of the correlation coefficient is greater than the critical value, we say a linear relation exists between the two variables. Otherwise, no linear relation exists. 15
16
EXAMPLE Does a Linear Relation Exist? Step 1 The linear correlation coefficient between depth at which drilling begins and the time to drill 5 feet is 0.773 Step 2 Table II shows the critical value with n = 12 is 0.576 Step 3 Since |0.773|>0.576, we conclude a positive association exists between depth at which drilling begins and time to drill 5 feet.
17
4.2 Least-Squares Regression 17
18
Using the following sample data: Using (2, 5.7) and (6, 1.9): EXAMPLE Finding an Equation that Describes Linearly Relate Data (a) Find a linear equation that relates x (the explanatory variable) and y (the response variable) by selecting two points and finding the equation of the line containing the points.
19
(b) Graph the equation on the scatter diagram. (c) Use the equation to predict y if x = 3.
20
} (3, 5.2) residual = observed y – predicted y = 5.2 – 4.75 = 0.45 The difference between the observed value of y and the predicted value of y is the error, or residual. Using the line from the last example, and the predicted value at x = 3: residual = observed y – predicted y = 5.2 – 4.75 = 0.45
21
Least-Squares Regression Criterion The least-squares regression line is the line that minimizes the sum of the squared errors (or residuals). This line minimizes the sum of the squared vertical distance between the observed values of y and those predicted by the line, (“y-hat”). We represent this as “ minimize Σ residuals 2 ”.
22
The Least-Squares Regression Line The equation of the least-squares regression line is given by where and is the slope of the least-squares regression line is the y-intercept of the least- squares regression line
23
The Least-Squares Regression Line Note: is the sample mean and s x is the sample standard deviation of the explanatory variable x ; is the sample mean and s y is the sample standard deviation of the response variable y.
24
EXAMPLE Finding the Least-squares Regression Line Using the drilling data (a)Find the least-squares regression line. (b) Predict the drilling time if drilling starts at 130 feet. (c) Is the observed drilling time at 130 feet above, or below, average. (d) Draw the least-squares regression line on the scatter diagram of the data.
25
(a)We agree to round the estimates of the slope and intercept to four decimal places. (b) (c) The observed drilling time is 6.93 seconds. The predicted drilling time is 7.035 seconds. The drilling time of 6.93 seconds is below average.
26
(d)
27
Interpretation of Slope: The slope of the regression line is 0.0116. For each additional foot of depth we start drilling, the time to drill five feet increases by 0.0116 minutes, on average.
28
Interpretation of the y-Intercept: The y-intercept of the regression line is 5.5273. To interpret the y-intercept, we must first ask two questions: 1. Is 0 a reasonable value for the explanatory variable? 2. Do any observations near x = 0 exist in the data set? A value of 0 is reasonable for the drilling data (this indicates that drilling begins at the surface of Earth. The smallest observation in the data set is x = 35 feet, which is reasonably close to 0. So, interpretation of the y-intercept is reasonable. The time to drill five feet when we begin drilling at the surface of Earth is 5.5273 minutes.
29
If the least-squares regression line is used to make predictions based on values of the explanatory variable that are much larger or much smaller than the observed values, we say the researcher is working outside the scope of the model. Never use a least-squares regression line to make predictions outside the scope of the model because we can’t be sure the linear relation continues to exist.
30
Predictions When There is No Linear Relation: When the correlation coefficient indicates no linear relation between the explanatory and response variables, and the scatter diagram indicates no relation at all between the variables, then we use the mean value of the response variables, then we use the mean value of the response variable as the predicted value so that
31
Summary: 1. Use StatCrunch to plot a scatter plot 2. Use StatCrunch to calculate r 3. Determine whether there is a positive/negative linear correlation between X and Y. 4. If there is a linear correlation between X and Y, use StatCrunch to find the least squares regression line. Otherwise, do not find the least squares regression line.
32
When a value is assigned to X if there is no correlation between X and Y, use StatCrunch to find and the best predicted Y is for any X. 5.When a value is assigned to X if there is a correlation between X and Y, use the least squares regression line to find the best predicted Y.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.