Download presentation
Presentation is loading. Please wait.
Published byBelinda Davidson Modified over 9 years ago
1
Statistics Review Chapter 10
2
Important Ideas In this chapter, we have leaned how to re- express the data and why it is needed.
3
Goals To make the distribution of a variable more symmetric. To make the spread of several groups more alike. To make the form of a scatterplot more nearly linear. To make the scatter in a scatterplot spread out evenly rather than following a fan shape.
4
The ladder of powers Easier way to find out what power to use. 2, 1, 1/2, [0(log)] -1/2, -1
5
Logarithms When ladder of powers does not work because the curvature is more stubborn, and we are not satisfied with the residuals plots, logarithms can be helpful. (only if none of the data values is zero or negative) Take log on both sides first, then re-express the data using some combinations.
6
Use curve instead of linear model When a clearly curved pattern shows up, we can use “curves of best fit” instead of trying to find a way to straighten the plot. Calculations are more difficult, though.
7
What can go wrong? Do not expect your model to be perfect. “All models are wrong, but some are useful” Do not choose a model based on R ^ 2. Plot residuals. Watch out for scatterplots that turn around. Avoid it.
8
Watch out for negative values. It is impossible to re-express negative data values by any power that is not a whole number on the ladder of powers. Add small constant, if possible. Do not stray too far from the ladder. It artificially inflate R^2.
9
Terms Re-express data: By using log, root, the reciprocal, and more, we change the data more symmetric, similar, and straight. Ladder of Powers: Collection of re-expressions that tells us what power we should start.
10
Question #9 The table below shows stopping distance in feet for a car tested 3 times at each of 5 speeds. We hope to create a model that predicts stopping distance from the speed of the car. SPEED (mpg)STOPPING DISTANCE (ft) 2064, 62, 59 30114, 118, 105 40153, 171, 165 50231, 203, 238 60317, 321, 276
11
a)Explain why a linear model is not appropriate. b)Re-express the data to straighten the scatterplot. c)Create an appropriate model. d)Estimate the stopping distance for a car traveling 55mph. e)Estimate the stopping distance for a car traveling 70 mph. f)How much confidence do you place in these predictions? Why?
12
Question #11 Ball players have been signing ever larger contracts. The highest salaries (in millions of dollars per season) for some notable players are given in the following table.
13
PlayerYearSalary NR19801 GF19822.04 KP19903 JC19904.7 RC19915.3 KG19968.5 AB199711 PM199812.5 MP199912.5 MV199913.3 KB199915 CD200117 AR200125.2
14
a)Re-express the data to straighten the scatterplot. b)Create an appropriate model for the tend in salary. c)Predict a superstar salary for 2005.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.