Download presentation
Presentation is loading. Please wait.
Published byDiana Simpson Modified over 9 years ago
1
Chapter 10 Re-expressing Data: Get it Straight!!
*Straightening Relationships *Goals of Re-Expression *Ladder of Powers
2
Straightening Relationships
To use a linear model, the scatterplot must be straight enough Check scatterplot AND residual plot We have the ability to straighten data so that we can use a linear model for scatterplots that do not satisfy the straight enough condition
3
MPG and Weight
4
A Hummer weighs about 6000 pounds. What is the predicted MPG?
5
MPG vs Gallons/100 Miles Change 25 mpg into gallons/100 miles
6
Scatterplot: gal/100 miles and weight
7
Revisit the Hummer What is the predicted fuel efficiency for a Hummer? (6000 lbs) The new model predicts that a 6000 lbs Hummer would get 9.7 gallons/100 miles Convert that back into MPG
8
Not Sold?? You regularly use re-expression
What units do you use to talk about how fast you went on a bike? What units do you use to talk about how fast you run?
9
Goals of Re-Expressing
1) Make the distribution of a variable more symmetric easier to compare centers if its unimodal you could perhaps use the Normal Model
10
Goals Make the spread of several groups more alike
groups with similar spreads are easier to compare centers may be different
11
Goals Make the form of a scatterplot more nearly linear
linear models are easier to describe
12
Goals Make the scatter in a scatterplot spread out evenly rather than following a fan shape having an even scatter is a condition of many methods in Stats (we will see later)
13
Ladder of Powers Use to systematically re-express data
The farther you move from 1 (original data) the greater the effect on the data Certain re-expressions work better for different types of data
14
Ladder of Powers 2 y2 1 y 1/2 logy -1/2 -1/ -1 -1/y Power Name Comment
unimodal distributions that are skewed to the left 1 y data that can be both positive and negative and continue without bond; less likely for re-expression 1/2 counted data logy measurements that can NOT be negative; values that grow by percentages (salaries, populations); if the data has zeros add a small constant to each value -1/2 -1/ uncommon; changing the sign to take the negative of the reciprocal square root preserves the direction -1 -1/y ratios of two quantities (mpg); change the sign if you want to preserve the direction; if there are zeros, add a small constant to all values
15
Plan B: Attack of the Logs
Try taking the logs of BOTH the x values and the y values. Model Name x-axis y-axis Comment Exponential x log(y) “0” power from the ladder Logarithmic log (x) y wide range of x-values, scatterplot descending rapidly at the left and trailing to the right Power log (y) when you are in between powers on the ladder
16
Example Let’s try to predict the shutter speed based off the f/stop of a cameras lens. Enter data Shutter Speed 1/1000 1/500 1/250 1/125 1/60 1/30 1/15 1/8 f/stop 2.8 4 5.6 8 11 16 22 32
17
What Can Go Wrong? Don’t expect your model to be perfect
Don’t choose a model based on R2 alone always check the residual plot Watch out for scatterplots that change direction Watch out for negative values Rescale years Don’t stray too far from the ladder
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.