Chapter 4: More on Two-Variable (Bivariate) Data
4.1 Transforming Relationships Animal’s Brain Weight vs. Weight of Body Outliers r=.86
Logarithm r=.50 Drop Outliers
Plot Logarithm vs. Logarithm r=.96 The vertical spread about the LSRL is similar everywhere, so the predictions of brain weight from body weight will be pretty precise (high r 2 ) – in LOG SCALE
Working with a function of our original measurements can greatly simplify statistical analysis. Transforming- How?
Recall… Chapter 1 we did Linear Transformations Took a set of data and transformed it linearly Called: SHIFTING C to F Meters to Miles
A Linear Transformation CANNOT make a curved relationship between 2 variables “straight” Resort to common non-linear functions like the logarithm, positive & negative powers We can transform either one of the explanatory/response variables OR BOTH when we do we will call the variable “t”
Real World Example: We measure fuel consumption of a car in miles per gallon Engineers measure it in gallons per mile (how many gallons of fuel the car needs to travel 1 mile) Reciprocal Transformation: 1/f(t) My Car- 25 miles per gallon 1/25=.04 gallons per mile
Monotonic Function A monotonic function f(t) moves in one direction as its argument “t” increases Monotonic Increasing Monotonic Decreasing
Monotonic Increasing: Positive “t” a + bt slope b>0
Monotonic Decreasing: Positive “t” a + bt slope b<0
Nonlinear monotonic transformations change data enough to change form or relations between 2 variables, yet preserve order and allow recovery of original data.
Strategy: 1.If the variable that you want to transform has values that are 0 or negative apply linear transformation (add a constant) to get all positive. 2.Choose power or logarithmic transformation that approximately straightens the scatterplot.
Ladder of Power Transformations: Power Function: t P
Power Functions: Monotonic Power Function For t > 0…. 1. Positive p – are monotonic increasing 2. Negative p – are monotonic decreasing
Monotonic Decreasing- Hard to interpret because reversed order of original data point We want to make all t P therefore monotonic increasing. We can apply a LINEAR TRANSFORMATION
Original Data (t) Power Function Linear Trans: 0 undefined undefinedUndefined Linear Transformation:
This is log t This is a line
Concavity of Power Functions: P is greater than 1 = - Push out right tail & pull in left tail - Gets stronger as power p moves up away from 1 P is less than 1 = - Push out left tail & pull in right tail - Gets stronger as power p moves down away from 1
Country’s GDB vs. Life Expectancy
P=
Use
How do you know what transformation will make the scatterplot straight? ** DO NOT just push buttons!! ** We will develop methods of selection 1. Logarithmic Transformation 2. Power Transformation
1. Logarithmic Transformations
Exponential Growth A variable grows… Linearly: Exponentially:
The King’s Chess Board… King’s Offer: 1,000,000 grains - 30 days Wise Man: 1 grain per day and double for 30 days
Cell Phone Growth
Suspect Exponential Growth… 1.Calculate Ratios of Consecutive Terms - IF approximately the same… continue
Suspect Exponential Growth… 2. Apply a Transformation that: a. Transforms exponential growth into linear growth b. Transforms non-exponential growth into non-linear growth
Logarithm Review… 1.log(AB)= 2.log(A/B)= 3.logX p =
The Transformation… We hypothesize an exponential model of the form y=ab x To gain linearity, use the (x, log(y)) transformation Form? –
When our data is growing exponentially… if we plot the log of y versus x, we should observe a straight line for the transformed data!
LOG (Y) = (year) R-sq = 98.2%
Eliminate first 4 years & perform regression
LOG (New Y) = (New X) R-sq = 99.99%
Predictions in Exponential Growth Model Regression is often used for predictions In exponential growth, ________ rather than actual values follow a linear pattern To make a prediction of Exp. Growth we must thus “undo” the logarithmic transformation. The inverse operation of a logarithm is _____________________
LOG (New Y) = (New X) R-sq = 99.99% Predict the number of cell phone users in 2000.
If a variable grows exponentially… its ___________ grow linearly! In other words… if (x, y) is exponential, then (x, log(y)) is linear!
Read and do Technology Toolbox- Page on your own!!
2. Power Transformations
Example: Pizza Shop- order pizza by diameter 10 inch12 inch14 inch Amount you get depends on the area of the pizza Area circle = pi times the square of the radius Power Law Model
Power Laws We expect area to go up with the square of dimension We expect volume to go up with the cube of a dimension Real Examples: Many Characteristics of Living Things Kleiber’s Law- The rate at which animals use energy goes up as the ¾ power of their body weight (works from bacteria to whales). Kleiber’s Law- The rate at which animals use energy goes up as the ¾ power of their body weight (works from bacteria to whales).
Power Laws Become Linear Exponential growth becomes linear when we apply the logarithm to the response variable (y). Power Laws become linear when we apply the logarithm transformation to BOTH variables.
To Achieve Linearity… 1. 1.The power law model is 2. 2.Take the logarithm of both sides of equation (this straightens scatterplot) 3. 3.Power p in the power law becomes the slope of the straight line that links log(y) to log(x) 4. 4.Undo transformation to make prediction
Fish Example… Read Example 4.9 page 216 Model: weight = a x length 3
Log (weight) = log a + [3x log(length)] Yes appears very linear- perform LSRL on [log(length), log(weight)]
LSRL: log(weight)= log(length) r =.9926r 2 =.9985
log(weight)= log(length) = log(length) This is the final power equation for the original data (note- look at p-value)!
Prediction… Why did we do this? weight = x length Predict the weight of a 36cm fish
Summary- Order of Checking… 1. Look to see if there is a ___________________if so use LSRL 2. If points are ____________plot (x, log y) or (x, ln y) to gain linearity 3. If there is a _________ relationship (power model) plot (logx, logy) 4. If the scatterplot looks ________ plot (logx, y)