Reexpressing Data. Re-express data – is that cheating? Not at all. Sometimes data that may look linear at first is actually not linear at all. Straight.

Slides:



Advertisements
Similar presentations
Copyright © 2010 Pearson Education, Inc. Slide A least squares regression line was fitted to the weights (in pounds) versus age (in months) of a.
Advertisements

Re-Expressing Data Get it Straight!. Page 192, #2, 4, 15, 19, 22 Residuals Pg 193, # 11, 23, 27, 33, 45 Pg 195, 16, 22, 23,25,37 Regression Wisdom Pg.
 Objective: To determine whether or not a curved relationship can be salvaged and re-expressed into a linear relationship. If so, complete the re-expression.
Chapter 10: Re-expressing data –Get it straight!
Introduction to Regression ©2005 Dr. B. C. Paul. Things Favoring ANOVA Analysis ANOVA tells you whether a factor is controlling a result It requires that.
Chapter 10: Re-Expressing Data: Get it Straight
Copyright © 2010 Pearson Education, Inc. Slide
Residuals Revisited.   The linear model we are using assumes that the relationship between the two variables is a perfect straight line.  The residuals.
Chapter 10 Re-Expressing data: Get it Straight
Get it Straight!! Chapter 10
Scatterplots Thinking Skill: Explicitly assess information and draw conclusions.
Lesson Quiz: Part I 1. Change 6 4 = 1296 to logarithmic form. log = 4 2. Change log 27 9 = to exponential form = log 100,000 4.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 10 Re-expressing Data: Get it Straight!
Chapter 10 Re-expressing the data
Re-expressing data CH. 10.
Re-expressing the Data: Get It Straight!
Chapter 10 Re-expressing Data: Get it Straight!!
17.2 Extrapolation and Prediction
1 Re-expressing Data  Chapter 6 – Normal Model –What if data do not follow a Normal model?  Chapters 8 & 9 – Linear Model –What if a relationship between.
Inference for regression - Simple linear regression
Transforming to achieve linearity
Inference for Regression
Inferences for Regression
Copyright © 2010 Pearson Education, Inc. Chapter 6 The Standard Deviation as a Ruler and the Normal Model.
Statistics Review Chapter 10. Important Ideas In this chapter, we have leaned how to re- express the data and why it is needed.
Chapter 10: Re-Expressing Data: Get it Straight AP Statistics.
Chapter 10: Re-expressing Data It’s easier than you think!
 Graph of a set of data points  Used to evaluate the correlation between two variables.
Analysis of Residuals ©2005 Dr. B. C. Paul. Examining Residuals of Regression (From our Previous Example) Set up your linear regression in the Usual manner.
DO NOW Read Pages 222 – 224 Read Pages 222 – 224 Stop before “Goals of Re-expression” Stop before “Goals of Re-expression” Answer the following questions:
Chapter 10 Re-expressing the data
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 10 Re-expressing Data: Get It Straight!. Slide Straight to the Point We cannot use a linear model unless the relationship between the two.
Chapter 10: Re- expressing Data by: Sai Machineni, Hang Ha AP STATISTICS.
Lecture 6 Re-expressing Data: It’s Easier Than You Think.
Copyright © 2010 Pearson Education, Inc. Slide A least squares regression line was fitted to the weights (in pounds) versus age (in months) of a.
Bivariate Data Analysis Bivariate Data analysis 4.
Chapter 8 Linear Regression HOW CAN A MODEL BE CREATED WHICH REPRESENTS THE LINEAR RELATIONSHIP BETWEEN TWO QUANTITATIVE VARIABLES?
If the scatter is curved, we can straighten it Then use a linear model Types of transformations for x, y, or both: 1.Square 2.Square root 3.Log 4.Negative.
Chapter 5 Lesson 5.4 Summarizing Bivariate Data 5.4: Nonlinear Relationships and Transformations.
Re-Expressing Data. Scatter Plot of: Weight of Vehicle vs. Fuel Efficiency Residual Plot of: Weight of Vehicle vs. Fuel Efficiency.
Copyright © 2010 Pearson Education, Inc. Chapter 10 Re-expressing Data: Get it Straight!
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 10 Re-expressing Data: Get it Straight!
Chapter 10 Notes AP Statistics. Re-expressing Data We cannot use a linear model unless the relationship between the two variables is linear. If the relationship.
Chapter 4 More on Two-Variable Data. Four Corners Play a game of four corners, selecting the corner each time by rolling a die Collect the data in a table.
Statistics 10 Re-Expressing Data Get it Straight.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 10 Re-expressing Data: Get it Straight!
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 13 Lesson 13.2a Simple Linear Regression and Correlation: Inferential Methods 13.2: Inferences About the Slope of the Population Regression Line.
 Understand why re-expressing data is useful  Recognize when the pattern of the data indicates that no re- expression will improve it  Be able to reverse.
Chapter 10: Re-expressing Data (Get it Straight)
WARM UP: Use the Reciprocal Model to predict the y for an x = 8
Let’s Get It Straight! Re-expressing Data Curvilinear Regression
Chapter 10: Re-Expression of Curved Relationships
Inferences for Regression
Re-expressing the Data: Get It Straight!
(Residuals and
Bell Ringer Make a scatterplot for the following data.
Chapter 10 Re-Expressing data: Get it Straight
Suppose the maximum number of hours of study among students in your sample is 6. If you used the equation to predict the test score of a student who studied.
Re-expressing Data: Get it Straight!
Re-expressing the Data: Get It Straight!
Re-expressing the Data: Get It Straight!
So how do we know what type of re-expression to use?
Re-expressing Data:Get it Straight!
When You See (This), You Think (That)
Lecture 6 Re-expressing Data: It’s Easier Than You Think
Inferences for Regression
Re-expressing Data: Get it Straight!
Presentation transcript:

Reexpressing Data

Re-express data – is that cheating? Not at all. Sometimes data that may look linear at first is actually not linear at all. Straight enough condition: Does the scatterplot look straight? Randomization Condition: are the individuals a representative sample from the population? Does the Plot Thicken? Condition: Does a scatterplot of the residuals against predicted values have ANY pattern? It shouldn’t. Clusters indicate that the relationship probably isn’t linear. Boring is good..

Huh? The picture you see is a scatter plot of fuel efficiency (mpg) vs weight of a late model car (lbs). Looks ok, and r 2 is.816 (sometimes written as 81.6%) so maybe it is ok. The second graph, extrapolating the data, suggests that a 6000 lb car would get 0 mpg. The H2 weighs 6400 lbs. Now, it doesn’t get good gas mileage, but it is better than 0. The third graph is the residual graph of fuel efficiency. See how it has a “bend” in it? This is the indication that the original graph is not well described by a near expression.

So what do we do? Weight vs Fuel efficiency (gal/100 miles) may solve the problem. Where else do we re-express? If I ran 9 miles per hour on a mile run.. Is that fast? What if I did that on a 100 m dash?

Why re-express? 1. Make the distribution of a variable (histogram) more symmetric. 2. Makes the spread of several groups (seen in side by side boxplots) more alike. 3. Make the form of a scatterplot more nearly linear. 4. Makes the scatter spread out more evenly.

Ladder of Powers This is a list of ways to re-express data 2Square the yTry if unimodal and skewed to left 1Do Nothing ½Square root the yFor counted data, try this 0Log the yMeasurements that cannot be negative and values that grow by percentage increases may benefit. If the data has zeroes, add a small constant to all values before finding the log - ½Reciprocal Root of y Not common, but sometimes useful Reciprocal of yThis is like our running example

Ladder of Powers Part 2 If nothing feels good, you can try one of these three ideas (as long as none of the data is negative or zero) Exponentialx, log y Logarithmiclog x, yWide range of x values or a rapid decent but leveling might benefit from doing this. Remember the discussion of CEO’s salaries from earlier in the year? Powerlog x, log yAlways an option.

This is not a cure-all!! Some data just won’t benefit. Don’t worry. Yes, some data fits curved models, but the calculations are pretty intense.

Example

Homework Take a worksheet with you and do the circled problems.