Chapter 4 More on Two-Variable Data. Four Corners Play a game of four corners, selecting the corner each time by rolling a die Collect the data in a table.

Slides:



Advertisements
Similar presentations
Copyright © 2010 Pearson Education, Inc. Slide A least squares regression line was fitted to the weights (in pounds) versus age (in months) of a.
Advertisements

 Objective: To determine whether or not a curved relationship can be salvaged and re-expressed into a linear relationship. If so, complete the re-expression.
Chapter 4 Review: More About Relationship Between Two Variables
4.1: Linearizing Data.
Chapter 10: Re-Expressing Data: Get it Straight
Copyright © 2010 Pearson Education, Inc. Slide
Inference for Regression
Chapter 10 Re-Expressing data: Get it Straight
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
Scatter Diagrams and Linear Correlation
Chapter Four: More on Two- Variable Data 4.1: Transforming to Achieve Linearity 4.2: Relationships between Categorical Variables 4.3: Establishing Causation.
Lesson Quiz: Part I 1. Change 6 4 = 1296 to logarithmic form. log = 4 2. Change log 27 9 = to exponential form = log 100,000 4.
S ECTION 4.1 – T RANSFORMING R ELATIONSHIPS Linear regression using the LSRL is not the only model for describing data. Some data are not best described.
Gordon Stringer, UCCS1 Regression Analysis Gordon Stringer.
Correlation and Linear Regression
+ Hw: pg 788: 37, 39, 41, Chapter 12: More About Regression Section 12.2b Transforming using Logarithms.
More about Relationships Between Two Variables
Inference for regression - Simple linear regression
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Transforming to achieve linearity
Chapter 4: More on Two-Variable (Bivariate) Data.
Scatterplots, Associations, and Correlation
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Transforming Relationships
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Bivariate Data When two variables are measured on a single experimental unit, the resulting data are called bivariate data. You can describe each variable.
Chapter 10: Re-Expressing Data: Get it Straight AP Statistics.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
Summarizing Bivariate Data
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
M25- Growth & Transformations 1  Department of ISM, University of Alabama, Lesson Objectives: Recognize exponential growth or decay. Use log(Y.
“Numbers are like people… torture them long enough and they’ll tell you anything...” Linear transformation.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Chapter 10 Re-expressing Data: Get It Straight!. Slide Straight to the Point We cannot use a linear model unless the relationship between the two.
4.1 Transforming Relationships. Transforming (reexpressing) -Applying a function such as the logarithmic or square root to a quantitative variable -Because.
Lecture 6 Re-expressing Data: It’s Easier Than You Think.
Copyright © 2010 Pearson Education, Inc. Slide A least squares regression line was fitted to the weights (in pounds) versus age (in months) of a.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Chapter 8 Linear Regression HOW CAN A MODEL BE CREATED WHICH REPRESENTS THE LINEAR RELATIONSHIP BETWEEN TWO QUANTITATIVE VARIABLES?
Copyright © 2010 Pearson Education, Inc. Chapter 7 Scatterplots, Association, and Correlation.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Fall Looking Back In Chapters 7 & 8, we worked with LINEAR REGRESSION We learned how to: Create a scatterplot Describe a scatterplot Determine the.
Section 4.1 Transforming Relationships AP Statistics.
Copyright © 2010 Pearson Education, Inc. Chapter 10 Re-expressing Data: Get it Straight!
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 10 Re-expressing Data: Get it Straight!
Chapter 10 Notes AP Statistics. Re-expressing Data We cannot use a linear model unless the relationship between the two variables is linear. If the relationship.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Statistics 10 Re-Expressing Data Get it Straight.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 10 Re-expressing Data: Get it Straight!
HW 16 Key. 20:31 Wal-Mart. 20:31 a a.Scatterplot. Linear? No. It grows at a growing rate.
Chapter 13 Lesson 13.2a Simple Linear Regression and Correlation: Inferential Methods 13.2: Inferences About the Slope of the Population Regression Line.
Chapter 13 Lesson 13.2a Simple Linear Regression and Correlation: Inferential Methods 13.2: Inferences About the Slope of the Population Regression Line.
Chapter 4 Basic Estimation Techniques
Transforming Relationships
Re-expressing the Data: Get It Straight!
Bell Ringer Make a scatterplot for the following data.
Chapter 10 Re-Expressing data: Get it Straight
1) A residual: a) is the amount of variation explained by the LSRL of y on x b) is how much an observed y-value differs from a predicted y-value c) predicts.
Lecture Slides Elementary Statistics Thirteenth Edition
Re-expressing the Data: Get It Straight!
Re-expressing the Data: Get It Straight!
CHAPTER 29: Multiple Regression*
Transforming Relationships
CHAPTER 3 Describing Relationships
Chapters Important Concepts and Terms
More on Two-Variable Data
Presentation transcript:

Chapter 4 More on Two-Variable Data

Four Corners Play a game of four corners, selecting the corner each time by rolling a die Collect the data in a table listing the trial number and the number of people remaining in the game

Four Corners Trial number Students remaining

The Spread of Cancer Time (years) Number of cancer cells Page 194 in textbook

Sect 4.1 Transforming Relationships

Example 4.1 on page 195 The least squares regression line is not a good fit If we remove the elephant we dramatically drop our correlation coefficient (r= 0.86 becomes r=0.50)

Previous example with elephant removed We see a different trend The data curves to the right (is not linear)

Take the logarithm of both variables and the data becomes very linear The vertical spread about the LSRL is similar everywhere, so predictions of brain weight from body weight will be equally precise for any body weight (in the log scale)

Why transform? We need to have the data in a linear form so that we can find the LSRL and the r value When we have data that is not linear we must transform it so that it is as close to linear as possible Linear transformations cannot straighten a curved relationship between two variables ◦ Can change linear model kilometers to miles ◦ Can change linear model inches to feet

Applying a function such as a logarithm or square root to a quantitative transforming variable is called transforming or re- expressing the data Understanding how simple functions work helps us to choose and use transformations

First steps in transforming Transforming data amounts to changing the scale of measurement that was used when the data was collected Linear transformations cannot straighten a curved relationship between two variables To do that we resort to functions that are not linear Ex: logarithms, powers (+/-)

Monotonic Functions The different types of transformations (linear, +/- powers, and logarithms) are all monotonic A monotonic function f(t) moves in one direction as its argument t increases

Increasing/Decreasing Monotonic Functions A monotonic increasing function preserves the order of data If a>b, then f(a)>f(b) Its graph is increasing everywhere A monotonic decreasing function reverses the order of data If a>b, then f(a)<f(b) Its graph is decreasing everywhere Linear: a + bt (with slope +) Exponential (base >1) Log function Linear: a + bt (with slope -) Exponential (base <1) 1/t (for positive t)

A function can be monotonic over some range of t without being everywhere monotonic Ex: the square function t 2 is monotonic increasing for If the range of t includes both positive and negative values, the square is not monotonic It decreases as t increases for negative values of t and increases as t increases for positive values

Monotonic increasing Monotonic decreasing

Many variables take only 0 or positive values, so we are particularly interested in how functions behave for positive values of t

Linear vs. Exponential Growth Linear growth increases by a fixed amount in each equal time period adds a fixed increment in each equal time period Exponential growth increases by a fixed percentage of the previous total multiplied by a fixed number in each time period

To grasp the idea of multiplicative growth, consider a population of bacteria in which each bacterium splits into two each hour After one day there are 2 24 bacteria Rapid increase after a slow start

Exponential growth model y = a * b x a and b are constants Pattern follows a smooth curve

Nonlinear monotonic transformations Change data enough to alter the shape of distributions and the form of relations between two variables Are simple enough to preserve order and allow recovery of the original data

Strategy for transforming data: If the variable to be transformed takes values that are 0 or negative, first apply a linear transformation to make the values all positive Often we just add a constant to all the observations Then, choose a power or logarithmic transformation that simplifies the data Ex: one that approximately straightens a scatterplot

The logarithm transformation

Review of logarithms log b x = y remember: b y = x log (AB) = log A + log B log (A/B) = log A – log B log X p = p log X

Is the picture for cell phone growth (on page 205) exponential? We need an accurate way to check whether growth is exponential Can’t just “eye it”

Steps for the logarithm transformation First calculate the ratio of consecutive terms Don’t need to be exactly the same, but should be approximately the same Then apply a mathematical transformation that changes exponential growth into linear growth And patterns of growth that are not exponential into something other than linear

If we assume that it is exponential, then if we take the log of both sides we should get a straight line for our transformed data Remember, the exponential and logarithmic functions are inverses

Transforming cell phone growth

Transforming cell phone growth (continued) Log a and log b are constants, so the right side of the equation looks like the form for a straight line Plot the points in the form (x, log y) Table was shown previously Picture is on page 207 (next slide)

Plot appears slightly concave down, but more linear than original data

Compute LSRL to see results (careful what lists/data you use) Log y = x

Transforming cell phone growth (continued) r 2 is 0.982, so 98.2% of the variation in log y is explained by the least squares regression of log y on x Need to look at residual plot (next slide)

Ideally, should show random scatter about y = 0 reference line Curved pattern shows us that some improvement is still possible For now, we will accept the exponential model on the basis of the high r 2 value

Prediction in the exponential growth model Regression is often used for prediction In the case of exponential growth, the logarithms rather than the actual responses follow a linear pattern To do prediction, we need to “undo” the logarithm transformation to return to the original units of measurement The same idea works for any monotonic transformation There is always exactly one original value behind any transformed value, so we can always go back to our original scale

Making the best prediction… If you look at the residual plot on page 208, the last four points look very linear To try to predict the number of subscribers in 2000 we could discard the first four points because they are the oldest and furthest removed from 2000 This gives us a new LSRL: log NewY = NewX

Section 4.1 cont.

Power law models y = a x p, where a and p are numbers Power law models become linear when we apply the logarithm transformation to both variables Powers greater than 1 give graphs that bend upward. The sharpness of the bend increases as the power increases. Powers less than 1 but greater than 0 give graphs that bend downward. Powers less than 0 give graphs that decrease as x increases. Greater negative values of p result in graphs that decrease more quickly.

y = a x p log y = log a + p log x Taking the logarithm of both variables straightens the scatterplot of y against x The power p in the power law becomes the slope of the straight line that links log y to log x

Prediction in power law models If taking the logarithm of both variables makes a scatterplot linear, a power law is a reasonable model for the original data The slope is only an estimate of the p in an underlying power model The greater the scatter of the points in the scatterplot about the fitted line, the smaller our confidence that this estimate is accurate

Example 4.9 page 216

Don’t forget to perform an inverse transformation:

Power law modeling with the graphing calculator Enter data into L1 and L2 Show scatterplot L3 = log L1 L4 = log L2 Graph L3 and L4 with each other Verify it is linear Find LSRL for L3/L4 Check r 2 value Graph residuals Look for random scatter Transform back to power model Make prediction