Chapter 4 More About Relationships Between Two Variables 4.1 Transforming to Achieve Linearity 4.2 Relationship Between Categorical Variables 4.3 Establishing Causation
How do you determine if data is linear? Look at the graph (is it straight?) Look at the residual plot (is it scattered?) Look at the correlation coefficient, r (is it close to -1 or 1?) If the answer to any of these questions is no, then a line is probably not a good fit and a curved function may be more appropriate.
Curved Functions Tested in AP Stats
Transform the data to determine whether Exponential or Power Regression is appropriate. Run a Linear Regression on the transformed data. Perform an inverse transformation to turn the equation into Exponential or Power. What if the data is not linear?
Transform the Data to Determine if Exponential or Power Regression is Appropriate 1.Enter data into L 1 and L 2. 2.See that it is not linear. (scatterplot is curved, residual is curved) 3.Transform the data into logarithms. – Enter L 3 = log x and L 4 = log y. 4.Look at (x, log y) and (log x, log y) for linearity. – If (x, log y) is linear, use exponential regression. – If (log x, log y) is linear, use power regression.
Run a Linear Regression on the Transformed Data
Perform an Inverse Transformation to Turn the Equation into Exponential or Power
Non-Linear Regression in the Calculator
By Hand vs. The Calculator x = L 1, y = L 2, log x = L 3, log y = L 4
Categorical Data in Two Way Tables Marginal Distribution: the distribution of only one of the variables. Find the marginal distribution of ice cream flavors. ChocolateVanillaStrawberry Freshmen Sophomores Juniors25713 Seniors10222
Categorical Data in Two Way Tables Conditional Distribution: the distribution of one variable given a specific condition of the other variable. Find the conditional distribution of grade level among those who prefer chocolate. ChocolateVanillaStrawberry Freshmen Sophomores Juniors25713 Seniors10222
1.What percent of students like strawberry? 2.What percent of seniors like vanilla? 3.What percent of chocolate lovers are juniors? 4.What percent of students are freshmen? 5.What percent of students are vanilla loving seniors? 6.What percent of upper classmen like chocolate? ChocolateVanillaStrawberry Freshmen Sophomores Juniors25713 Seniors10222
Simpson’s Paradox Suppose two people, Lisa and Bart, are editors for the St. Louis Post Dispatch. Answer the following questions given the data below: What percentage of articles did Lisa edit in Week 1? _________ Bart? _________ Who edited a higher percentage of articles in Week 1? ______________________ What percentage of articles did Lisa edit in Week 2? _________ Bart? _________ Who edited a higher percentage of articles in Week 2? ______________________ What percentage of articles did Lisa edit Total? _________ Bart? _________ Who edited a higher percentage of articles in Total? ______________________ Week 1Week 2Total Lisa 60 / / / 110 Bart 9 / / / 110
HOW CAN THIS BE?? In the first week, Lisa improves 60 percent of the articles she edits while Bart improves 90 percent of the articles he edits. In the second week, Lisa improves just 10 percent of the articles she edits, while Bart improves 30 percent. Both times, Bart improved a much higher percentage of articles than Lisa—yet when the two tests are combined, Lisa has improved a much higher percentage than Bart! Week 1Week 2Total Lisa 60.0% 10.0% 55.5% Bart 90.0% 30.0% 35.5%
Establishing Causation