Bivariate Data and Scatter Plots Bivariate Data: The values of two different variables that are obtained from the same population element. While the variables.

Slides:



Advertisements
Similar presentations
Scatterplots and Correlation
Advertisements

Chapter 3 Examining Relationships
Chapter 8 Linear regression
Chapter 4 The Relation between Two Variables
Lesson Diagnostics on the Least- Squares Regression Line.
Chapter 3 Bivariate Data
Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
Scatter Diagrams and Linear Correlation
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Summarizing Bivariate Data Introduction to Linear Regression.
Chapter 4 Describing the Relation Between Two Variables
Describing the Relation Between Two Variables
Math 227 Elementary Statistics Math 227 Elementary Statistics Sullivan, 4 th ed.
Haroon Alam, Mitchell Sanders, Chuck McAllister- Ashley, and Arjun Patel.
Scatter Diagrams and Correlation
Least Squares Regression Line (LSRL)
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Linear Regression Analysis
Linear Regression.
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
Correlation Correlation measures the strength of the LINEAR relationship between 2 quantitative variables. Labeled as r Takes on the values -1 < r < 1.
Least-Squares Regression Section 3.3. Why Create a Model? There are two reasons to create a mathematical model for a set of bivariate data. To predict.
Anthony Greene1 Regression Using Correlation To Make Predictions.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 4 Section 1 – Slide 1 of 30 Chapter 4 Section 1 Scatter Diagrams and Correlation.
1 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 5 Summarizing Bivariate Data.
4.1 Scatter Diagrams and Correlation. 2 Variables ● In many studies, we measure more than one variable for each individual ● Some examples are  Rainfall.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
Lesson Scatterplots and Correlation. Knowledge Objectives Explain the difference between an explanatory variable and a response variable Explain.
Chapter 10 Correlation and Regression
Summarizing Bivariate Data
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Scatter Diagrams and Correlation Variables ● In many studies, we measure more than one variable for each individual ● Some examples are  Rainfall.
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. DosageHeart rate
Describing Bivariate Relationships Chapter 3 Summary YMS AP Stats Chapter 3 Summary YMS AP Stats.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
 Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination  Justify the regression.
1 Association  Variables –Response – an outcome variable whose values exhibit variability. –Explanatory – a variable that we use to try to explain the.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Linear Regression Day 1 – (pg )
^ y = a + bx Stats Chapter 5 - Least Squares Regression
LEAST-SQUARES REGRESSION 3.2 Least Squares Regression Line and Residuals.
Least Squares Regression Lines Text: Chapter 3.3 Unit 4: Notes page 58.
Section 1.3 Scatter Plots and Correlation.  Graph a scatter plot and identify the data correlation.  Use a graphing calculator to find the correlation.
Copyright © Cengage Learning. All rights reserved. 8 9 Correlation and Regression.
Copyright © 2009 Pearson Education, Inc. Chapter 8 Linear Regression.
Two-Variable Data Analysis
1 Objective Given two linearly correlated variables (x and y), find the linear function (equation) that best describes the trend. Section 10.3 Regression.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
Chapter 8 Linear Regression.
Topics
Correlation & Regression
Chapter 4.2 Notes LSRL.
CHAPTER 3 Describing Relationships
LSRL Least Squares Regression Line
Chapter 5 STATISTICS (PART 4).
Chapter 4 Correlation.
Regression.
Lecture Notes The Relation between Two Variables Q Q
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
3.2 – Least Squares Regression
Algebra Review The equation of a straight line y = mx + b
9/27/ A Least-Squares Regression.
Honors Statistics Review Chapters 7 & 8
Presentation transcript:

Bivariate Data and Scatter Plots Bivariate Data: The values of two different variables that are obtained from the same population element. While the variables may be either categorical or quantitative, we will focus on cases where they are both quantitative. Can we predict values of one variable from values of the other variable? Do the values of one variable cause the values of the other variable? 1Section 3.1, Page 59

Scatter Plot Example TI-83 Scatter Plots always have and explanatory variable and a response variable. The choice is arbitrary. The explanatory variable is always plotted on the x-axis, and the response variable is always plotted on the y axis. STAT – EDIT – ENTER; Enter x data in L1, and y in L2 2 nd STAT PLOT – ENTER -1: Plot 1 Highlight ON Type: Highlight first icon XList: 2 nd L1 YList 2 nd L2 ZOOM 9: ZoomStat TRACE; Use arrows to move to points and display values. 2Section 3.1, Page 60

Linear Correlation Linear Correlation: A measure of the strength of a linear relationship between two variables. The closer to a straight line the dots are, the stronger the relationship. 3Section 3.1, Page 61 If there correlation, then we say the two variables are associated. Changes in the value of one variable are associated with changes in the value of the other variable.

Coefficient of Correlation Measure of Strength Also known as the Pearson Correlation Coefficient. 4Section 3.2, Page 62 perfect straight line negative slope no relationship at all perfect straight line with positive slope

Problems 5Problems, Page 71

Correlation Coefficient TI-83 Add-In Program Finding r. STAT – EDIT – ENTER: Enter data in L1 and L2 PRGM-CORRELTN 2 nd LI – Comma – 2 nd L2 SCATTER PLOT? – 1=YES; (Displays scatter plot) ENTER; (Displays: r=.8394) This is a moderately strong positive relationship. 6Section 3.2, Page 62

Section 3.2, Page 637 Association and Causality Shoe Size Grade Level Elementary School Students Reading Scores Is this a reasonable association? Does giving students bigger shoes cause reading scores to improve? What explains this association? Lurking Variable: A third variable, often unexpressed, that has an effect on either or both x and y variables making it appear they are related. Association alone can never establish causality!

Problems 8Problems, Page 71

Problems 9Problems, Page 72

Problems 10Problems, Page 72

Linear Regression 11Section 3.3, Page 65 Line of Best Fit If a straight line model seems appropriate, the best fit straight line is found by using the method of least squares. Suppose that is the equation of a straight line, where (read “y-hat) represents the predicted value of y that corresponds to a particular value of x. The least squares criteria requires that we find the constants, a and b such that is as small as possible.

Line of Best Fit The best line will be the one where the sum of the squares of the “misses” is at a minimum. Calculus procedures are used to find the coefficients, a and b such that the line ŷ = a + bx has the least squares. 12Section 3.3, Page 66 r is the correlation coefficient, s y is the standard deviation of y-values and s x is the standard deviation of the x values

Linear Regression TI-83 Add-In Program a.For the above data, make a scatter plot, and comment on the suitability of the data for regression analysis. STAT – EDIT; Enter Height in L1, and Weight in L2. PRGN – REGBASIC X LIST=2 ND L1; Y LIST=2 ND L2 SCATTER PLOT: 1=YES The pattern looks positive, linear, and no outliers which could cause problems. Scatter Plot 13Section 3.3, Page 68

Linear Regression TI-83 Add-In Program b.Find the regression equation and r. ENTER; The program is paused to view graph, hitting ENTER moves the program along. The equation is: = x r, the coefficient of correlation =.7979, a relatively strong relationship. c.Check the plot of the regression line versus the scatter plot. ENTER – 1=YES 14Section 3.3, Page 68

Linear Regression TI-83 Add-In Program d.What is the value of the slope of the line, and what does it mean? b = is the slope of the line. It indicates the number of units change in the y value for every one unit increase in the x value. In this problem, for each one inch increase in height, weight increases by lbs. Its units are lbs/inch. e.What is the value of the intercept of the line, and what does it mean? a = is the y intercept. It has no meaning in this problem. It would be the weight of a person of zero height. f.What is the value of r 2 ? It is called the index of determination. It measures the strength of the model, 1 being perfect and 0 being useless. r 2 =.6367 indicating a relative strong positive correlation. 15Section 3.3, Page 68

Linear Regression TI-83 Add-In Program ENTER; 1 = YES The horizontal line represents the regression line. For each actual value of x, the residual is the actual y-value – predicted y-value. The dots show the “misses” or residuals. If the residuals show some kind of a pattern, it means that the linear regression model is not appropriate for the data, so other model, i.e. quadratic, may be better. Since there is not pattern is this plot, the linear model is appropriate for this data. 16Section 3.3, Page 68 g.Check the residual plot and explain what it means

Linear Regression TI-83 Add-In Program h.Use the model to predict the weight of a woman who is 65 inches tall. PREDICTED Y: 1 = YES X=65 Answer: lbs i.Use the model to predict the weight of a woman who is 77 inches tall. ENTER: 1 = YES X=77 Answer lbs. Notice that the range of the x values is from 61 to 69 inches. 77 inches is too far above the actual values used to develop the model. While the result is mathematically correct, the result is not valid in the context of the problem. 17Section 3.3, Page 68

Problems 18Problems, Page 72

Problems 19Problems, Page 73 a.Construct a scatter diagram. b.Does the pattern appear linear? c.Find the equation of best fit. d.What is the value of r and what does it mean? e.What is the slope? What are its units? Interpret its meaning. f.What is the y-intercept value? What does it mean? g.What does the residual plot show? What does it mean? h.Estimate the the stride rate for a speed of 19.2 ft/sec. Is the estimate reliable? Why? i.Estimate the stride rate for a speed of 31 ft/sec. Is the estimate reliable? Why?

Problems 20Problems, Page 73 c.What is the value of r and what does it mean? d.What is the slope? What are its units? Interpret its meaning. e.What is the y-intercept value? What does it mean? f.What does the residual plot show? What does it mean? g.Estimate the # of intersections for a state with 450 miles. Is the estimate reliable? Why? h.Estimate the # of intersections for a state with 950 miles. Is the estimate reliable? Why?

Problems 21Problems, Page 73 a.Construct a scatter diagram. What does it indicate to you? b.Find the equation of best fit. c.What is the value of r and what does it mean? d.What is the slope? What are its units? Interpret its meaning. e.What is the y-intercept value? What does it mean? f.What does the residual plot show? What does it mean? g.Estimate the price of an 8 year old car. Is the estimate reliable? Why? h.Estimate price of a 22 year old car. Is the estimate reliable? Why?