Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

Slides:



Advertisements
Similar presentations
Chapter 3 Examining Relationships Lindsey Van Cleave AP Statistics September 24, 2006.
Advertisements

AP Statistics Section 3.2 C Coefficient of Determination
Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.
Chapter 4 Describing the Relation Between Two Variables
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
Math 227 Elementary Statistics Math 227 Elementary Statistics Sullivan, 4 th ed.
Correlation and Regression Analysis
Haroon Alam, Mitchell Sanders, Chuck McAllister- Ashley, and Arjun Patel.
Linear Regression/Correlation
Correlation & Regression Math 137 Fresno State Burger.
Regression, Residuals, and Coefficient of Determination Section 3.2.
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Linear Regression Analysis
Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.
Objectives (BPS chapter 5)
Descriptive Methods in Regression and Correlation
Linear Regression.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Relationship of two variables
Relationships between Variables. Two variables are related if they move together in some way Relationship between two variables can be strong, weak or.
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Section 5.2: Linear Regression: Fitting a Line to Bivariate Data.
Chapter 10 Correlation and Regression
BIOL 582 Lecture Set 11 Bivariate Data Correlation Regression.
Regression Regression relationship = trend + scatter
Aim: Review for Exam Tomorrow. Independent VS. Dependent Variable Response Variables (DV) measures an outcome of a study Explanatory Variables (IV) explains.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
LECTURE 9 Tuesday, 24 FEBRUARY STA291 Fall Administrative 4.2 Measures of Variation (Empirical Rule) 4.4 Measures of Linear Relationship Suggested.
Creating a Residual Plot and Investigating the Correlation Coefficient.
The correlation coefficient, r, tells us about strength (scatter) and direction of the linear relationship between two quantitative variables. In addition,
Correlation – Recap Correlation provides an estimate of how well change in ‘ x ’ causes change in ‘ y ’. The relationship has a magnitude (the r value)
 Chapter 3! 1. UNIT 7 VOCABULARY – CHAPTERS 3 & 14 2.
AP Statistics HW: p. 165 #42, 44, 45 Obj: to understand the meaning of r 2 and to use residual plots Do Now: On your calculator select: 2 ND ; 0; DIAGNOSTIC.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.3 Predicting the Outcome.
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Least Squares Regression Lines Text: Chapter 3.3 Unit 4: Notes page 58.
.  Relationship between two sets of data  The word Correlation is made of Co- (meaning "together"), and Relation  Correlation is Positive when the.
Unit 4 Lesson 3 (5.3) Summarizing Bivariate Data 5.3: LSRL.
Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
The correlation coefficient, r, tells us about strength (scatter) and direction of the linear relationship between two quantitative variables. In addition,
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Describing Bivariate Relationships. Bivariate Relationships When exploring/describing a bivariate (x,y) relationship: Determine the Explanatory and Response.
Chapter 2 Bivariate Data Scatterplots.   A scatterplot, which gives a visual display of the relationship between two variables.   In analysing the.
Chapter 3 LSRL. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict.
Unit 4 LSRL.
LSRL.
Least Squares Regression Line.
Sections Review.
Chapter 5 LSRL.
Chapter 5 STATISTICS (PART 4).
SIMPLE LINEAR REGRESSION MODEL
Chapter 3.2 LSRL.
Suppose the maximum number of hours of study among students in your sample is 6. If you used the equation to predict the test score of a student who studied.
Linear Regression/Correlation
Least Squares Regression Line LSRL Chapter 7-continued
Unit 4 Vocabulary.
Introduction to Probability and Statistics Thirteenth Edition
Chapter 5 LSRL.
Chapter 5 LSRL.
Correlation and Regression
Ch 4.1 & 4.2 Two dimensions concept
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Chapter 3 Vocabulary Linear Regression.
Presentation transcript:

Scatterplot and trendline

Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:

What can we tell from scatterplot Direction of relationship (positive, negative, no correlation) Strength of relationship ( strong >0.8, weak <0.5) Form of relationship (linear, quadratic, cubic, etc)

Some examples i r=0.5 Weak Points are scattered around Positive (upward trend) Hard to tell the form Roughly Linear?

Some examples ii r=0.8 Strong Points are compact Positive Clear linear pattern

Some examples iii r=0.2 Very weak, almost no pattern Points all over the plot Very hard to tell whether it is positive or negative

Some examples iii r=0 No pattern Points fall everywhere in the plot Can not tell whether there is upward or downward trend

Some examples iv r= Strong relationship Negative relationship (downward trend). Linear pattern

Some examples v r= Not very different from plot iii

What is r? r is called correlation coefficient There are many different ways of calculating r. The one that we use most frequently is called Pearson product moments correlation coefficient (or simply Pearson correlation coefficient)

How to calculate r? Formula to be introduced later.

Other facts about r Ranges from –1 to +1 Sign shows direction of the correlation Absolute value shows the strength of the correlation *** Only measures linear correlation

Example Y=x^2 r is almost 0 r= *** But there is a clear quadratic correlation between x and y for sure!!!

How to use correlation Make predictions  Given a value of x and the correlation between x and y, we can predict the value of y.  This is an example of model fitting in statistics

Another classification of variables In terms of the role of the variables in the model, they are put into two classes:  Independent, explanatory, predictor, x-value  Dependent, response, y-value

What a statistical model does Gives us a measure of the relationship between two (or more) variables. Gives us a measure of how good the model performs, since we always have many model choices. Enables us to make prediction using the relationship identified in the model

Graphical Illustration of the model Trendline r=0.8 Positive Strong Linear

Regression Regression is one way of fitting a statistic model. For the above data, we have Y=b0+b1x+error b0 is called the intercept b1 is called the regression coefficient/slope Error is a “must have” part in any statistic model

Numeric Example Data X: Y: r=

Results of a regression i Intercept = Slope = The line in the middle is called the trendline or regression line The distance between individual points and the line is called “residual”

Results of a regression ii X: Y: Y.hat: Resid: Y.hat is the predicted value of Y given X and the regression model we got Residuals=Y-Y.hat and that is the error in our model

How do we get the regression model We find the set of intercept and slope that satisfies the following conditions  The sum of all residuals should be 0  The sum of the squared residuals is minimized

How to measure how good this model is? One measure is called r-square For this model, it is r^2= It means among all the variation observed in the variable Y, about 84.5% is explained by the predictor X. The rest is the error.

How is r-square related to our measure of correlation Hint, it is called… r-squared

Yes, it is the squared value of the correlation between X and Y ^2=

Some things to know This relationship only works regression with one predictor. The trendline or the regression model only works for X values within the range of our data, or not too far from it. In this case, our X values range from 10 to 50. So we can predict Y using X=26 but not X=126. Correlation does not imply causality.  Example: Children’s shoe size vs reading ability