–Working with relationships between two variables “Donation “ made to teacher & Stats Test Score.

Slides:



Advertisements
Similar presentations
Residuals.
Advertisements

Lesson 10: Linear Regression and Correlation
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Linear Regression (C7-9 BVD). * Explanatory variable goes on x-axis * Response variable goes on y-axis * Don’t forget labels and scale * Statplot 1 st.
Chapter 4 The Relation between Two Variables
Overview Correlation Regression -Definition
Scatter Diagrams and Linear Correlation
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Correlation and Linear Regression
Describing the Relation Between Two Variables
Correlation and Regression. Correlation What type of relationship exists between the two variables and is the correlation significant? x y Cigarettes.
RESEARCH STATISTICS Jobayer Hossain Larry Holmes, Jr November 6, 2008 Examining Relationship of Variables.
Chapter 3: Examining Relationships
Ch 2 and 9.1 Relationships Between 2 Variables
Correlation and Regression Analysis
Relationships Among Variables
Chapter 5 Regression. Chapter 51 u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We.
Introduction to Linear Regression and Correlation Analysis
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Chapter 6: Exploring Data: Relationships Chi-Kwong Li Displaying Relationships: Scatterplots Regression Lines Correlation Least-Squares Regression Interpreting.
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
Chapter 6 & 7 Linear Regression & Correlation
Lesson Least-Squares Regression. Knowledge Objectives Explain what is meant by a regression line. Explain what is meant by extrapolation. Explain.
Ch4 Describing Relationships Between Variables. Pressure.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
3.3 Least-Squares Regression.  Calculate the least squares regression line  Predict data using your LSRL  Determine and interpret the coefficient of.
Wednesday, May 13, 2015 Report at 11:30 to Prairieview.
Chapter 10 Correlation and Regression
Ch4 Describing Relationships Between Variables. Section 4.1: Fitting a Line by Least Squares Often we want to fit a straight line to data. For example.
Correlation & Regression
Summarizing Bivariate Data
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
Chapters 8 & 9 Linear Regression & Regression Wisdom.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Creating a Residual Plot and Investigating the Correlation Coefficient.
Warm Up Feel free to share data points for your activity. Determine if the direction and strength of the correlation is as agreed for this class, for the.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
Correlation – Recap Correlation provides an estimate of how well change in ‘ x ’ causes change in ‘ y ’. The relationship has a magnitude (the r value)
Warm Up Read over the Activity at the beginning of Chapter 3 (p. 120) AP Statistics, Section 3.1, Part 1 1.
Independent Dependent Scatterplot Least Squares
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
Regression and Correlation
Describing Relationships
Sections Review.
SCATTERPLOTS, ASSOCIATION AND RELATIONSHIPS
LSRL Least Squares Regression Line
Regression and Residual Plots
Describing Bivariate Relationships
Chapter 7 Part 1 Scatterplots, Association, and Correlation
Section 3.3 Linear Regression
AP Statistics, Section 3.3, Part 1
Chapter 3: Describing Relationships
AP Statistics September 30th, 2014 Mr. Calise
Chapter 3: Describing Relationships
Section 3.2: Least Squares Regressions
Algebra Review The equation of a straight line y = mx + b
AP Stats Agenda Text book swap 2nd edition to 3rd Frappy – YAY
Chapters Important Concepts and Terms
Chapter Thirteen McGraw-Hill/Irwin
Honors Statistics Review Chapters 7 & 8
Review of Chapter 3 Examining Relationships
Presentation transcript:

–Working with relationships between two variables “Donation “ made to teacher & Stats Test Score

Correlation & Regression Univariate & Bivariate Statistics –U: frequency distribution, mean, mode, range, standard deviation –B: correlation – two variables Correlation –linear pattern of relationship between one variable (x) and another variable (y) – an association between two variables –relative position of one variable correlates with relative distribution of another variable X - An explanatory variable attempts to explain the observed outcomes in Y –A response variable measures an outcome of a study. Warning: –No proof of causality –Cannot assume x causes y

Scatterplot or Scatter Diagram a plot of paired data to determine or show a relationship between two variables

Graduating Seniors by State in 2005 The state of Louisiana The state of Rhode Island

AP Statistics, Section 3.1, Part 1 5 Figure 3.1 (Percent taking SAT vs. Score) Attributes of a good scatterplot –Consistent and uniform scale –Label on both axis –Accurate placement of data –Data throughout the axis –Axis break lines if not starting at zero. To achieve this goal you should try to do your scatterplots on graph paper.

Graduating Seniors by State in 2005 States from NE, Mid-Atlantic and West States from Midwest, Mtn Central, and Southwest

Paired Data

Scatter Diagram

Linear Correlation The general trend of the points seems to follow a straight line segment.

Linear Correlation

Non-Linear Correlation

No Linear Correlation

High Linear Correlation Points lie close to a straight line.

High Linear Correlation

Moderate Linear Correlation

Low Linear Correlation

Perfect Linear Correlation

Questions Arising Can we find a relationship between x and y? How strong is the relationship?

When there appears to be a linear relationship between x and y: attempt to “fit” a line to the scatter diagram.

When using x values to predict y values: Call x the explanatory variable Call y the response variable

Scatterplot! No Correlation –Random or circular assortment of dots Positive Correlation –ellipse leaning to right –GPA and SAT –Smoking and Lung Damage –Number of Whoppers eaten and Mr. Flynn’s weight Negative Correlation –ellipse learning to left –Depression & Self-esteem –Studying & test errors –Vampire friends & Werewolf boyfriends

AP Statistics, Section 3.1, Part 1 22 Interpreting Scatterplots Pattern/Shape: linear, parabola, bell shaped –Deviations from pattern: Are there areas where the data conform less to the pattern? –Form: Are there clusters of data? –Special data: Are there any influential points? –Is a transformation of data necessary? Trend/Direction: positive, negative, or WTF? –As x increases what happens to y? Strength/Association: weak, moderate, strong –IF a line were drawn through the data, how close would the points be to the line? –Is the a small or large amount of variability within the y values?

Pearson’s Correlation Coefficient “r” indicates… –strength of relationship (strong, weak, or none) –the variation of the points around the model (linear) –direction of relationship positive (direct) – variables move in same direction negative (inverse) – variables move in opposite directions r ranges in value from –1.0 to +1.0 Strong Negative No Rel. Strong Positive Try quick estimates –Next slide and strange quiz

Practice with Scatterplots r =.__ __

A relationship between correlation coefficient, r, and the slope, b, of the least squares line:

Linear correlation coefficient  1  r  +1

Calculating the Correlation Coefficient, r

Paired Data

Scatter Diagram

Find the Least Squares Line

Finding the slope

Finding the y-intercept

The equation of the least squares line is: y = a + bx y = x

To Compute r: Complete a table, with columns listing x, y, x 2, y 2, xy Compute SS xy, SS x, and SS y Use the formula:

Find the Correlation Coefficient

Calculations:

The Correlation Coefficient, r = r  0.98

AP Statistics, Section 3.2, Part 138 Calculating Correlation The calculation of correlation is based on mean and standard deviation. Remember that both mean and standard deviation are not resistant measures.

AP Statistics, Section 3.2, Part 139 Calculating Correlation What does the contents of the parenthesis look like? What happens when the values are both from the lower half of the population? From the upper half? Both z-values are negative. Their product is positive. Both z-values are positive. Their product is positive. The formula for calculating z- values.

AP Statistics, Section 3.2, Part 140 Calculating Correlation What happens when one value is from the lower half of the population but other value is from the upper half? One z-value is positive and the other is negative. Their product is negative.

AP Statistics, Section 3.2, Part 141 Using the TI-83/84 to calculate r With Diagnostics ON: Run LinReg(a+bx) [STAT>CALC>option 8] with the explanatory variable as the first list, and response variable as the second list The results are the slope and vertical intercept of the regression equation (more on that later) and values of r and r 2. (More on r 2 check next handout ;)

Predictive Potential Coefficient of Determination –r² –Amount of variance accounted for in y by x –Percentage increase in accuracy you gain by using the regression line to make predictions –Without correlation, you can only guess the mean of y –[Used with regression] 20%0%80%100%60%40% Understanding r-squared actvity

Limitations of Correlation linearity: –can’t describe (accurately) non-linear relationships –e.g., flavor and % eaten, thickness and strength truncation of range: –underestimate strength of relationship if you can’t see full range of x value no proof of causation –third variable problem: could be 3 rd variable causing change in both variables directionality: can’t be sure which way causality “flows” “We don’t get it” – what does it have to do with that Line? That is for another session…

Regression Regression: Correlation + Prediction –predicting y based on x –e.g., predicting…. throwing points (y) based on distance from target (x) Regression equation –formula that specifies a line –y’ = a + bx –plug in a x value (distance from target) and predict y (points) –note y= actual value of a score y’= predict value Data Handout –Test takers, planets, darts

AP Statistics, Section 3.3, Part 145 The Least-Square Regression Finds the best fit line by trying to minimize the areas formed by the difference of the real data from the values predicted by the model.

AP Statistics, Section 3.3, Part 146 The Least-Square Regression Statisticians use a slightly different version of “slope-intercept” form. Slope is the product of r value and std dev ratio Y-intercept is the value found using the avg x and avg y

Regression Graphic – Regression Line if x=18 then… y’=47 if x=24 then… y’=20

AP Statistics, Section 3.3, Part 148 Predicting Model To put the regression line on the graph use the Statistics:Eq:RegEQ from the Vars menu to put the Y 1 equation. Then you can use Trace or Table or Y 1 to find response values that correspond to particular experimental values.

Regression Equation y’= a + bx –y’ = predicted value of y –b = slope of the line –x = value of x that you plug-in –a = y-intercept (where line crosses y axis) In the dart throwing case…. –y’ = (x) 20So if the distance is 20 feet 20 –y’ = ( 20 ) –y’ = –y’ = See STAT – CALC – LinReg: a + bx

Drawing a Regression Line by Hand Four steps 1.Use the y-intercept (if possible; does it have meaning =interval vs. rational) 2.Plot the average point (mean x, mean y) 3.Plug in a large value for x (just so it falls on the right end of the graph), plug it in for x, then plot the resulting point 4.Connect the three points with a straight line!

AP Statistics, Section 3.3, Part 151 Residuals It is important to note that the observed value almost never match the predicted values exactly The difference between the observed value and predicted has a special name: residual Observed Value: (y) Predicted Value ( ) Residual:

AP Statistics, Section 3.3, Part 152 Residual Plots You can plot the residuals to see if the there is any trends with the quality of the predictive model Try looking in the List menu for “RESID:”

AP Statistics, Section 3.3, Part 153 Residual Plots This residual shows no tendencies. It is equally bad throughout. This suggests that the original relationship is linear.

AP Statistics, Section 3.3, Part 1 54 “Pattern” =Not Linear “Well Distributed”=Linear

Predictive Ability Mantra!! –As variability decreases, prediction accuracy __________ –if we can account for variance, we can make better predictions As r increases: –r² increases “variance accounted for” increases the prediction accuracy increases –prediction error decreases (distance between y’ and y) –Sy decreases the standard error of the residual/predictor measures overall amount of prediction error –It can be thought of like this …

Thanks – Peace ! We like big r’s and we cannot lie!!! You other brothers can’t deny!!! Check out those residuals son and plot em with your TI-84 on Cause if they don’t look all scattered and patterned then your least squared line is shattered Then I only want that - if your scale and r squared is fat So kick out those nasty outliers When your correlation factor is on BABY GOT STATS!