1 Association  Variables –Response – an outcome variable whose values exhibit variability. –Explanatory – a variable that we use to try to explain the.

Slides:



Advertisements
Similar presentations
7.1 Seeking Correlation LEARNING GOAL
Advertisements

Linear Regression.  The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu:  The model won’t be perfect, regardless.
Chapter 8 Linear regression
Chapter 8 Linear regression
Linear Regression Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Copyright © 2010 Pearson Education, Inc. Chapter 8 Linear Regression.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Chapter 4 The Relation between Two Variables
Agresti/Franklin Statistics, 1 of 52 Chapter 3 Association: Contingency, Correlation, and Regression Learn …. How to examine links between two variables.
Copyright © 2009 Pearson Education, Inc. Chapter 8 Linear Regression.
Chapter 8 Linear Regression © 2010 Pearson Education 1.
CHAPTER 8: LINEAR REGRESSION
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Describing the Relation between Two Variables 4.
Ch 2 and 9.1 Relationships Between 2 Variables
Correlation & Regression
Chapter 5 Regression. Chapter 51 u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We.
Descriptive Methods in Regression and Correlation
Linear Regression.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Introduction to Linear Regression and Correlation Analysis
Relationship of two variables
ASSOCIATION: CONTINGENCY, CORRELATION, AND REGRESSION Chapter 3.
Biostatistics Unit 9 – Regression and Correlation.
Chapter 3 concepts/objectives Define and describe density curves Measure position using percentiles Measure position using z-scores Describe Normal distributions.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Chapter 3 Section 3.1 Examining Relationships. Continue to ask the preliminary questions familiar from Chapter 1 and 2 What individuals do the data describe?
Section 2.2 Correlation A numerical measure to supplement the graph. Will give us an indication of “how closely” the data points fit a particular line.
Chapter 10 Correlation and Regression
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression.
Chapters 8 & 9 Linear Regression & Regression Wisdom.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Chapter 7 Scatterplots, Association, and Correlation.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Correlation tells us about strength (scatter) and direction of the linear relationship between two quantitative variables. In addition, we would like to.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
 Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination  Justify the regression.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
Chapter 8 Linear Regression HOW CAN A MODEL BE CREATED WHICH REPRESENTS THE LINEAR RELATIONSHIP BETWEEN TWO QUANTITATIVE VARIABLES?
Correlation/Regression - part 2 Consider Example 2.12 in section 2.3. Look at the scatterplot… Example 2.13 shows that the prediction line is given by.
CHAPTER 8 Linear Regression. Residuals Slide  The model won’t be perfect, regardless of the line we draw.  Some points will be above the line.
Linear Regression Day 1 – (pg )
Business Statistics for Managerial Decision Making
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1.
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Linear Regression Chapter 8. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Statistics 8 Linear Regression. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King.
Part II Exploring Relationships Between Variables.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
1. Analyzing patterns in scatterplots 2. Correlation and linearity 3. Least-squares regression line 4. Residual plots, outliers, and influential points.
SCATTERPLOTS, ASSOCIATION AND RELATIONSHIPS
Scatter plots & Association
Simple Linear Regression
Simple Linear Regression
Chapter 3: Describing Relationships
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Unit 4 Vocabulary.
Chapter 2 Looking at Data— Relationships
Chapter 3: Describing Relationships
Algebra Review The equation of a straight line y = mx + b
9/27/ A Least-Squares Regression.
Honors Statistics Review Chapters 7 & 8
Review of Chapter 3 Examining Relationships
Presentation transcript:

1 Association  Variables –Response – an outcome variable whose values exhibit variability. –Explanatory – a variable that we use to try to explain the variability in the response.

2 Association  There is an association between two variables if values of one variable are more likely to occur with certain values of a second variable.

3 Picturing Association  Two Categorical (Qualitative). –Cross-tabs table, mosaic plot.  Two Numerical (Quantitative). –Scatter diagram.

4 Categorical Data  Who? –Students in a statistics class at Penn State University.  What? –“With whom is it easiest to make friends?” Opposite sex, same sex, no difference. –Gender. Male, female.

5 Cross-tabs Table Same Sex Opposite Sex No DiffTotal Female Male Total With whom is it easiest to make friends?

6 Bar Graph With whom is it easiest to make friends?

7 Percentages Count Row % Same Sex Opposite Sex No DiffTotal Female % % % % Male % % % % Total With whom is it easiest to make friends?

8 Mosaic Plot

9 Interpretation  More than 50% of males say no difference while less than 50% of females say no difference.  Females are about twice as likely as males to say opposite.  Males are about twice as likely as females to say the same.

10 Scatter Plot  Statistics is about … variation.  Recognize, quantify and try to explain variation.  Variation in two quantitative variables is displayed in a scatter plot.

11 Scatter Plot  Numerical variable on the vertical axis, y, is the response variable.  Numerical variable on the horizontal axis, x, is the explanatory variable.

12 Scatter Plot  Example: Body mass (kg) and Bite force (N) for Canidae. –y, Response: Bite force (N) –x, Explanatory: Body mass (kg) –Cases: 28 species of Canidae.

13

14 Positive Association  Positive Association –Above average values of Bite force are associated with above average values of Body mass. –Below average values of Bite force are associated with below average values of Body mass.

15 Scatter Plot  Example: Outside temperature and amount of natural gas used. –Response: Natural gas used (1000 ft 3 ). –Explanatory: Outside temperature ( o C). –Cases: 26 days.

16

17 Negative Association –Above average values of gas are associated with below average temperatures. –Below average values of gas are associated with above average temperatures.

18 Association  Positive –As x goes up, y tends to go up.  Negative –As x goes up, y tends to go down.

19 Correlation  Linear Association –How closely do the points on the scatter plot represent a straight line? –The correlation coefficient gives the direction of and quantifies the strength of the linear association between two quantitative variables.

20 Correlation  Standardize y  Standardize x

21

22 Correlation Coefficient

23 Correlation Coefficient  Body mass and Bite force  r =

24 Correlation Coefficient  There is a very strong positive correlation, linear association, between the body mass and bite force for the various species of Canidae.

25 JMP  Analyze – Multivariate methods – Multivariate  Y, Columns – Body mass – BF ca (Bite force at the canine)

26

27 Correlation Properties  The sign of r indicates the direction of the association.  The value of r is always between –1 and +1.  Correlation has no units.  Correlation is not affected by changes of center or scale.

28 Algebra Review  The equation of a straight line  y = mx + b – m is the slope – the change in y over the change in x – or rise over run. – b is the y-intercept – the value where the line cuts the y axis.

29

30 Review  y = 3x + 2 –x = 0 y = 2 (y-intercept) –x = 3 y = 11 –Change in y (+9) divided by the change in x (+3) gives the slope, 3.

31 Linear Regression  Example: Body mass (kg) and Bite force (N) for Canidae. –y, Response: Bite force (N) –x, Explanatory: Body mass (kg) –Cases: 28 species of Canidae.

32 Correlation Coefficient  Body mass and Bite force  r =

33 Correlation Coefficient  There is a strong correlation, linear association, between the body mass and bite force for the various species of Canidae.

34 Linear Model  The linear model is the equation of a straight line through the data.  A point on the straight line through the data gives a predicted value of y, denoted.

35 Residual  The difference between the observed value of y and the predicted value of y,, is called the residual.  Residual =

36

37 Line of “Best Fit”  There are lots of straight lines that go through the data.  The line of “best fit” is the line for which the sum of squared residuals is the smallest – the least squares line.

38 Line of “Best Fit”  Some positive and some negative residuals but they sum to zero.  Passes through the point.

39 Line of “Best Fit” Least squares slope: intercept:

40 Body mass, xBite Force, y Least Squares Estimates

41 Least Squares Estimates

42 Interpretation  Slope – for a 1 kg increase in body mass, the bite force increases, on average, N.  Intercept – there is not a reasonable interpretation of the intercept in this context because one wouldn’t see a Canidae with a body mass of 0 kg.

43

44 Prediction  Least squares line

45 Residual  Body mass, x = 25 kg  Bite force, y = N  Predicted, = N  Residual, = – = – 14.6 N

46 Residuals  Residuals help us see if the linear model makes sense.  Plot residuals versus the explanatory variable. –If the plot is a random scatter of points, then the linear model is the best we can do.

47

48 Interpretation of the Plot  The residuals are scattered randomly. This indicates that the linear model is an appropriate model for the relationship between body mass and bite force for Canidae.

49 (r) 2 or R 2  The square of the correlation coefficient gives the amount of variation in y, that is accounted for or explained by the linear relationship with x.

50 Body mass and Bite force  r =  (r) 2 = (0.9807) 2 = or 96.2%  96.2% of the variation in bite force can be explained by the linear relationship with body mass.

51 Regression Conditions  Quantitative variables – both variables should be quantitative.  Linear model – does the scatter diagram show a reasonably straight line?  Outliers – watch out for outliers as they can be very influential.

52 Regression Cautions  Beware of extraordinary points.  Don’t extrapolate beyond the data.  Don’t infer x causes y just because there is a good linear model relating the two variables.

53 Extraordinary Points

54 Don’t Extrapolate  Explanatory (x) – Average outdoor temperature ( o C).  Response (y) – Amount of natural gas used (1000 cu ft).

55 Don’t Extrapolate

56 Don’t Extrapolate  Explanatory (x = 20) – Average outdoor temperature ( o C).  Response (y) – Amount of natural gas used (1000 cu ft).

57 Correlation Causation  Don’t confuse correlation with causation. –There is a strong positive correlation between the number of crimes committed in communities and the number of 2 nd graders in those communities.  Beware of lurking variables.