Chapter 4 Correlation.

Slides:



Advertisements
Similar presentations
Linear Regression (LSRL)
Advertisements

Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.
LSRL Least Squares Regression Line
Chapter 5 Correlation. Suppose we found the age and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship.
Chapter 3 Correlation. Suppose we found the age and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship.
Correlation.
Quantitative Data Essential Statistics. Quantitative Data O Review O Quantitative data is any data that produces a measurement or amount of something.
Examining Relationships YMS3e Chapter 3 3.3: Correlation and Regression Extras Mr. Molesky Regression Facts.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Linear regression Correlation. Suppose we found the age and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship.
Chapter 3-Examining Relationships Scatterplots and Correlation Least-squares Regression.
AP Statistics Monday, 26 October 2015 OBJECTIVE TSW investigate the role of correlation in statistics. EVERYONE needs a graphing calculator. DUE NOW –Gummi.
Least Squares Regression Lines Text: Chapter 3.3 Unit 4: Notes page 58.
Unit 4 Lesson 3 (5.3) Summarizing Bivariate Data 5.3: LSRL.
Chapter 7 Linear Regression. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable.
Chapter 5 Summarizing Bivariate Data Correlation.
Chapter 5 Lesson 5.2 Summarizing Bivariate Data 5.2: LSRL.
Unit 3 Correlation. Homework Assignment For the A: 1, 5, 7,11, 13, , 21, , 35, 37, 39, 41, 43, 45, 47 – 51, 55, 58, 59, 61, 63, 65, 69,
Chapter 3 LSRL. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict.
Chapter 5 LSRL. Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict.
Quantitative Data Essential Statistics.
Statistics 200 Lecture #6 Thursday, September 8, 2016
Chapter 5 Correlation.
Correlation.
Examining Relationships
Unit 4 LSRL.
LSRL.
Chapter 4.2 Notes LSRL.
Least Squares Regression Line.
Chapter 3: Describing Relationships
Chapter 5 LSRL.
LSRL Least Squares Regression Line
Chapter 3.2 LSRL.
Describe the association’s Form, Direction, and Strength
Examining Relationships
Least Squares Regression Line LSRL Chapter 7-continued
Chapter 3: Describing Relationships
^ y = a + bx Stats Chapter 5 - Least Squares Regression
Chapter 3: Describing Relationships
Chapter 5 Correlation.
Least-Squares Regression
Examining Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 5 LSRL.
Chapter 5 LSRL.
Chapter 5 LSRL.
Chapter 5 Correlation.
Correlation.
Chapter 3: Describing Relationships
Least-Squares Regression
Examining Relationships Chapter 7
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Summarizing Bivariate Data
Chapter 3: Describing Relationships
Homework: pg. 180 #6, 7 6.) A. B. The scatterplot shows a negative, linear, fairly weak relationship. C. long-lived territorial species.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
A medical researcher wishes to determine how the dosage (in mg) of a drug affects the heart rate of the patient. Find the correlation coefficient & interpret.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
9/27/ A Least-Squares Regression.
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Honors Statistics Review Chapters 7 & 8
Chapter 3: Describing Relationships
Presentation transcript:

Chapter 4 Correlation

Suppose we found the age and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship between the age and weight of these adults? Age 24 30 41 28 50 46 49 35 20 39 Wt 256 124 320 185 158 129 103 196 110 130

Create a scatterplot of the data below. Suppose we found the height and weight of a sample of 10 adults. Create a scatterplot of the data below. Is there any relationship between the height and weight of these adults? Is it positive or negative? Weak or strong? Ht 74 65 77 72 68 60 62 73 61 64 Wt 256 124 320 185 158 129 103 196 110 130

The farther away from a straight line – the weaker the relationship The closer the points in a scatterplot are to a straight line - the stronger the relationship. The farther away from a straight line – the weaker the relationship

Identify as having a positive association, a negative association, or no association. + Heights of mothers & heights of their adult daughters - Age of a car in years and its current value + Weight of a person and calories consumed Height of a person and the person’s birth month NO Number of hours spent in safety training and the number of accidents that occur -

Correlation Coefficient (r)- A quantitative assessment of the strength & direction of the linear relationship between bivariate, quantitative data Use the following sentence to describe r: There is a direction (positive/negative), strength (strong/moderate/weak), linear association between x and y.

Correlation Coefficient (r)- Pearson’s sample correlation is used most parameter - r (rho) statistic - r

Calculate r. Interpret r in context. Speed Limit (mph) 55 50 45 40 30 20 Avg. # of accidents (weekly) 28 25 21 17 11 6 Calculate r. Interpret r in context. There is a strong, positive, linear relationship between speed limit and average number of accidents per week.

Properties of r (correlation coefficient) legitimate values of r is [-1,1] Strong correlation No Correlation Moderate Correlation Weak correlation

The correlations are the same. value of r does not depend on the unit of measurement for either variable x (in mm) 12 15 21 32 26 19 24 y 4 7 10 14 9 8 12 Find r. Change to cm & find r. The correlations are the same.

value of r does not depend on which of the two variables is labeled x y 4 7 10 14 9 8 12 Switch x & y & find r. The correlations are the same.

value of r is non-resistant x 12 15 21 32 26 19 24 y 4 7 10 14 9 8 22 Find r. Outliers affect the correlation coefficient

r = 0, but has a definite relationship! value of r is a measure of the extent to which x & y are linearly related Find the correlation for these points: x -3 -1 1 3 5 7 9 Y 40 20 8 4 8 20 40 Sketch the scatterplot r = 0, but has a definite relationship!

Association vs. Causation In a famous example of a correlation study, the following results were obtained. Year Number of Methodist Cuban Rum Imported Ministers in New England to Boston (in barrels) ---------------------------------------------------------------------------------------- 1860 63 8376 1865 48 6406 1870 53 7005 1875 64 8486 1880 72 9595 1885 80 10,643 1890 85 11,265 1895 76 10,071 1900 80 10,547 1905 83 11,008 1910 105 13,885 1915 140 18,559 1920 175 23,024 1925 183 24,185 1930 192 25,434 1935 221 29,238 1940 262 34,705

Minister data: r = .9999 So does an increase in ministers cause an increase in consumption of rum?

Correlation does not imply causation

LSRL Section 4.2

Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict y

Be sure to put the hat on the y - (y-hat) means the predicted y b – is the slope it is the approximate amount by which y increases when x increases by 1 unit a – is the y-intercept it is the approximate height of the line when x = 0 in some situations, the y-intercept has no meaning Be sure to put the hat on the y

Least Squares Regression Line LSRL The line that gives the best fit to the data set The line that minimizes the sum of the squares of the deviations from the line

(0,0) (3,10) (6,2) y =.5(6) + 4 = 7 2 – 7 = -5 4.5 y =.5(0) + 4 = 4 0 – 4 = -4 -5 y =.5(3) + 4 = 5.5 10 – 5.5 = 4.5 -4 (0,0) Sum of the squares = 61.25

What is the sum of the deviations from the line? Will it always be zero? Use a calculator to find the line of best fit (0,0) (3,10) (6,2) 6 Find y - y -3 The line that minimizes the sum of the squares of the deviations from the line is the LSRL. -3 Sum of the squares = 54

Interpretations Slope: For each unit increase in x, there is an approximate increase/decrease of b in y. Correlation coefficient: There is a direction, strength, linear association between x and y.

The ages (in months) and heights (in inches) of seven children are given. x 16 24 42 60 75 102 120 y 24 30 35 40 48 56 60 Find the LSRL. Interpret the slope and correlation coefficient in the context of the problem.

Correlation coefficient: There is a strong, positive, linear association between the age and height of children. Slope: For an increase in age of one month, there is an approximate increase of .34 inches in heights of children.

Predict the height of a child who is 4.5 years old. The ages (in months) and heights (in inches) of seven children are given. x 16 24 42 60 75 102 120 y 24 30 35 40 48 56 60 Predict the height of a child who is 4.5 years old. Predict the height of someone who is 20 years old. Graph, find lsrl, also examine mean of x & y

Extrapolation The LSRL should not be used to predict y for values of x outside the data set. It is unknown whether the pattern observed in the scatterplot continues outside this range.

For these data, this is the best equation to predict y from x. The ages (in months) and heights (in inches) of seven children are given. The LSRL is Can this equation be used to estimate the age of a child who is 50 inches tall? Calculate: LinReg L2,L1 For these data, this is the best equation to predict y from x. Do you get the same LSRL? However, statisticians will always use this equation to predict x from y

Will this point always be on the LSRL? The ages (in months) and heights (in inches) of seven children are given. x 16 24 42 60 75 102 120 y 24 30 35 40 48 56 60 Calculate x & y. Plot the point (x, y) on the LSRL. Graph, find lsrl, also examine mean of x & y Will this point always be on the LSRL?

The correlation coefficient and the LSRL are both non-resistant measures.

Formulas – on chart

The following statistics are found for the variables posted speed limit and the average number of accidents. Find the LSRL & predict the number of accidents for a posted speed limit of 50 mph.