Download presentation
Presentation is loading. Please wait.
1
Warm Up Scatter Plot Activity
2
Bivariate Data – Scatter Plots and Correlation Coefficient
3
Objective Construct Scatter Plots and find Correlation Coefficient using Formula
4
Relevance To be able to graphically represent two quantitative variables and analyze the strength of the relationship.
5
2 Quantitative Variables……
We represent 2 variables that are quantitative by using a scatter plot. Scatter Plot – a plot of ordered pairs (x,y) of bivariate data on a coordinate axis system. It is a visual or pictoral way to describe the nature of the relationship between 2 variables.
6
Input and Output Variables……
X: a. Input Variable b. Independent Var c. Controlled Var Y: a. Output Variable b. Dependent Var c. Results from the Controlled variable
7
Example When dealing with height and weight, which variable would you use as the input variable and why? Answer: Height would be used as the input variable because weight is often predicted based on a person’s height. Normal acceptable weight ranges are based on a person’s height!
8
Constructing a scatter plot
Do a scatter plot of the following data: Independent Dependent Variable Age Blood Pressure 43 128 48 120 56 135 61 143 67 141 70 152
9
What do we look for? A. Is it a positive correlation, negative correlation, or no correlation? B. Is it a strong or weak correlation? C. What is the shape of the graph?
10
Answer using GDC Age Blood Pressure 43 128 48 120 56 135 61 143 67 141
70 152
11
Notice Notice the following:
A. Strong Positive – as x increases, y also increases. B. Linear - it is a graph of a line.
12
Example 2 – NO GDC Independent Dependent Variable # of Absences
# of Absences Final Grade 6 82 2 86 15 43 9 74 12 58 5 90 8 78
13
Example 2 using GDC Independent Dependent Variable # of Absences
# of Absences Final Grade 6 82 2 86 15 43 9 74 12 58 5 90 8 78
14
Notice Notice the following:
Strong Negative – As x increases, y decreases Linear – it’s the graph of a line.
15
Example 3 – NO GDC Independent Dependent Variable Hrs. of Exercise
Hrs. of Exercise Amt of Milk 3 48 8 2 32 5 64 10 56 72 1
16
Example 3 using GDC Independent Dependent Variable Hrs. of Exercise
Hrs. of Exercise Amt of Milk 3 48 8 2 32 5 64 10 56 72 1
17
Notice Notice: There seems to be no correlation between the hours or exercise a person performs and the amount of milk they drink.
18
Steps for Scatter Plot using GDC
Put x’s in L1 and y’s in L2 Click on “2nd y=“ Set scatter plot to look like the screen to the right. Press zoom 9 or set your own window and then press graph.
19
Linear Correlation
20
Correlation Definition – a statistical method used to determine whether a relationship exists between variables. 3 Types of Correlation: A. Positive B. Negative C. No Correlation
21
Positive Correlation: as x increases, y increases or as x decreases, y decreases.
Negative Correlation: as x increases, y decreases. No Correlation: there is no relationship between the variables.
22
Linear Correlation Analysis
Primary Purpose: to measure the strength of the relationship between the variables. *This is a test question!!!!
23
Coefficient of Linear Correlation
The numerical measure of the strength and the direction between 2 variables. This number is called the correlation coefficient. The symbol used to represent the correlation coefficient is “r.”
24
The range of “r” values The range of the correlation coefficient is -1 to +1. The closer to 0 you get, the weaker the correlation.
25
Range ____________________________________ -1 0 +1 Strong
Negative No Linear Relationship Strong Positive ____________________________________
26
Computational Formula using z-scores of x and y
27
Example 1 Find the correlation coefficient (r) of the following example. Use the lists in the calculator. x y 2 80 5 1 70 4 90 60
28
Find mean and standard deviation first
Since you will be using a formula that uses z-scores, you will need to know the mean and standard deviation of the x and y values. Put x’s in L1 Put y’s in L2 Run stat calc one var stats L1 – Write down mean & st. dev. Run stat calc one var stats L2 – Write down mean & st. dev. Better Option: 2nd Stat Math Mean (L?) Store It!!!!!
29
Shown on GDC – Write Down
x values: y values:
30
Calculator Lists Set Formula L1 L2 L3 = (L1-2.8)/1.643167673
L5 = L3 x L4 x y z(of x) z (of y) z (of x) times z(of y) 2 80 5 1.3389 1 70 -1.095 4 90 0.7303 1.2279 60 -1.403
31
Calculate “r” From the lists….. n = 5
32
What does that mean? Since r = 0.61, the correlation is a moderate correlation. Do we want to make predictions from this? It depends on how precise the answer needs to be.
33
Example 2 Find the correlation coefficient (r) for the following data.
Do you remember what we found from the scatter plot? Age Blood Pressure 43 128 48 120 56 135 61 143 67 141 70 152
34
Let’s do this one together
Remember to use your lists in the calculator. Don’t round numbers until your final answer. Find the mean and st. dev. for x and y. Explain what you found.
35
X Values: Y Values:
36
List values you should have
43 128 -1.368 1.0205 48 120 -1.448 1.2978 n=6 56 135 61 143 67 141 70 152 1.1796 1.36 1.6042
37
Compute “r”
38
Describe it Since r = 0.897 Strong Positive Correlation
39
Example 3…… Find the correlation coefficient for the following data.
Do you remember what we found from the scatter plot? # of Absences Final Grade 6 82 2 86 15 43 9 74 12 58 5 90 8 78
40
X Values: Y Values:
41
List Values using GDC L1 L2 L3 L4 L5 6 82 -0.4898 0.53626 -0.2626 2 86
-1.404 0.7746 -1.088 15 43 1.5673 -1.788 -2.802 n=7 9 74 12 58 5 90 1.0129 8 78
42
Compute “r”
43
Describe it Since r = Strong Negative Correlation
44
Example 4 Find the correlation coefficient of the following data.
Do you remember what we found from the scatter plot? Hrs of Exercise Amt of Milk 3 48 8 2 32 5 64 10 56 72 1
45
x Values: y Values:
46
List Values using GDC Hrs of Exercise Amt of Milk L3 L4 L5 3 48
8 -1.206 -1.476 1.7804 2 32 -0.603 5 64 1.0205 n=9 10 1.206 -1.387 -1.673 56 1.8091 1.2008 72 1.3771 1
47
Compute “r”
48
Describe It Since r = .067 No Correlation…..No correlation exists
49
What is It is the coefficient of determination.
It is the percentage of the total variation in y which can be explained by the relationship between x and y. A way to think of it: The value tells you how much your ability to predict is improved by using the regression line compared with NOT using the regression line.
50
For Example If it means that 89% of the variation in y can be explained by the relationship between x and y. It is a good fit.
51
Assignment Worksheet
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.