Presentation is loading. Please wait.

Presentation is loading. Please wait.

M248: Analyzing data Block D UNIT D3 Related variables.

Similar presentations


Presentation on theme: "M248: Analyzing data Block D UNIT D3 Related variables."— Presentation transcript:

1 M248: Analyzing data Block D UNIT D3 Related variables

2 UNIT D3: Related variables
Block D UNIT D3: Related variables Contents Introduction Section 1: Correlation Section 2: Measures of correlation Terms to know and use Unit D3 Exercises

3 Introduction This unit is based on two points
Correlation between two variables and the strength of this correlation. Studying contingency tables

4 Section 1: Correlation Two random variables are said to be related (or correlated or associated) if knowing the value of one variable tells you something about the value of the other. Alternatively we say there is a relationship between the two variables. The first thing to do when investigating a possible relationship between two variables, is to produce a scatterplot of the data.

5 Section 1: Correlation Section 1.1: Are the variables related?
The two variables are said to be positively related if the pattern of the points in the scatterplot slopes upwards from left to right. Example:

6 Section 1: Correlation The two variables are said to be negatively related if the pattern of the points in the scatterplot slopes downwards from left to right. Example:

7 Section 1: Correlation Sometimes, a relationship between two variables is more complicated and the variables cannot be classified as either positively or negatively related. Example: Read Examples 1.1, 1.2, 1.3 and 1.4 Solve Activity 1.1 Pattern seems to be like

8 Section 1: Correlation Section 1.2: Correlation and Causation
Causation and correlation are not equivalent. Causation means that the value of one variable is caused by the value of the other, while correlation means that there is a relationship between the two variables. Check activity 1.2 page 91

9 Section 2: Measures of correlation
There are two measures of correlations, The Pearson and the Spearman correlation. These measures are called correlation coefficients. A correlation coefficient is a number between -1 and +1, the closer it is of those limits, the stronger the relationship between the two variables. Correlation coefficients which measure how well a straight line can explain the relationship between two variables are called linear correlation coefficients.

10 Section 2: Measures of correlation
Section 2.1: The Pearson correlation coefficient Two variables are said to be positively related if they increase together and decrease together, then the correlation coefficient will be positive. And if they are negatively related then it will be negative. A correlation coefficient of 0 implies that there is no systematic linear relationship between the two variables.

11 Section 2: Measures of correlation
The Pearson correlation coefficients is a measure of how well a straight line can explain the relationship between two variables. It is only appropriate to use this coefficient if the scatterplot shows a roughly linear pattern. Check the example on page 94 The covariance is basically a number that reflects the degree to which two random variables vary together. The covariance is defined by:

12 Section 2: Measures of correlation
For data with sums of squares of deviations the Pearson correlation coefficient r is defined by The word ‘coefficient’ is sometimes omitted, and r is referred to simply as the Pearson correlation. Note: When there are outliers in data, it is sometimes useful to omit those points before calculating the correlation coefficients. Read Example 2.1 Solve activities 2.2, 2.3 and 2.4

13 Section 2: Measures of correlation
Section 2.2: The Spearman rank correlation coefficient Replacing the original data by their ranks, and measuring the strength of association between two variables by calculating the Pearson correlation coefficient with the ranks is known as the Spearman rank correlation coefficient, and is denoted by rs The values of rs is a measure of the linearity of the relationship between the ranks. Solve activity 2.5 page 101

14 Section 2: Measures of correlation
A relationship is known as a monotonic increasing relationship if the value of rs is equal to +1. that is they have an exact curvilinear positive relationship. Example: Similarly a data has a Spearman rank correlation coefficient of -1 if the two variables have a monotonic decreasing relationship

15 Section 2: Measures of correlation
Section 2.3: Testing for association The sampling distribution of the Pearson correlation: Under the null hypothesis that there is no association between two variables, the sampling distribution of the Pearson correlation R, is such that: Read Example 2.5 Solve Activity 2.6

16 Section 2: Measures of correlation
The approximate sampling distribution of the Spearman correlation: For large samples, under the null hypothesis of no association, the sampling distribution of Rs, is such that: Solve Activity 2.7 Solve Exercises 2.1 and 2.2

17 Section 2: Measures of correlation
Section 2.4: Correlation using MINITAB Refer to chapter 7 of computer Book D for the work in this subsection.

18 Terms to know and use Related variable Monotonic relationship
Correlation Correlation coefficient Association Pearson correlation coefficient Causation Spearman rank correlation coefficient Positively related Negatively related Monotonic increasing Monotonic decreasing

19 Unit D3 Exercises M248 Exercise Booklet
Solve the following exercises: Exercise 66 …………………………………… Page 20 Exercise 67 …………………………………… Page 20 Exercise 68 …………………………………… Page 20


Download ppt "M248: Analyzing data Block D UNIT D3 Related variables."

Similar presentations


Ads by Google