Correlations & Regression Modelling

Slides:



Advertisements
Similar presentations
Structural Equation Modeling
Advertisements

Describing Relationships Using Correlation and Regression
Education 793 Class Notes Joint Distributions and Correlation 1 October 2003.
Maureen Meadows Senior Lecturer in Management, Open University Business School.
Correlation and Simple Regression Introduction to Business Statistics, 5e Kvanli/Guynes/Pavur (c)2000 South-Western College Publishing.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
Lecture 4: Correlation and Regression Laura McAvinue School of Psychology Trinity College Dublin.
Business Statistics - QBM117 Statistical inference for regression.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
Correlation and Regression Analysis
Correlation & Regression Math 137 Fresno State Burger.
Linear Regression Analysis
Lecture 5 Correlation and Regression
Correlation and Regression
Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.
Correlation By Dr.Muthupandi,. Correlation Correlation is a statistical technique which can show whether and how strongly pairs of variables are related.
Correlation and regression 1: Correlation Coefficient
Scatter Plots and Linear Correlation. How do you determine if something causes something else to happen? We want to see if the dependent variable (response.
Introduction to Quantitative Data Analysis (continued) Reading on Quantitative Data Analysis: Baxter and Babbie, 2004, Chapter 12.
Regression Analysis. Scatter plots Regression analysis requires interval and ratio-level data. To see if your data fits the models of regression, it is.
Investigating the Relationship between Scores
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Introduction to Correlation Analysis. Objectives Correlation Types of Correlation Karl Pearson’s coefficient of correlation Correlation in case of bivariate.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 9: Correlation and Regression Analysis. Correlation Correlation is a numerical way to measure the strength and direction of a linear association.
Correlation. Up Until Now T Tests, Anova: Categories Predicting a Continuous Dependent Variable Correlation: Very different way of thinking about variables.
Correlation & Regression Analysis
Lecture 29 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics.
Overview and interpretation
CORRELATION ANALYSIS.
Chapter 7: Hypothesis Testing. Learning Objectives Describe the process of hypothesis testing Correctly state hypotheses Distinguish between one-tailed.
©2013, The McGraw-Hill Companies, Inc. All Rights Reserved Chapter 3 Investigating the Relationship of Scores.
Chapter 13 Linear Regression and Correlation. Our Objectives  Draw a scatter diagram.  Understand and interpret the terms dependent and independent.
Correlation & Forecasting
Simple Linear Correlation
Statistical analysis.
Regression Analysis.
Business and Economics 6th Edition
Regression Analysis AGEC 784.
Correlation & Regression
Statistical analysis.
AND.
Correlation and Regression
PCB 3043L - General Ecology Data Analysis.
Chapter 5 STATISTICS (PART 4).
Copyright © Cengage Learning. All rights reserved.
Correlation and Regression
Elementary Statistics
Chapter 6 Predicting Future Performance
CHAPTER 10 Correlation and Regression (Objectives)
Correlation and Simple Linear Regression
CORRELATION(r) and REGRESSION (b)
Correlation and Regression
2. Find the equation of line of regression
Since When is it Standard to Be Deviant?
CORRELATION ANALYSIS.
Correlation and Simple Linear Regression
Chapter 3D Chapter 3, part D Fall 2000.
Correlation and Regression
11A Correlation, 11B Measuring Correlation
M248: Analyzing data Block D UNIT D3 Related variables.
Simple Linear Regression and Correlation
Product moment correlation
Topic 8 Correlation and Regression Analysis
Chapter 6 Predicting Future Performance
Warm-up: Pg 197 #79-80 Get ready for homework questions
Correlation & Regression
Business and Economics 7th Edition
REGRESSION ANALYSIS 11/28/2019.
Presentation transcript:

Correlations & Regression Modelling

Correlation Correlation is a statistical technique that show whether and how strongly pairs of variables are related (height and weight are related). Like all statistical techniques, correlation is only appropriate for certain kinds of data. Branch of statistics that looks at the relationship between two data sets

Correlation Pearson correlation coefficient specifically addresses linear relationships It ranges from -1 to 1. The closer r is to 1 or -1, the more closely the two variables are related. If r is close to 0, it means there is no relationship between the variables. If r is positive, it means that as some variables gets larger, the other gets larger too. If r is negative, it means that as one gets larger, the other gets smaller Inverse correlation) 𝑟= 𝑖=1 𝑛 (𝑥 𝑖 − 𝑥 )(𝑦 𝑖 − 𝑦 ) 𝑖=1 𝑛 (𝑥 𝑖 − 𝑥 )2 𝑖=1 𝑛 (𝑦 𝑖 − 𝑦 )2

Correlation A correlation report can also show a second result of each test (statistical significance). Significance level will tell you how likely it is that the reported correlation may be due to chance in the form of random sampling error. alpha = r2

Correlation Error  to assume a correlation means that a change in one variable causes a change in another  Correlation doesn’t imply causation.

Correlation “Correlation does not imply causation” Source: https://en.wikipedia.org/wiki/Correlation_and_dependence

Example. Correlation between Tree Stumps and Beetle Larvae. Is there a linear relationship between the number of tree stumps left behind by beavers and the number of beetle larvae?. Researchers laid out 10 circular plots, each 4 meters diameter, in an area where beavers were cutting down cottonwood trees. The number of stumps and the number of clusters of beetle were recorded in each plot with the following results. Stumps (x) Beetle Larvae (y) 2 10 30 1 12 3 24 4 40 11 5 56 8 14

Example. Correlation between Tree Stumps and Beetle Larvae. Stumps (x) Beetle Larvae (y) x2 y2 x*y 2 10 4 100 20 30 900 60 1 12 144 3 24 9 576 72 40 16 1600 160 11 121 5 56 25 3136 280 120 8 64 14 196 28 245 74 8437 771

Example. Correlation between Tree Stumps and Beetle Larvae 𝑟= 𝑖=1 𝑛 (𝑥 𝑖 − 𝑥 )(𝑦 𝑖 − 𝑦 ) 𝑖=1 𝑛 (𝑥 𝑖 − 𝑥 )2 𝑖=1 𝑛 (𝑦 𝑖 − 𝑦 )2 r = 0.92 (0.92)2 = 0.84  The variability in the number of tree stumps explains about 84% of the variability in the number of clusters of beetle larvae.

Example. Correlation between Tree Stumps and Beetle Larvae Resolution in R x= c(2,2,1,3,4,1,5,3,1,2) y= c(10, 30, 12, 24, 40, 11, 56, 40, 8, 14) a=cor(x, y, method = c("pearson")) a

Correlation Interpretation and Covariance Matrix Calculation of Covariance Matrix 2. Calculation of Covariance 3. Calculation of Correlation

Covariance Matrix

Covariance Matrix Jochen Triesch, UC San Diego, http://cogsci.ucsd.edu/~triesch

Covariance Matrix (example) We want to calculate and interpret the covariance and correlation between the height and the weight of a group of people: P1 P2 P3 Height (cm) 180 156 170 Weight (kg) 86 54 70

1. If we treat Height and weight as independent samples, we can calculate their Variances:

2. If we extend the idea of variance to two dimensions, it gives a 2x2 matrix with the total information about the random vector.

σ 1 2 σ 21 σ 12 σ 2 2 Covariance Matrix σ 1 2 σ 21 σ 12 σ 2 2 The variances are located in the main diagonal of the matrix. The elements besides the main diagonal are called covariances.

Covariance Matrix The covariance matrix is the basis for all later considerations concerning accuracy. The covariance matrix is always symmetric.

Covariance The covariance measures the linear relationship between two variables.

Correlation A correlation coefficient measures the degree to which two variables tend to change at the same time. The coefficient describes both the strength and the direction of the relationship.

Correlation The correlation coefficient depends on the covariance. The correlation coefficient is equal to the covariance divided by the product of the standard deviations of the variables. Therefore, a positive covariance will always produce a positive correlation and a negative covariance will always generate a negative correlation.

Example x= [2,2,1,3,4,1,5,3,1,2] y= [10, 30, 12, 24, 40, 11, 56, 40, 8, 14] n=10 1. Calculate the Covariance Matrix 2. Calculate Covariance Coefficient sample 3. Calculate correlation coefficient

R mdata=matrix(c(2,2,1,3,4,1,5,3,1,2,10, 30, 12, 24, 40, 11, 56, 40, 8, 14),ncol=10,nrow=2, byrow=TRUE) c1=mdata[1,] c2=mdata[2,] A= c1-mean(c1) B= c2 - mean(c2) N=rbind(A, B) COV= t(N)%*% N cov2=(t(A)%*% B)/9 cov(c1, c2) cor(c1,c2) cor2= cov2/(sqrt(var(c1))*sqrt(var(c2)))

Exercise 1 Using the internal R Data set “trees”. We will look at whether volume, height and girth of trees are correlated. We’ll plot the data first. Does the data appear to show a correlation?. Which relationship appears to be the strongest?. Manually calculate the Pearsons correlation coefficient for one of these relationships. Use R to check the correlation between all sets.

Reference material https://en.wikipedia.org/wiki/Linear_model https://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient