Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regression and Correlation

Similar presentations


Presentation on theme: "Regression and Correlation"— Presentation transcript:

1 Regression and Correlation
GTECH 201 Lecture 18

2 ANOVA Analysis of Variance
Continuation from matched-pair difference of means tests; but now for 3+ cases We still check whether samples come from one or more distinct populations Variance is a descriptive parameter ANOVA compares group means and looks whether they differ sufficiently to reject H0

3 ANOVA H0 and HA

4 ANOVA Test Statistic MSB = between-group mean squares MSW = within-group mean squares Between-group variability is calculated in three steps: Calculate overall mean as weighted average of sample means Calculate between-group sum of squares Calculate between-group mean squares (MSB)

5 Between-group Variability
Total or overall mean Between-group sum of squares Between-group mean squares

6 Within-group Variability
Within-group sum of squares Within-group mean squares

7 Kruskal-Wallis Test Nonparametric equivalent of ANOVA
Extension of Wilcoxon rank sum W test to 3+ cases Average rank is Ri / ni Then the Kruskal-Wallis H test statistic is With N =n1 + n2 + … +nk = total number of observations, and Ri = sum of ranks in sample i

8 ANOVA Example House prices by neighborhood in ,000 dollars A B C D
196

9 ANOVA Example, continued
Sample statistics n X s A B C D Total Now fill in the six steps of the ANOVA calculation

10 The Six Steps

11 Correlation Co-relatedness between 2+ variables
As the values of one variable go up, those of the other change proportionally Two step approach: Graphically - scatterplot Numerically – correlation coefficients

12 Is There a Correlation?

13 Scatterplots Exploratory analysis

14 Pearson’s Correlation Index
Based on concept of covariance = covariation between X and Y = deviation of X from its mean = deviation of Y from its mean Pearson’s correlation coefficient

15 Sample and Population r is the sample correlation coefficient
Applying the t distribution, we can infer the correlation for the whole population Test statistic for Pearson’s r

16 Correlation Example Lake effect snow

17 Spearman’s Rank Correlation
Non-parametric alternative to Pearson Logic similar to Kruskal and Wilcoxon Spearman’s rank correlation coefficient

18 Regression In correlation we observe degrees of association but no causal or functional relationship In regression analysis, we distinguish an independent from a dependent variable Many forms of functional relationships bivariate linear multivariate non-linear (curvi-linear)

19 Graphical Representation
In correlation analysis either variable could be depicted on either axis In regression analysis, the independent variable is always on the X axis Bivariate relationship is described by a best-fitting line through the scatterplot

20 Least-Square Regression
Objective: minimize

21 Regression Equation Y = a + bX

22 Strength of Relationship
How much is explained by the regression equation?

23 Coefficient of Determination
Total variation of Y (all the bucket water) Large ‘Y’ = dependent variable Small ‘y’ = deviation of each value of Y from its mean e = explained; u = unexplained

24 Explained Variation Ratio of square of covariation between X and Y to the variation in X where Sxy = covariation between X and Y Sx2 = total variation of X Coefficient of determination

25 Error Analysis r 2 tells us what percentage of the variation is accounted for by the independent variable This then allows us to infer the standard error of our estimate which tells us, on average, how far off our prediction would be in measurement units


Download ppt "Regression and Correlation"

Similar presentations


Ads by Google