Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regression & Correlation. Review: Types of Variables & Steps in Analysis.

Similar presentations


Presentation on theme: "Regression & Correlation. Review: Types of Variables & Steps in Analysis."— Presentation transcript:

1 Regression & Correlation

2 Review: Types of Variables & Steps in Analysis

3 Types of Variables Nominal / Categorical: each value is distinct category [gender, blood type, city] Scale / Interval: linear measure, same interval between each value [age, weight, IQ, GPA, SAT, income] Ordinal: ranking, un-equal intervals between values [Likert scale, preference ranking]

4 Variables & Statistical Tests Variable TypeExampleCommon Stat Method Nominal by nominal Blood type by gender Chi-square Scale by nominalGPA by gender GPA by major T-test Analysis of Variance Scale by scaleWeight by height GPA by SAT Regression Correlation

5 Evaluating an hypothesis Step 1: What is the relationship in the sample? Step 2: How confidently can one generalize from the sample to the universe from which it comes?

6 Evaluating an hypothesis Relationship in Sample Statistical Significance 2 ordinal vars.Cross-tab / contingency table “p value” from Chi Square Scale dep. & 2-cat indep. Means for each category “p value” from t- test Scale dep. & 3+ cat indep. Means for each category “p value” from ANOVA 2 scale vars.Regression line Correlation “p value” from reg or correlation

7 Relationships between Scale Variables Regression Correlation

8 Regression Amount that a dependent variable increases (or decreases) for each unit increase in an independent variable. Expressed as equation for a line – y = m(x) + b – the “regression line” Interpret by slope of the line: m (Or: interpret by “odds ratio” in “logistic regression”)

9 Correlation Strength of association of scale measures r = -1 to 0 to +1 +1 perfect positive correlation -1 perfect negative correlation 0 no correlation Interpret r in terms of variance

10 Mean & Variance

11 Survey of Class n = 42 Height Mother’s height Mother’s education SAT Estimate IQ Well-being (7 pt. Likert) Weight Father’s education Family income G.P.A. Health (7 pt. Likert)

12 Frequency Table for:HEIGHT Valid Cum Value Label Value Frequency Percent Percent Percent 59.00 1 2.4 2.4 2.4 61.00 2 4.8 4.8 7.1 62.00 3 7.1 7.1 14.3 63.00 3 7.1 7.1 21.4 65.00 5 11.9 11.9 33.3 66.00 3 7.1 7.1 40.5 67.00 4 9.5 9.5 50.0 68.00 5 11.9 11.9 61.9 69.00 1 2.4 2.4 64.3 70.00 6 14.3 14.3 78.6 71.00 1 2.4 2.4 81.0 72.00 4 9.5 9.5 90.5 73.00 3 7.1 7.1 97.6 74.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0

13 Frequency Table for:HEIGHT Valid Cum Value Label Value Frequency Percent Percent Percent 59.00 1 2.4 2.4 2.4 61.00 2 4.8 4.8 7.1 62.00 3 7.1 7.1 14.3 63.00 3 7.1 7.1 21.4 65.00 5 11.9 11.9 33.3 66.00 3 7.1 7.1 40.5 67.00 4 9.5 9.5 50.0 68.00 5 11.9 11.9 61.9 69.00 1 2.4 2.4 64.3 70.00 6 14.3 14.3 78.6 71.00 1 2.4 2.4 81.0 72.00 4 9.5 9.5 90.5 73.00 3 7.1 7.1 97.6 74.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0 Descriptive Statistics for:HEIGHT Valid Variable Mean Std Dev Variance Range Minimum Maximum N HEIGHT 67.33 3.87 14.96 15.00 59.00 74.00 42 mean

14 Variance   x i - Mean ) 2 Variance = s 2 = ----------------------- N Standard Deviation = s =  variance

15 Frequency Table for:WEIGHT Valid Cum Value Label Value Frequency Percent Percent Percent 115.00 1 2.4 2.4 2.4 120.00 1 2.4 2.4 4.8 124.00 1 2.4 2.4 7.1 125.00 4 9.5 9.5 16.7 128.00 1 2.4 2.4 19.0 130.00 6 14.3 14.3 33.3 135.00 4 9.5 9.5 42.9 136.00 1 2.4 2.4 45.2 140.00 3 7.1 7.1 52.4 145.00 2 4.8 4.8 57.1 150.00 3 7.1 7.1 64.3 155.00 2 4.8 4.8 69.0 160.00 6 14.3 14.3 83.3 165.00 2 4.8 4.8 88.1 170.00 1 2.4 2.4 90.5 185.00 1 2.4 2.4 92.9 190.00 2 4.8 4.8 97.6 210.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0 Descriptive Statistics for:WEIGHT Valid Variable Mean Std Dev Variance Range Minimum Maximum N WEIGHT 146.38 21.30 453.80 95.00 115.00 210.00 42 mean

16 Relationship of weight & height: Regression Analysis

17

18 “Least Squares” Regression Line Dependent = ( B ) (Independent) + constant weight = ( B ) ( height ) + constant

19 Regression line

20 Regression:WEIGHTonHEIGHT Multiple R.59254 R Square.35110 Adjusted R Square.33488 Standard Error 17.37332 Analysis of Variance DF Sum of Squares Mean Square Regression 1 6532.61322 6532.61322 Residual 40 12073.29154 301.83229 F = 21.64319 Signif F =.0000 ------------------ Variables in the Equation ------------------ Variable B SE B Beta T Sig T HEIGHT 3.263587.701511.592541 4.652.0000 (Constant) -73.367236 47.311093 -1.551 [ Equation:Weight = 3.3 ( height ) - 73 ]

21 Regression line W = 3.3 H - 73

22 Strength of Relationship “Goodness of Fit”: Correlation How well does the regression line “fit” the data?

23

24 Frequency Table for:WEIGHT Valid Cum Value Label Value Frequency Percent Percent Percent 115.00 1 2.4 2.4 2.4 120.00 1 2.4 2.4 4.8 124.00 1 2.4 2.4 7.1 125.00 4 9.5 9.5 16.7 128.00 1 2.4 2.4 19.0 130.00 6 14.3 14.3 33.3 135.00 4 9.5 9.5 42.9 136.00 1 2.4 2.4 45.2 140.00 3 7.1 7.1 52.4 145.00 2 4.8 4.8 57.1 150.00 3 7.1 7.1 64.3 155.00 2 4.8 4.8 69.0 160.00 6 14.3 14.3 83.3 165.00 2 4.8 4.8 88.1 170.00 1 2.4 2.4 90.5 185.00 1 2.4 2.4 92.9 190.00 2 4.8 4.8 97.6 210.00 1 2.4 2.4 100.0 ------- ------- ------- Total 42 100.0 100.0 Valid cases 42 Missing cases 0 Descriptive Statistics for:WEIGHT Valid Variable Mean Std Dev Variance Range Minimum Maximum N WEIGHT 146.38 21.30 453.80 95.00 115.00 210.00 42 mean

25 Variance = 454

26 Regression line mean

27 Correlation: “Goodness of Fit” Variance (average sum of squared distances from mean) = 454 “Least squares” (average sum of squared distances from regression line) = 295 454 – 295 = 159159 / 454 =.35 Variance is reduced 35% by calculating from regression line

28 r 2 = % of variance in WEIGHT “explained” by HEIGHT Correlation coefficient = r

29 Correlation:HEIGHTwith WEIGHT HEIGHT WEIGHT HEIGHT 1.0000.5925 ( 42) ( 42) P=. P=.000 WEIGHT.5925 1.0000 ( 42) ( 42) P=.000 P=.

30 r =.59 r 2 =.35 HEIGHT “explains” 35% of variance in WEIGHT

31 Sentence & G.P.A. Regression: form of relationship Correlation: strength of relationship p value: statistical significance

32 G. P. A.

33 Length of Sentence

34 Scatterplot: Sentence on G.P.A.

35 Regression Coefficients Sentence = -3.5 G.P.A. + 18

36 Sent = -3.5 GPA + 18 “Least Squares” Regression Line

37 Correlation: Sentence & G.P.A.

38 Interpreting Correlations r = -22p =.31 r 2 =.05 G.P.A. “explains” 5% of the variance in length of sentence


Download ppt "Regression & Correlation. Review: Types of Variables & Steps in Analysis."

Similar presentations


Ads by Google