Practice You collect data from 53 females and find the correlation between candy and depression is Determine if this value is significantly different than zero. You collect data from 53 males and find the correlation between candy and depression is Determine if this value is significantly different than zero.
Practice You collect data from 53 females and find the correlation between candy and depression is –t obs = 3.12 –t crit = 2.00 You collect data from 53 males and find the correlation between candy and depression is –t obs = 4.12 –t crit = 2.00
Practice You collect data from 53 females and find the correlation between candy and depression is You collect data from 53 males and find the correlation between candy and depression is Is the effect of candy significantly different for males and females?
Hypothesis H 1 : the two correlations are different H 0 : the two correlations are not different
Testing Differences Between Correlations Must be independent for this to work
When the population value of r is not zero the distribution of r values gets skewed Easy to fix! Use Fisher’s r transformation Page 746
Testing Differences Between Correlations Must be independent for this to work
Testing Differences Between Correlations
Note: what would the z value be if there was no difference between these two values (i.e., H o was true)
Testing Differences Z = What is the probability of obtaining a Z score of this size or greater, if the difference between these two r values was zero? p =.267 If p is <.025 reject H o and accept H 1 If p is = or >.025 fail to reject H o The two correlations are not significantly different than each other!
Remember this: Statistics Needed Need to find the best place to draw the regression line on a scatter plot Need to quantify the cluster of scores around this regression line (i.e., the correlation coefficient)
Regression allows us to predict!.....
Straight Line Y = mX + b Where: Y and X are variables representing scores m = slope of the line (constant) b = intercept of the line with the Y axis (constant)
Excel Example
That’s nice but.... How do you figure out the best values to use for m and b ? First lets move into the language of regression
Straight Line Y = mX + b Where: Y and X are variables representing scores m = slope of the line (constant) b = intercept of the line with the Y axis (constant)
Regression Equation Y = a + bX Where: Y = value predicted from a particular X value a = point at which the regression line intersects the Y axis b = slope of the regression line X = X value for which you wish to predict a Y value
Practice Y = X What is the slope and the Y-intercept? Determine the value of Y for each X: X = 1, X = 3, X = 5, X = 10
Practice Y = X What is the slope and the Y-intercept? Determine the value of Y for each X: X = 1, X = 3, X = 5, X = 10 Y = -5, Y = -1, Y = 3, Y = 13
Finding a and b Uses the least squares method Minimizes Error Error = Y - Y (Y - Y) 2 is minimized
.....
..... Error = 1 Error = -1 Error =.5 Error = -.5Error = 0 Error = Y - Y (Y - Y) 2 is minimized
Finding a and b Ingredients COV xy S x 2 Mean of Y and X
Regression
Ingredients Mean Y =4.6 Mean X = 3 Cov xy = 3.75 S 2 X = 2.50
Regression Ingredients Mean Y =4.6 Mean X = 3 Cov xy = 3.75 S 2 x = 2.50
Regression Ingredients Mean Y =4.6 Mean X = 3 Cov xy = 3.75 S 2 x = 2.50
Regression Equation Y = a + bx Equation for predicting smiling from talking Y = (x)
Regression Equation Y = (x) How many times would a person likely smile if they talked 15 times?
Regression Equation Y = (x) How many times would a person likely smile if they talked 15 times? 22.6 = (15)
Y = (1.5)X.....
Y = (1.5)X X = 1; Y =
Y = (1.5)X X = 5; Y =
Y = (1.5)X
Mean Y = 14.50; S y = 4.43 Mean X = 6.00; S x = 2.16 Quantify the relationship with a correlation and draw a regression line that predicts aggression.
∑XY = 326 ∑Y = 58 ∑X = 24 N = 4
∑XY = 326 ∑Y = 58 ∑X = 24 N = 4
COV = S y = 4.43 S x = 2.16
COV = S y = 4.43 S x = 2.16
Regression Ingredients Mean Y =14.5 Mean X = 6 Cov xy = S 2 X = 4.67
Regression Ingredients Mean Y =14.5 Mean X = 6 Cov xy = S 2 X = 4.67
Regression Equation Y = a + bX Y = (-1.57)X
Y = (-1.57)X
Y = (-1.57)X.
Y = (-1.57)X..
Y = (-1.57)X..
Hypothesis Testing Have learned –How to calculate r as an estimate of relationship between two variables –How to calculate b as a measure of the rate of change of Y as a function of X Next determine if these values are significantly different than 0
Testing b The significance test for r and b are equivalent If X and Y are related (r), then it must be true that Y varies with X (b). Important to learn b significance tests for multiple regression
Steps for testing b value 1) State the hypothesis 2) Find t-critical 3) Calculate b value 4) Calculate t-observed 5) Decision 6) Put answer into words
Practice You are interested in if candy consumption significantly alters a persons depression. Create a graph showing the relationship between candy consumption and depression (note: you must figure out which is X and which is Y)
Practice CandyDepression Charlie555 Augustus743 Veruca459 Mike3108 Violet465
Step 1 H 1 : b is not equal to 0 H 0 : b is equal to zero
Step 2 Calculate df = N - 2 –df = 3 Page 747 –First Column are df –Look at an alpha of.05 with two-tails –t crit = and
Step 3 CandyDepression Charlie555 Augustus743 Veruca459 Mike3108 Violet465 COV = -30.5N = 5 r = -.81Sy = S x = 1.52
Step 3 COV = -30.5N = 5 r = -.81 S x = 1.52 S y = Y = (X) b =
Step 4 Calculate t-observed b = Slope S b = Standard error of slope
Step 4 S yx = Standard error of estimate S x = Standard Deviation of X
Step 4 S y = Standard Deviation of y r = correlation between x and y
Note
..... Error = 1 Error = -1 Error =.5 Error = -.5Error = 0 Error = Y - Y (Y - Y) 2 is minimized
Step 4 S y = Standard Deviation of y r = correlation between x and y
Step 4 S yx = Standard error of estimate S x = Standard Deviation of X
Step 4 S yx = Standard error of estimate S x = Standard Deviation of X
Step 4 Calculate t-observed b = Slope S b = Standard error of slope
Step 4 Calculate t-observed b = Slope S b = Standard error of slope
Step 4 Note: same value at t-observed for r
Step 5 If t obs falls in the critical region: –Reject H 0, and accept H 1 If t obs does not fall in the critical region: –Fail to reject H 0
t distribution t crit = t crit =
t distribution t crit = t crit =
Step 5 If t obs falls in the critical region: –Reject H 0, and accept H 1 If t obs does not fall in the critical region:If t obs does not fall in the critical region: –Fail to reject H 0
Practice
Page
The regression equation for faculty shows that the best estimate of starting salary for faculty is $15,000 (intercept). For every additional year the salary increases on average by $900 (slope). For administrative staff the best estimate of starting salary is $10,000 (slope), for every additional year the salary increases on average by $1500 (slope). They will be equal at 8.33 years of service.
Practice Page
r =.68r 1 =.829 r =.51r 1 =.563 Z =.797 p =.2119 Correlations are not different from each other
Discuss 9.38
SPSS Problem #3 Due March 15th Page 287 –9.2 –9.3 –9.10 and create a graph by hand