Download presentation
Presentation is loading. Please wait.
Published bySherman Darrell Taylor Modified over 9 years ago
1
Psyc 235: Introduction to Statistics DON’T FORGET TO SIGN IN FOR CREDIT! http://www.psych.uiuc.edu/~jrfinley/p235/
2
End of Term Business 3rd Graded Assessment next M & W Same procedure as the last two If you want to sign up for a specific time please email one of us. Extra Credit Final Requirements 84 hours on ALEKS by 11:59 pm April 30th You can check how many in class hours you have during the assessment You must schedule an extra credit final (no walk-ins) by 11:59 May 4th (email is fine) Asssessment can be scheduled anytime 9-5 on May 8th or May 9th
3
Questions? You master 98% or more Your receive A+ 92% – 97%A 84% – 91%B 75% - 83%C 65% - 74%D < 65%F
4
Studying Relationships between Variables When considering relationships, An independent variable (X) usually has many quantitative levels Want to show that the dependant variable is some function of the independent variable Correlation & Regression
5
Correlation & Regression What’s the difference? Usually, 2 observations for each of N subjects Consider these 2 examples: running speed in a maze(Y) & number trials to reach criterion(X) Running speed (Y) & the number of food pellets per reinforcement(X) What’s the difference here? In both, Y is a random variable. In one X is fixed and the other it is random.
6
Difference between correlation & regression When X is fixed variable-- regression When X is random variable -- correlation Note: in practice they are very similar, often differentiated by purpose of research Goal is to predict Y from X: regression Goal is to know degree of relationship: correlation
7
Last Week: Scatter Plots X-axis: predictor variable Y-axis: criterion variable
8
Consider Data: What do you see? Unexpected? As we continue, think about why we might see this pattern What is this line? Regression Line: Y predicted on X Given a value of X, our prediction of what Y is likely to be
9
Correlation Correlation is a measure of the degree to which the data clusters around that line. If all data is on the line r=+1.
10
Correlation Coefficient: r sign: direction of relationship magnitude (number): strength of relationship -1 ≤ r ≤ 1 r=0 is no linear relationship r=-1 is “perfect” negative correlation r=1 is “perfect” positive correlation Notes: Symmetric Measure (You can exchange X and Y and get the same value) Measures linear relationship only
11
Example for discussion: Howell(1988) investigated the relationship between stress and health. Measures: Stress: a scale to measure frequency, perceived importance, and desirability of life events Health: Hypkins Symptom Checklist for 57 psychological symptoms
12
What do we always do first in a study? Look at the raw data! What do you notice? unimodal positively skewed few outliers good variability
13
Covariance Correlation Coefficient is based off of covariance Covariance: number that reflects the degree to which two variables vary together Cov XY =∑(X-X)(Y-Y) N - 1
14
Think about it… If there was a positive correlation between these two variables what we expect to find? ∑X = 2297 ∑X 2 = 67489 X = 21467 S x = 13.096 ∑Y = 9705 ∑Y 2 = 923787 Y = 90.701 S y =20.266 ∑XY = 22576 N = 107
15
Think about it… ∑X = 2297 ∑X 2 = 67489 X = 21467 S x = 13.096 ∑Y = 9705 ∑Y 2 = 923787 Y = 90.701 S y =20.266 ∑XY = 22576 N = 107 Cov XY =∑(X-X)(Y-Y) N - 1 or Cov XY =∑XY-(∑X∑Y/N) N - 1 Cov XY =134.301
16
Pearson Correlation Coefficient ( r ) The way we were talking about the covariance suggests that it might be a measure of the relationship between 2 variables But: Covariance is also a function of the standard deviations of X and Y To resolve: r = cov xy S X S Y r= 134.301 =.506 (13.1)(20.3)
17
Last week you learned: Formula: alt. formula (ALEKS): So this is the same formula for r, just in a slightly different form. r is the covariance, adjusted by the standard deviation Also Note: this r is not an unbiased estimate of the correlation coefficient in the population: so there is an adjusted formula for a small sample size (not common)
18
Looking at the data… Equation for straight line: Y = bX + a Y= predicted Y value b = slope a = intercept X= value of predictor value Our task is to solve for a & be to find best-fitting linear function. A logical way to do this is in terms of errors of prediction… So, how far apart are predicted Ys and actual Ys ^ ^
19
Optimal values of a & b a = Y -bX b = cov XY s 2 x In our stress data set: b=.7831 a=73.891 Y=.78*X+73.9 So, we could find a predicted value of Y for any X… And then find the difference between predicted and actual. ^
20
X Y Estimated Regression Line Find (Y-Y) 2 ^
21
X Y Estimated Regression Line =“y hat”: predicted value of Y for X i
22
ALEKS: Y=b 0 +b 1 X b 1 (slope) b 0 (Y intercept) using correlation coefficient
23
Standard Error of Estimate So if we wanted to predict a value of Y from X, we could just plug it into the formula. But! We would expect our predicted Y to vary some from the actual data So we calculate the likely error S 2 Y.X = ∑(Y-Y) = SS residual N-2 N-2 ^
24
Data Example ∑(Y-Y)=0 ∑(Y-Y)=32,388.049 ^^^^ S 2 Y.X = 32388 =309.458 N-2 S Y.X = 17.56
25
Confidence Limits on Y Although standard error of estimate is useful as an overall measure of error, it is not a good estimate of the error in any single prediction If we want to predict Y on S for a NEW member of the population, the error is given by…
26
CI on Y s’ Y.X = s Y.X √(1 + 1 + (X i -X) 2 ) N (N-1)s 2 And then we form the Confidence Interval as we have in the past CI(Y) = Y +- (t /2 )(s’ Y.X ) In our data: s’ Y.X =17.7 CI(Y) = 81.7 +- 1.9(17.7) 46.6<Y<116.9
27
Hypothesis Testing We know how to calculate r and b (the slope) We may want to test the null hypothesis that the corresponding population parameters are 0.
28
Testing r We can show that when p(rho)=0, r will be normally distributed around 0. t = r√N-2 with df = N-2 √1-r 2 So, in the physician example: r =.506 and N=107 t=6.011 and we reject H 0 : p=0
29
Testing b If you think about it, testing b is very similar to testing b… Right? (slope is nonzero if X&Y are related) b* = underlying pop value of b Standard Error approximated by: sb = s Y.X s X √N-1
30
Testing b t = b- b* = b = (b)(s X )√N-1 s b s Y.X s Y.X s X √N-1 In our data: t=(.7831)(13.096)(√106)/17.56 =6.011 This is the same t value!!
31
CI on b CI(Y) = b +- (t /2 )* s Y.X s X √N-1
32
ICES Forms
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.