Reasoning in Psychology Using Statistics 2017
Announcements Exam 2 in lecture and lab on Wednesday Be prepared to do calculations (including square roots) on calculator Announcements
Cautions with Correlations Mathematical cautions Different scales: convert to z-scores Restriction of range (e.g., age & height) Outliers (especially in small samples) Interpretive caution Causal claims Cautions with Correlations
Pearson’s r, z transformation Change all scores to z-scores Both variables on same scale Correlation stays the same What happens to means? zy zx -1.5 -1.0 -0.5 0.5 -1 -.5 0 .5 1 1.5 Convert X and Y to z-scores 1.0 Y X 1 2 3 4 5 6 3.6 Pearson’s r, z transformation
Total data for positive correlation between SAT and GPA. Get r > 0 What correlation between SAT and GPA in only those with admitted and studied (400 < SAT < 700)? Get r = 0 Restriction of range
One extreme score can change correlation (especially in small sample). On left, 5 observations, high X associated with high Y: good predictability. On right, same 5 observations plus 1 other, high X associated with high or low Y: poor predictability. Outliers
Causal claims We’d like to say: To be able to do this: X causes Y The causal variable must come first There must be co-variation between the two variables Need to eliminate plausible alternative explanations Causal claims
Causal claims We’d like to say: To be able to do this: X causes Y The causal variable must come first There must be co-variation between the two variables Need to eliminate plausible alternative explanations Directionality Problem (temporal precedence): Happy people sleep well - Or sleeping well makes you happy? Causal claims
Causal claims We’d like to say: To be able to do this: X causes Y The causal variable must come first There must be co-variation between the two variables Need to eliminate plausible alternative explanations Third Variable Problem: - Happy people sleep well - Or does sleeping well make you happy? OR something else makes people happy and sleep well! Regular exercise Minimal use of drugs & alcohol Being a conscientious person Being a good relationship Other Variable Causal claims
Causal claims We’d like to say: To be able to do this: X causes Y The causal variable must come first There must be co-variation between the two variables Need to eliminate plausible alternative explanations Coincidence (random co-occurence) r=0.52 correlation between the number of republicans in US senate and number of sunspots From Fun with correlations See also Spurious correlations Causal claims Correlation is not causation blog posts: Internet’s favorite phrase Why we keep saying it
Review for Exam 2: Descriptive statistics Statistical procedures to help organize, summarize & simplify large sets of data One variable (frequency distribution) Display results in a frequency distribution table & histogram (or bar chart if categorical variable). Make a deviations table to get measures of central tendency (mode, median, mean) & variability (range, standard deviation, variance). Two variables (bivariate distribution) Display results: Make a scatterplot. Make a bivariate deviations or z-table table to get Pearson’s r. Z-scores & normal distribution Review for Exam 2: Descriptive statistics
Example Are hours sleeping related to GPA? You conduct a survey. Your sample of 10 gives these results for average hours per night sleeping: 7, 6, 7, 8, 8, 7, 9, 5, 9, 6 You also have respondents give their overall GPA: 2.4, 3.9, 3.5, 2.8, 3.0, 2.1, 3.9, 2.9, 3.6, 2.7 We will focus on sleep results first and then both variables together. What kind of scales are they? To find standard deviation, will we use formula for population or sample? Example
Step 1: Frequency distribution & histogram Hrs. sleep n=10 7,6,7,8,8 7,9,5,9,6 X f p % cf c% 9 8 7 6 5 ∑ 10 1.0 100 Step 1: Frequency distribution & histogram
Step 1: Frequency distribution & histogram Hrs. sleep n=10 7,6,7,8,8 7,9,5,9,6 X f p % cf c% 9 2 8 7 3 6 5 1 ∑ 10 1.0 100 Will enter first two columns as X and Y axes for frequency distribution Step 1: Frequency distribution & histogram
Step 1: Frequency distribution & histogram Hrs. sleep n=10 p = f/n X f p % cf c% 9 2 .2 20 8 7 3 .3 30 6 5 1 .1 10 ∑ 10 1.0 100 Step 1: Frequency distribution & histogram
Step 1: Frequency distribution & histogram X f p % cf c% 9 2 .2 20 8 7 3 .3 30 6 5 1 .1 10 ∑ 10 1.0 100 Step 1: Frequency distribution & histogram
Step 1: Frequency distribution & histogram X f p % cf c% 9 2 .2 20 8 7 3 .3 30 6 5 1 .1 10 ∑ 10 1.0 100 Step 1: Frequency distribution & histogram
Step 1: Frequency distribution & histogram X f p % cf c% 9 2 .2 20 8 7 3 .3 30 6 60 5 1 .1 10 ∑ 10 1.0 100 Step 1: Frequency distribution & histogram
Step 1: Frequency distribution & histogram X f p % cf c% 9 2 .2 20 8 80 7 3 .3 30 6 60 5 1 .1 10 ∑ 10 1.0 100 Step 1: Frequency distribution & histogram
Step 1: Frequency distribution & histogram X f p % cf c% 9 2 .2 20 10 100 8 80 7 3 .3 30 6 60 5 1 .1 ∑ 10 1.0 100 Step 1: Frequency distribution & histogram
Step 1: Frequency distribution & histogram Hrs. sleep F R E Q U N C Y 6 5 4 3 2 1 7 8 9 X f 9 2 8 7 3 6 5 1 SCORE Step 1: Frequency distribution & histogram
A weighted mean Suppose that you combine two groups together. How do you compute the new group mean? Group 1 Group 2 New Group 110 110 110 140 110 110 140 140 110 110 A weighted mean
A weighted mean Suppose that you combine two groups together. X f 9 2 How do you compute the new group mean? Be careful computing the mean of this distribution, remember there are groups here Group 1 Group 2 New Group X f 9 2 8 7 3 6 5 1 9 8 7 6 5 110 110 110 140 110 110 140 140 110 110 A weighted mean
Characteristics of a mean & standard deviation The mean Change/add/delete a given score, then the mean will change. Add/subtract a constant to each score, then the mean will change by adding(subtracting) that constant. Multiply (or divide) each score by a constant, then the mean will change by being multiplied by that constant. The standard deviation Change/add/delete a given score, then the mean will change. Add/subtract a constant to each score, then the standard deviation will NOT change. Multiply (or divide) each score by a constant, then the standard deviation will change by being multiplied by that constant. Characteristics of a mean & standard deviation
Step 2: Deviations table X Hrs. sleep n = 10 Create table, sorted in descending order 9 8 7 6 5 Step 2: Deviations table
Step 2: Deviations table X Hrs. sleep n = 10 9 8 7 6 5 Mode = 7 (filled in) Median = 7 (arrow) Mean = (∑X)/n = 72/10 = 7.2 Range = 5 to 9 ∑ 72 Step 2: Deviations table
Step 2: Deviations table X Hrs. sleep n = 10 9 1.8 8 .8 7 -.2 6 -1.2 5 -2.2 = 9-7.2 Mode = 7 Median = 7 Mean = (∑X)/n = 72/10 = 7.2 Range = 5 to 9 ∑ 72 7.2 Step 2: Deviations table
Step 2: Deviations table X Hrs. sleep n = 10 9 1.8 3.24 8 .8 .64 7 -.2 .04 6 -1.2 1.44 5 -2.2 4.84 = 1.82 Mode = 7 Median = 7 Mean = ∑X/n = 72/10 = 7.2 Range = 5 to 9 SD for sample = √15.6/9 = √1.73 = 1.32 ∑ 72 7.2 15.6 = SS Step 2: Deviations table
Step 3: Scatterplot Person Hrs. GPA GPA A 7 2.4 B 6 3.9 C 7 3.5 D 8 2.8 E 8 3.0 F 7 2.1 G 9 3.9 H 5 2.9 I 9 3.6 J 6 2.7 4.0 3.5 3.0 2.5 2.0 1.5 1.0 5 6 7 8 9 GPA Hours of sleep Step 3: Scatterplot
Step 3: Scatterplot Person Hrs. GPA GPA A 7 2.4 B 6 3.9 C 7 3.5 D 8 2.8 E 8 3.0 F 7 2.1 G 9 3.9 H 5 2.9 I 9 3.6 J 6 2.7 4.0 B G 3.5 C I 3.0 H DE 2.5 J A 2.0 F 1.5 1.0 5 6 7 8 9 GPA What does shape of envelope indicate about correlation? low positive correlation Hours of sleep Step 3: Scatterplot
Step 3: Scatterplot, Effect of outlier Person Hrs. GPA A 7 2.4 B 6 3.9 C 7 3.5 D 8 2.8 E 8 3.0 F 7 2.1 G 9 3.9 H 5 2.9 I 9 3.6 J 6 2.7 K 5 1.0 4.0 B G 3.5 C I 3.0 H DE 2.5 J A 2.0 F 1.5 1.0 K 5 6 7 8 9 GPA What does shape of envelope indicate about correlation? moderate positive correlation Hours of sleep Step 3: Scatterplot, Effect of outlier
Step 3: Scatterplot, Effect of outlier Person Hrs. GPA A 7 2.4 B 6 3.9 C 7 3.5 D 8 2.8 E 8 3.0 F 7 2.1 G 9 3.9 H 5 2.9 I 9 3.6 J 6 2.7 K 9 1.0 4.0 B G 3.5 C I 3.0 H DE 2.5 J A 2.0 F 1.5 1.0 K 5 6 7 8 9 GPA What does shape of envelope indicate about correlation? low negative correlation Hours of sleep Step 3: Scatterplot, Effect of outlier
Step 4: Bivariate Deviations Table X Y 9 1.8 3.24 3.9 0.82 0.67 1.476 3.6 0.52 0.27 0.936 8 0.8 0.64 3.0 -0.08 0.01 -0.064 2.8 -0.28 0.86 -0.224 7 -0.2 0.40 3.5 0.42 0.18 -0.084 0.04 2.4 -0.68 0.46 0.136 2.1 -0.98 0.96 0.196 6 -1.2 1.44 -0.984 2.7 -0.38 0.14 0.456 5 -2.2 4.84 2.9 -0.18 0.03 0.396 72 0.0 15.6 30.8 3.47 2.24 7.2 SSX 3.08 SSY SP n=10 Note signs! Sum Mean +r or – r? Step 4: Bivariate Deviations Table
Pearson’s r & summary statistics XY co-deviations ___2.24___ √ 15.6 * 3.47 = _2.24_ √54.132 = _2.24_ = .304 7.357 = X deviations, Y deviations Pearson’s r & summary statistics
An example SRA (Scientific Reasoning Assessment) (fictional) Based on normative data: Normal, μ = 50.0, σ = 10.0 Preparing for your analyses Write down what you know Make a sketch of the distribution (make a note: population or sample) Determine the shape What is best measure of center? What is best measure of variability? Mark the mean (center) and standard deviation on your sketch μ 40 60 An example
z-scores & Normal Distribution SRA (Scientific Reasoning Assessment) (fictional) Based on normative data: Normal distr., μ = 50.0, σ = 10.0 Question 1 If George got a 35 on the SRA, what is his percentile rank? m Unit Normal Table 0.0668 Since a normal distribution, can use Unit Normal Table to infer percentile. 40 60 -1.0 1.0 That’s 6.68% at or below this score (definition of percentile) z-scores & Normal Distribution
z-scores & Normal Distribution SRA (Scientific Reasoning Assessment) (fictional) Based on normative data: Normal distr., μ = 50.0, σ = 10.0 Question 2 Unit Normal Table What proportion of people get between a 40 and 60 on the SRA? m 0.1587 0.1587 40 40 60 60 -1.0 1.0 That’s about 32% outside these two scores Since a normal distribution, can use Unit Normal Table to infer percentile. That leaves 68% between these two scores z-scores & Normal Distribution
z-scores & Normal Distribution SRA (Scientific Reasoning Assessment) (fictional) Based on normative data: Normal distr., μ = 50.0, σ = 10.0 Question 3a Suppose that Chandra took a different reasoning assessment (the RSE: Based on normative data, Normal distr., μ= 100, σ = 15). She received a 130 on the RSE. Assuming that they are highly positively correlated, what is the equivalent score on the SRA? transformation z-scores & Normal Distribution
z-scores & Normal Distribution SRA (Scientific Reasoning Assessment) (fictional) Based on normative data: Normal distr., μ = 50.0, σ = 10.0 Question 3a (for RSE) Suppose that Chandra took a different reasoning assessment (the RSE: Based on normative data, Normal distr., μ= 100, σ = 15). She received a 130 on the RSE. Assuming that they are highly positively correlated, what is the equivalent score on the SRA? (for SRA) transformation z-scores & Normal Distribution
In lab: continue to review, including SPSS Questions? Wrap up