Teaching Statistics in Psychology UTOPPS—Fall 2004 Teaching Statistics in Psychology
AP Psychology “Suggestions” Descriptive Statistics Measures of central tendency Variability Correlation Experimental Design Appropriate Sampling Control
Measures of Central Tendency Mean True average Median True middle of sorted data Mode Most frequent data value
How these measures relate In a normal distribution… Mean = Median = Mode In a nonnormal distribution (one that is skewed)… The mean will be impacted by the unusual values and will be pulled toward the skew.
Examples
Which measure do you use? If the distribution is close to normal, they are all equal and it makes no difference If the distribution is skewed, the median is the best measure of center because it is “resistant” to influence from outliers that cause the distribution to be skewed.
Variability Variation occurs Range of the data (max-min) there may be a center to the data, but each individual value will vary around that center Range of the data (max-min) Quartiles (divide the data into quarters) Interquartile Range (Q3 – Q1) Standard Deviation Average distance that a data point lies away from the mean As a rough estimate—figure out the distance that each data value lies from the mean, and take the average Not an exact calculation, but it gives a rough idea.
Practicing Calculations 1,1,2,3,4,5,6,7,8,9,10,16 Median (easiest) 5.5 (average between 5 and 6) Mean Add ‘em all up and divide by 12 6 Mode 1 (occurs most frequently)
More Practice 1,1,2,3,4,5,6,7,8,9,10,16 Range Quartiles 16 – 1 = 15 Quartiles Medians of the lower and upper halves Q1 = 2.5 Q3 = 8.5 Interquartile range Q3 – Q1 = 6
Standard Deviation 1,1,2,3,4,5,6,7,8,9,10,16 The formula for standard deviation is quite complicated, but we will use a procedure to estimate this value. The mean value is 6 The differences between each data point and 6 are all follows: -5, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 10 Square those differences, add them all up and divide by 12 and then square root. (25+25+16+9+4+1+1+4+9+16+100)/12 Square root your answer
When to use Standard Deviation Standard deviation (like the mean) is impacted by unusual values, so should not be used unless the distribution is close to normal. If the distribution is normal, standard deviations are an excellent way to see how variability within the data.
Empirical Rule
How to interpret and use standard deviations More than three standard deviations away from the mean is unusual, not impossible If you see a result like this, we attribute that to mean that something unusual is occurring We sometimes use this as evidence against the current mean We sometimes use a p-value to indicate how unusual this result is Something is considered unusual if the p-value is small (p < .05). Look back at the empirical rule slide…
Z-scores Measure the number of standard deviations that a data point lies from the mean If z is negative, the data point is below the mean If z is positive, the data point is above the mean Z = (data point – mean) / st. dev If z is more than 2.5, then it is considered unusual in most settings
AP Psych Exam 2003 Statistics are often used to describe and interpret the results of intelligence testing. Describe the three measures of central tendency Describe a skewed distribution Relate the three measures of central tendency to a normal distribution Relate the three measure of central tendency to positively skewed distribution
An intelligence test for which the scores are normally distributed has a mean of 100 and a standard deviation of 15. Use this info to describe how the scores are distributed. In two normal distribution, the means are 100 for group I and 115 for group II. Can an individual in group I have a higher score than the mean score for group II. Explain.
Correlation between two quantitative variables Correlation is not Causation Correlation shows that some linear relationship exists. Correlation Coefficient -1 < r < 1 -1 shows a very strong negative relationship 1 shows a very strong positive relationship 0 shows no correlation between the variables
Correlation between categorical variables Difficult to assess Comparing percentages can be misleading Bar graphs can be misleading