Fear Free Stats! A quick introduction, discussion and conclusion of what you need to know about Statistics to be successful on the AP Psychology Exam This material was originally taken and modified from a TOPSS unit lesson plan Join TOPSS today!
Intro to STATS Election polls Market research Exercise regimes Surveys Statistics (Stats) can be used as a tool to help demystify research data. Examples: Election polls Market research Exercise regimes Surveys Etc.
Definition of Statistics A means of organizing and analyzing data (numbers) systematically so that they have meaning.
Types Descriptive Stats- Organize data so that we can communicate about that data Inferential Stats- Answers the question, “What can we infer about the population from data gathered from the sample?” Generalizability
Measurement Scales Nominal Scale Ordinal Scale Interval Scale Ratio Scale
Looking at data in a meaningful way Frequency distribution- an organized list that enables us to see clusters or patterns in data , Example: 91 92 87 99 83 84 82 93 89 91 85 94 91 98 90
99 1 90 1 98 1 89 1 97 0 88 0 96 0 87 1 95 0 86 0 N=15 94 1 85 1 93 1 84 1 92 1 83 1 91 3 82 1
Grouped Frequency of same scores 95-99 2 90-94 7 85-89 3 80-84 3 N=15 The width of the intervals in grouped frequency tables must be equal. There should be no overlap.
The Challenger Disaster Intro 30 sec 7 mins
Misuse of Stats The decision to launch the Challenger was in part based on the correlational analysis of failure rates and temperature. You can look at the actual data available to the experts who decided to launch the shuttle and decide if you would have actually launched the shuttle.
Temp # of failures 53 2 57 1 58 1 63 1 70 2 75 2 The Data: Table A
The Data Table B Temp / # of failures 53 2 57 1 63 1 66 0 67 0 68 0 53 2 57 1 63 1 66 0 67 0 68 0 69 0 70 2 72 0 73 0 75 0 76 0 79 0 81 0 The Data Table B
Just a moment for Discussion What other factors impacted the decision of the company to allow for the launch of the Challenger? Just a moment for Discussion To the teacher: a brief research of the issue can be done to expand this topic. Application of critical thinking skills is an important marker for success on the AP Exam.
Moving on to Graphs These allow us to quickly summarize the data collected. In a glance we can attain some level of meaning from the numbers. Examples:
Pie Charts A circle within which all of the data points or numbers are contained in the form of percentages
Bar Graphs A common method for representing nominal data where the height of the bars indicates percentage or frequency of each category
Frequency Polygons A line graph that has the same vertical and horizontal labels as the histogram Each score’s frequency of occurrence is marked with a point on the graph, when all points are connected with a line
The Frequency Polygon Useful in showing the asymmetry in distribution of ordinal, interval and ratio data. This asymmetry is referred to as SKEW.
Positive and Negative SKEW If there is a clustering of data on the high end, then the skew is NEGATIVE because skewness is always indicative of the “tail” or low end of the graph as indicated by low frequency of occurrence. A POSITIVE skew would be indicated by high frequency of low end data points with a few data points at the high end
The Tail Tells the Tale The line of the frequency polygon “tails off” to include these low frequency ends or SKEWNESS
Line Graphs Indicate change that occurs during an experiment. Shows the change in relationship between IV and DV DV always on the vertical axis(Y) and IV on horizontal axis(X) ******
Graphs don’t lie But different representations will provide a different visual that can be deceptive. Dice and distribution
Descriptive Statistics Measures of central tendency- these numbers attempt to describe the “typical” or “average” score in a distribution. What are the measures of central tendency?
Mode The most frequently occurring score in a set of scores. When two different scores occur most frequently it is referred to as bimodal distribution. Example?
Median The score that falls in the middle when the scores are ranked in ascending or descending order. This is the best indicator of central tendency when there is a skew because the median is unaffected by extreme scores. If N is odd, then the median will be a whole number, if N is even, the position will be midway between the two values in the set.
Mean The mathematical average of a set of scores The mean is always pulled in the direction of extreme scores (pulled toward the skew) of the distribution. Examples?
Examples Week One: 71 74 76 79 98 Week Two: 70 74 76 77 78 SAMPLE TEMPERATURES CALCULATE Week One: 71 74 76 79 98 Week Two: 70 74 76 77 78 MEAN OF WEEK ONE MEAN OF WEEK TWO MEDIAN OF WEEK ONE MEDIAN OF WEEK TWO MODE OF WEEK ONE MODE OF WEEK TWO
MEASURE OF CENTRAL TENDENCY CAN BE MISLEADING Suppose your mother wants you to attend a family reunion on Sunday. Everyone in the family protests! Your mother attempts to separately convince each family member that it will not be so bad.
Mom’s story Mom tells your younger sister that the “average” age of the gathering is 10 years old. She tells you the “average” age is 18. She tells dad that the “average” age is 36. Now each family member feels better about spending the day at the family reunion. Did Mom lie?
The attendees Years old Name/relation 3 7 10 15 17 18 44 49 58 59 82 96 Cousin Susie Cousin Sammy Twin Shanda Twin Wanda Cousin Marty Cousin Juan Cousin Pat Aunt Harriet Uncle Stewart Aunt Rose Uncle Don Grandma Faye Great Aunt Lucille
Answer me this What is the median? What is the mode? What is the mean? Did Mom “lie”?
What is the median? 18 What is the mode? 10 What is the mean? 36 Did Mom “lie”? Not really. . .
Measures of Variability Measures of variability indicate how much spread or variability there is in a distribution. If you collected the ages of all students in the 11th grade, there would be little variability. If you collected the shoe sizes of all students in the 11th grade, there would be greater variability.
Range The range is the difference between the lowest and highest score in the data set. The range of scores can be significantly increased with a single outlying score.
EXAMPLE Range=32 Range= 8 Class One: 94, 92, 85, 81, 80, 73, 62 Class Two: 85, 83, 82, 81, 80, 79, 77 Range= 8
Variance SD2 Variance= Standard Deviation squared This is a measure of how different the scores are from each other. The difference between the scores is measured by the distance of each score from the mean of all the scores. FORMULA: Variance= Standard Deviation squared SD2
Standard Deviation FORMULA: This measure of variability is also based on how different scores are from each other. There are computer programs and calculators used for this data. FORMULA: The Standard Deviation is the square root of the variance
Normal Distribution The normal curve is a theoretical or hypothetical frequency curve. Most frequency curves are not symmetrical (remember skew) Normal distribution is displayed on a graph with a “bell” shaped curve.
Bell Curve
%%%%%%%%%%% Must be memorized
Figure 1.10 The normal curve Myers: Psychology, Ninth Edition Copyright © 2010 by Worth Publishers
Correlations Correlation describes the relationship between two variables How is studying related to grades? How is playing video games related to grades?
Positive Correlation Indicates a direct relationship between variables Variables move in the same direction An increase of one variable is accompanied by an increase in another variable A decrease in one variable is accompanied by a decrease in another variable Example
Negative Correlation Indicates an inverse relationship between variables An increase in one variable is accompanied by a decrease in another variable, or vice versa.
Correlation coefficients Correlations are measured with numbers ranging from -1.0 to +1.0. These numbers are called correlation coefficients.
As the correlation coefficient moves closer to +1 As the correlation coefficient moves closer to +1.0, the coefficient shows an increasing positive correlation. As the correlation coefficient moves closer to -1.0, the stronger the negative correlation. A zero could indicate no correlation exists between variables .+1.0 and -1.0 indicate a perfect correlation
Which is a stronger correlation? The absolute value of the number indicates the strength of the correlation.
Correlation does not imply causation! BUT. . . Correlation does not imply causation!
Correlational Studies An often used research design. May not have IV and DV, may be variable one and two. Examples?
Scatter plots A visual representation of correlations The x variable is on the horizontal axis and the y variable is on the vertical axis
Back to the Challenger Disaster Plot the data from Table A and from Table B to establish a visual representation of the scatterplot.
Inferential Statistics Help us determine if one variable has an effect on another variable. Helps us determine if the difference between variables is significant enough to infer (for credit on an AP Exam, you cannot use the term to define the term) that the difference was due to the variables, rather than chance.
When is an Observed Difference Reliable? Making Inferences When is an Observed Difference Reliable? Representative samples are better than biased samples. Less variable observations are more reliable than more variable ones. More cases are better than fewer cases. OBJECTIVE 19| Identify three principles for making generalizations from samples.
When is a Difference Significant? Making Inferences When is a Difference Significant? When sample averages are reliable and the difference between them is relatively large, we say the difference has statistical significance. For psychologists this difference is measured through alpha level set at 5 percent. OBJECTIVE 20| Explain how psychologists decide whether differences are meaningful.
Statistical Significance Are the results of research strong enough to indicate a relationship (correlation)? Would you publish the results? An arbitrary criterion has been established as .05 (5%). Researchers commonly use two inferential tests to measure significance T-test ANOVA
Type I and Type II Errors Statistical tests make use of data from samples. These results are then generalized to the general population. How can we know that it reflects the correct conclusion? Ironically, the possibility of a research error is what makes the research scientific in the first place. If a hypothesis cannot be falsified (e.g. the hypothesis has circular logic), it is not testable, and thus not scientific, by definition
Prediction and Type I and Type II Errors Variable has an effect Variable does not have an effect Deciding that a variable has an effect Correct Positive Deciding that a variable does not have an effect Unit VIII. Motivation and Emotion
Prediction and Type I and Type II Errors Variable has an effect Variable does not have an effect Deciding that a variable has an effect Correct Positive Deciding that a variable does not have an effect Negative Unit VIII. Motivation and Emotion
Type I and Type II Errors Type I Error (false positive) Deciding that one variable has an effect on (or relationship to another variable) when it doesn’t Incorrectly rejecting the null hypothesis and accepting the hypothesis p-level gives us the odds of making this kind of error
Prediction and Type I and Type II Errors Variable has an effect Variable does not have an effect Deciding that a variable has an effect Correct Positive False Type I Error Deciding that a variable does not have an effect Negative Type II Error Unit VIII. Motivation and Emotion
Type II Error (false negative) Deciding that one variable does not have an effect on (or a relationship to) another variable when it does Incorrectly accepting the null hypothesis and rejecting the hypothesis There is no easy way to estimate the odds of this kind of error
Prediction and Type I and Type II Errors Variable has an effect Variable does not have an effect Deciding that a variable has an effect Correct Positive False Type I Error Deciding that a variable does not have an effect Negative Type II Error Unit VIII. Motivation and Emotion
Replication This is one of the reasons experiments must be replicated. This is how science regulates, and minimizes, the potential for Type I and Type II errors Replication is often not possible in medical diagnosis so Type I and II errors are a factor In the legal system fingerprinting and DNA have a possibility of false positives leading to false convictions.
Are you free of fear? Statistics is an important aspect of research design in psychology. In college you will take an entire course in the Statistics of psychology. If you have a grasp of what was presented today, you will be successful on the AP Exam.
Fun with STATS Dice and distribution M and M sampler