Outline 1. Histograms and boxplots 2. Mean and standard deviation 3. Proportions and bar charts 4. Sampling and allocation 5. Inference and confidence intervals 6. t tests and alternatives 7. ANOVA 8. Regression and correlation 9. More ANOVA and regression 10. Categorical data analysis
Histograms and Boxplots Learning outcomes Statisticalese Making histograms - deciding type and bin width - the macro/micro distinction in graphing Making boxplots - ranking and ordering data - learning the 5-point summary
Statisticalese I will probably have a bagel today. Probability of having a bagel > 50% It takes about 20 minutes to cook rice. The central tendency (more on what this means throughout the course) for cooking rice is 20 minutes. Statisticalese takes English phrases that include numerical information and uncertainity and translates them (often making them more precise).
Today's data set: DNA exonerations Hundreds of people found guilty of crimes, who spent time in prison, and later exonerated by DNA evidence.
Caseno i firstn i lastn i state i year1 i year2 i time i 1GaryDotsonIllinois DavidVasquezVirginia EdwardGreenDC : : : : : : : 162LeoWatersN. Carolina GeorgeRodriquezTexas This is what a data file looks like in most statistics packages Focus is on the time i variable for years in prison. The subscripts show the values vary.
Frequency Table
Histogram: With dots Years in prison Frequency
Stem and leaf diagram
Deciding bin width
Name histogram
5 point summary values: sorted: ranks: values: ranks: ↑↑↑↑↑ minimumfirst quartile medianthird quartile maximum
Median when n is even: the mid-rank
Boxplots (Box and Whiskers)
Comparing histograms and boxplots
Summary Statisticalese. A language for numbers and chance. Histograms. Decide bin width. Boxplot. Shows outliers well. Graphs. Make clear. Avoid adding frills.