Welcome to Week 04 Tues MAT135 Statistics http://media.dcnews.ro/image/201109/w670/statistics.jpg
Review
Descriptive Statistics Descriptive statistics – describe our sample – we’ll use this to make inferences about the population
Descriptive Statistics graphs n max min each observation frequencies mean, median, mode range, variance, standard deviation, quartiles, IQR
Statistics vs Parameters Statistic Parameter n N x μ s2 σ2 s σ
Questions?
Exploring Data We are using the descriptive statistics to summarize our sample (and, hopefully, our population) in just a few numbers
Exploring Data The “five-number summary” is: the min Q1 the median Q3 the max
Exploring Data We know how to get all of these using our calculators!
Boxplots There is a graph statisticians use to show this summary: the box plot
Boxplots The boxplot (a.k.a. box and whisker diagram) is a standardized way of displaying the distribution of data based on the five number summary: minimum, first quartile, median, third quartile, and maximum
Boxplots
BOXPLOTS IN-CLASS PROBLEM Daily high temperatures Feb 2008 for Fairbanks, Alaska: 14, 12, 17, 25, 10, -1, -8, -15, -7, 0, 5, 14, 18, 14, 16, 8, -15, -13, -17, -12, 0, 1, 9, 12, 14, 7, 6, 8 Create a Boxplot
What do we need for a Boxplot? BOXPLOTS IN-CLASS PROBLEM 1 What do we need for a Boxplot?
BOXPLOTS IN-CLASS PROBLEM 2 Daily high temperatures Feb 2008 for Fairbanks, Alaska: 14, 12, 17, 25, 10, -1, -8, -15, -7, 0, 5, 14, 18, 14, 16, 8, -15, -13, -17, -12, 0, 1, 9, 12, 14, 7, 6, 8 Find the 5-number summary
BOXPLOTS IN-CLASS PROBLEM 2 Min = Q1 = Median = Q3 = Max =
BOXPLOTS IN-CLASS PROBLEM 2 Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Notice they’re all in order at the bottom of your list! YAY!
Min = -17 Now for the box! Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 BOXPLOTS IN-CLASS PROBLEM 3 Min = -17 Now for the box! Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Min! BOXPLOTS IN-CLASS PROBLEM 3 Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Min! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Q1! BOXPLOTS IN-CLASS PROBLEM 3 Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Q1! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Median! BOXPLOTS IN-CLASS PROBLEM 3 Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Median! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Q3! BOXPLOTS IN-CLASS PROBLEM 3 Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Q3! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Max! BOXPLOTS IN-CLASS PROBLEM 3 Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Max! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Box! BOXPLOTS IN-CLASS PROBLEM 3 Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Box! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Whiskers! BOXPLOTS IN-CLASS PROBLEM 3 Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Whiskers! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Questions?
Outliers Because the min and max may be outliers, a variation on the boxplot includes “fences” to show where most of the data occurs
Outliers Lower fence: Q1 - 1.5 * IQR Upper fence: Q3 + 1.5 * IQR
Min = -17 What is the IQR? Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 OUTLIERS IN-CLASS PROBLEM 4 Min = -17 What is the IQR? Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
OUTLIERS IN-CLASS PROBLEM 4 Min = -17 IQR=14-(-4)=18 Q1 = -4 What is the Median = 7.5 lower fence? Q3 = 14 Max = 25 -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
OUTLIERS IN-CLASS PROBLEM 5 Min = -17 IQR=14-(-4)=18 Q1 = -4 Lower fence = Median = 7.5 Q1-1.5*IQR Q3 = 14 -4-1.5(18) Max = 25 = -31 -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
OUTLIERS IN-CLASS PROBLEM 5 Min = -17 IQR=14-(-4)=18 Q1 = -4 Lower fence=-31 Median = 7.5 What is the Q3 = 14 upper fence? Max = 25 -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
OUTLIERS IN-CLASS PROBLEM 6 Min = -17 IQR=14-(-4)=18 Q1 = -4 Lower fence=-31 Median = 7.5 Upper fence= Q3 = 14 Q3+1.5*IQR Max = 25 14+1.5(18)=41 -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
OUTLIERS IN-CLASS PROBLEM 6 Min = -17 IQR=14-(-4)=18 Q1 = -4 Lower fence=-31 Median = 7.5 Upper fence=41 Q3 = 14 So, do we have Max = 25 any outliers? -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
OUTLIERS IN-CLASS PROBLEM 7 Min = -17 IQR=14-(-4)=18 Q1 = -4 Lower fence=-31 Median = 7.5 Upper fence=41 Q3 = 14 Max and Min are Max = 25 inside the fence! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Outliers How outliers are shown in a boxplot
Types of Boxplots
Questions?
Boxplots Boxplots are typically used to compare different groups
Data Summary Table from a Ball-bouncing Experiment Boxplots Data Summary Table from a Ball-bouncing Experiment Super Ball Wiffle Golf Splash SpongyBall Minimum 66 38 70 7 44 Q1 71 45 75 14 58 Median 76 48 78 16.5 60 Q3 50 80 23 62 Maximum 91 90 28 67
Boxplots
Boxplots
BOXPLOTS IN-CLASS PROBLEM 8 What differences?
Boxplots Unfortunately it is almost impossible to get a true boxplot using Excel
Boxplots Unfortunately it is almost impossible to get a true boxplot using Excel (there are several YouTube videos showing how to get one…
Boxplots Unfortunately it is almost impossible to get a true boxplot using Excel (there are several YouTube videos showing how to get one… but they are all wrong…)
Questions?
Exploring Data There actually IS a useful graph you can get out of Excel that includes both an average and a measure if dispersion
I use the Hi/Low/Close graph Exploring Data I use the Hi/Low/Close graph
Exploring Data
What does this graph show? BOXPLOTS IN-CLASS PROBLEM 9 What does this graph show?
What does this graph show? BOXPLOTS IN-CLASS PROBLEM 10 What does this graph show?
Questions?
Normal Probability The most popular continuous graph in statistics is the NORMAL DISTRIBUTION
Empirical Rule Two descriptive statistics completely define the shape of a normal distribution: Mean µ Standard deviation σ
Suppose we have a normal distribution, µ = 12 σ = 2 Empirical Rule Suppose we have a normal distribution, µ = 12 σ = 2
Empirical Rule If µ = 12 12
Empirical Rule If µ = 12 σ = 2 6 8 10 12 14 16 18
Empirical Rule More sneaky stuff about the normal distribution:
Empirical Rule More sneaky stuff about the normal distribution:
Empirical Rule So now you can calculate even more percentages!
What % of the data is between the mean and +1 SD? EMPIRICAL RULE IN-CLASS PROBLEM 11 What % of the data is between the mean and +1 SD?
What % is between the mean and -1 SD? EMPIRICAL RULE IN-CLASS PROBLEM 12 What % is between the mean and -1 SD?
What % of the data is between +1 SD and +2 SD? EMPIRICAL RULE IN-CLASS PROBLEM 13 What % of the data is between +1 SD and +2 SD?
What % is between -1 SD and -2 SD? EMPIRICAL RULE IN-CLASS PROBLEM 14 What % is between -1 SD and -2 SD?
What % of the data is between +2 SD and +3 SD? EMPIRICAL RULE IN-CLASS PROBLEM 15 What % of the data is between +2 SD and +3 SD?
What % is between -2 SD and -3 SD? EMPIRICAL RULE IN-CLASS PROBLEM 16 What % is between -2 SD and -3 SD?
What % of the data is above +3 SD? EMPIRICAL RULE IN-CLASS PROBLEM 17 What % of the data is above +3 SD?
What % of the data is below -3 SD? EMPIRICAL RULE IN-CLASS PROBLEM 18 What % of the data is below -3 SD?
Questions?
z-scores For the standard normal distribution, µ = 0 σ = 1 -3 -2 -1 0 1 2 3
The standard normal is also called “z” z-scores The standard normal is also called “z”
z-scores z = (x - µ)/σ
EMPIRICAL RULE IN-CLASS PROBLEM 19 A dataset has a normal distribution with μ = 45 and σ = 13 Find the z-score for a value of 65:
z-scores With a bit of algebra, we can use z = (x - µ)/σ to solve for x given a z-score
z-scores z = (x - µ)/σ x =
EMPIRICAL RULE IN-CLASS PROBLEM 20 A data point from a normal distribution with μ = 45 and σ = 13 has a z-score = 2.3 What is the data value?
In-class Project Turn in your homework! Don’t forget your homework due next class! See you Thursday!