Statistics Fractiles https://www.123rf.com/photo_6622261_statistics-and-analysis-of-data-as-background.html
Cumulative Frequencies An ogive typically forms an “s” shape http://everythingmaths.co.za/maths/grade-11/11-statistics/tikzpictures/4d8229e8dd2b71fe85520a97acdab63a.png
Questions?
Fractiles Another way of describing frequency data A measure of position Based on the ogive (cumulative frequency) or ordered data
Fractiles How to do it: find n order the data divide the data into the # of pieces you want, each with an equal # of members
Fractiles quartile - four pieces percentile - 100 pieces
Step 1: Find n! FRACTILES IN-CLASS PROBLEM 17 88 33 30 11 41 46 62 5 78 31 54 Step 1: Find n!
n = 12 What’s next? FRACTILES IN-CLASS PROBLEM 17 88 33 30 11 41 46 62 5 78 31 54 n = 12 What’s next?
What if you split it into equal halves? FRACTILES IN-CLASS PROBLEM 5 11 17 30 31 33 41 46 5462 78 88 Order the data! What if you split it into equal halves? How many observations would be in each half?
6 observations in each half! This is the 50th percentile FRACTILES IN-CLASS PROBLEM 5 11 17 30 31 33 41 46 5462 78 88 Poof! 6 observations in each half! This is the 50th percentile or the “median”
The 50th percentile or the “median” 33+41 2 = = 37 FRACTILES IN-CLASS PROBLEM 5 11 17 30 31 33 41 46 5462 78 88 The 50th percentile or the “median” 33+41 2 = = 37
What if you wanted quartiles? FRACTILES IN-CLASS PROBLEM 5 11 17 30 31 33 41 46 5462 78 88 What if you wanted quartiles? How many observations would be in each quartile? Where would the splits be?
3 observations in each quartile! FRACTILES IN-CLASS PROBLEM 5 11 17 30 31 33 41 46 5462 78 88 Poof! 3 observations in each quartile!
1st quartile = = 23.5 30+17 2 3rd quartile = = 58 62+54 FRACTILES IN-CLASS PROBLEM 5 11 17 30 31 33 41 46 5462 78 88 1st quartile = = 23.5 3rd quartile = = 58 30+17 2 62+54
Fractiles Quartiles and percentiles are common, others not so much The median is also common, but it is called “the median” rather than “the 50th percentile” or “2nd quartile”
Questions?
Variability Another measure of variability:
Variability Interquartile range (IQR): IQR = 3rd quartile – 1st quartile
Variability The interquartile range is in the same units as the original data (like the range and standard deviation “s”)
What is the IQR for our data? FRACTILES IN-CLASS PROBLEM 14 What is the IQR for our data? 5 11 17 30 31 33 41 46 5462 78 88
1st quartile = = 23.5 30+17 2 3rd quartile = = 58 62+54 So the IQR is… FRACTILES IN-CLASS PROBLEM 14 5 11 17 30 31 33 41 46 5462 78 88 1st quartile = = 23.5 3rd quartile = = 58 So the IQR is… 30+17 2 62+54
1st quartile = = 23.5 30+17 2 3rd quartile = = 58 62+54 FRACTILES IN-CLASS PROBLEM 14 5 11 17 30 31 33 41 46 5462 78 88 1st quartile = = 23.5 3rd quartile = = 58 IQR = 58 - 23.5 = 34.5 30+17 2 62+54
Questions?
Exploring Data We are using the descriptive statistics to summarize our sample (and, hopefully, our population) in just a few numbers
Exploring Data The “five-number summary” is: the min Q1 the median Q3 the max
Boxplots There is a graph statisticians use to show this summary: the box plot
Boxplots The boxplot (a.k.a. box and whisker diagram) is a standardized way of displaying the distribution of data based on the five number summary: minimum, first quartile, median, third quartile, and maximum
Boxplots
BOXPLOTS IN-CLASS PROBLEM Daily high temperatures Feb 2008 for Fairbanks, Alaska: 14, 12, 17, 25, 10, -1, -8, -15, -7, 0, 5, 14, 18, 14, 16, 8, -15, -13, -17, -12, 0, 1, 9, 12, 14, 7, 6, 8 Create a Boxplot
What do we need for a Boxplot? BOXPLOTS IN-CLASS PROBLEM What do we need for a Boxplot?
BOXPLOTS IN-CLASS PROBLEM Daily high temperatures Feb 2008 for Fairbanks, Alaska: 14, 12, 17, 25, 10, -1, -8, -15, -7, 0, 5, 14, 18, 14, 16, 8, -15, -13, -17, -12, 0, 1, 9, 12, 14, 7, 6, 8 Find the 5-number summary
BOXPLOTS IN-CLASS PROBLEM Min = Q1 = Median = Q3 = Max =
BOXPLOTS IN-CLASS PROBLEM Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Notice they’re all in order at the bottom of your list! YAY!
Min = -17 Now for the box! Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 BOXPLOTS IN-CLASS PROBLEM Min = -17 Now for the box! Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Min! BOXPLOTS IN-CLASS PROBLEM Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Min! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Q1! BOXPLOTS IN-CLASS PROBLEM Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Q1! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Median! BOXPLOTS IN-CLASS PROBLEM Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Median! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Q3! BOXPLOTS IN-CLASS PROBLEM Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Q3! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Max! BOXPLOTS IN-CLASS PROBLEM Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Max! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Box! BOXPLOTS IN-CLASS PROBLEM Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Box! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Whiskers! BOXPLOTS IN-CLASS PROBLEM Min = -17 Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 Whiskers! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Questions?
Outliers Because the min and max may be outliers, a variation on the boxplot includes “fences” to show where most of the data occurs
Outliers Lower fence: Q1 - 1.5 * IQR Upper fence: Q3 + 1.5 * IQR
Min = -17 What is the IQR? Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 OUTLIERS IN-CLASS PROBLEM Min = -17 What is the IQR? Q1 = -4 Median = 7.5 Q3 = 14 Max = 25 -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
OUTLIERS IN-CLASS PROBLEM Min = -17 IQR=14-(-4)=18 Q1 = -4 What is the Median = 7.5 lower fence? Q3 = 14 Max = 25 -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
OUTLIERS IN-CLASS PROBLEM Min = -17 IQR=14-(-4)=18 Q1 = -4 Lower fence = Median = 7.5 Q1-1.5*IQR Q3 = 14 -4-1.5(18) Max = 25 = -31 -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
OUTLIERS IN-CLASS PROBLEM Min = -17 IQR=14-(-4)=18 Q1 = -4 Lower fence=-31 Median = 7.5 What is the Q3 = 14 upper fence? Max = 25 -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
OUTLIERS IN-CLASS PROBLEM Min = -17 IQR=14-(-4)=18 Q1 = -4 Lower fence=-31 Median = 7.5 Upper fence= Q3 = 14 Q3+1.5*IQR Max = 25 14+1.5(18)=41 -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
OUTLIERS IN-CLASS PROBLEM Min = -17 IQR=14-(-4)=18 Q1 = -4 Lower fence=-31 Median = 7.5 Upper fence=41 Q3 = 14 So, do we have Max = 25 any outliers? -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
OUTLIERS IN-CLASS PROBLEM Min = -17 IQR=14-(-4)=18 Q1 = -4 Lower fence=-31 Median = 7.5 Upper fence=41 Q3 = 14 Max and Min are Max = 25 inside the fence! -24 -20 -16 -12 -8 -4 0 4 8 12 16 20 24
Outliers How outliers are shown in a boxplot
Types of Boxplots
Questions?
Boxplots Boxplots are typically used to compare different groups
Data Summary Table from a Ball-bouncing Experiment Boxplots Data Summary Table from a Ball-bouncing Experiment Super Ball Wiffle Golf Splash SpongyBall Minimum 66 38 70 7 44 Q1 71 45 75 14 58 Median 76 48 78 16.5 60 Q3 50 80 23 62 Maximum 91 90 28 67
Boxplots
Boxplots
BOXPLOTS IN-CLASS PROBLEM What differences?
Boxplots Unfortunately it is almost impossible to get a true boxplot using Excel
Boxplots Unfortunately it is almost impossible to get a true boxplot using Excel (there are several YouTube videos showing how to get one…
Boxplots Unfortunately it is almost impossible to get a true boxplot using Excel (there are several YouTube videos showing how to get one… but they are all wrong…)
Questions?