JV Stats HW & Test # 2 OUTLIER FORMULAS BOX PLOTS HISTOGRAMS DESCRIBING A DISTRIBUTION (C.U.S.S.) STANDARD DEVIATION AND VARIANCE
Create a Box Plot of this data. We need the 5# summary, which consists of the min,Q1,Med,Q3, max
Finding q1 and q2 the long way
Outlier formulas
Outliers
Boxplot of greed data
Boxplot of greed data
What if there was an outlier
How many text messages? Get an exact count of the number of text messages SENT & RECEIVED since 8:00 AM today.(this is not a trap, I promise)
Text MEssages Create a stemplot Find the 5# summary Find the mean Create a Boxplot(make sure to check for outliers!) Describe the distribution
Per 2 text messages Per 2
Per 2 text messages Per 2
Per 2 text messages Per 2
Per 2 text messages Summary of Stats 5# summary Per 2
Per 2 text messages Per 2
Boxplot 5#-summary 0,0,5.5,20,96 57 & 96 are outliers Per 2
Describe the distribution(C.U.S.S.) The median number of total text messages for our class was 5.5 while the mean was 13.3. Two students had very large text message totals of 57 & 96 which were outliers. Since we had some large text totals out distribution was strongly skewed to the right. The # of text messages ranged from 0 to 96. Per 2
Per 3 text messages Per 3
Per 3 text messages Per 3
Per 3 text messages Per 3
Per 3 text messages Summary of Stats 5# summary Per 3
Per 3 text messages Per 3
Boxplot 5#-summary 0,0,8,29,117 82,94,117 are outliers Per 3
Describe the distribution(c.u.s.s.) The median number of total text messages for our class was 8 while the mean was 20.8. Three students had very large text message totals of 82, 94, & 117 which were outliers. Since we had some large text totals out distribution was strongly skewed to the right. The # of text messages ranged from 0 to 117. Per 3
histograms Per 2 Stemplot of per 2 Unit 1 Test Scores
histograms Per 2 Histogram
histograms Per 2 Calculating Numerical Summaries
histograms Per 2 Describe the Distribution(C.U.S.S.) Our distribution of test scores is skewed to the left and there appear to be no outliers. The median test score fell in the 80% to 89% range. The mean will be somewhat lower than the median due to the skew to the left. Scores for this test ranged from the 40’s up to around 100%.
histograms Per 3 Calculating Numerical Summaries
histograms Per 3 Describe the Distribution(C.U.S.S.) Our distribution of the unit 1 test scores is skewed to the left and there appear to be no outliers. The median test score fell between 90% and 94% inclusive. Due to the skew to the left our mean score is less than the median. Scores ranged from the low 60’s up to the high 90’s
Histogram(centered on #’s) PER 2
Test #1 Splitting Stems was used in these stemplots Per 2 Per 4
Period 2 test scores Find the Median, also called Q2 69
Period 2 test scores 59 79 Find Q1 which is median below the Median Find Q3 which is median above the Median 79
Period 2 test scores 41,59,69,79,97 20 Give the 5-# summary which is Min, Q1 , Q2 , Q3 , Max 41,59,69,79,97 IQR is the interquartile range IQR = Q3 – Q1 20
Period 2 test scores Checking for Outliers. We have formulas to check high and low numbers TO CHECK for a LOW OUTLIER Q1 – 1.5(IQR) any # smaller is an outlier TO CHECK for a HIGH OUTLIER Q3 + 1.5(IQR) any # larger is an outlier
Create a box plot Outliers Min, Q1 , Q2 , Q3 , Max
Create a box plot of your test scores Start with a number line below your box plot Make your increments consistent Draw the box accurately
Period 4 test scores Find the Median, also called Q2 69
Period 4 test scores 59 83 Find Q1 which is median below the Median Find Q3 which is median above the Median 83
Period 4 test scores 41,59,69,83,97 24 Give the 5-# summary which is Min, Q1 , Q2 , Q3 , Max 41,59,69,83,97 IQR is the interquartile range IQR = Q3 – Q1 24
Period 4 test scores Checking for Outliers. We have formulas to check high and low numbers TO CHECK for a LOW OUTLIER Q1 – 1.5(IQR) any # smaller is an outlier TO CHECK for a HIGH OUTLIER Q3 + 1.5(IQR) any # larger is an outlier
Create a box plot Outliers Min, Q1 , Q2 , Q3 , Max
Create a box plot of your test scores Start with a number line below your box plot Make your increments consistent Draw the box accurately
Back-To-back stemplot
C.U.S.S. C: center……give the mean and median This acronym is used to compare two or more distributions(graphs) C: center……give the mean and median U: unusual features……are there outliers? S: shape……skewed left, skewed right, fairly symmetric….etc S: spread……..give the range….(max – min)
Use the C.U.S.S. acronym to compare two or more distributions
Period 2 has a mean of 69. 5 and period 4 has a mean of 69. 7 Period 2 has a mean of 69.5 and period 4 has a mean of 69.7. Both classes have a median of 69. There are no outliers for either class. Both distributions are fairly symmetric. Both classes had a min score of 41 and a max score of 97.
Find the 5 # Summary and check for outliers {3,5,2,6,5,1,9,7,4,2,3,23} If you need to, put them in order 1,2,2,3,3,4,5,5,6,7,9,23 5 # Summary {1,2.5,4.5,6.5,23}
Mystery box plot Example. {1,2,7,8,8,10,10,13,19,22} {1,7,9,13,22} Here is the 5 # summary of a distribution of a set of 12 numbers {1,7,9,13,22} a) Is it possible that there is no number 9 in the set of numbers? Explain. Yes, since there is an even number of numbers in the set, there is no exact middle number. Example. {1,2,7,8,8,10,10,13,19,22}
Mystery box plot Here is the 5 # summary of a distribution of a set of 23 numbers {1,7,9,13,22} a) Is it possible that there is no number 9 in the set of numbers? Explain. No, since there is an odd # of numbers in the set, the median must be a number in the set.
Create your set of data Create a set of data with 20 numbers. Make sure there are 2 low outliers and 2 high outliers. Show your outlier formulas to prove that your set meets these requirements. Create a box plot of your data.
Quartiles Each of the 4 parts contain 25% of the data 1 2 3 4 A Box Plot is made up of Q1, Q2, and Q3. These are called quartiles because they split the box plot into 4 parts. 1 2 3 4 Q1 Q2 Q3 Each of the 4 parts contain 25% of the data
Standard Deviation & Variance Consider the following set of numbers {1,1,2,2,3,4,5,6,7,9} Find the Mean. Find the distance each # is from your mean, square all those distances and add them up Divide that sum by (n-1)…..this is the Variance Take the square root of the Variance. Now you have the Standard Deviation
Lets add #13 to the set Now the mean is not a whole number, makes the process a little more difficult
What is standard deviation It is the average distance the set of numbers are from the mean. It is a measure of spread. Lower standard deviation means that the numbers in the set are grouped closely around the mean High standard deviation means that the numbers in the set are widely spread and may possibly have outliers
Find the standard deviation 1,7,2,10,4,7
Find the standard deviations(Round Means to whole #)
Find the standard deviations(Round Means to whole #)
The Game of Greed
The Game of greed
Find the 5 # Summary {0,16,32,53,73} Check for Outliers Q1 – 1.5(IQR)…….16 – 1.5(37) = -39.5 Q3 + 1.5(IQR)…….53 + 1.5(37) = 108.5 There are NO OUTLIERS Draw the Box Plot Describe the distribution(C.U.S.S.) The mean is 34.3 and the median is 32. There are no outliers. The distribution is fairly symmetric and the range is from 0 to 73.
Histograms How many times did you go out to eat over the weekend? Before we create our histogram we need to make a frequency table.
Create the histogram
Calculate the mean of a histogram
Describe the distribution (c.u.s.s.)
Different types of histograms Bars are centered on the numbers Bars are for a given range Your data will help you decide which type is better
Outfiers affect Range, Mean, Variance, and Standard Deviation Outfiers affect Range, Mean, Variance, and Standard Deviation. An outlier causes all these to increase or decrease Outliers do not affect the median and IQR.
Coin in a cup activity Given 3 minutes for your strong hand and 3 minutes for your weak hand. How many quarters can you bounce in a cup? Is your strong hand better than your weak? To reduce bias student flipped a coin to determine whether they would use their strong or weak hands for the first 3 minutes. This is done because students may learn how to bounce their coins and get better the second time around.
Coin in a cup activity
Coin in a cup activity PER 2
Coin in the cup activity PER 2
Compare the Strong and Weak C.U.S.S. The median of the strong is 3.5 and the median of the weak is 2. The mean of the strong is 4.1 and the mean of the weak is 3.5. There are no outliers. Both distributions are skewed to the right. The strong ranges from 0 to 12 and the weak ranges from 0 to 10.
Find the Standard Deviation 1,4,6,3,7,8,1,8,7
Check for outliers 1,25,23,38,45,45,48,53,67,72,76
Check for outliers 1,25,23,38,45,45,48,53,67,72,76
Create a boxplot 1,25,23,38,45,45,48,53,67,72,76
Find the mean and median of a histogram
Find the mean and median of a histogram
USE THESE TO CHECK FOR OUTLIERS
Texting challenge BOYS VS GIRLS Pair up with a person from the opposite sex, one person will text and the other person will time using their stopwatch….then switch
Mama always said, “Life is like a box of chocolates, you never know what you’re gonna get”
2014 per 2 The girls were much faster at texting, they had a median texting speed of 31 seconds while the boys median was 50 seconds. Mean times were 31.3 for girls and 49.5 for boys. The girls mean and median were very close together which resulted in a fairly symmetric box plot while the boys boxplot was arguably symmetric but could also be considered slightly skewed right. Even though the boys had some very slow texting times, there were no outliers for boys or girls. Boys had a much larger range with texting times ranging from 23.4 seconds up to 99 seconds. Girls times were much more consistent ranging from 15 seconds to 46 seconds.
Times(seconds) to Text Phrase Boys (25,31.5,40,45,120 Girls (22,29,34,48,78) Per 2 2014 BOYS GIRLS
Describe the Distribution(C.U.S.S.) Also use this to compare multiple distributions
Describe the Distribution(C.U.S.S.) Per 2
Times(seconds) to Text Phrase Boys (31,40,42,53,53) Girls (25,34,40.5,46.5,66) Per 4 BOYS GIRLS
Exit light, enter night, take my hand, were off to never neverland.
Girls Boys 2015 per 3 The girls were slightly faster at texting than the boys, they had a median texting speed of 21.5 seconds while the boys median was 24.8 seconds. Mean times were 25.6 for girls and 27.5 for boys. Both boys and girls had their mean times slower than their median times which resulted in boxplots that were slightly right skewed. The girls slowest texting time of 58 seconds and the boys slowest time of 61.5 seconds were considered outliers. Both the boys and the girls had similar ranges of texting times with the girls at 12 seconds up to 58 seconds and boys from 15.2 seconds up to 61.5 seconds.