PROBABILITY & STATISTICS Monday, September 10 th, 2012
7 Billion_ Are You Typical_ -- National Geographic Magazine.mp4 7 Billion_ Are You Typical_ -- National Geographic Magazine.mp4
THE MEAN To find the mean, add the values from the data set and divide by the number of observations. LET’S TRY IT OUT… – Find the mean of the following data: {2,5,7,9,6,8,5,5,3,0,2}
YOU TRY… What is the MEAN for the home-run data?
HOW CAN WE USE MEAN? What useful information does it provide? We just found that the mean number of home runs hit by Barry Bonds is Could we compare this mean to another baseball athlete’s mean?
BONDS vs. AARON Below is Hank Aaron’s Homerun data during all his years of baseball: {13, 27, 26, 44, 30, 39, 40, 34, 45, 44, 24, 32, 44, 39, 29, 44, 38, 47, 34, 40, 20} FIND THE MEAN
BONDS vs. AARON BONDS scored a mean of home runs during his career. AARON scored a mean of 34.9 home runs during his career. WHAT DOES THE MEAN TELL US?
WAIT…What about those outliers? What if we left the 73 out? What would our new MEAN be? WHAT DOES THIS TELL US?
CAUTIONARY NOTE: MEANS The MEAN is sensitive to the influence of EXTREME observations – EXTREME observations can be outliers OR a heavily skewed distribution (a LONG TAIL!) The mean is NOT a resistant measure because it cannot resist the influence of extreme observations.
THE MEDIAN The MEDIAN is the formal version of midpoint. MEDIAN: the midpoint of a distribution, such that half of the observations are smaller and half are larger.
THE MEDIAN HOW DO WE FIND THE MEDIAN? – Arrange all numbers in order of size, from smallest to largest – If the number of observations is odd, the median is the center observation in the list – If the number of observations is even, the median is the mean of the two center observations in the ordered list
LET’S TRY… FIND THE MEDIAN FOR THE DATA SET Home runs hit by Barry Bonds in his first 16 seasons: {33, 37, 19, 16, 25, 33, 42, 49, 37, 34, 25, 73, 46, 40, 34, 24} HOW DO WE FIND THE MEDIAN? – Arrange all numbers in order of size, from smallest to largest – If the number of observations is odd, the median is the center observation in the list – If the number of observations is even, the median is the mean of the two center observations in the ordered list
YOU TRY… FIND THE MEDIAN FOR THE DATA SET Home runs hit by Hank Aaron: {13, 20, 24, 26, 27, 29, 30, 32, 34, 34, 38, 39, 39, 40, 40, 44, 44, 44, 44, 45, 47}
COMPARING MEAN AND MEDIAN The mean is NOT resistant – High and low values affect it The median IS resistant
COMPARING MEAN AND MEDIAN {2,3,4,5,6,7,8,7,6,5,4,3,2} What is the mean? What is the median? What can we conclude?
WHAT ABOUT SKEWED DATA? What if data is skewed right? What if data is skewed left?
TRY IT OUT Joey’s first 14 quiz grades in a marking period were:{86,84,91,75,78,80,74,87,76,96,82,90,98,93} Calculate the mean
TRY IT OUT…continued What is Joey has an unexcused absence for the fifteenth quiz and he receives a score of 0. Determine his final quiz average. What property of the mean does this situation illustrate? What was the effect of the zero on Joey’s quiz average?
DO MEAN & MEDIAN TELL THE WHOLE STORY? Example: the Census Bureau reports that in 2000 the median income of American households was $41,345. – What does this mean? Example: calculating the mean concentration of active ingredients for a drug. WHAT’S MISSING?
DON’T WE NEED TO KNOW THE SPREAD OF THE DATA? Range: the difference between the largest and smallest values in a data set. BUT…does range tell us everything?
QUARTILES Quartiles allow us to look at the spread of the middle of the data HOW TO CALCULATE QUARTILES: 1.Arrange observations in increasing order and locate the median (M) of the data. 2.The first quartile (Q 1 ) is the median of the observations to the left of the median. 3.The third quartile (Q 2 ) is the median of the observations to the right of the median.
QUARTILES Are quartiles RESISTANT? Q1Q1 MQ3Q3
GIVE IT A TRY! Using Hank Aaron’s data: { }
INTERQUARTILE RANGE What do you think it means?
INTERQUARTILE RANGE Interquartile Range: the distance between the first and third quartiles – IQR=Q 3 -Q 1 How is IQR useful? – If an observation falls between Q 1 and Q 3, you know it’s neither unusually high nor unusually low. – It can help us identify suspect outliers
USING IQR TO FIND OUTLIERS Before, we used our judgment to identify outliers. THE 1.5 X IQR CRITERION: – If an outlier falls more than 1.5XIQR above the third quartile or below the first quartile, it is an outlier.
IS IT AN OUTLIER? Q1Q1 MQ2Q2 LET’s SUSPECT THAT BARRY BONDS’S 73 HOME RUN SEASON IS AN OUTLIER. USE THE 1.5 X IQR CRITERION
Auction Action Nina and her mum helped organize the auction fundraiser to help pay for the school ski trip. Seven local shops each donated a gift certificate to be redeemed for merchandise. The gift certificates will be auctioned off at the big event.
Auction Action The Athlete's Alley, Bay State Basement, CD Corner, Don's Deli, Elk's Software, Fancy Footwear Shop, and the Gorgeous Candy Boutique have all participated. Your task is to use the information on the next slide to work out the amount donated by each of these shops.
Auction Action The gift certificates are each in multiples of $5. There is a $100 range in the value of the gift certificates, which start at $25. The mean value of all seven gift certificates is $80, and the median and the mode are both $70. The certificate from The Athlete's Alley is worth the most, and the one from Gorgeous Candy is worth the least. The total value of the gift certificates from CD Corner, Don's Deli, and the Fancy Footwear Shop is $270, but the Fancy Footwear certificate is worth $50 more than the one from Don's Deli. The CD Corner gift certificate is equivalent to the mean for this group of 3.
Auction Action - Answers The Athlete's Alley $125 Bay State Basement$70 CD Corner$90 Don's Deli$65 Elk's Software $70 Fancy Footwear Shop$115 Gorgeous Candy Boutique$25