Measures of Central Tendency “Where is the Middle?” 12/3/2018 Measures of Central Tendency “Where is the Middle?” To accompany Hawkes Lesson 3.1 Original content by D.R.S. 12/3/2018
The Mean The street name for “mean” is “average”. But be professional and call it “the mean” In words: “Add up all the numbers. Divide by how many numbers there are in the list. 12/3/2018
Formula for calculating the Mean In symbols: 𝜇= 𝑥 𝑖 𝑁 or 𝑥 = 𝑥 𝑖 𝑛 𝑥 𝑖 are the individual data values There are 𝑁 data values in the population. Or if a sample, little 𝑛 data values. 𝑖 goes from 1, 2, 3, …, 𝑁 or 𝑛 Greek letter Σ, capital sigma, means “sum” 12/3/2018
Formula for calculating the Mean In symbols: 𝜇= 𝑥 𝑖 𝑁 or 𝑥 = 𝑥 𝑖 𝑛 Rounding: one more decimal place than the data’s precision The mean of { 15, 15, 17 } is 15.7 The mean of { 15.0, 15.0, 17.0} is 15.67 12/3/2018
Which name to use? 𝜇 or 𝑥 ? 𝜇 is the Greek letter mu, pronounced “mew” Use 𝜇 if your calculation is the mean of the entire population being studied. 𝑥 is pronounced “bar x”. Use 𝑥 if your calculation is the mean of just a sample of the entire population. 12/3/2018
Example – The Mean Data: Fernwood home sales at 3 common data.pptx Slide #2 The mean selling price of homes in Fernwood in the given month is $__________. 12/3/2018
The Median Arrange the values in order from lowest to highest. Take the “middle” number. That’s the median. Example: We rolled five dice and recorded the totals. Our results: 10, 12, 17, 18, 18, 21, 23 Count: n = 7, an odd number of numbers. Who’s in the middle? The median is _____. 12/3/2018
The Median of an even # of #s Same experiment but this time we got these results: 10, 12, 17, 17, 18, 19, 20, 22 Count: n = 8, an even number of numbers. There are two middle numbers! 10, 12, 17, 17, 18, 19, 20, 22 In this case, take the midpoint of the two middle numbers: (17 + 18) ÷ 2 = ______. 12/3/2018
Example – The Median Fernwood example: 3 common data.pptx Slide #2 12/3/2018
The Mode What value occurs most frequently? Rolling one die over and over and recording the results. Data: 1, 2, 2, 3, 3, 3, 4, 4, 5, 5, 6, 6 The mode is _____. 12/3/2018
The Modes, plural What value occurs most frequently? Rolling one die over and over and recording the results. Data: 1, 2, 2, 3, 3, 3, 4, 4, 5, 5, 5, 6, 6 The modes are _____ and _____. 12/3/2018
The Mode – another special case What value occurs most frequently? Prices of ten cars at Miracle Motors: $3,199 $4,999 $8,999 $9,100 $9,399 $10,599 $12,399 $15,999 $17,999 $32,999 Everybody is tied with one occurrence. “There is No Mode.” 12/3/2018
More about No Mode What value occurs most frequently? Record the color of each traffic signal we encounter on our travels R, R, G, G, G, R Red and Green are tied with three occurrences each. “There is No Mode.” But if R, R, G, Y, G, G, R, then we have two modes, Red and Green. 12/3/2018
How many modes? 1, 2, many, or 0 How many modes does the data set have? We say that the data set is ___________. There is 1 mode. UNIMODAL There are 2 modes. BIMODAL There are >2 modes. MULTIMODAL There is no mode. “There is no mode” 12/3/2018
Fernwood, continued Data: 3 common data.pptx Slide #2 The mean home price is $___________. The median home price is $__________. The mode of the home price is $_________. The range of home prices is the highest minus the lowest = $________. The midrange is 𝑙𝑜𝑤𝑒𝑠𝑡+ℎ𝑖𝑔ℎ𝑒𝑠𝑡 2 = $_________, halfway between the extreme values. 12/3/2018
“a Statistic” vs. “a Parameter” If the mean or median or mode or anything else was calculated from an entire population’s data – then it’s called a parameter If the mean or median or mode or anything else was calculated from just a sample – then it’s called a statistic 12/3/2018
The lingo of Samples, Populations a SAMPLE a POPULATION We calculate STATISTICS There are 𝑛 items The mean is denoted as 𝑥 We calculate PARAMETERS There are 𝑁 items The mean is denoted as 𝜇 12/3/2018
Mean for a Frequency Distribution Previously: a list of numbers Now: a list of numbers and counts of how many times each number occurs Same formula but with some adaptations 12/3/2018
Example of a Frequency Distribution Bowling League – bowlers’ averages, Slide #3 in 3 common data.pptx or Excel in 3 common data.xlsx There are “classes”, ranges of scores. There are counts of how many individuals belong to that class. What about the boundaries? 12/3/2018
Example – The Mean of a Frequency Distribution What is the mean league average? 𝑥 𝑖 𝑛 becomes 𝑓 𝑖 ∙ 𝑥 𝑖 𝑓 𝑖 The data values are denoted as 𝑥 𝑖 For 𝑖=1, 2, 3,… Value 𝑥 𝑖 occurs 𝑓 𝑖 times 12/3/2018
Example – The Mean of a Frequency Distribution What is the mean league average? 𝑥 𝑖 𝑛 becomes 𝑓 𝑖 ∙ 𝑥 𝑖 𝑓 𝑖 The numerator contains a shortcut for repeatedly adding the same number, such as a 150 score * how many individuals. The denominator adds up all the frequencies to get the total how many items in the set. 12/3/2018
Example – Mean for Frequency Dist. PPT – Slide #4 in 3 common data.pptx or Excel “Bowling mean” sheet in the workbook 3 common data.xlsx 12/3/2018
Example – Mode for a Frequency Distribution Frequency means “How many times did the value occur?” So it’s easy – just look for the class which has the highest frequency. The mode is the VALUE associated with that highest frequency, not the count itself. You could say “the modal class is 90 and below” Or you could say “the mode is 85”. 12/3/2018
Example – Mode for a frequency distribution Refer again to the bowling league: Slide #3 at 3 common data.xlsx “The Modal Class is ________________” (not a number, but a class name). 12/3/2018
The Weighted Mean The classic example is GPA, Grade Point Average. The data values are A=4, B=3, C=2, D=1, F=0 The weights are the courses’ credit values 12/3/2018
Weighted Mean Example Original data: Slide #5 in 3 common data.pptx Computation: Slide #6 in 3 common data.pptx Excel spreadsheet for experimentation: Sheet “GPA experiment” in workbook 3 common data.xlsx 12/3/2018
Weighted Mean Example What effect did the 4-credit course have? Notice the convention for GPA is two decimal digits; usually it would be one more digit than the data, or 3.1. 12/3/2018
TI-84 Put the data values into a TI-84 LIST. STAT menu, CALC submenu 1-Var Stats Ln (if omitted, it uses L1) 𝒙 , n, Med are among the values we get minX and maxX help you find range, midrange Mode must still be done by hand 12/3/2018
TI-84 Fernwood example 12/3/2018
Excel Fernwood example * 12/3/2018
Which measure “best” describes the data? Consider the Fernwood data again. Which one number “best” communicates the housing market? Mean? Median? Mode? Midrange? Experiment with the Excel file to explore more 12/3/2018
Which measure “best” describes the “middle” of the data? If the data is qualitative, use the Mode. MODE Have to use Mode for categorical, non-numerical data! Can’t take the “average” of non-numerical, non-rankable data. If the data is quantitative, Mean is Nice. MEAN But if there are OUTLIERS, they can distort the mean. Or if the data is greatly SKEWED, the mean might be distorted. Sometimes Median is better MEDIAN The Median is more resistant to distortion from outliers 12/3/2018 Credit for idea: Hawkes Learning Systems online instruction page
Trimmed Mean Sometimes used in cases like Fernwood home prices. We throw out some of the extreme values on either end. Similar effect to median – to insulate the result from extreme values on the ends. But the satisfaction of having a mean that lets all the middle numbers have their say. 12/3/2018
Trimmed Mean Example “the 20% trimmed mean” We have 15 values in the data set 20% of 15 = 3: Ignore 3 lowest, 3 highest 12/3/2018
TI-84 Frequency Distribution You have to use TWO lists: One list for the data values Another list for the frequencies Then 1-Var Stats L1,L2 Tell it the data value list first, frequency list second 12/3/2018
Frequency Distribution TI-84 for the bowling league 12/3/2018
Weighted Mean Each value has a “weight” Because some values are “more important” than others. Those values are “weighted more heavily” And other values are “less important” 12/3/2018
Your GPA A classic Weighted Mean situation Course Grade (Points) Credits British Literature I B (3) 3 U.S. History A (4) Physics 4 French II C (2) Ice Fishing 1 12/3/2018
TI-84 Weighted Mean You have to use TWO lists: First list for the data values (the grades’ values) Second list for the weights (how many credits) Then 1-Var Stats L1,L2 Careful! Easy to mix these up, especially with GPA problems, since both lists are values in the same 0-4 range. (Example: “A” in 1-credit P.E. would be treated as a “D’ in a 4-credit course!) 12/3/2018
TI-84 GPA Example 12/3/2018