Applying the Normal Distribution: Z-Scores Chapter 3.5 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U
Comparing Data Consider the following two students: Student 1 MDM 4U, Mr. Lieff, Semester 1, Mark = 84%, Student 2 MDM 4U, Mr. Lieff, Semester 2, Mark = 83%, Can we compare the two students fairly when the mark distributions are different?
Mark Distributions for Each Class Semester 1, Semester 2,
Comparing Distributions It is difficult to compare two distributions when they have different characteristics For example, the two histograms have different means and standard deviations z-scores allow us to make the comparison
The Standard Normal Distribution A distribution with a mean of zero and a standard deviation of one X~N(0,1²) Each element of any normal distribution can be translated to the same place on a Standard Normal Distribution using the z-score of the element the z-score is the number of standard deviations the piece of data is below or above the mean If the z-score is positive, the data lies above the mean, if negative, below
Standardizing The process of reducing the normal distribution to a standard normal distribution N(0,1 2 ) is called standardizing Remember that a standardized normal distribution has a mean of 0 and a standard deviation of 1
Example 1 For the distribution X~N(10,2²) determine the number of standard deviations each value lies above or below the mean: a. x = 7 z = 7 – 10 2 z = is 1.5 standard deviations below the mean 18.5 is 4.25 standard deviations above the mean (anything beyond 3 is an outlier) b. x = 18.5 z = 18.5 – 10 2 z=4.25
Example continued… 34% 13.5% 2.35% 95% 99.7%
Standard Deviation A recent math quiz offered the following data The z-scores offer a way to compare scores among members of the class, find out how many had a mark greater than yours, indicate position in the class, etc. mean = 68.0 standard deviation = 10.9
Example 2: Suppose your mark was 64 Compare your mark to the rest of the class z = (64 – 68.0)/10.9 = (using the z-score table on page 398) We get or 35.6% So 35.6% of the class has a mark less than or equal to yours
Example 3: Percentiles The k th percentile is the data value that is greater than k% of the population If another student has a mark of 75, what percentile is this student in? z = ( )/10.9 = 0.64 From the table on page 398 we get or 73.9%, so the student is in the 74 th percentile – their mark is greater than 74% of the others
Example 4: Ranges Now find the percent of data between a mark of 60 and 80 For 60: z = (60 – 68)/10.9 = -0.73gives 23.3% For 80: z = (80 – 68)/10.9 = 1.10gives 86.4% 86.4% % = 63.1% So 63.1% of the class is between a mark of 60 and 80
Back to the two students... Student 1 Student 2 Student 2 has the lower mark, but a higher z- score!
Exercises read through the examples on pages try page 186 #2-5, 7, 8, 10
Mathematical Indices Chapter 3.6 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U
What is an Index? An index is an arbitrarily defined number that provides a measure of scale These are used to indicate a value, but do not actually represent some actual measurement or quantity so that we can make comparisons
1) BMI – Body Mass Index A mathematical formula created to determine whether a person’s mass puts them at risk for health problems BMI =m = mass(kg), h = height(m) Standard / Metric BMI Calculator UnderweightBelow 18.5 Normal Overweight Obese30.0 and Above
2) Slugging Percentage Baseball is the most statistically analyzed sport in the world A number of indices are used to measure the value of a player Batting Average (AVG) measures a player’s ability to get on base (hits / at bats) Slugging percentage (SLG) also takes into account the number of bases that a player earns (total bases / at bats) SLG = where TB = 1B + 2B*2 + 3B*3 + HR*4 and 1B = singles, 2B = doubles, 3B = triples, HR = homeruns
Slugging Percentage Example e.g. DH Frank Thomas, Toronto Blue Jays Statistics: 466 AB, 126 H, 11 2B, 0 3B, 39 HR SLG = (H + 2B + 2*3B + 3*HR) / AB = ( *0 + 3*39) / 466 = 254 / 466 = (3 decimal places)
Moving Average Used when time-series data show a great deal of fluctuation (e.g. long term trend of a stock) takes the average of the previous n values e.g. 5-Day Moving Average cannot calculate until the 5 th day value for Day 5 is the average of Days 1-5 value for Day 6 is the average of Days 2-6 e.g. Look up a stock symbol at Click Charts Technical chart n-Day Moving Average
Exercises read pp a (odd), 2-3 ac, 4 (alt: calculate SLG for 3 players on your favourite team for 2007), 8, 9, 11
References Halls, S. (2004). Body Mass Index Calculator. Retrieved October 12, 2004 from Wikipedia (2004). Online Encyclopedia. Retrieved September 1, 2004 from