Presentation is loading. Please wait.

Presentation is loading. Please wait.

MATH 1107 Elementary Statistics Lecture 3 Describing and Exploring Data – Central Tendency, Variation and Relative Standing.

Similar presentations


Presentation on theme: "MATH 1107 Elementary Statistics Lecture 3 Describing and Exploring Data – Central Tendency, Variation and Relative Standing."— Presentation transcript:

1 MATH 1107 Elementary Statistics Lecture 3 Describing and Exploring Data – Central Tendency, Variation and Relative Standing

2 Describing and Exploring Data Commonly, data is described in these terms: –Central Tendency –Variation –Relative Standing –Outliers

3 Describing data – central tendency The “center” of a data set can be described using two different measures: 1. Mean – the “average” 2. Median – the midpoint

4 Describing data – central tendency The “Mean” or Average is the most common measurement of central tendency; x = n  x x is pronounced ‘x-bar’ and denotes the mean of a set of sample values x µ is pronounced ‘mu’ and denotes the mean of values in a population N µ =  x x

5 Describing data – central tendency The “Median” is the midpoint of all values arranged in descending order.  If the number of values is odd, the median is the number located in the exact middle of the list.  If the number of values is even, the median is found by computing the mean of the two middle numbers.

6 Describing data – central tendency Example of Mean Versus Median - Salaries in a small law firm: 45,500 500,000 48,20062,500 53,40021,500 62,35063,500 68,50052,500 55,40053,400 72,00074,600 69,50038,500 What is the Mean? What is the Median? Which Measurement is more appropriate? What is the Mean? What is the Median? Which Measurement is more appropriate?

7 Describing data – central tendency Where would you expect the “central tendency” to occur?

8 Describing data – central tendency Salaries in a small law firm: Mean = 82,366 Median = 58,875 Which one is “right”? Where else might this scenario occur?

9 Describing data – central tendency The “Mode” is the most commonly occurring value; May or may not exist; May be bimodal or “multi-modal”; The only measure of central tendency that can be used with NOMINAL data.

10 Describing data – central tendency 1.5.40 1.10 0.42 0.73 0.48 1.10 2.27 27 27 55 55 55 88 88 99 3.1 2 3 6 7 8 9 10  Mode is 1.10  Bimodal - 27 & 55  No Mode  Mode is 1.10  Bimodal - 27 & 55  No Mode

11 Describing data – central tendency You can calculate (estimate) mean from a frequency distribution: x = f  (f x)  Where: x = class midpoint f = frequency  f = n

12 Describing data – central tendency Salaries in a small law firm: Estimated Mean = (25,000*1)+(35,000*1)+(45,000*2)…(200,000*1) 16 or $149,285

13 Symmetric – bell shaped data Skewed – has a left or right tail Describing data - Skewness

14 Lecture 3 Continued Describing and Exploring Data – Variation

15 Describing data - Variation Assume you are presented with two lottery opportunities with the following outcomes: A: 0, 0, 0, 1000, 1000, 1000 B: 0, 500, 500, 500, 500, 1000 What are the respective Means? Medians? Modes? Which would you rather play? Why?

16 Describing data - Variation Standard Deviation is a method for describing the “spread” or variation in a data set. Specifically, std is a measurement of each observation from the mean:  ( x - x ) 2 n - 1 S =S =

17 Describing data - Variation Population std is calculated similarly, using different symbols: 2  ( x - µ ) N  =

18 Describing data - Variation The Variance of a data set is another measure of dispersion and is simply the square of the std: s  2 2 } Notation Sample variance Population variance

19 Describing data - Variation The majority of data is normally distributed. Therefore, it is important to understand that:

20 Describing data - Variation Height generally follows a normal distribution. 1.If the average height of men is 69” with a std of 2.8”, what percentage of men are between 66.2” and 71.8”? 2.If the average height of women is 64” with a std of 2.5”, what percentage of women are between 59” and 69”? 3.The average height for the Hawks is 78”. How many standard deviations is the average Hawk from the male mean?

21 Chebyshev’s Theorem The proportion (or fraction) of any set of data lying within K standard deviations of the mean is always at least 1-1/K 2, where K is any positive number greater than 1.  For K = 2, at least 3/4 (or 75%) of all values lie within 2 standard deviations of the mean  For K = 3, at least 8/9 (or 89%) of all values lie within 3 standard deviations of the mean Describing data - Variation

22 Lecture 3 Continued Describing and Exploring Data – Relative Standing

23 Describing data – Relative Standing A “percentile” or p th indicates how much of the data set falls below a particular number, x. And (100- p)% indicates how much of the data set falls above a particular number, x. For example, would you rather score in the 25 th percentile or the 80 th percentile on a test? What if there are 75 people that take a test and 50 score below you? What percentile are you in?

24 Describing data – Relative Standing In order to assess a particular value, x in terms of its standard deviations from the mean, regardless of units, we utilize a standardized Z-Score. z = x - x s x - µ z =  Sample: Population:

25 Describing data – Relative Standing Application of Z-Score: My 20 month old daughter is 30” tall. My friend’s 20 month old son is 31” tall. Although my daughter is slightly shorter, is she shorter relative to their gender peers? 20 month old boys average 34”, with std of 2.5” 20 month old girls average 32”, with std of 2.5” Cate: Z=30-32 = -.8 Joseph: Z=31-34 = -1.2 2.5 2.5

26 At Z-scores of -.8 and –1.2, respectively, what percentile are Cate and Joseph in for the heights of their respective genders? Describing data – Relative Standing


Download ppt "MATH 1107 Elementary Statistics Lecture 3 Describing and Exploring Data – Central Tendency, Variation and Relative Standing."

Similar presentations


Ads by Google