Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 3: Descriptive Measures STP 226: Elements of Statistics Jenifer Boshes Arizona State University.

Similar presentations


Presentation on theme: "Chapter 3: Descriptive Measures STP 226: Elements of Statistics Jenifer Boshes Arizona State University."— Presentation transcript:

1 Chapter 3: Descriptive Measures STP 226: Elements of Statistics Jenifer Boshes Arizona State University

2 3.1: Measures of Center

3 Mean The mean of a data set is the sum of the observations divided by the number of observations. (average)

4 Example 1: The following data set is comprised of a set of homework grades. Find the mean homework grade. 93 87 90 90 82 85 88 90 93 83 90 Interpret: Example 2: The following data set is comprised of the lengths of a rare orchid (in inches). Find the mean orchid length. 13 18 14.5 14 15 14 Interpret:

5 Median To find the median of a data set: Arrange the data in increasing order. If the number of observations is odd, then the median is the observation exactly in the middle. If the number of observations is odd, then the median is the observation exactly in the middle. If the number of observations is even, the median is the mean of the two middle observations in the ordered list. If the number of observations is even, the median is the mean of the two middle observations in the ordered list. Example 3: Find the median homework score. Example 4: Find the median orchid length. Interpret: 93 87 90 90 82 85 88 90 93 83 90 13 18 14.5 14 15 14

6 Mode The mode of a data set is value that occurs with greatest frequency. First, find the frequency of each value in the data set. If no value occurs more than once, there is no mode. If no value occurs more than once, there is no mode. Otherwise, any value that occurs with greatest frequency is a mode. Otherwise, any value that occurs with greatest frequency is a mode. Example 5: Find the mode homework score. Example 6: Find the mode orchid length. Interpret: 93 87 90 90 82 85 88 90 93 83 90 13 18 14.5 14 15 14

7 Example 7: Find the mean, median, and mode of each of the data sets. 45907888 68168649 88868276 Data Set I 619982728096 787766 Data Set II

8 Skewed vs. Symmetric (a)Right skewed: The mean is to the right of the median. (b)Symmetric: The mean is equal to the median. (c)Left skewed: The mean is to the left of the median.

9 When to use each… Median: Use the median when your data set has very extreme values. A resistant measure (or robust) is not sensitive to the influence of a few extreme observations. Mode: Use the mode when you have qualitative data.

10 Sample Mean

11 Example 8: The exam scores for a student are: 61, 97, 78, 86, and 73. (a)Use mathematical notation to represent the individual exam scores. (b)Use summation notation to express the sum of the five exam scores. (c)Find for the exam data.

12 3.2: Measures of Variation

13 Example 1: The exam scores for student A are: 100, 100, 90, 90, and 70. The exam scores for student B are: 90, 88, 88, 93, and 91. Compare the means and medians. Who is the better student? Who is more consistent?

14 Range

15 Standard Deviation The standard deviation measures variation by indicating, on average, how far the observations are from the mean.

16 Sample Standard Deviation 1.For each observation, calculate the deviation from the mean. 2.Square this value. 3.Add up the squares. 4.Divide by n – 1. 5.Take the square root.

17 Example 2: Find the standard deviation for student A: 100, 100, 90, 90, and 70. 1.For each observation, calculate the deviation from the mean. 2.Square this value. 3.Add up the squares. 4.Divide by n – 1. 5.Take the square root.

18 Example 3: Find the standard deviation for student B: 90, 88, 88, 93, and 91. 1.For each observation, calculate the deviation from the mean. 2.Square this value. 3.Add up the squares. 4.Divide by n – 1. 5.Take the square root. What can we say about the relative performance between students A and B?

19 Comments on Standard Deviation s 2 is called the sample variance. The units of s 2 are the square of the original units. The units of s 2 are the square of the original units. The units of s are the same as the original units. s is ALWAYS ≥ 0. Why? s is a measure of how much each point deviates from the mean deviation. Do not perform any rounding until the computation is complete; otherwise, substantial roundoff error can result. Almost all the observations in any data set lie within three standard deviations to either side of the mean. This is known as Chebyshev’s Rule.

20 Example 3: How many observations for student B are within one standard deviation of the mean? How many observations for student B are within two standard deviation of the mean? How many observations for student B are within three standard deviation of the mean?

21 3.3: The Five-Number Summary; Boxplots

22 Recall: What does it mean for a statistic to be robust? Name a statistic that is not robust. Name a statistic that is robust Robustness

23 Quartiles Quartiles divide a data set into quarters. Q 1, Q 2, and Q 3 are the three quartiles. The second quartile (Q 2 ) is the median of the entire data set. The first quartile (Q 1 ) is the median of the portion of the data set that lies at or below Q 2. The third quartile (Q 3 ) is the median of the portion of the data set that lies at or above Q 2.

24 Example 1: Fifteen people were asked how many baseball games they had attended the previous season. Find the quartiles. 1225861 04219170 6314223134 1.Order the data. 2.Find the median of the data set. This is Q 2. 3.Find the median of the data that lies at or below the median of the entire data set. This is Q 1. 4.Find the median of the data that lies at or above the median of the entire data set. This is Q 3.

25 Interquartile Range (IQR) The IQR is the difference between the first and third quartiles; that is, IQR = Q 3 – Q 1. It is the preferred measure of variation when the median is used as the measure of center. Like the median, the IQR is a resistant or robust measure.

26 Example 2: What is the IQR for the baseball data? Interpret:

27 Five-Number Summary Min Q 1 Q 2 Q 3 Max

28 Example 3: Find the five-number summary for the baseball data.

29 Outliers Outliers are observations that fall well outside the overall pattern of the data. They may result from a recording error, obtaining an observation from a different population, or an unusual extreme value.

30 Lower and Upper Limits Lower limit: Q 1 – 1.5 · IQR Upper limit: Q 3 + 1.5 · IQR Observations that lie outside the upper and lower limits – either below the lower limit or above the upper limit – are potential outliers.

31 Example 4: For the baseball data: (a)Obtain the lower and upper limits. (b)Determine the potential outliers, if any. (c)Construct a modified boxplot. Adjacent values of a set are the most extreme observations that are not potential outliers. 1225861 04219170 6314223134

32 Steps for Constructing a Modified Boxplot

33 Steps for Constructing a Boxplot

34 Boxplots Boxplots are useful for comparing two or more data sets. Notice how box width and whisker length relate to skewness and symmetry.

35 Bibliography Some of the textbook images embedded in the slides were taken from: Elementary Statistics, Sixth Edition; by Weiss; Addison Wesley Publishing Company Copyright © 2005, Pearson Education, Inc.


Download ppt "Chapter 3: Descriptive Measures STP 226: Elements of Statistics Jenifer Boshes Arizona State University."

Similar presentations


Ads by Google