Introduction Our daily lives often involve a great deal of data, or numbers in context. It is important to understand how data is found, what it means,

Slides:

Advertisements

Similar presentations

EcoTherm Plus WGB-K 20 E 4,5 – 20 kW.

Advertisements

AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory

Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.

5.1 Rules for Exponents Review of Bases and Exponents Zero Exponents

Slide 1Fig 26-CO, p.795. Slide 2Fig 26-1, p.796 Slide 3Fig 26-2, p.797.

Slide 1Fig 25-CO, p.762. Slide 2Fig 25-1, p.765 Slide 3Fig 25-2, p.765.

STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.

STATISTICS POINT ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.

Addition and Subtraction Equations

David Burdett May 11, 2004 Package Binding for WS CDL.

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 5- 1.

1 When you see… Find the zeros You think…. 2 To find the zeros...

Continuous Numerical Data

Whiteboardmaths.com © 2004 All rights reserved

4-4 Variability Objective: Learn to find measures of variability.

Create an Application Title 1Y - Youth Chapter 5.

Add Governors Discretionary (1G) Grants Chapter 6.

Lecture Slides Elementary Statistics Tenth Edition

CHAPTER 18 The Ankle and Lower Leg

Lecture 7 THE NORMAL AND STANDARD NORMAL DISTRIBUTIONS

The 5S numbers game..

A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.

St. Edward’s University

1 1 Slide © 2003 South-Western/Thomson Learning TM Slides Prepared by JOHN S. LOUCKS St. Edwards University.

Break Time Remaining 10:00.

The basics for simulations

Factoring Quadratics — ax² + bx + c Topic

EE, NCKU Tien-Hao Chang (Darby Chang)

PP Test Review Sections 6-1 to 6-6

MM4A6c: Apply the law of sines and the law of cosines.

Data Distributions Warm Up Lesson Presentation Lesson Quiz

Regression with Panel Data

Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.

Biology 2 Plant Kingdom Identification Test Review.

Chapter 1: Expressions, Equations, & Inequalities

2.5 Using Linear Models Month Temp º F 70 º F 75 º F 78 º F.

Quantitative Analysis (Statistics Week 8)

Adding Up In Chunks.

MaK_Full ahead loaded 1 Alarm Page Directory (F11)

When you see… Find the zeros You think….

Midterm Review Part II Midterm Review Part II 40.

2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.

Before Between After.

2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.

Slide R - 1 Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Prentice Hall Active Learning Lecture Slides For use with Classroom Response.

Subtraction: Adding UP

1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)

Static Equilibrium; Elasticity and Fracture

Basic Statistics Measures of Central Tendency.

Clock will move after 1 minute

Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 11 Simple Linear Regression.

Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.

Select a time to count down from the clock above

Copyright Tim Morris/St Stephen's School

9. Two Functions of Two Random Variables

A Data Warehouse Mining Tool Stephen Turner Chris Frala

1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.

1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)

Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.

Unit 4 Describing Data Standards: S.ID.1 Represent data on the real number line (dot plots, histograms, and box plots) S.ID.2 Use statistics appropriate.

Introduction To compare data sets, use the same types of statistics that you use to represent or describe data sets. These statistics include measures.

Warm-up 8/25/14 Compare Data A to Data B using the five number summary, measure of center and measure of spread. A) 18, 33, 18, 87, 12, 23, 93, 34, 71,

Lesson 1: Summarizing and Interpreting Data

Presentation transcript:

Introduction Our daily lives often involve a great deal of data, or numbers in context. It is important to understand how data is found, what it means, and how the information is used. The focus of this lesson is on how to calculate and understand statistics—the numbers that summarize, describe, or represent sets of data. 1.1.1: Describing Data Sets

Key Concepts Data can be described, summarized, and graphed in a variety of ways. We can represent a data set using a measure of center. Measures of Center A measure of center is a single number used to represent the middle value, expected value, or most typical value of a data set. Two commonly used measures of center are the median and the mean. 1.1.1: Describing Data Sets

Key Concepts, continued The median is the middle-most value of a data set; 50% of the data is less than this value, and 50% is greater than it. To find the median, arrange the data values from least to greatest. The median is the middle value in an ordered data set if the number of data values is odd. If the data set contains an even number of values, the median is the average of the two middle numbers. 1.1.1: Describing Data Sets

Key Concepts, continued The mean is found by adding the values in a data set and then dividing the sum by the number of values in the data set. It is also considered the average of all the values in a data set. The mean can be found using the formula , where (pronounced “x bar”) represents the mean. 1.1.1: Describing Data Sets

Key Concepts, continued is the uppercase Greek letter sigma, and is used to represent a sum. So, represents the sum of the n data values in the data set: 1.1.1: Describing Data Sets

Key Concepts, continued The Five-Number Summary The five-number summary of a data set consists of the following key numbers: the minimum, the first quartile (Q1), the median (Q2), the third quartile (Q3), and the maximum. The minimum is the smallest value in the data set and the maximum is the largest value in the data set. The median, also known as the second quartile, is represented by Q2. 1.1.1: Describing Data Sets

Key Concepts, continued When the data values are ordered from least to greatest, the first quartile, Q1, is the value that identifies the lower 25% of the data. It is also the median of the lower half of the data set; 75% of all data is greater than this value. The third quartile, Q3, is the value that identifies the upper 25% of the data. It is also the median of the upper half of the data set; 75% of all data is less than this value. 1.1.1: Describing Data Sets

Key Concepts, continued Measures of Spread or Variability A measure of spread is a number used to describe how far apart certain key values are from each other, or how far a typical value is from the mean of a data set. Measures of spread are also known as measures of variability. The most common measures of spread are the range, interquartile range, and standard deviation. The range is the difference from the minimum to the maximum in a data set; that is, range = maximum – minimum. The range describes the spread of the entire data set. 1.1.1: Describing Data Sets

Key Concepts, continued The interquartile range, IQR, is the difference from the first quartile to the third quartile: IQR = Q3 – Q1. The interquartile range describes the spread of the middle “half ” of the data set. Note: In some cases, the data values between Q1 and Q3 do not form exactly half the data set. But data sets often have many values, and in those cases the middle “half ” is very close to half, so the distinction is not important. For example, if a data set has 1,001 values, then the middle “half ” has 501 values, which is approximately 50.05% of the data set. 1.1.1: Describing Data Sets

Key Concepts, continued The mean absolute deviation, MAD, is the average absolute value of the difference between each data point in a data set and the mean. It is found by summing the absolute value of each difference (or deviation from the mean), then dividing the sum by the total number of data points. The formula for mean absolute deviation is , where is the mean and n is the number of data values. 1.1.1: Describing Data Sets

Key Concepts, continued Shown in expanded form, the formula looks like this: Consider this data set: 3, 5, 6, 8, 8. The mean is 6: . Use the mean to find the mean absolute deviation by substituting each of the values in the data set for xi and 6 for , as shown on the next slide. 1.1.1: Describing Data Sets

Key Concepts, continued The mean absolute deviation is 1.6. 1.1.1: Describing Data Sets

Key Concepts, continued The lowercase Greek letter sigma, σ, is used in two measures of spread, or variability: variance and standard deviation. The variance, σ2, is a measure of spread, or variability; it is the average of the squares of the deviations of all the data values in a data set from the mean. The variance is found using the formula , where is the mean and n is the number of data values. 1.1.1: Describing Data Sets

Key Concepts, continued Shown in expanded form, the formula looks like this: Consider the same data set as before: 3, 5, 6, 8, 8, with a mean of 6. Find the variance by substituting each of the values in the data set for xi and 6 for , as shown on the next slide. 1.1.1: Describing Data Sets

Key Concepts, continued The variance is 3.6. 1.1.1: Describing Data Sets

Key Concepts, continued The standard deviation, σ, is another measure of spread, or variability; it is the average square difference from the mean, denoted by the lowercase Greek letter sigma, σ. The standard deviation is found using the formula , where xi is a data point, is the mean, and n is the number of data values. 1.1.1: Describing Data Sets

Key Concepts, continued • Shown in expanded form, the formula looks like this: Consider the same data set as earlier: 3, 5, 6, 8, 8. The variance, found previously, is 3.6. Take the square root of the variance to find the standard deviation: σ = 3.6 σ ≈ 1.897 1.1.1: Describing Data Sets

Key Concepts, continued The standard deviation describes how much the data values vary, or deviate, from the mean. That is, it describes the deviation of a typical data value from the mean. When the mean is used as the measure of center, the standard deviation should be used as a measure of spread. 1.1.1: Describing Data Sets

Key Concepts, continued Outliers and Extreme Values An outlier is a data value that is much less or much greater than most of the values in the data set. A data value is an outlier if it is less than Q1 – 1.5(IQR) or if it is greater than Q3 + 1.5(IQR). An extreme value is a data value that seems to be much less or much greater than most of the other data values. Note: All outliers are extreme values, but not all extreme values are outliers. 1.1.1: Describing Data Sets

Key Concepts, continued The term “extreme value” is less precise than the term “outlier” because there is no rule for identifying extreme values; they are a matter of opinion. Nevertheless, extreme values can affect the choices of measures of center and spread. Extreme values that are not outliers are those values that fall within the limits discussed previously for outliers. When there are no outliers or other extreme data values, the mean is generally a better measure of center than the median. 1.1.1: Describing Data Sets

Key Concepts, continued When there is an outlier, or in some cases one or more other extreme values, the median is generally a better measure of center than the mean. 1.1.1: Describing Data Sets

Key Concepts, continued Box Plots and Dot Plots A box plot is a graph that shows the five-number summary of a data set. 1.1.1: Describing Data Sets

Key Concepts, continued The vertical line segment inside the box in a box plot represents the median (Q2). The length of the box in a box plot is the interquartile range (IQR). A dot plot is a graph that uses dots to show the number of times each value in a data set appears in that data set. 1.1.1: Describing Data Sets

Key Concepts, continued The mean is the balance point on the dot plot of any data set; that is, if the dots were weights on a scale, the mean would be the point at which the scale would be balanced, or level. A data distribution is an arrangement of data values. When the data values are displayed in a dot plot, the distribution might have a shape that can be named. Two shapes of particular interest are symmetric and skewed. 1.1.1: Describing Data Sets

Key Concepts, continued In a symmetric distribution, a line can be drawn so that the left and right sides are mirror images of each other, as shown. In a skewed distribution, most of the data values are concentrated on one side of the median. 1.1.1: Describing Data Sets

Key Concepts, continued A distribution in which there is a “tail” of isolated, spread-out data points to the right of the median is called skewed to the right. (“Tail” describes the visual appearance of the data points.) Data that is skewed to the right is also called positively skewed. 1.1.1: Describing Data Sets

Key Concepts, continued A distribution is skewed to the right if most of the data values are concentrated on the left. That is, many of the values are clustered on the left side of the distribution, and few values are on the right side (creating the “tail”). There may be one or more outliers or other extreme values on the right. 1.1.1: Describing Data Sets

Key Concepts, continued A distribution in which there is a tail to the left of the median is called skewed to the left. Data that is skewed to the left is also called negatively skewed. 1.1.1: Describing Data Sets

Key Concepts, continued A distribution is skewed to the left if most of the data values are concentrated on the right. That is, many of the values are clustered on the right side of the distribution, and few values are on the left side (creating the “tail”). There may be one or more outliers or other extreme values on the left. 1.1.1: Describing Data Sets

Key Concepts, continued Representing a Given Data Set Accurately It is not always obvious how to choose the most appropriate measures of center and spread as well as the most appropriate graph for a data set. Furthermore, it is not always clear that one particular choice is better than another. Use the table on the next slide to help guide your decisions. 1.1.1: Describing Data Sets

Key Concepts, continued Selecting Appropriate Measures of Center and Spread and Appropriate Graphs *Mean absolute deviation (MAD) and variance (σ2) may be used sometimes as well. If there is an outlier, use: If there is no outlier, use: Measure of center Median (Q2) Mean Rough measure of spread Range Additional measure of spread Interquartile range (IQR) Standard deviation (σ)* Graph Box plot (The median is the vertical segment inside the box.) Dot plot (The mean is the balance point.) 1.1.1: Describing Data Sets

Common Errors/Misconceptions confusing the terms mean and median, and how to calculate each measure confusing the terms mean absolute deviation, variance, and standard deviation, and how to calculate each measure forgetting to order the data values from least to greatest before calculating the median, first and third quartiles, and interquartile range 1.1.1: Describing Data Sets

Common Errors/Misconceptions, continued choosing the data value whose position number is as the median when there are n data values and n is even; for example, choosing the fifth data value as the median when there are ten data values forgetting that when the median is used as the measure of center, the interquartile range should be used as a measure of spread confusing the terms skewed to the left and skewed to the right 1.1.1: Describing Data Sets

Guided Practice Example 1 The following data set shows the numbers of minutes it took 10 chemistry students to complete a quiz: 9 13 10 10 2 11 2 11 11 12 Describe the data set, using appropriate measures of center and spread. Identify any outliers or other extreme values and describe their effects. 1.1.1: Describing Data Sets

Guided Practice: Example 1, continued Make a plan. The choice of spread depends on the choice of center. The choice of center depends on whether there are any outliers. To identify outliers, you need the interquartile range. To find the interquartile range, you need to first find the quartiles Q1 and Q3. So, begin by finding the five-number summary of the data set. 1.1.1: Describing Data Sets

Guided Practice: Example 1, continued Find the five-number summary. The five-number summary includes the minimum value, the first quartile (Q1), the second quartile (Q2) or median, the third quartile (Q3), and the maximum value. Begin by ordering the data values from least to greatest. 2 2 9 10 10 11 11 11 12 13 The minimum is 2 and the maximum is 13. 1.1.1: Describing Data Sets

Guided Practice: Example 1, continued The median, Q2, is the average of the two middle values because the number of values, 10, is even. The two middle values are 10 and 11, so add and divide by 2 to find the median. The median is 10.5. There are 5 data values on either side of 10.5; since the number of data values is odd, we can find Q1 and Q3 without averaging values. 1.1.1: Describing Data Sets

Guided Practice: Example 1, continued The first quartile, Q1, is the middle value of the lower half (the data values to the left of the median): 9. The third quartile, Q3, is the middle value of the upper half (the data values to the right of the median): 11. The five-number summary is shown in the following diagram. 1.1.1: Describing Data Sets

Guided Practice: Example 1, continued Find the interquartile range (IQR). The interquartile range is the difference between Q3 (11) and Q1 (9). IQR = Q3 – Q1 IQR = (11) – (9) IQR = 2 The interquartile range is 2. 1.1.1: Describing Data Sets

Guided Practice: Example 1, continued Identify any outliers. A data value is an outlier if it is less than Q1 – 1.5(IQR) or greater than Q3 + 1.5(IQR). Calculate Q1 – 1.5(IQR) for Q1 = 9 and IQR = 2. Q1 – 1.5(IQR) = (9) – 1.5(2) Q1 – 1.5(IQR) = 9 – 3 Q1 – 1.5(IQR) = 6 The data values 2 and 2 are outliers because 2 < 6. 1.1.1: Describing Data Sets

Guided Practice: Example 1, continued Calculate Q3 + 1.5(IQR) for Q3 = 11 and IQR = 2. Q3 + 1.5(IQR) = (11) + 1.5(2) Q3 + 1.5(IQR) = 11 + 3 Q3 + 1.5(IQR) = 14 There are no data values greater than 14. The only outliers are 2 and 2. 1.1.1: Describing Data Sets

Guided Practice: Example 1, continued Choose an appropriate measure of center for the data. The median, 10.5, is an appropriate measure of center because there are two extreme values, 2 and 2, that are also outliers of the data set. 1.1.1: Describing Data Sets

Guided Practice: Example 1, continued Choose an appropriate measure of spread for the data. The range is useful for any data set, but it is only a rough measure because it does not give any information about data values between the minimum and the maximum. Because the median has been chosen as the more appropriate measure of center, the additional measure of spread should be the interquartile range. 1.1.1: Describing Data Sets

Guided Practice: Example 1, continued Draw a box plot and a dot plot to display the data set. Use the five-number summary to create the box plot. 1.1.1: Describing Data Sets

Guided Practice: Example 1, continued Create the dot plot by marking occurrences of each data set value on a number line that has the same increments as your box plot. 1.1.1: Describing Data Sets

Guided Practice: Example 1, continued Use the plots to describe the data set. The distribution is skewed to the left because there are two values that are on the left, relatively far from the rest of the data, which is concentrated at the right. The median, Q2 = 10.5, represents the data set. The median is represented by the vertical line segment inside the box of the box plot. The interquartile range, 2, is the difference between the upper quartile (Q3), which is 11, and the lower quartile (Q1), which is 9. 1.1.1: Describing Data Sets

✔ Guided Practice: Example 1, continued The data values 2 and 2 are extreme values in this data set; their effect is to make the mean too low to be an accurate measure of center. The extreme data values 2 and 2 can be called outliers because they are less than Q1 – 1.5(IQR). On a box plot, outliers are data values that are outside the box by a distance of more than 1.5 times the interquartile range; that is, outside the box by a distance of more than 1.5 times the length of the box. Looking at the box plot, it appears that the distance between 2 and the left side of the box is more than twice the length of the box itself. ✔ 1.1.1: Describing Data Sets

Guided Practice: Example 1, continued http://www.walch.com/ei/00617 1.1.1: Describing Data Sets

Guided Practice Example 2 Eight friends are discussing their part-time jobs. They worked the following numbers of hours last week: 8 6 8 4 8 14 10 14 Describe the data set, using appropriate measures of center and spread. Identify any outliers or other extreme values and describe their effects. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Make a plan. The choice of spread depends on the choice of center. The choice of center depends on whether there are any outliers. To identify outliers, you need the interquartile range. To find the interquartile range, you need to first find the quartiles Q1 and Q3. So, begin by finding the five-number summary of the data set. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Find the five-number summary. Order the data values from least to greatest. 4 6 8 8 8 10 14 14 The minimum is 4 and the maximum is 14. The median is the average of the two middle values, because the number of data values is even. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued The median of 8 doesn’t fall between any values in the data set, so we are splitting the data set into two halves, each with an even number of data values. We will need to average values to find Q1 and Q3. Q1 is the average of the two middle values of the lower half of the data set (the data to the left of the median). 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Q3 is the average of the two middle values of the upper half of the data set (the data to the right of the median). The five-number summary is shown below. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Find the interquartile range (IQR). The interquartile range is the difference between Q3 (12) and Q1 (7). IQR = Q3 – Q1 IQR = (12) – (7) IQR = 5 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Identify any outliers. A data value is an outlier if it is less than Q1 – 1.5(IQR) or greater than Q3 + 1.5(IQR). Calculate Q1 – 1.5(IQR) for Q1 = 7 and IQR = 5. Q1 – 1.5(IQR) = (7) – 1.5(5) Q1 – 1.5(IQR) = 7 – 7.5 Q1 – 1.5(IQR) = –0.5 There are no data values less than –0.5. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Calculate Q3 + 1.5(IQR) for Q3 = 12 and IQR = 5. Q3 + 1.5(IQR) = (12) + 1.5(5) Q3 + 1.5(IQR) = 12 + 7.5 Q3 + 1.5(IQR) = 19.5 There are no data values greater than 19.5. There are no outliers. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Choose an appropriate measure of center. There are no outliers; therefore, look at the ordered list of data values and decide whether there are any values that seem to be extreme, even if they do not qualify as outliers. Do this by informally comparing the differences between consecutive values. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Ordered data values: 4, 6, 8, 8, 8, 10, 14, 14 There are no large differences between consecutive data values, so there do not seem to be any extreme values. The mean is an appropriate measure of center because there are no outliers or other extreme values. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Find the mean, x. The mean is the average of all the data values. Formula for calculating mean is the sum of the n data values. Substitute values from the data set for x1, etc. There are 8 data values, so n = 8. — 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Simplify. The mean is 9. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Choose appropriate measures of spread. Because the mean has been chosen as the measure of center, appropriate measures of spread are the range, mean absolute deviation (MAD), variance (σ2), and standard deviation (σ). 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Find the range. The range is the difference between the maximum and minimum. In this data set, the maximum is 14 and the minimum is 4. range = maximum – minimum range = (14) – (4) range = 10 The range is 10. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Calculate the mean absolute deviation, the variance, and the standard deviation for individual data values. For each value, find its deviation from the mean, then take the absolute value of the deviation, and then square the deviation. Organize the data values and results in a table, as shown on the next slide. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Data value Mean Deviation from mean Absolute deviation Deviation squared xi 4 9 –5 5 25 6 –3 3 8 –1 1 10 14 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Find the mean absolute deviation (MAD), the variance, and the standard deviation for the data set. Find the sum in each of the last two columns of the table from the previous step. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Data value Mean Deviation from mean Absolute deviation Deviation squared xi 4 9 –5 5 25 6 –3 3 8 –1 1 10 14 Sum 22 88 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued The sum of the absolute deviations for the individual data values is 22. The sum of the squares of the deviations is 88. The mean absolute deviation is the average of the sum of the absolute deviations, as shown on the next slide. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Formula for mean absolute deviation Substitute 22 for , the sum of the absolute deviations, and 8 for n, the number of data values. Simplify. The mean absolute deviation is 2.75. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued The variance is the average of the sum of the squares of the deviations: Formula for variance Substitute 88 for , the sum of the squares of the deviations, and 8 for n, the number of data values. Simplify. The variance is 11. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued The standard deviation is the square root of the variance: Formula for standard deviation Substitute 11 for the variance, σ2. Simplify. The standard deviation is approximately 3.32. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Draw a box plot. Use the five-number summary to create the box plot. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Draw a dot plot. Create the dot plot by marking occurrences of each data set value on a number line that has the same increments as your box plot. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued Use the plots to describe the data set. The distribution is neither significantly skewed nor symmetric, though it is nearly symmetric about the value 8. The mean, , and median, Q2 = 8, are both reasonable choices as appropriate measures of center. But the mean is a slightly better choice because it is the balance point of the entire data set, and the data set has no outliers or other extreme values. 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued 1.1.1: Describing Data Sets

✔ Guided Practice: Example 2, continued The range, 10, describes the spread of the entire data set, from minimum to maximum. The standard deviation, σ ≈ 3.32, describes the difference, or deviation, between a typical data value and the mean. (The mean absolute deviation, MAD = 2.75, and the variance, σ2 = 11, are associated with the standard deviation.) There are no extreme values or outliers. ✔ 1.1.1: Describing Data Sets

Guided Practice: Example 2, continued http://www.walch.com/ei/00618 1.1.1: Describing Data Sets