What are the effects of outliers on statistical data?

Slides:



Advertisements
Similar presentations
Measures of Central Tendency and Variation 11-5
Advertisements

Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
Ch 11 – Probability & Statistics
Unit 4: Describing Data.
1 Distribution Summaries Measures of central tendency Mean Median Mode Measures of spread Range Standard Deviation Interquartile Range (IQR)
Unit 4 – Probability and Statistics
Warm-Up Exercises 1.Write the numbers in order from least to greatest. 82, 45, 98, 87, 82, The heights in inches of the basketball players in order.
Statistics: Use Graphs to Show Data Box Plots.
Quartiles & Extremes (displayed in a Box-and-Whisker Plot) Lower Extreme Lower Quartile Median Upper Quartile Upper Extreme Back.
Warm Up Simplify each expression. – 53
CONFIDENTIAL 1 Grade 8 Algebra1 Data Distributions.
3. Use the data below to make a stem-and-leaf plot.
Objectives Vocabulary
Warm-Up 1.The probability distribution for the number of games played in each World Series for the years 1923–2004 is given below. Find the expected number.
Presentation 1-1 Honors Advanced Algebra Morton, Spring 2014.
Chapter 3 Descriptive Statistics: Numerical Methods Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Objectives Describe the central tendency of a data set.
Objectives Create and interpret box-and-whisker plots.
Warm Up Problem of the Day Lesson Presentation Lesson Quizzes.
6-9 Data Distributions Objective Create and interpret box-and-whisker plots.
Table of Contents 1. Standard Deviation
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 3 Descriptive Statistics: Numerical Methods.
Holt McDougal Algebra 2 Measures of Central Tendency and Variation Check It Out! Example 3 Make a box-and-whisker plot of the data. Find the interquartile.
Holt McDougal Algebra 2 Measures of Central Tendency and Variation Measures of Central Tendency and Variation Holt Algebra 2Holt McDougal Algebra.
Objectives Vocabulary Describe the central tendency of a data set.
7.7 Statistics and Statistical Graphs. Learning Targets  Students should be able to… Use measures of central tendency and measures of dispersion to describe.
Warm-Up Define mean, median, mode, and range in your own words. Be ready to discuss.
Quantitative data. mean median mode range  average add all of the numbers and divide by the number of numbers you have  the middle number when the numbers.
Box and Whisker Plots Measures of Central Tendency.
6-8 Measures of Central Tendency and Variation Objective Find measures of central tendency and measures of variation for statistical data. Examine the.
Warm Up Simplify each expression
Chapter 3 Averages and Variation Understanding Basic Statistics Fifth Edition By Brase and Brase Prepared by Jon Booze.
Lesson 25 Finding measures of central tendency and dispersion.
Unit 4: Probability Day 4: Measures of Central Tendency and Box and Whisker Plots.
Statistics and Data Analysis
Holt McDougal Algebra Data Distributions Warm Up Identify the least and greatest value in each set Use the data below to make a stem-and-
Splash Screen. Then/Now You have already found measures of central tendency. (Lesson 13–2) Find measures of variation. Display data in a box-and-whisker.
Holt McDougal Algebra Measures of Central Tendency and Variation Work through the notes Then complete the class work next to this document on the.
Holt McDougal Algebra Measures of Central Tendency and Variation Recall that the mean, median, and mode are measures of central tendency—values.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
Statistics Unit Test Review Chapters 11 & /11-2 Mean(average): the sum of the data divided by the number of pieces of data Median: the value appearing.
Warm Up What is the mean, median, mode and outlier of the following data: 16, 19, 21, 18, 18, 54, 20, 22, 23, 17.
Measures of Central Tendency (0-12) Objective: Calculate measures of central tendency, variation, and position of a set of data.
Holt McDougal Algebra 1 Data Distributions Holt Algebra 1 Warm Up Warm Up Lesson Presentation Lesson Presentation Lesson Quiz Lesson Quiz Holt McDougal.
Measures of Central Tendency and Variation
6.1 - Measures of Central Tendency and Variation
6.1 - Measures of Central Tendency and Variation
10-3 Data Distributions Warm Up Lesson Presentation Lesson Quiz
6.1 - Measures of Central Tendency and Variation
Please copy your homework into your assignment book
Notes 13.2 Measures of Center & Spread
Statistics Unit Test Review
10-3 Data Distributions Warm Up Lesson Presentation Lesson Quiz
8.1 - Measures of Central Tendency and Variation
10-3 Data Distributions Warm Up Lesson Presentation Lesson Quiz
6.1 - Measures of Central Tendency and Variation
Lesson 10-3 Data Distributions
Vocabulary box-and-whisker plot lower quartile upper quartile
Lesson 1: Summarizing and Interpreting Data
The absolute value of each deviation.
Measures of Central Tendency
Constructing Box Plots
Measures of Central Tendency and Variation 8-1
10-3 Data Distributions Warm Up Lesson Presentation Lesson Quiz
. . Box and Whisker Measures of Variation Measures of Variation 8 12
Box and Whisker Plots.
The data sets {19, 20, 21} and {0, 20, 40} have the same mean and median, but the sets are very different. The way that data are spread out from the mean.
Warm-Up Define mean, median, mode, and range in your own words. Be ready to discuss.
Warm up Honors Algebra 2 3/14/19
Presentation transcript:

What are the effects of outliers on statistical data? Essential Questions How do we find measures of central tendency and measures of variation for statistical data? What are the effects of outliers on statistical data?

Example 1: Finding Measures of Central Tendency Find the mean, median, and mode of the data. deer at a feeder each hour: 3, 0, 2, 0, 1, 2, 4 Mean: Median: 0, 0, 1, 2, 2, 3, 4 Mode: The most common results are 0 and 2.

Example 2: Finding Measures of Central Tendency Find the mean, median, and mode of the data set. {6, 9, 3, 8} Mean: Median: 3, 6, 8, 9 Mode: None.

Example 3: Finding Measures of Central Tendency Find the mean, median, and mode of the data set. {2, 5, 6, 2, 6} Mean: Median: 2, 2, 5, 6, 6 Mode: 2 and 6.

Example 4: Finding Weighted Average A weighted average is a mean calculated by using frequencies of data values. Suppose that 30 movies are rated as follows: weighted average of stars =

For numerical data, the weighted average of all of those outcomes is called the expected value for that experiment. The probability distribution for an experiment is the function that pairs each outcome with its probability.

Example 5: Finding Expected Value The probability distribution of successful free throws for a practice set is given below. Find the expected number of successes for one set. Use the weighted average. Simplify. The expected number of successful free throws is 2.05.

Example 6: Finding Expected Value The probability distribution of the number of accidents in a week at an intersection, based on past data, is given below. Find the expected number of accidents for one week. Use the weighted average. expected value = 0(0.75) + 1(0.15) + 2(0.08) + 3(0.02) = 0.37 Simplify. The expected number of accidents is 0.37.

Lesson 1.1 Practice A

What are the effects of outliers on statistical data? Essential Questions How do we find measures of central tendency and measures of variation for statistical data? What are the effects of outliers on statistical data?

A box-and-whisker plot shows the spread of a data set A box-and-whisker plot shows the spread of a data set. It displays 5 key points: the minimum and maximum values, the median, and the first and third quartiles.

The quartiles are the medians of the lower and upper halves of the data set. If there are an odd number of data values, do not include the median in either half. The interquartile range, or IQR, is the difference between the 1st and 3rd quartiles, or Q3 – Q1. It represents the middle 50% of the data.

Example 7: Making a Box-and-Whisker Plot and Finding the Interquartile Range Make a box-and-whisker plot of the data. Find the interquartile range. {6, 8, 7, 5, 10, 6, 9, 8, 4} Step 1 Order the data from least to greatest. Step 2 Find the minimum, maximum, median, and quartiles. Minimum Median Maximum 4, 5, 6, 6, 7, 8, 8, 9, 10 First quartile 5.5 Third quartile 8.5

Step 3 Draw a box-and-whisker plot. Draw a number line, and plot a point above each of the five values. Then draw a box from the first quartile to the third quartile with a line segment through the median. Draw whiskers from the box to the minimum and maximum. IRQ = 8.5 – 5.5 = 3 The interquartile range is 3, the length of the box in the diagram.

Example 8: Making a Box-and-Whisker Plot Make a box-and-whisker plot of the data. Find the interquartile range. {13, 14, 18, 13, 12, 17, 15, 12, 13, 19, 11, 14, 14, 18, 22, 23} Step 1 Order the data from least to greatest. Step 2 Find the minimum, maximum, median, and quartiles. Minimum Median Maximum 11, 12, 12, 13, 13, 13, 14, 14, 14, 15, 17, 18, 18, 19, 22, 23 First quartile 13 Third quartile 18

Step 3 Draw a box-and-whisker plot. IQR = 18 – 13 = 5 The interquartile range is 5, the length of the box in the diagram.

The data sets {19, 20, 21} and {0, 20, 40} have the same mean and median, but the sets are very different. The way that data are spread out from the mean or median is important in the study of statistics. A measure of variation is a value that describes the spread of a data set. The most commonly used measures of variation are the range, the interquartile range, the variance, and the standard deviation. The variance, denoted by σ2, is the average of the squared differences from the mean. Standard deviation, denoted by σ, is the square root of the variance and is one of the most common and useful measures of variation.

Low standard deviations indicate data that are clustered near the measures of central tendency, whereas high standard deviations indicate data that are spread out from the center. The symbol commonly used to represent the mean is x, or “x bar.” The symbol for standard deviation is the lowercase Greek letter sigma, . Reading Math

Example 9: Finding the Mean and Standard Deviation Find the mean and standard deviation for the data set of the number of people getting on and off a bus for several stops. {6, 8, 7, 5, 10, 6, 9, 8, 4} List in order. 4, 5, 6, 6, 7, 8, 8, 9, 10 Step 1 Find the mean.

Example 9 Continued Step 2 Find the difference between the mean and each data value, and square it. Data value x 4 5 6 7 8 9 10 Step 3 Find the variance. Find the average of the last row of the table. Step 4 Find the standard deviation. The standard deviation is the square root of the variance. The mean is 7 people, and the standard deviation is about 1.83 people.

Example 10: Finding the Mean and Standard Deviation Find the mean and standard deviation for the data set of the number of elevator stops for several rides. {0, 3, 1, 1, 0, 5, 1, 0, 3, 0} List in order. 0, 0, 0, 0, 1, 1, 1, 3, 3, 5 Step 1 Find the mean.

Example 10 Continued Step 2 Find the difference between the mean and each data value, and square it. Data value x 1 3 5 Find the average of the last row of the table. Step 3 Find the variance. Step 4 Find the standard deviation. The standard deviation is the square root of the variance. The mean is 1.4 stops, and the standard deviation is about 1.6 stops.

An outlier is an extreme value that is much less than or much greater than the other data values. Outliers have a strong effect on the mean and standard deviation. If an outlier is the result of measurement error or represents data from the wrong population, it is usually removed. There are different ways to determine whether a value is an outlier. One is to look for data values that are more than 3 standard deviations from the mean.

Example 11: Examining Outliers Find the mean and the standard deviation for the heights of 15 cans. Identify any outliers. Can Heights (mm) 92.8 92.9 92.7 92.1

Example 11 Continued Identify the outliers. Look for the data values that are more than 3 standard deviations away from the mean in either direction. Three standard deviations is about 3(0.195) = 0.585. Values less than 92.185 and greater than 93.355 are outliers, so 92.1 is an outlier.

Example 12: Examining Outliers In the 2003-2004 American League Championship Series, the New York Yankees scored the following numbers of runs against the Boston Red Sox: 2, 6, 4, 2, 4, 6, 6, 10, 3, 19, 4, 4, 2, 3. Identify the outlier. Identify the outliers. Three standard deviations is about 3(4.3) = 12.9. Mean Values less than –7.5 and greater than 18.3 are outliers, so 19 is an outlier.

Lesson 1.1 Practice B