Boxplots.

Slides:



Advertisements
Similar presentations
Describing Distributions with Numbers
Advertisements

Lecture 17 Sec Wed, Feb 13, 2008 Boxplots.
Measures of Variation Sample range Sample variance Sample standard deviation Sample interquartile range.
The Five-Number Summary and Boxplots
Percentiles Def: The kth percentile is the value such that at least k% of the measurements are less than or equal to the value. I.E. k% of the measurements.
Quartiles and the Interquartile Range.  Comparing shape, center, and spreads of two or more distributions  Distribution has too many values for a stem.
Why use boxplots? ease of construction convenient handling of outliers construction is not subjective (like histograms) Used with medium or large size.
M08-Numerical Summaries 2 1  Department of ISM, University of Alabama, Lesson Objectives  Learn what percentiles are and how to calculate quartiles.
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
1 Further Maths Chapter 2 Summarising Numerical Data.
Chapter 5: Boxplots  Objective: To find the five-number summaries of data and create and analyze boxplots CHS Statistics.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Why use boxplots? ease of construction convenient handling of outliers construction is not subjective (like histograms) Used with medium or large size.
1 Chapter 2 Bivariate Data A set of data that contains information on two variables. Multivariate A set of data that contains information on more than.
Why use boxplots? ease of construction convenient handling of outliers construction is not subjective (like histograms) Used with medium or large size.
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
AP Statistics 5 Number Summary and Boxplots. Measures of Center and Distributions For a symmetrical distribution, the mean, median and the mode are the.
5,8,12,15,15,18,20,20,20,30,35,40, Drawing a Dot plot.
Quantitative Data Continued
CHAPTER 1 Exploring Data
a graphical presentation of the five-number summary of data
Analyzing One-Variable Data
Chapter 1: Exploring Data
Chapter 5 : Describing Distributions Numerically I
How to describe a graph Otherwise called CUSS
Boxplots.
Unit 2 Section 2.5.
CHAPTER 1 Exploring Data
Chapter 2b.
Box and Whisker Plots Algebra 2.
2.6: Boxplots CHS Statistics
Boxplots.
Warmup What is the shape of the distribution? Will the mean be smaller or larger than the median (don’t calculate) What is the median? Calculate the.
Numerical Measures: Skewness and Location
Describing Distributions Numerically
Boxplots.
Range between the quartiles. Q3 – Q1
Approximate the answers by referring to the box plot.
10.5 Organizing & Displaying Date
Boxplots.
Measuring Variation 2 Lecture 17 Sec Mon, Oct 3, 2005.
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Measures of Central Tendency
Chapter 1: Exploring Data
Define the following words in your own definition
Box & Whiskers Plots AQR.
Boxplots.
Organizing, Summarizing, &Describing Data UNIT SELF-TEST QUESTIONS
Boxplots.
Chapter 1: Exploring Data
Boxplots.
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Boxplots.
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
The Five-Number Summary
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Quiz.
Box and Whisker Plots and the 5 number summary
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Boxplots.
Chapter 1: Exploring Data
Presentation transcript:

Boxplots

Why use boxplots? ease of construction convenient handling of outliers construction is not subjective (like histograms) Used with medium or large size data sets (n > 10) useful for comparative displays

Disadvantage of boxplots does not retain the individual observations should not be used with small data sets (n < 10)

How to construct find five-number summary Min Q1 Med Q3 Max draw box from Q1 to Q3 draw median as center line in the box extend whiskers to min & max

ALWAYS use modified boxplots in this class!!! display outliers fences mark off mild & extreme outliers whiskers extend to largest (smallest) data value inside the fence ALWAYS use modified boxplots in this class!!!

Interquartile Range (IQR) – is the range (length) of the box Inner fence Interquartile Range (IQR) – is the range (length) of the box Q3 - Q1 Q1 – 1.5IQR Q3 + 1.5IQR Any observation outside this fence is an outlier! Put a dot for the outliers.

Modified Boxplot . . . Draw the “whisker” from the quartiles to the observation that is within the fence!

Any observation outside this fence is an extreme outlier! Outer fence Q1 – 3IQR Q3 + 3IQR Any observation between the fences is considered a mild outlier. Any observation outside this fence is an extreme outlier!

For the AP Exam . . . . . . you just need to find outliers, you DO NOT need to identify them as mild or extreme. Therefore, you just need to use the 1.5IQRs

A report from the U.S. Department of Justice gave the following percent increase in federal prison populations in 20 northeastern & mid-western states in 1999. 5.9 1.3 5.0 5.9 4.5 5.6 4.1 6.3 4.8 6.9 4.5 3.5 7.2 6.4 5.5 5.3 8.0 4.4 7.2 3.2 Create a modified boxplot. Describe the distribution. Use the calculator to create a modified boxplot.

Evidence suggests that a high indoor radon concentration might be linked to the development of childhood cancers. The data that follows is the radon concentration in two different samples of houses. The first sample consisted of houses in which a child was diagnosed with cancer. Houses in the second sample had no recorded cases of childhood cancer. (see data on note page) Create parallel boxplots. Compare the distributions.

Cancer No Cancer 100 200 Radon The median radon concentration for the no cancer group is lower than the median for the cancer group. The range of the cancer group is larger than the range for the no cancer group. Both distributions are skewed right. The cancer group has outliers at 39, 45, 57, and 210. The no cancer group has outliers at 55 and 85.