Understanding and Comparing Distributions

Slides:



Advertisements
Similar presentations
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 5- 1.
Advertisements

Copyright © 2010 Pearson Education, Inc. Slide
Describing Quantitative Variables
DESCRIBING DISTRIBUTION NUMERICALLY
F IND THE S TANDARD DEVIATION FOR THE FOLLOWING D ATA OF GPA: 4, 3, 2, 3.5,
C. D. Toliver AP Statistics
Descriptive Measures MARE 250 Dr. Jason Turner.
Understanding and Comparing Distributions
Understanding and Comparing Distributions 30 min.
Statistics 100 Lecture Set 6. Re-cap Last day, looked at a variety of plots For categorical variables, most useful plots were bar charts and pie charts.
CHAPTER 1 Exploring Data
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
Chapter 5: Understanding and Comparing Distributions
MEASURES OF SPREAD – VARIABILITY- DIVERSITY- VARIATION-DISPERSION
Understanding and Comparing Distributions
CHAPTER 2: Describing Distributions with Numbers
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Objectives 1.2 Describing distributions with numbers
Let’s Review for… AP Statistics!!! Chapter 1 Review Frank Cerros Xinlei Du Claire Dubois Ryan Hoshi.
Copyright © 2009 Pearson Education, Inc. Chapter 5 Understanding and Comparing Distributions.
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
1 Further Maths Chapter 2 Summarising Numerical Data.
Chapter 2 Section 5 Notes Coach Bridges
Review BPS chapter 1 Picturing Distributions with Graphs What is Statistics ? Individuals and variables Two types of data: categorical and quantitative.
Chapter 5 Understanding and Comparing Distributions Math2200.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Understanding and Comparing Distributions.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Describing Distributions Numerically.
Understanding and Comparing Distributions
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 5 – Slide 1 of 21 Chapter 3 Section 5 The Five-Number Summary And Boxplots.
Chapter 5: Boxplots  Objective: To find the five-number summaries of data and create and analyze boxplots CHS Statistics.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Chapter 4 Understanding and Comparing Distributions Another Useful Graphical Method: Boxplots.
Using Measures of Position (rather than value) to Describe Spread? 1.
+ Chapter 1: Exploring Data Section 1.3 Describing Quantitative Data with Numbers The Practice of Statistics, 4 th edition - For AP* STARNES, YATES, MOORE.
Chapter 4 - Exploring Data Section 1 - Describing Distribution with Numbers.
Chapter 4 Measures of Central Tendency Measures of Variation Measures of Position Dot Plots Stem-and-Leaf Histograms.
Copyright © 2009 Pearson Education, Inc. Slide 4- 1 Practice – Ch4 #26: A meteorologist preparing a talk about global warming compiled a list of weekly.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
Probability & Statistics
Box and Whisker Plots or Boxplots
Chapter 5 : Describing Distributions Numerically I
Describing Distributions Numerically
Understanding and Comparing Distributions
Understanding and Comparing Distributions
Understanding and Comparing Distributions
Unit 2 Section 2.5.
Describing Distributions Numerically
Averages and Variation
The histograms represent the distribution of five different data sets, each containing 28 integers from 1 through 7. The horizontal and vertical scales.
Understanding and Comparing Distributions
Chapter 2b.
Box and Whisker Plots Algebra 2.
2.6: Boxplots CHS Statistics
DAY 3 Sections 1.2 and 1.3.
Numerical Measures: Skewness and Location
Understanding and Comparing Distributions
Lecture 2 Chapter 3. Displaying and Summarizing Quantitative Data
Measure of Center And Boxplot’s.
Describing Distributions Numerically
Measure of Center And Boxplot’s.
Understanding and Comparing Distributions
Displaying and Summarizing Quantitative Data
Displaying and Summarizing Quantitative Data
Measures of Central Tendency
Chapter 1: Exploring Data
Understanding and Comparing Distributions
Describing Distributions Numerically
Understanding and Comparing Distributions
Quiz.
Presentation transcript:

Understanding and Comparing Distributions Chapter 5 Understanding and Comparing Distributions .

The Big Picture Below is a histogram of the Average Wind Speed at Hopkins Forest in Western Massachusetts, for every day in 1989.

The Big Picture (cont) The distribution is: High value may be an outlier Median daily wind speed =1.90 mph IQR is 1.78 mph

The Five-Number Summary of a distribution reports its median, quartiles, and minimum and maximum Example: The five-number summary for the daily wind speed is: Max 8.67 Q3 2.93 Median 1.90 Q1 1.15 Min 0.20

Daily Wind Speed: Making Boxplots A is a graphical display of the five-number summary. Boxplots are particularly useful when comparing groups.

Constructing Boxplots Five number summary : 0. 20, 1. 15, 1. 90, 2 Draw a single vertical axis spanning the range of the data. Draw short horizontal lines at the lower and upper quartiles and at the median. Then connect them with vertical lines to form a box.

Constructing Boxplots (cont. ) Five number summary : 0. 20, 1. 15, 1 Sketch “fences” around the main part of the data. The upper fence is 1.5 IQRs above the upper quartile. The lower fence is 1.5 IQRs below the lower quartile. Note: the fences only help with constructing the boxplot and should not appear in the final display.

Constructing Boxplots (cont.) Use the fences to grow “whiskers.” Draw lines from the ends of the box up and down to the minimum and maximum data values found If a data value falls outside one of the fences, we do not connect it with a whisker.

Constructing Boxplots (cont.) Add the outliers by displaying any data values beyond the fences with special symbols. We often use a different symbol for “far outliers” that are farther than 3 IQRs from the quartiles.

Wind Speed: Making Boxplots (cont.) Let us compare the histogram and boxplot for daily wind speeds:

Comparing Groups It is always more interesting to compare groups. With histograms, note the shapes, centers, and spreads of the two distributions. What does this graphical display tell you?

Comparing Groups (cont) Boxplots hide the details while displaying the overall summary information. We often plot them side by side for groups or categories we wish to compare.

What About Outliers? If there are any clear outliers and you are reporting the mean and standard deviation Report with the outliers present and with the outliers removed Note: The median and IQR are not likely to be affected by the outliers.

Timeplots: Order, Please! For some data sets, we are interested in how the data behave over time. In these cases, we construct of the data.

Re-expressing Skewed Data to Improve Symmetry One way to make a skewed distribution more symmetric is to or the data Apply a simple function (e.g., logarithmic function).

Re-expressing Skewed Data to Improve Symmetry (cont.) A logarithmic function was applied to each of the observations of the data displayed in the previous slide. Note the change in from the raw data (previous slide) to the data (left).

What Can Go Wrong? Avoid inconsistent scales Beware of outliers Be careful when comparing groups with very different spreads

What have we learned? We’ve learned the value of comparing data groups and looking for patterns among groups and over time We’ve seen that boxplots are very effective for comparing groups graphically We’ve experienced the value of identifying and investigating outliers

Practice Exercise - Chapter 5 A survey conducted in a college intro stats class during Autumn 2003 asked students about the number of credit hours they were taking that quarter. The number of credit hours for a random sample of 16 students is 10 10 12 14 15 15 15 15 17 17 19 20 20 20 20 22

Practice Exercise - Chapter 5 (cont) a. Find the five number summary for the data above b. Find the IQR for the data c. From parts (a) and (b), are there any outliers in the data? d. Create a boxplot of these data.

Practice Exercise - Chapter 5 (cont) 10 10 12 14 15 15 15 15 17 17 19 20 20 20 20 22 a. Find the 5 number summary:

Practice Exercise - Chapter 5 (cont) To find quartiles, divide data into 2 even sets 1st: 10 10 12 14 15 15 15 15 2nd: 17 17 19 20 20 20 20 22 To find Q1 we find the median of the first set of numbers above: → Q1 = To find Q3 we find the median of the second set of numbers: → Q3 =

Practice Exercise - Chapter 5 (cont) a. Five number summary:

Practice Exercise - Chapter 5 (cont) b. Find the IQR of the data. IQR = =

Practice Exercise - Chapter 5 (cont) c. From parts (a) and (b), are there any outliers in the data? To determine if there are outliers we need to calculate the values of the fences. Lower fence = =

Practice Exercise - Chapter 5 (cont) Upper fence = Q3 + 1.5 x IQR = Are there any observation outside the fences? None of the observations lie outside the fences, hence in the data

Practice Exercise - Chapter 5 (cont) d. Create a boxplot of these data. Min = 10 Q1 = 14.5 Median = 16 Q3 = 20 Max = 22 Lower fence = 5.75 Upper fence = 28.25