Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 4 Understanding and Comparing Distributions.

Similar presentations


Presentation on theme: "Chapter 4 Understanding and Comparing Distributions."— Presentation transcript:

1 Chapter 4 Understanding and Comparing Distributions

2 Objectives Construct side by side histograms or boxplots on comparable scales to compare the distributions of two groups. Compare the distributions of two or more groups by comparing their shapes, centers, spreads, and unusual features.

3 4.1 Comparing Groups with Histograms

4 Wind Speeds in the Hopkins Memorial Forest
Typical speed < 1 mph A small number of high wind days One very windy day > 6 mph IQR ~ 1.82 mph May be interesting to compare winter (Oct. – March) with summer (April – Sept.)

5 Comparing Seasons In investigating the wind patterns in the Hopkins Memorial Forest, we can compare winter and summer months. Summer is unimodal and skewed right. Winter is less skewed and nearly uniform.

6 Comparing Seasons (Continued)
Typical summer wind < 1 mph, a few days above 3 mph Winter wind often < 3 mph, more spread out Always relatively calm in the summer, but winter has windier days

7 Comparing Seasons (Continued)
Winter is substantially windier than summer. Both the standard deviation and the IQR show that winter wind speeds are more variable compared to summer

8 Comparing Stem-and-Leaf
A back-to-back stem-and-leaf diagram compares nest egg indices (savings and investments). Northeast and Midwest generally have bigger nest egg indices than the South and West. Back-to-back charts are best for comparisons.

9 4.2 Comparing Groups with Boxplots

10 Using Boxplots for Comparisons
Are some months windier than others? Compare April and July. Notice many outliers over the year with this view.

11 Wooden Vs. Steel Which type of roller coaster is faster: steel or wooden? Steel roller coasters are generally faster. Similar IQRs, but note the difference in the ranges One superfast steel roller coaster, but no exceptionally fast wooden roller coasters

12 Please, No Cold Coffee! We want to compare which of 4 different coffee cups keeps the coffee hot. Measure the temperature 30 minutes after being poured for each of the four types. Repeat the experiment 8 times. Think Plan: Compare the data sets for the four types. Variables: Quantitative – Temperature change of coffee

13 Show → Mechanics Present the 5-number summaries of each cup type. Also, find the IQRs. Construct four boxplots, one for each cup type. Boxplots effectively compare the distributions.

14 Tell → Conclusion The individual cup types are slightly skewed left.
Nissan is best for keeping the coffee hot typically losing only 2˚. SIGG is the worst typically losing 14˚. Over 75% of the Nissan cups showed less heat loss than any of the other cup types.

15 4.3 Outliers

16 How to Approach Outliers
Check to see if there may have been an error in the data collection or data input. If the reported heights of students includes a student that is 170 inches tall (14 feet), maybe that student was measured in centimeters. Check to see if there was an extraordinary outcome. The median number of daily customers at the Punxsutawney, PA, gift store may be 42 with an IQR of , but on February 2, there were 831 customers.

17 Common Errors Causing an Outlier
Transposing the digits A respondent not understanding the survey question Misreading results Confusion about units Cheating

18 The Outliers Can be the Most Interesting Data Values
Income Data: The CEO Student Height: The basketball team’s center Snowfall: The great blizzard of ’98 Exam Score: The curve breaker Milk Purchased: Octomom! Always comment on the outliers.

19 What Can Go Wrong? Avoid inconsistent scales.
Don’t try to compare one thing measured in feet to another measured in meters. Label Clearly. Variables should be identified and axes labeled. Beware of Outliers! If the outliers are errors, remove them. Otherwise, considering presenting with and without the outliers.

20 What’s Wrong With This? Horizontal scales different 1965 to 1999
Vertical axis not labeled Is it $ or rank? Makes it look like the rank has gotten worse, but a lower rank is better. Being number 1 is the best.

21 What Have We Learned? Choose the right tool.
Use histograms to compare two or three groups. Use boxplots to compare many groups. Treat outliers with attention and care. Local or global, especially in a time series Investigate if the outliers are errors or remarkable.


Download ppt "Chapter 4 Understanding and Comparing Distributions."

Similar presentations


Ads by Google