Chapter 5 Understanding and Comparing Distributions Math2200.

Slides:



Advertisements
Similar presentations
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 5- 1.
Advertisements

Copyright © 2010 Pearson Education, Inc. Slide
Describing Quantitative Variables
DESCRIBING DISTRIBUTION NUMERICALLY
F IND THE S TANDARD DEVIATION FOR THE FOLLOWING D ATA OF GPA: 4, 3, 2, 3.5,
Descriptive Measures MARE 250 Dr. Jason Turner.
Understanding and Comparing Distributions
Understanding and Comparing Distributions 30 min.
CHAPTER 4 Displaying and Summarizing Quantitative Data Slice up the entire span of values in piles called bins (or classes) Then count the number of values.
Slide 3- 1 Copyright © 2010 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Business Statistics First Edition.
Chapter 5: Understanding and Comparing Distributions
It’s an outliar!.  Similar to a bar graph but uses data that is measured.
Describing Quantitative Data with Numbers Part 2
Chapter In Chapter 3… … we used stemplots to look at shape, central location, and spread of a distribution. In this chapter we use numerical summaries.
Understanding and Comparing Distributions
Understanding and Comparing Distributions
Box and Whisker Plots A Modern View of the Data. History Lesson In 1977, John Tukey published an efficient method for displaying a five-number data summary.
Describing distributions with numbers
Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Objectives 1.2 Describing distributions with numbers
1 Laugh, and the world laughs with you. Weep and you weep alone.~Shakespeare~
Copyright © 2009 Pearson Education, Inc. Chapter 5 Understanding and Comparing Distributions.
Table of Contents 1. Standard Deviation
Chapter 6 Displaying and Describing Quantitative Data © 2010 Pearson Education 1.
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
1 Further Maths Chapter 2 Summarising Numerical Data.
Slide 4-1 Copyright © 2004 Pearson Education, Inc. Dealing With a Lot of Numbers… Summarizing the data will help us when we look at large sets of quantitative.
Chapter 4 Displaying Quantitative Data. Quantitative variables Quantitative variables- record measurements or amounts of something. Must have units or.
Chapter 2 Section 5 Notes Coach Bridges
Chapter 3 Looking at Data: Distributions Chapter Three
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Understanding and Comparing Distributions.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 Describing Distributions Numerically.
Understanding and Comparing Distributions
Chapter 5: Boxplots  Objective: To find the five-number summaries of data and create and analyze boxplots CHS Statistics.
Chapter 4 Understanding and Comparing Distributions Another Useful Graphical Method: Boxplots.
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
MATH 2311 Section 1.5. Graphs and Describing Distributions Lets start with an example: Height measurements for a group of people were taken. The results.
Understanding & Comparing Distributions Chapter 5.
Chapter 4 Measures of Central Tendency Measures of Variation Measures of Position Dot Plots Stem-and-Leaf Histograms.
Copyright © 2009 Pearson Education, Inc. Slide 4- 1 Practice – Ch4 #26: A meteorologist preparing a talk about global warming compiled a list of weekly.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
Statistics Unit Test Review Chapters 11 & /11-2 Mean(average): the sum of the data divided by the number of pieces of data Median: the value appearing.
1 By maintaining a good heart at every moment, every day is a good day. If we always have good thoughts, then any time, any thing or any location is auspicious.
Chapter 4 Histograms Stem-and-Leaf Dot Plots Measures of Central Tendency Measures of Variation Measures of Position.
Probability & Statistics
Chapter 4 Understanding and Comparing Distributions.
Chapter 5 : Describing Distributions Numerically I
Describing Distributions Numerically
Understanding and Comparing Distributions
Understanding and Comparing Distributions
Understanding and Comparing Distributions
Unit 2 Section 2.5.
Objective: Given a data set, compute measures of center and spread.
Describing Distributions Numerically
The histograms represent the distribution of five different data sets, each containing 28 integers from 1 through 7. The horizontal and vertical scales.
Understanding and Comparing Distributions
2.6: Boxplots CHS Statistics
Understanding and Comparing Distributions
Lecture 2 Chapter 3. Displaying and Summarizing Quantitative Data
Describing Distributions Numerically
Understanding and Comparing Distributions
Displaying and Summarizing Quantitative Data
Displaying and Summarizing Quantitative Data
Define the following words in your own definition
Understanding and Comparing Distributions
Describing Distributions Numerically
Honors Statistics Review Chapters 4 - 5
Understanding and Comparing Distributions
Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3
MATH 2311 Section 1.5.
Presentation transcript:

Chapter 5 Understanding and Comparing Distributions Math2200

Example: The Hopkins Memorial Forest A 2500-acre reserve in Massachusetts, New York, Vermont Managed by the Williams College center for Environmental Studies (CES) Average wind speed for every day in 1989 –Important for monitoring storms

Avg WindDay of YearMonth

Five-number summary Max8.670 Q Median1.900 Q Min0.200

Boxplot Invented by John W. Tukey

Constructing Boxplots 1.Draw a single vertical axis spanning the range of the data. Draw short horizontal lines at the lower and upper quartiles and at the median. Then connect them with vertical lines to form a box.

Constructing Boxplots (cont.) 2.Erect “fences” around the main part of the data. –The upper fence is 1.5 IQRs above the upper quartile. –The lower fence is 1.5 IQRs below the lower quartile. –Note: the fences only help with constructing the boxplot and should not appear in the final display.

Constructing Boxplots (cont.) 3.Use the fences to grow “whiskers.” –Draw lines from the ends of the box up and down to the most extreme data values found within the fences. –If a data value falls outside one of the fences, we do not connect it with a whisker.

Constructing Boxplots (cont.) 4.Add the outliers by displaying any data values beyond the fences with special symbols. –We often ( not always ) use a different symbol for “far outliers” that are farther than 3 IQRs from the quartiles.

How to make a boxplot? Draw a single vertical axis spanning the extent of the data Draw short horizontal lines at the Q1, median, Q3. Then connect them to make a box. Draw ‘fences’ –Upper fence = Q * IQR –Lower fence = Q * IQR Grow ‘whiskers’ Add outliers TI-83 can make boxplots

Comparing groups Relationship between a quantitative variable and a categorical variable –A categorical variable defines groups Is it windier in the winter or summer? –A binary categorical variable Spring/Summer: April -- September Fall/Winter: October – March –A quantitative variable: average wind speed

Comparison Spring/SummerFall/Winter shape modeunimodal symmetryskewed to the rightless skewed outliernoyes center mean median spread StdDev IQR

Are some months windier than others?

Summary Average wind speed is lower and less variable in the summer, especially July Average wind speed is higher and more variable in the winter The highest winder speed occurs in November More outliers than when plotting for the entire year

Outliers Some outliers are obviously errors –Misplacing the decimal point –Digit transposed –Digits repeated or omitted –Units may be wrong –Incorrectly copied What to do with outliers? –If there are any clear outliers and you are reporting the mean and standard deviation, report them with the outliers present and with the outliers removed. The differences may be quite revealing. –Note: The median and IQR are not likely to be affected by the outliers.

Timeplots For some data sets, we are interested in how the data behave over time. In these cases, we construct timeplots of the data.

Timeplots

Re-expressing Skewed Data to Improve Symmetry When data are skewed, it is hard to simply summarize with a center and spread. Can we transform the data to be more symmetric? Histogram of the annual compensation to CEOs of the Fortune 500 companies in 2005

Re-expressing Skewed Data to Improve Symmetry (cont.) One way to make a skewed distribution more symmetric is to re- express or transform the data by applying a simple function (e.g., logarithmic function or square root).

Re-expressing to equalize spread across groups

After log transformation

What Can Go Wrong? Avoid inconsistent scales, either within the display or when comparing two displays. Label clearly so a reader knows what the plot displays. Beware of outliers Be careful when comparing groups that have very different spreads

What Can Go Wrong? (cont.)