Chapter 4: Displaying Quantitative Data. Histograms Bins – equal width “piles” that we use to divide up quantitative data The bins and the counts in each.

Slides:



Advertisements
Similar presentations
Very simple to create with each dot representing a data value. Best for non continuous data but can be made for and quantitative data 2004 US Womens Soccer.
Advertisements

Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 2 Exploring Data with Graphs and Numerical Summaries Section 2.2 Graphical Summaries.
Displaying and Summarizing Quantitative Data Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Copyright © 2010 Pearson Education, Inc. Chapter 4 Displaying and Summarizing Quantitative Data.
Copyright © 2009 Pearson Education, Inc. Chapter 4 Displaying and Summarizing Quantitative Data.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 4 Displaying and Summarizing Quantitative Data.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 4- 1.
Chapter 4: Displaying Quantitative Data
Displaying & Summarizing Quantitative Data
Chapter 4: Displaying Quantitative Data
Chapter 4 Displaying and Summarizing Quantitative Data Math2200.
Copyright © 2010 Pearson Education, Inc. Chapter 4 Displaying and Summarizing Quantitative Data.
Slide 4-1 Copyright © 2004 Pearson Education, Inc. Dealing With a Lot of Numbers… Summarizing the data will help us when we look at large sets of quantitative.
Chapter 4 Displaying Quantitative Data *histograms *stem-and-leaf plots *dotplot *shape, center, spread.
Chapter 4 Displaying Quantitative Data. Quantitative variables Quantitative variables- record measurements or amounts of something. Must have units or.
. Chapter 4 Displaying Quantitative Data. . Slide 4- 2 Dealing With a Lot of Numbers… Summarizing the data will help us when we look at large sets of.
1 Chapter 4 Displaying and Summarizing Quantitative Data.
Unit 4 Statistical Analysis Data Representations.
Copyright © 2010 Pearson Education, Inc. Chapter 4 Displaying and Summarizing Quantitative Data.
Displaying Quantitative Data AP STATS NHS Mr. Unruh.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 4 Displaying and Summarizing Quantitative Data.
Chapter 3: Displaying and Summarizing Quantitative Data Part 1 Pg
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
1.2 Displaying Quantitative Data with Graphs.  Each data value is shown as a dot above its location on the number line 1.Draw a horizontal axis (a number.
AP Statistics. Chapter 1 Think – Where are you going, and why? Show – Calculate and display. Tell – What have you learned? Without this step, you’re never.
Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 4- 1.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 3, Slide 1 Chapter 3 Displaying and Summarizing Quantitative Data.
Displaying Quantitative Data
Sections TAKE OUT YOUR NOTES, Book & Do Page 8 #7-8
Describing Distributions
Displaying and Summarizing Quantitative Data
EXPLORING QUANTITATIVE DATA
Displaying and Summarizing Quantitative Data
Chapter 1: Exploring Data
Warm Up.
Unit 4 Statistical Analysis Data Representations
AP Statistics CH. 4 Displaying Quantitative Data
Displaying and Summarizing Quantitative Data
Displaying Quantitative Data
Bell Ringer Create a stem-and-leaf display using the Super Bowl data from yesterday’s example
Displaying Distributions with Graphs
Sec. 1.1 HW Review Pg. 19 Titanic Data Exploration (Excel File)
Honors Statistics Chapter 4 Part 4
Displaying and Summarizing Quantitative Data
Histograms: Earthquake Magnitudes
Give 2 examples of this type of variable.
Displaying and Summarizing Quantitative Data
Displaying Quantitative Data
Probability & Statistics Describing Quantitative Data
NUMERICAL DATA (QUANTITATIVE) CHAPTER 4.
Displaying and Summarizing Quantitative Data
Displaying and Summarizing Quantitative Data
Displaying and Summarizing Quantitative Data
Identifying key characteristics of a set of data
WARM - UP What percent of students are Females in a freshman class?
Chapter 1: Exploring Data
QUANTITATIVE DATA chapter 4 (NUMERICAL).
Chapter 1: Exploring Data
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Displaying and Summarizing Quantitative Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

Chapter 4: Displaying Quantitative Data

Histograms Bins – equal width “piles” that we use to divide up quantitative data The bins and the counts in each bin give the distribution of the quantitative variable

Enron Corporation Problem Month to Month Stock Price Change Enron Corporation was world’s biggest energy supply corporations. dominating the energy trading business Enron stock sold for about $5 a share 2000, Enron stock closed at a 52-week high of $89.75 Less than a year later it hit a low of $0.25 were there hints of trouble that might have been seen?

Monthly Price Change JanFebMarAprMayJuneJulyAugSeptOctNovDec

Histogram of Enron Data

Relative Frequency Histogram Replaces the counts on the vertical axis with the percentages of the total number of cases falling in each bin.

Stem-and-Leaf Plot Pulse Rate (8|8 means 88 beats/min) Contain all the information found in a histogram When drawn carefully, it satisfies the area principle and shows distribution Preserve the individual data values When turned, it looks roughly like the actual histogram of the data

How Many Bins? Pulse Rate (8|8 means 88 beats/min) Pulse Rate (8|8 means 88 beats/min) Too few? Too many? It’s a judgment call. Use enough to be meaningful, but not too much that the data is too spaced out.

Dotplots Simple display Places a dot along an axis for each case in the data Can be plotted horizontally with the counts on the vertical axis or vertically with the counts on the horizontal axis See graph on page 49

Quantitative Data Condition The data values of a quantitative variable whose units are known In order to create a stem-and-leaf plot, histogram, or dotplot, this condition must be met

S.O.C.S. How we describe a distribution Shape Outliers Center Spread

Shape Describe any modes Describe any symmetry Describe any tails

Shape How many “humps?” – Humps are called modes

Multimodal

Uniform

Symmetry

Tails The right tail is longer, so the data is skewed to the right. The (usually) thinner ends of a distribution are called the tails

The left tail is longer, so the data is skewed to the left.

Symmetric, Skewed Right, or Skewed Left? Neither tail is longer and the data appears to symmetric.

Gaps Help us see multiple modes May help us to notice when the data may have come from different sources or contain more than one group

Outliers? Any data that appears to not “belong” with the rest of the distribution Always refer to outliers with vague terms NEVER just “throw away” an outlier – it can be extremely important in context!! Look for gaps in the data – usually where you find the outliers

But How do I Know it’s an Outlier? Shape, gaps, and even outliers are judgment calls at this point. There are generally accepted “tests” for outliers that we statisticians have derived (we’ll see them shortly) and some graphs have clear skew, but there is some room for interpretation. Trust your eyes and what you “see” in the data!

Center It could be the “mean” or the “median” Easy description of a “typical” value and a concise summary of the whole batch of numbers When a histogram is unimodal and symmetric, it’s easy to eyeball and give a rough estimate of the center. Not so clear for other histograms (skewed, multimodal, etc). In fact, for multimodal, the center may be meaningless because it could be showing different sets of data.

Spread Variation matters – Are the data values tightly clustered around the center? – Is the data widely spread out?

Just Checking It’s often a good idea to think about what the distribution of the data set might look like before we collect the data. What do you think the distribution of the following data sets will look like? Be sure to think in terms of SOCS!

Grades of those that study and the grades of those that did not study. Just Checking

Monthly temperatures in Durham, NC.

A collection of 1000 peoples, of various ages, body masses collected.

“Think, Show, Tell” Example Let’s go back to the Kentucky Derby example from Chapter 2. We’re going to focus on the distribution of duration of race times. Think: What do we want to find out? Identify the variables and report the W’s We want to see the distribution of race times of the Kentucky Derby. We have the data from races between 1875 and Be sure to check the appropriate condition!

Kentucky Derby Revisited Show: We almost always want to make a histogram with computer software/graphing calculator when the data is quantitative. *Ask yourself – is the histogram close to what we expected?

Tell: Describe the distribution using SOCS. The main body of the distribution is bimodal and fairly symmetric with most of the data clustered between 115 and 140 seconds. There appears to be an upper outlier, indicating that one race was ran much slower than the others. It appears as though this data set describes two different data sets – one when the race was 1.5 miles and another when the race was shortened to 1.25 miles. Because of this, the center of the entire data set would probably be of little importance to us. It appears as though the spread of the data is small within each mode. This may suggest that pace did not drastically change over the 100+ years the race was tracked. Kentucky Derby Revisited

Comparing Infant Death Rates In 2001 the infant death rate in the U.S. was 6.8 deaths per 1000 live births. How does the rate differ from region to region? The Kaiser Family Foundation collected data from all 50 states and the District of Columbia, allowing us to compare the infant death rates in the Northeast and Midwest to those in the South and West.

Think: The W’s and how, but now let’s put the information into sentences: We want to compare infant death rates for regions of the United States. We have the 2001 rates for each state and the District of Columbia. Comparing Infant Death Rates

Show: The rates are quantitative, so a stem-and-leaf display is appropriate. Comparing Infant Death Rates Infant Death Rates (by state), 2001 South and West Northeast and Midwest (3|8| means 3.8 deaths per 1000 live births)

Comparing Infant Death Rates Tell: In general, infant death rates appear to have been somewhat higher for states in the South and West than in the Northeast and Midwest. The distribution is roughly symmetric, but may be slightly skewed to the right for the South and West. Nationally, most states had rates above 9. Infant death rates were more consistent in the Northeast and Midwest; no states were above 9, but one state had an unusually low 3.8 infant deaths per 1000 live births.

When Order Matters Timeplots – Do we want to see the data in a specific order? – Are we looking for patterns over time? – Time plots use the x-axis for time (year, month, day, hour, etc.) and the y-axis to plot the data points. – Often connected because time is continuous

Back to Enron What does the timeplot show that the histogram can’t? Monthly Change in Stock Price ($)

Re-expressing Skewed Data One way to make a skewed distribution more symmetric is to re-express or transform the data by applying a simple function. Often, we will transform data using logarithms to create a more symmetric distribution – Don’t worry! We won’t be doing it by hand

What Can Go Wrong? Don’t make a histogram of a categorical variable Don’t look for shape, center, and spread of a bar chart Don’t use bars in every display – save them for histograms and bar charts Choose a bin width appropriate to the data Avoid inconsistent scales Label clearly