Quantitative Data Who? Cans of cola. What? Weight (g) of contents.

Slides:



Advertisements
Similar presentations
Describing Quantitative Variables
Advertisements

DESCRIBING DISTRIBUTION NUMERICALLY
Chapter 2 Exploring Data with Graphs and Numerical Summaries
Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
Understanding and Comparing Distributions 30 min.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
What is Statistics? Statistics is the science of collecting, analyzing, and drawing conclusions from data –Descriptive Statistics Organizing and summarizing.
Warm Up Find the mean, median, mode, range, and outliers of the following data. 11, 7, 2, 7, 6, 12, 9, 10, 8, 6, 4, 8, 8, 7, 4, 7, 8, 8, 6, 5, 9 How does.
1 Further Maths Chapter 2 Summarising Numerical Data.
Displaying Quantitative Data Graphically and Describing It Numerically AP Statistics Chapters 4 & 5.
Essential Statistics Chapter 11 Picturing Distributions with Graphs.
MATH 2311 Section 1.5. Graphs and Describing Distributions Lets start with an example: Height measurements for a group of people were taken. The results.
Module 8 Test Review. Find the following from the set of data: 6, 23, 8, 14, 21, 7, 16, 8  Five Number Summary: Answer: Min 6, Lower Quartile 7.5, Median.
Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.
Copyright © 2009 Pearson Education, Inc. Slide 4- 1 Practice – Ch4 #26: A meteorologist preparing a talk about global warming compiled a list of weekly.
Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )
Unit 3 Guided Notes. Box and Whiskers 5 Number Summary Provides a numerical Summary of a set of data The first quartile (Q 1 ) is the median of the data.
Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:
Analyzing Data Week 1. Types of Graphs Histogram Must be Quantitative Data (measurements) Make “bins”, no overlaps, no gaps. Sort data into the bins.
ALL ABOUT THAT DATA UNIT 6 DATA. LAST PAGE OF BOOK: MEAN MEDIAN MODE RANGE FOLDABLE Mean.
AP Statistics. Chapter 1 Think – Where are you going, and why? Show – Calculate and display. Tell – What have you learned? Without this step, you’re never.
All About that Data Unit 6 Data.
UNIT ONE REVIEW Exploring Data.
Exploratory Data Analysis
Quantitative Data Continued
All About that Data Unit 6 Data.
Objective: Given a data set, compute measures of center and spread.
Unit 6 Day 2 Vocabulary and Graphs Review
MATH 2311 Section 1.5.
Laugh, and the world laughs with you. Weep and you weep alone
Jeopardy Final Jeopardy Chapter 1 Chapter 2 Chapter 3 Chapter 4
The histograms represent the distribution of five different data sets, each containing 28 integers from 1 through 7. The horizontal and vertical scales.
Chapter 3 Describing Data Using Numerical Measures
Analyze Data: IQR and Outliers
CHAPTER 1: Picturing Distributions with Graphs
DAY 3 Sections 1.2 and 1.3.
Topic 5: Exploring Quantitative data
CHAPTER 1 Exploring Data
Describing Distributions of Data
Friday Lesson Do Over Statistics is all about data. There is a story to be uncovered behind the data--a story with characters, plots and problems. The.
Displaying Distributions with Graphs
Displaying and Summarizing Quantitative Data
POPULATION VS. SAMPLE Population: a collection of ALL outcomes, responses, measurements or counts that are of interest. Sample: a subset of a population.
10.5 Organizing & Displaying Date
Introduction to Summary Statistics
Displaying and Summarizing Quantitative Data
pencil, red pen, highlighter, GP notebook, graphing calculator
Chapter 1 Warm Up .
Basic Practice of Statistics - 3rd Edition
Organizing, Summarizing, &Describing Data UNIT SELF-TEST QUESTIONS
Chapter 1: Exploring Data
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
Measures of Position Section 3.3.
Day 52 – Box-and-Whisker.
Honors Statistics Review Chapters 4 - 5
CHAPTER 1 Exploring Data
Chapter 1: Exploring Data
Lesson – Teacher Notes Standard:
Chapter 1: Exploring Data
pencil, red pen, highlighter, GP notebook, graphing calculator
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Analyze Data: IQR and Outliers
Chapter 1: Exploring Data
MATH 2311 Section 1.5.
Chapter 1: Exploring Data
Presentation transcript:

Quantitative Data Who? Cans of cola. What? Weight (g) of contents. 368, 351, 355, 367, 352, 369, 370, 369 370, 355, 354, 357, 366, 353, 373, 365 355, 356, 362, 354, 353, 378, 368, 349

Weight of Contents Note that a histogram is a special kind of bar chart where the bar heights are the frequency (number of items, in this example cans) with weights that fall between the values given on the horizontal number line.

Weight of Contents If we choose to split the stem on the stem plot and have bins of width 5 g instead of 10 g we get a much different picture of the distribution of the data. We have a bi-modal shape. This is often an indication that there are two distinct phenomena involved.

Weight of Contents Who? What? Cans of cola. Weight of contents (g) Type of cola (Regular or Diet) In fact there are two different kinds of cola, regular and diet. There are twelve cans of each.

Weight of Contents Regular Cola 36 2 36* 5678899 003 37* 8 Diet Cola 34 34* 9 35 123344 35* 55567 Above are stem plots for the two types of cola. With two types of cola, we would like to compare the two types.

Comparing Distributions How do the distributions compare in terms of Shape? Center? Spread?

Comparing Groups Regular Diet Min: 362 g QL: 366.5 g Med: 368.5 g QU: 370 g Max: 378 g Diet Min: 349 g QL: 352.5 g Med: 354 g QU: 355 g Max: 357 g Here are the five number summaries for each group.

Comparing Groups From the box plots, Regular cans’ contents tend to weigh more than contents of Diet cola. Note that there is a potential outlier for the Regular cans’, the weight of 378 g is separated from the rest of the box. Where are the boundaries? Upper quartile + 1.5*IQR = 370 + 1.5*3.5 = 375.25. Note that 378 is above that upper boundary and so is highlighted. The whisker on the box extends out to the highest value below 375.25 (373 g). The lower boundary for regular is Lower quartile – 1.5*IQR = 366.5 – 5.25 = 361.25. The minimum value (362) is above this boundary and so is the end of the lower whisker. For Diet cola, no points fall outside the boundaries.

Comparing Groups Regular Diet Med: 368.5 g Mean: 368.8 g Range: 16 g IQR: 3.5 g Std dev: 4.03 g Diet Med: 354 g Mean: 353.7 g Range: 8 g IQR: 2.5 g Std dev: 2.23 g From the numerical summaries. On average, Regular cola contents has a higher weight than Diet cola. The various measures of spread give a somewhat mixed message. The sample range for Regular cola is twice that of Diet. The sample standard deviation for Regular cola is almost twice that of Diet. However, the IQR for Regular is not that much greater than that for Diet. What is happening? The one potential outlier for Regular (378) is affecting the Range and the Standard deviation but is not influencing the IQR that much. If you go back to the box plots, the widths of the boxes are fairly similar (the IQRs are similar) however the one large Regular value makes the Range and Standard deviation larger.

JMP The data table is arranged so that rows are cases (Who?) and columns are variables (What?). Before entering data into JMP answer the questions Who? and What?

JMP – Data Table Weight Type of Cola 1 368 R 2 367 11 378 12 13 351 D 14 355 23 353 24 349 There are 24 cases (cans of cola) and so there are 24 rows of data. There are two variables for each can; weight and type. Weight is a numerical (quantitative) variable (JMP puts numbers on the right of columns). Type is a categorical (character/nominal in JMP speak) variable (JMP puts characters on the left of columns).

JMP – Analyze Analyze – Distribution Y, Columns: Weight

JMP – Output Distribution Weight Stack Display Options: Horizontal Layout Histogram Options: Count Axis

JMP – Output JMP will automatically select the bins. You can change these by Right click on Weight axis; Axis Settings Minimum: 340 Maximum: 380 Increment: 10

Note that JMP computes quartiles a little differently. This is ok Note that JMP computes quartiles a little differently. This is ok. If you are asked to use JMP or given JMP output, use the values that JMP gives you. Rounding summaries. Round summaries to one more decimal place than what the original data have. For the standard deviation, round to two more decimal places than the original data. Round final answers only. If you wanted to compute the sample standard deviation by hand (why I have no idea) you should not round any intermediate steps in the calculation to avoid round off error.

JMP – Analyze Analyze – Distribution Y, Columns: Weight By: Type of Cola

JMP – Output Distribution Weight Uniform Scaling Stack Display Options: Horizontal Layout Histogram Options: Count Axis

Note that JMP computes quartiles a little differently. This is ok Note that JMP computes quartiles a little differently. This is ok. If you are asked to use JMP or given JMP output, use the values that JMP gives you. Rounding summaries. Round summaries to one more decimal place than what the original data have. For the standard deviation, round to two more decimal places than the original data. Round final answers only. If you wanted to compute the sample standard deviation by hand (why I have no idea) you should not round any intermediate steps in the calculation to avoid round off error.

JMP – Analyze Analyze – Fit Y by X Note: Y is numerical/continuous Y, Response: Weight X, Factor: Type of Cola Note: Y is numerical/continuous X is character/nominal

JMP – Output One way analysis of Weight by Type of Cola Display Options – Box Plots, Mean Lines, Grand Mean Highlight (click on, hold down shift if more than one) potential outliers Means and Std Dev