Quantitative Data Who? Cans of cola. What? Weight (g) of contents.

Slides:

Advertisements

Similar presentations

Describing Quantitative Variables

Advertisements

DESCRIBING DISTRIBUTION NUMERICALLY

Chapter 2 Exploring Data with Graphs and Numerical Summaries

Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.

Understanding and Comparing Distributions 30 min.

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.

Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.

Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.

What is Statistics? Statistics is the science of collecting, analyzing, and drawing conclusions from data –Descriptive Statistics Organizing and summarizing.

Warm Up Find the mean, median, mode, range, and outliers of the following data. 11, 7, 2, 7, 6, 12, 9, 10, 8, 6, 4, 8, 8, 7, 4, 7, 8, 8, 6, 5, 9 How does.

1 Further Maths Chapter 2 Summarising Numerical Data.

Displaying Quantitative Data Graphically and Describing It Numerically AP Statistics Chapters 4 & 5.

Essential Statistics Chapter 11 Picturing Distributions with Graphs.

MATH 2311 Section 1.5. Graphs and Describing Distributions Lets start with an example: Height measurements for a group of people were taken. The results.

Module 8 Test Review. Find the following from the set of data: 6, 23, 8, 14, 21, 7, 16, 8  Five Number Summary: Answer: Min 6, Lower Quartile 7.5, Median.

Chapter 5 Describing Distributions Numerically Describing a Quantitative Variable using Percentiles Percentile –A given percent of the observations are.

Copyright © 2009 Pearson Education, Inc. Slide 4- 1 Practice – Ch4 #26: A meteorologist preparing a talk about global warming compiled a list of weekly.

Probability & Statistics Box Plots. Describing Distributions Numerically Five Number Summary and Box Plots (Box & Whisker Plots )

Unit 3 Guided Notes. Box and Whiskers 5 Number Summary Provides a numerical Summary of a set of data The first quartile (Q 1 ) is the median of the data.

Describing Data Week 1 The W’s (Where do the Numbers come from?) Who: Who was measured? By Whom: Who did the measuring What: What was measured? Where:

Analyzing Data Week 1. Types of Graphs Histogram Must be Quantitative Data (measurements) Make “bins”, no overlaps, no gaps. Sort data into the bins.

ALL ABOUT THAT DATA UNIT 6 DATA. LAST PAGE OF BOOK: MEAN MEDIAN MODE RANGE FOLDABLE Mean.

AP Statistics. Chapter 1 Think – Where are you going, and why? Show – Calculate and display. Tell – What have you learned? Without this step, you’re never.

All About that Data Unit 6 Data.

UNIT ONE REVIEW Exploring Data.

Exploratory Data Analysis

Quantitative Data Continued

All About that Data Unit 6 Data.

Objective: Given a data set, compute measures of center and spread.

Unit 6 Day 2 Vocabulary and Graphs Review

MATH 2311 Section 1.5.

Laugh, and the world laughs with you. Weep and you weep alone

Jeopardy Final Jeopardy Chapter 1 Chapter 2 Chapter 3 Chapter 4

The histograms represent the distribution of five different data sets, each containing 28 integers from 1 through 7. The horizontal and vertical scales.

Chapter 3 Describing Data Using Numerical Measures

Analyze Data: IQR and Outliers

CHAPTER 1: Picturing Distributions with Graphs

DAY 3 Sections 1.2 and 1.3.

Topic 5: Exploring Quantitative data

CHAPTER 1 Exploring Data

Describing Distributions of Data

Friday Lesson Do Over Statistics is all about data. There is a story to be uncovered behind the data--a story with characters, plots and problems. The.

Displaying Distributions with Graphs

Displaying and Summarizing Quantitative Data

POPULATION VS. SAMPLE Population: a collection of ALL outcomes, responses, measurements or counts that are of interest. Sample: a subset of a population.

10.5 Organizing & Displaying Date

Introduction to Summary Statistics

Displaying and Summarizing Quantitative Data

pencil, red pen, highlighter, GP notebook, graphing calculator

Chapter 1 Warm Up .

Basic Practice of Statistics - 3rd Edition

Organizing, Summarizing, &Describing Data UNIT SELF-TEST QUESTIONS

Chapter 1: Exploring Data

Basic Practice of Statistics - 3rd Edition

Chapter 1: Exploring Data

Measures of Position Section 3.3.

Day 52 – Box-and-Whisker.

Honors Statistics Review Chapters 4 - 5

CHAPTER 1 Exploring Data

Chapter 1: Exploring Data

Lesson – Teacher Notes Standard:

Chapter 1: Exploring Data

pencil, red pen, highlighter, GP notebook, graphing calculator

Chapter 1: Exploring Data

Chapter 1: Exploring Data

Analyze Data: IQR and Outliers

Chapter 1: Exploring Data

MATH 2311 Section 1.5.

Chapter 1: Exploring Data

Presentation transcript:

Quantitative Data Who? Cans of cola. What? Weight (g) of contents. 368, 351, 355, 367, 352, 369, 370, 369 370, 355, 354, 357, 366, 353, 373, 365 355, 356, 362, 354, 353, 378, 368, 349

Weight of Contents Note that a histogram is a special kind of bar chart where the bar heights are the frequency (number of items, in this example cans) with weights that fall between the values given on the horizontal number line.

Weight of Contents If we choose to split the stem on the stem plot and have bins of width 5 g instead of 10 g we get a much different picture of the distribution of the data. We have a bi-modal shape. This is often an indication that there are two distinct phenomena involved.

Weight of Contents Who? What? Cans of cola. Weight of contents (g) Type of cola (Regular or Diet) In fact there are two different kinds of cola, regular and diet. There are twelve cans of each.

Weight of Contents Regular Cola 36 2 36* 5678899 003 37* 8 Diet Cola 34 34* 9 35 123344 35* 55567 Above are stem plots for the two types of cola. With two types of cola, we would like to compare the two types.

Comparing Distributions How do the distributions compare in terms of Shape? Center? Spread?

Comparing Groups Regular Diet Min: 362 g QL: 366.5 g Med: 368.5 g QU: 370 g Max: 378 g Diet Min: 349 g QL: 352.5 g Med: 354 g QU: 355 g Max: 357 g Here are the five number summaries for each group.

Comparing Groups From the box plots, Regular cans’ contents tend to weigh more than contents of Diet cola. Note that there is a potential outlier for the Regular cans’, the weight of 378 g is separated from the rest of the box. Where are the boundaries? Upper quartile + 1.5*IQR = 370 + 1.5*3.5 = 375.25. Note that 378 is above that upper boundary and so is highlighted. The whisker on the box extends out to the highest value below 375.25 (373 g). The lower boundary for regular is Lower quartile – 1.5*IQR = 366.5 – 5.25 = 361.25. The minimum value (362) is above this boundary and so is the end of the lower whisker. For Diet cola, no points fall outside the boundaries.

Comparing Groups Regular Diet Med: 368.5 g Mean: 368.8 g Range: 16 g IQR: 3.5 g Std dev: 4.03 g Diet Med: 354 g Mean: 353.7 g Range: 8 g IQR: 2.5 g Std dev: 2.23 g From the numerical summaries. On average, Regular cola contents has a higher weight than Diet cola. The various measures of spread give a somewhat mixed message. The sample range for Regular cola is twice that of Diet. The sample standard deviation for Regular cola is almost twice that of Diet. However, the IQR for Regular is not that much greater than that for Diet. What is happening? The one potential outlier for Regular (378) is affecting the Range and the Standard deviation but is not influencing the IQR that much. If you go back to the box plots, the widths of the boxes are fairly similar (the IQRs are similar) however the one large Regular value makes the Range and Standard deviation larger.

JMP The data table is arranged so that rows are cases (Who?) and columns are variables (What?). Before entering data into JMP answer the questions Who? and What?

JMP – Data Table Weight Type of Cola 1 368 R 2 367 11 378 12 13 351 D 14 355 23 353 24 349 There are 24 cases (cans of cola) and so there are 24 rows of data. There are two variables for each can; weight and type. Weight is a numerical (quantitative) variable (JMP puts numbers on the right of columns). Type is a categorical (character/nominal in JMP speak) variable (JMP puts characters on the left of columns).

JMP – Analyze Analyze – Distribution Y, Columns: Weight

JMP – Output Distribution Weight Stack Display Options: Horizontal Layout Histogram Options: Count Axis

JMP – Output JMP will automatically select the bins. You can change these by Right click on Weight axis; Axis Settings Minimum: 340 Maximum: 380 Increment: 10

Note that JMP computes quartiles a little differently. This is ok Note that JMP computes quartiles a little differently. This is ok. If you are asked to use JMP or given JMP output, use the values that JMP gives you. Rounding summaries. Round summaries to one more decimal place than what the original data have. For the standard deviation, round to two more decimal places than the original data. Round final answers only. If you wanted to compute the sample standard deviation by hand (why I have no idea) you should not round any intermediate steps in the calculation to avoid round off error.

JMP – Analyze Analyze – Distribution Y, Columns: Weight By: Type of Cola

JMP – Output Distribution Weight Uniform Scaling Stack Display Options: Horizontal Layout Histogram Options: Count Axis

Note that JMP computes quartiles a little differently. This is ok Note that JMP computes quartiles a little differently. This is ok. If you are asked to use JMP or given JMP output, use the values that JMP gives you. Rounding summaries. Round summaries to one more decimal place than what the original data have. For the standard deviation, round to two more decimal places than the original data. Round final answers only. If you wanted to compute the sample standard deviation by hand (why I have no idea) you should not round any intermediate steps in the calculation to avoid round off error.

JMP – Analyze Analyze – Fit Y by X Note: Y is numerical/continuous Y, Response: Weight X, Factor: Type of Cola Note: Y is numerical/continuous X is character/nominal

JMP – Output One way analysis of Weight by Type of Cola Display Options – Box Plots, Mean Lines, Grand Mean Highlight (click on, hold down shift if more than one) potential outliers Means and Std Dev