Choosing the “Best Average”

Slides:

Advertisements

Similar presentations

Chapter 4 Exploring Numerical Data

Advertisements

Describing Quantitative Variables

Mathematics Mrs. Sharon Hampton. VOCABULARY Lower extreme: the minimum value of the data set Lower quartile: Q1 the median of the lower half of the data.

T-6 Five Number Summary. Five Number Summary Includes: Minimum: the lowest value in the data set Lower Quartile (Q1): the 25 th percentile Median: the.

Frequency Distributions, Histograms, and Related Topics.

1.3: Describing Quantitative Data with Numbers

What is Statistics? Statistics is the science of collecting, analyzing, and drawing conclusions from data –Descriptive Statistics Organizing and summarizing.

Lesson Describing Distributions with Numbers adapted from Mr. Molesky’s Statmonkey website.

Organizing Data AP Stats Chapter 1. Organizing Data Categorical Categorical Dotplot (also used for quantitative) Dotplot (also used for quantitative)

1.3 Describing Quantitative Data with Numbers Pages Objectives SWBAT: 1)Calculate measures of center (mean, median). 2)Calculate and interpret measures.

Chapter 6: Interpreting the Measures of Variability.

More Univariate Data Quantitative Graphs & Describing Distributions with Numbers.

 Boxplot  TI-83/84 Calculator  5 number summary  Do you have an outlier  Modified Boxplot.

AP Statistics Objective: Students will be able to construct and determine when to use bar charts, pie charts, and dot plots. (Histograms)

Analyzing Data Week 1. Types of Graphs Histogram Must be Quantitative Data (measurements) Make “bins”, no overlaps, no gaps. Sort data into the bins.

Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.

CHAPTER 1 Exploring Data

Understanding and Comparing Distributions

U4D3 Warmup: Find the mean (rounded to the nearest tenth) and median for the following data: 73, 50, 72, 70, 70, 84, 85, 89, 89, 70, 73, 70, 72, 74 Mean:

CHAPTER 2: Describing Distributions with Numbers

Analyzing One-Variable Data

CHAPTER 1 Exploring Data

CHAPTER 1 Exploring Data

CHAPTER 1 Exploring Data

Measures of central tendency

DAY 3 Sections 1.2 and 1.3.

Please take out Sec HW It is worth 20 points (2 pts

Click the mouse button or press the Space Bar to display the answers.

Describing Distributions of Data

Describing Distributions with Numbers

Measure of Center And Boxplot’s.

Measure of Center And Boxplot’s.

Displaying Distributions with Graphs

Displaying and Summarizing Quantitative Data

Box Plots and Outliers.

POPULATION VS. SAMPLE Population: a collection of ALL outcomes, responses, measurements or counts that are of interest. Sample: a subset of a population.

CHAPTER 1 Exploring Data

Organizing Data AP Stats Chapter 1.

Dotplots Horizontal axis with either quanitative scale or categories

Warmup - Just put on notes page

Describing Quantitative Data with Numbers

Measures of central tendency

Common Core Math I Unit 1: One-Variable Statistics Boxplots, Interquartile Range, and Outliers; Choosing Appropriate Measures.

Chapter 1: Exploring Data

Describing a Skewed Distribution Numerically

Chapter 1: Exploring Data

CHAPTER 1 Exploring Data

CHAPTER 1 Exploring Data

Measures of Position Section 3.3.

Measures of Center.

Chapter 1: Exploring Data

CHAPTER 1 Exploring Data

Chapter 1: Exploring Data

Chapter 1: Exploring Data

CHAPTER 1 Exploring Data

CHAPTER 1 Exploring Data

CHAPTER 1 Exploring Data

CHAPTER 1 Exploring Data

Chapter 1: Exploring Data

CHAPTER 1 Exploring Data

Chapter 1: Exploring Data

Chapter 1: Exploring Data

CHAPTER 1 Exploring Data

CHAPTER 1 Exploring Data

Chapter 1: Exploring Data

Chapter 1: Exploring Data

CHAPTER 1 Exploring Data

Chapter 1: Exploring Data

Lesson Plan Day 1 Lesson Plan Day 2 Lesson Plan Day 3

Chapter 1: Exploring Data

Presentation transcript:

Choosing the “Best Average” The shape of your data and the existence of any outliers may help you choose the best average:

This distribution of stolen bases is skewed right, with a median of 5, as noted on the histogram. It does not seem plausible that the balancing point (mean) is also 5. Because the distribution is stretched out to the right, the mean must be greater than 5. Think of all the extreme values that will pull the mean up.

Being able to identify the shape and center of a distribution is a great start. However, two distributions can have the same shape and center, but look quite different. Here are the dotplots that show 100 PERFORMANCES by two different bowlers. Both distributions are unimodal and symmetric, with centers around 150. However, it is important to compare the spread (variability) of the distributions.

Measuring Variability In sports, it is important to measure variability because it shows the consistency of an athlete or team. For example, if the distribution of an athlete’s PERFORMANCES has little variability, it means that he or she is very consistent. There are several ways to measure the variability of a distribution. We will focus on the range and the interquartile range.

Range The range of a distribution is the distance between the minimum value and the maximum value. Examples: The range of AL runs= 901-646= 255 runs The range of NL runs= 855-637= 218 runs We have some evidence there is less variability in the NL distribution. Range can be a bit deceptive if there is an unusually high or unusually low value in a distribution. For this reason, we often use a second measure of variability called the…

Interquartile Range The interquartile range (IQR) is a single number that measures the range of the middle half of the distribution, ignoring the values in the lowest quarter of the distribution and the values in the highest quarter of the distribution

In order to calculate the IQR, we first have to find the quartiles of the distribution, which are the values that divide the distribution into four groups of roughly the same size.

The IQR for the NL distribution is smaller than the IQR for the AL distribution. Therefore, we have evidence that there is less variability in the NL distribution.

Unusually large or small values can have a big impact on measures like the mean and range. Think about if we were going to calculate the mean salary and range of salaries for students in this classroom. Let’s say Adam Sandler finds out he is one class short of graduating high school, and that class happens to be Algebra I. He moves to LaGrange and transfers into this class. What effect would his salary have on the mean? On the range? What type of effect would it have on the median? On IQR?

Outliers Outliers are any value that falls out of the pattern of the rest of the data (unusually high or unusually low values in a distribution). Outliers can have a big effect on summary statistics, such as the mean and range.

Here are Tennessee Titan’s running back Chris Johnson’s yards for each rush during a game against the Houston Texans on September 20, 2009. Do there appear to be any outliers? The mean is brought up greatly by the two outliers. However, the median is relatively unaffected.

A measure of center or spread is resistant if it isn’t influenced by outliers. Median and interquartile range are resistant to outliers Mean and range are not resistant to outliers

The rule of thumb for an observation being an outlier is if the observation lies more than 1.5 IQR’s below the first quartile or above the third quartile. Let’s practice using Chris Johnson’s 16 rushing attempts from the September 20, 2009 game against the Texans.

Comparing Distributions When asked to compare two distributions, you must address four points: The shape The outliers The center The spread Think of the acronym SOCS to help you remember what to address.

The shape of a distribution may be difficult to determine from a boxplot. Try comparing the distance from the median to the minimum and maximum values to determine if a distribution is skewed or roughly symmetric. You will not be able to tell if a distribution is unimodal from looking at a boxplot.

Here are boxplots for the number of runs scored in the AL and in the NL during 2008. (Note: the plots are on the same scale for comparison purposes.) Let’s compare using our four points.

Shape The AL distribution is skewed slightly left (the left half of the distribution appears more spread out). The NL distribution is approximately symmetric.

Outliers Neither distribution contains an outlier.

Center Typically, teams score more runs in the AL because the median for the AL distributions is higher than the median for the NL distributions.

Spread The AL distribution is slightly more spread out because it has both a larger range and larger IQR. This indicates there is more variability among AL teams and more consistency among NL teams.

Using the TI-84 to Make Graphs and Calculate Summary Statistics As fun as it is to calculate everything by hand, the TI-84 calculator can do many of our calculations for us. The calculator can create boxplots, histograms, and calculate summary statistics.

Boxplot Let’s use our 2008 run data. Here are the numbers: AL runs scored: 782 845 811 805 821 691 765 829 789 646 671 774 901 714 NL runs scored: 720 753 855 704 747 770 712 700 750 799 799 735 637 640 779 641

The first thing we have to do is store this data as a list. Press STAT and choose the first option EDIT Enter the 14 AL data values in L1 and the 16 NL values in L2

Now we are going to set up the boxplot. Exit back into the home screen Now we are going to set up the boxplot. Exit back into the home screen. Then press STAT PLOT (2nd and y= ). Choose Plot1. Then, turn Plot1 on. Scroll to Type and choose the boxplot icon (with outliers). It is the first option in the second row. Enter L1 for Xlist. Enter 1 for Freq. Choose a mark for outliers.

Now we will display the graph. Press ZOOM Now we will display the graph. Press ZOOM. Then select option 9: ZOOMSTAT. Press enter. Press TRACE and scroll around to see different statistics for the distribution.

To see the boxplot for the NL distribution at the same time: Go back into STAT PLOT and turn on Plot2. Repeat the steps, but enter L2 for Xlist. To do this, scroll down to Xlist. Then press 2nd-2 (you will see the L2 button on top of the number 2).

Histogram Note: We can only view one histogram at a time. Start by pressing STAT PLOT. We want to turn on Plot 1. Make sure no other plots are turned on. Once in Plot1, change Type to Histogram. Enter L1 for Xlist. Keep Freq at 1.

To display the graph, press ZOOM and select the 9th option 9: ZOOMSTAT. Press TRACE to see the class boundaries and frequencies.

To change the boundaries, press WINDOW. Xmin defines where the first class begins and Xscl defines the class width. Xmax, Ymin, and Ymax define how big the window will be. To have classes of size 50 starting at 600, adjust your setting to match the example below.

Calculating Summary Statistics Make sure your run data is still stored in lists. Press STAT, scroll to the CALC menu, and choose the first option 1:1-Var Stats Next, press 2nd-1 to indicate you want the statistics for L1. Then press enter.

Here is the information given. Scroll down for additional information.

To get the data for the NL distribution, repeat the process using L2.

One iPad app that can calculate summary statistics for us is called Bstatistics Lite. Download it!!! When entering data, observations have to be separated with commas.