1.2 Describing Distributions with Numbers

Slides:



Advertisements
Similar presentations
AP Stat Day Days until AP Exam
Advertisements

DESCRIBING DISTRIBUTION NUMERICALLY
Descriptive Measures MARE 250 Dr. Jason Turner.
Measures of Dispersion or Measures of Variability
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
MEASURES OF SPREAD – VARIABILITY- DIVERSITY- VARIATION-DISPERSION
Looking at data: distributions - Describing distributions with numbers
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
Describing distributions with numbers
Objectives 1.2 Describing distributions with numbers
Table of Contents 1. Standard Deviation
Section 1 Topic 31 Summarising metric data: Median, IQR, and boxplots.
Describing distributions with numbers
Lecture 5 Dustin Lueker. 2 Mode - Most frequent value. Notation: Subscripted variables n = # of units in the sample N = # of units in the population x.
INVESTIGATION 1.
1 Further Maths Chapter 2 Summarising Numerical Data.
Displaying Quantitative Data Graphically and Describing It Numerically AP Statistics Chapters 4 & 5.
Chapter 5 Describing Distributions Numerically.
BPS - 5th Ed. Chapter 21 Describing Distributions with Numbers.
Numerical descriptions of distributions
Describe Quantitative Data with Numbers. Mean The most common measure of center is the ordinary arithmetic average, or mean.
AP Statistics 5 Number Summary and Boxplots. Measures of Center and Distributions For a symmetrical distribution, the mean, median and the mode are the.
Numerical descriptions of distributions
Chapter 5 : Describing Distributions Numerically I
CHAPTER 2: Describing Distributions with Numbers
CHAPTER 2: Describing Distributions with Numbers
Chapter 6 ENGR 201: Statistics for Engineers
Description of Data (Summary and Variability measures)
(12) students were asked their SAT Math scores:
Chapter 3 Describing Data Using Numerical Measures
CHAPTER 1 Exploring Data
Box and Whisker Plots Algebra 2.
DAY 3 Sections 1.2 and 1.3.
Chapter 5: Describing Distributions Numerically
1.2 Describing Distributions with Numbers
Lecture 2 Chapter 3. Displaying and Summarizing Quantitative Data
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Describing Distributions Numerically
POPULATION VS. SAMPLE Population: a collection of ALL outcomes, responses, measurements or counts that are of interest. Sample: a subset of a population.
Chapter 2 Exploring Data with Graphs and Numerical Summaries
Advanced Placement Statistics Ch 1.2: Describing Distributions
Basic Practice of Statistics - 3rd Edition
pencil, red pen, highlighter, GP notebook, graphing calculator
AP Statistics Day 4 Objective: The students will be able to describe distributions with numbers and create and interpret boxplots.
Chapter 1 Warm Up .
CHAPTER 2: Describing Distributions with Numbers
Chapter 1: Exploring Data
SYMMETRIC SKEWED LEFT SKEWED RIGHT
MATH 2400 – Ch. 2 Vocabulary Mean: the average of a set of data sum/n
Describing Distributions Numerically
Honors Statistics Review Chapters 4 - 5
CHAPTER 2: Describing Distributions with Numbers
Chapter 1: Exploring Data
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Chapter 1: Exploring Data
Chapter 1: Exploring Data
The Five-Number Summary
Chapter 1: Exploring Data
Basic Practice of Statistics - 3rd Edition
Chapter 1: Exploring Data
Describing Distributions with Numbers
CHAPTER 1 Exploring Data
pencil, red pen, highlighter, GP notebook, graphing calculator
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Chapter 1: Exploring Data
Presentation transcript:

1.2 Describing Distributions with Numbers AP Statistics

Ten exam scores: 76 82 93 45 68 74 82 91 98 Create a stem plot and describe the data. Exam Scores 4 5 6 8 7 4 5 6 2 2 9 1 3 8 S skewed left Olow outlier at 45 C79 Srange: 45-98, spread of 53 Key: 4|5 = 45 The distribution of exam scores is skewed left. There appears to be a low outlier at 45. The center of the data is around 79 with a range from 45 through 98. The data contains no gaps or clusters.

Section 1.1 dealt with the graphical approach to data analysis, through which we gain information about the shape of a distribution. The first step in data analysis is to always look at the data. This section deals with the numerical approach to data analysis, providing information about center and spread.

Measures of CENTER Mean ( ) = “average value” Example: Mean of test scores:

About the mean… Data contains an outlier (45) probably due to a lack of studying. Recalculate excluding that score: Therefore, the mean is sensitive to the influence of a few extreme observations and is NOT a resistant measure. 81.4

45 68 74 75 76 82 82 91 93 98 Median (M) = “middle value” 1. Order observations from smallest to largest. 2. If n is odd, the median is in the position (location! NOT value!). If n is even, the median is the average of the two middle values. Example: (Test Scores) 45 68 74 75 76 82 82 91 93 98 The median is resistant to outliers, but the mean is NOT.

skewed left skewed right One reason for choosing a particular measure of center and/or spread over another has to do with the statistic’s resistance to outliers and skewness. The relation of mean/median gives information about shape. If  If  approx. symmetric skewed left skewed right

To use your calculator: Enter data in L1 LIST MATH mean(L1) LIST MATH median(L1)

Measures of SPREAD The two distributions have the same mean and median, but clearly are different. How? SPREAD!

*** median NOT included in “count” *** 3 Measures of Spread: 1. Range – difference between largest and smallest value. 2. Percentiles The median is the 50th percentile. The use of percentiles is very important when median is the measure of center. Quartiles 1st quartile (Q1) = 25th percentile = median of values below Median 2nd quartile (M) = 50th percentile = Median 3rd quartile (Q3) = 75th percentile = median of values above Median *** median NOT included in “count” ***

5 # Summary: [n; min, Q1, M, Q3, max.] Example: Babe Ruth’s annual homeruns (note already sorted) 25 34 35 41 41 46 46 46 47 49 54 54 59 60 n = 15 min = 22 max = 60 M = 46 Q3 = 5 # Summary: [15; 22, 35, 46, 54, 60] Q1 = M(22, 25, …, 46) = 22 M(46, 47, …, 60) = 54

Box Plots Utilizes 5 # summary “ends” of rectangle are at Q1 and Q3 with center at M. “whiskers” extend from ends of rectangle to min. and max. Example: (Babe Ruth’s Homeruns)  

Modified Box Plots—used to explicitly determine outliers Range = Max – Min Inter Quartile Range (IQR) = Q3 – Q1 “fence”: Q1 – 1.5*(IQR) and Q3 + 1.5*(IQR) Lower Upper Observations outside the fence are outliers; plot individually Whiskers extend to smallest/largest observations which are not outliers.

Example: (test scores) Calculate 5 # summary; IQR, fence [10; 45, 74, 79, 91, 98] Q1 – 1.5*(IQR) =  45 is an outlier! Q3 – 1.5*(IQR) =  No upper outliers. 74 – 1.5*(17) = 48.5 91 – 1.5*(17) = 116.5

3. Standard Deviation Variance (s2) – the average of the squares of the deviations of the observations from their mean Problem: squaring units! Standard Deviation – measure of spread with original units of data; square root of variance “the square root of the average squared deviation from the mean.”

About Standard Deviation… The deviations display the spread of the values about their mean. Some of these deviations will be positive and some negative. In fact, the sum of the deviations of the observations from their mean will always be zero. Squaring deviations makes them all positive. The variance is the averaged squared deviation. s2 and s large  s2 and s small  observations widely spread about the mean observations close to the mean

Properties: measures spread about the mean, i.e. only use when mean is the measure of center s > 0. s = 0 only when there is no spread/variability, i.e. when observations have the same value. NOT resistant. Outliers can make std. dev. very large.

Find Std. Dev. of the ten exam scores: Example: Find Std. Dev. of the ten exam scores: 75 76 82 93 45 68 74 82 91 98 s = 15.06

Choosing Measures of Center and Spread Skewed Dist’n or Dist’n with strong outliers  median and 5 # Summary Reasonably Symmetric  and s