Presentation is loading. Please wait.

Presentation is loading. Please wait.

1.2 Describing Distributions with Numbers

Similar presentations


Presentation on theme: "1.2 Describing Distributions with Numbers"— Presentation transcript:

1 1.2 Describing Distributions with Numbers
AP Statistics

2 Ten exam scores: Create a stem plot and describe the data. Exam Scores 4 5 6 8 7 2 2 9 S skewed left Olow outlier at 45 C79 Srange: 45-98, spread of 53 Key: 4|5 = 45 The distribution of exam scores is skewed left. There appears to be a low outlier at 45. The center of the data is around 79 with a range from 45 through 98. The data contains no gaps or clusters.

3 Section 1.1 dealt with the graphical approach to data analysis, through which we gain information about the shape of a distribution. The first step in data analysis is to always look at the data. This section deals with the numerical approach to data analysis, providing information about center and spread.

4 Measures of CENTER Mean ( ) = “average value”
Example: Mean of test scores:

5 About the mean… Data contains an outlier (45) probably due to a lack of studying. Recalculate excluding that score: Therefore, the mean is sensitive to the influence of a few extreme observations and is NOT a resistant measure. 81.4

6 45 68 74 75 76 82 82 91 93 98 Median (M) = “middle value”
1. Order observations from smallest to largest. 2. If n is odd, the median is in the position (location! NOT value!). If n is even, the median is the average of the two middle values. Example: (Test Scores) The median is resistant to outliers, but the mean is NOT.

7 skewed left skewed right
One reason for choosing a particular measure of center and/or spread over another has to do with the statistic’s resistance to outliers and skewness. The relation of mean/median gives information about shape. If  If  approx. symmetric skewed left skewed right

8 To use your calculator:
Enter data in L1 LIST MATH mean(L1) LIST MATH median(L1)

9 Measures of SPREAD The two distributions have the same mean and median, but clearly are different. How? SPREAD!

10 *** median NOT included in “count” ***
3 Measures of Spread: 1. Range – difference between largest and smallest value. 2. Percentiles The median is the 50th percentile. The use of percentiles is very important when median is the measure of center. Quartiles 1st quartile (Q1) = 25th percentile = median of values below Median 2nd quartile (M) = 50th percentile = Median 3rd quartile (Q3) = 75th percentile = median of values above Median *** median NOT included in “count” ***

11 5 # Summary: [n; min, Q1, M, Q3, max.]
Example: Babe Ruth’s annual homeruns (note already sorted) n = 15 min = 22 max = 60 M = 46 Q3 = 5 # Summary: [15; 22, 35, 46, 54, 60] Q1 = M(22, 25, …, 46) = 22 M(46, 47, …, 60) = 54

12 Box Plots Utilizes 5 # summary
“ends” of rectangle are at Q1 and Q3 with center at M. “whiskers” extend from ends of rectangle to min. and max. Example: (Babe Ruth’s Homeruns)

13 Modified Box Plots—used to explicitly determine outliers
Range = Max – Min Inter Quartile Range (IQR) = Q3 – Q1 “fence”: Q1 – 1.5*(IQR) and Q *(IQR) Lower Upper Observations outside the fence are outliers; plot individually Whiskers extend to smallest/largest observations which are not outliers.

14 Example: (test scores) Calculate 5 # summary; IQR, fence [10; 45, 74, 79, 91, 98] Q1 – 1.5*(IQR) =  45 is an outlier! Q3 – 1.5*(IQR) =  No upper outliers. 74 – 1.5*(17) = 48.5 91 – 1.5*(17) = 116.5

15 3. Standard Deviation Variance (s2) – the average of the squares of the deviations of the observations from their mean Problem: squaring units! Standard Deviation – measure of spread with original units of data; square root of variance “the square root of the average squared deviation from the mean.”

16 About Standard Deviation…
The deviations display the spread of the values about their mean. Some of these deviations will be positive and some negative. In fact, the sum of the deviations of the observations from their mean will always be zero. Squaring deviations makes them all positive. The variance is the averaged squared deviation. s2 and s large  s2 and s small  observations widely spread about the mean observations close to the mean

17 Properties: measures spread about the mean, i.e. only use when mean is the measure of center s > 0. s = 0 only when there is no spread/variability, i.e. when observations have the same value. NOT resistant. Outliers can make std. dev. very large.

18 Find Std. Dev. of the ten exam scores:
Example: Find Std. Dev. of the ten exam scores: s = 15.06

19 Choosing Measures of Center and Spread
Skewed Dist’n or Dist’n with strong outliers  median and 5 # Summary Reasonably Symmetric  and s


Download ppt "1.2 Describing Distributions with Numbers"

Similar presentations


Ads by Google