Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lesson 1 - 2 Describing Distributions with Numbers adapted from Mr. Molesky’s Statmonkey website.

Similar presentations


Presentation on theme: "Lesson 1 - 2 Describing Distributions with Numbers adapted from Mr. Molesky’s Statmonkey website."— Presentation transcript:

1 Lesson 1 - 2 Describing Distributions with Numbers adapted from Mr. Molesky’s Statmonkey website

2 Measures of Spread Variability is the key to Statistics. Without variability, there would be no need for the subject. When describing data, never rely on center alone. Measures of Spread: Range - {rarely used... why?} Quartiles - InterQuartile Range {IQR=Q3-Q1} Variance and Standard Deviation {var and s x } Like Measures of Center, you must choose the most appropriate measure of spread.

3 Standard Deviation Another common measure of spread is the Standard Deviation: a measure of the “average” deviation of all observations from the mean. To calculate Standard Deviation: Calculate the mean. Determine each observation’s deviation (x - xbar). “Average” the squared-deviations by dividing the total squared deviation by (n-1). This quantity is the Variance. Square root the result to determine the Standard Deviation.

4 Standard Deviation Properties s measures spread about the mean and should be used only when the mean is used as the measure of center s = 0 only when there is no spread/variability. This happens only when all observations have the same value. Otherwise, s > 0. As the observations become more spread out about their mean, s gets larger s, like the mean x-bar, is not resistant. A few outliers can make s very large

5 Standard Deviation Variance: Standard Deviation: Example 1.16 (p.85): Metabolic Rates 1792166613621614146018671439

6 Standard Deviation 1792166613621614146018671439 x(x - x)(x - x) 2 179219236864 1666664356 1362-23856644 161414196 1460-14019600 186726771289 1439-16125921 Totals:0214870 Metabolic Rates: mean=1600 Total Squared Deviation 214870 Variance var=214870/6 var=35811.66 Standard Deviation s=√35811.66 s=189.24 cal What does this value, s, mean?

7 Example 1 Which of the following measures of spread are resistant? 1.Range 2.Variance 3.Standard Deviation Not Resistant

8 Example 2 Given the following set of data: 70, 56, 48, 48, 53, 52, 66, 48, 36, 49, 28, 35, 58, 62, 45, 60, 38, 73, 45, 51, 56, 51, 46, 39, 56, 32, 44, 60, 51, 44, 63, 50, 46, 69, 53, 70, 33, 54, 55, 52 What is the range? What is the variance? What is the standard deviation? 73-28 = 45 117.958 10.861

9 Quartiles Quartiles Q1 and Q3 represent the 25th and 75th percentiles. To find them, order data from min to max. Determine the median - average if necessary. The first quartile is the middle of the ‘bottom half’. The third quartile is the middle of the ‘top half’. 192223 26 272829303132 456874757682 919398 medQ3=29.5 Q1=23 med=79Q1Q3

10 Using the TI-83 Enter the test data into List, L1 –STAT, EDIT enter data into L1 Calculate 5 Number Summary –Hit STAT go over to CALC and select 1-Var Stats and hitt 2 nd 1 (L1) Use 2 nd Y= (STAT PLOT) to graph the box plot –Turn plot1 ON –Select BOX PLOT (4 th option, first in second row) –Xlist: L1 –Freq: 1 –Hit ZOOM 9:ZoomStat to graph the box plot Copy graph with appropriate labels and titles

11 5-Number Summary, Boxplots The 5 Number Summary provides a reasonably complete description of the center and spread of distribution We can visualize the 5 Number Summary with a boxplot. MINQ1MEDQ3MAX min=45Q1=74med=79Q3=91max=98 4550556065707580859095100 Quiz Scores Outlier?

12 Determining Outliers InterQuartile Range “IQR”: Distance between Q1 and Q3. Resistant measure of spread...only measures middle 50% of data. IQR = Q3 - Q1 {width of the “box” in a boxplot} 1.5 IQR Rule: If an observation falls more than 1.5 IQRs above Q3 or below Q1, it is an outlier. “1.5 IQR Rule” Why 1.5? According to John Tukey, 1 IQR seemed like too little and 2 IQRs seemed like too much...

13 Outliers: 1.5 IQR Rule To determine outliers: 1.Find 5 Number Summary 2.Determine IQR 3.Multiply 1.5xIQR 4.Set up “fences” A.Lower Fence: Q1-(1.5∙IQR) B.Upper Fence: Q3+(1.5∙IQR) 5.Observations “outside” the fences are outliers.

14 Outlier Example 0102030405060708090100 Spending ($) IQR=45.72-19.06 IQR=26.66 IQR=45.72-19.06 IQR=26.66 1.5IQR=1.5(26.66) 1.5IQR=39.99 1.5IQR=1.5(26.66) 1.5IQR=39.99 All data on pg 48, #1.6 outliers } fence: 45.72+39.99 = 85.71 fence: 19.06-39.99 = -20.93 {

15 Example 4 Consumer Reports did a study of ice cream bars (sigh, only vanilla flavored) in their August 1989 issue. Twenty-seven bars having a taste-test rating of at least “fair” were listed, and calories per bar was included. Calories vary quite a bit partly because bars are not of uniform size. Just how many calories should an ice cream bar contain? Construct a boxplot for the data above. 342377319353295234294286 377182310439111201182197 209147190151131151

16 Example 4 - Answer Q1 = 182Q2 = 221.5Q3 = 319 Min = 111Max = 439Range = 328 IQR = 137UF = 524.5LF = -23.5 Calories 100125150175200225250275300325350375400425450475500

17 Example 5 The weights of 20 randomly selected juniors at MSHS are recorded below: a) Construct a boxplot of the data b) Determine if there are any mild or extreme outliers c) Comment on the distribution 121126130132143137141144148205 125128131133135139141147153213

18 Example 5 - Answer Q1 = 130.5Q2 = 138Q3 = 145.5 Min = 121Max = 213Range = 92 IQR = 15UF = 168 LF = 108 Mean = 143.6 StDev = 23.91 Weight (lbs) 100110120130140150160170180190200210220 * * Extreme Outliers ( > 3 IQR from Q3) Shape: somewhat symmetric Outliers: 2 extreme outliers Center: Median = 138Spread: IQR = 15

19 Linear Transformations Variables can be measured in different units (feet vs meters, pounds vs kilograms, etc) When converting units, the measures of center and spread will change Linear Transformations (x new = a+bx) do not change the overall shape of a distribution Multiplying each observation by b multiplies both the measure of center and spread by b Adding a to each observation adds a to the measure of center, but does not affect spread If the distribution was symmetric, its transformation is symmetric. If the distribution was skewed, its transformation maintains the same skewness

20 Transformation Example 6 Using the data from example #5 –a) Change the weight from pounds to kilograms and add 2 kg (for a special band uniform) –b) Get summary statistics and compare with example 5 –c) Draw a box plot 121126130132143137141144148205 125128131133135139141147153213

21 Example 6 - Answer Convert Pounds to Kg (  0.4536 ) and add 2 Q1 = 61.19Q2 = 64.60Q3 = 68.00 Min = 56.89Max = 98.62Range = 41.73 IQR = 6.81UF = 78.22 LF = 50.98 Mean = 67.14 (143.6  0.4536 + 2) StDev = 10.84 (23.91  0.4536) 121126130132143137141144148205 125128131133135139141147153213 56.8859.1560.9761.8866.8764.1465.9667.3269.1394.99 58.760.0661.4262.3363.2465.0565.9668.6871.4098.62

22 Example 6 – Answer cont Transformation follows what we expect: Multiplying each observation by b multiplies both the measure of center and spread by b Adding a to each observation adds a to the measure of center, but does not affect spread If the distribution was symmetric, its transformation is symmetric. If the distribution was skewed, its transformation maintains the same skewness Weight (in Kg) 4550556065707580859095100105 * * Extreme Outliers ( > 3 IQR from Q3)

23 Day 2 Summary and Homework Summary –Sample variance is found by dividing by (n – 1) to keep it an unbiased (since we estimate the population mean, μ, by using the sample mean, x‾) estimator of population variance –The larger the standard deviation, the more dispersion the distribution has –Boxplots can be used to check outliers and distributions –Use comparative boxplots for two datasets –Identifying a distribution from boxplots or histograms is subjective! Homework –pg 82: prob 33; pg 89 probs 40, 41; pg 97 probs 45, 46


Download ppt "Lesson 1 - 2 Describing Distributions with Numbers adapted from Mr. Molesky’s Statmonkey website."

Similar presentations


Ads by Google