Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 5 Understanding and Comparing Distributions.

Similar presentations


Presentation on theme: "Chapter 5 Understanding and Comparing Distributions."— Presentation transcript:

1 Chapter 5 Understanding and Comparing Distributions

2 Performance Scales: 1.Know the statistics appropriate to the shape of the data distribution. 2.Know the statistics appropriate to the shape of the data distribution to compare center (median, mean) and spread (interquartile range, standard deviation) of two or more different data sets. 3.Use statistics appropriate to the shape of the data distribution to compare center (median, mean) and spread (interquartile range, standard deviation) of two or more different data sets.MAFS.912.S-ID.1.2. 4.Adapts and applies the use of statistics appropriate to the shape of the data distribution to compare center (median, mean) and spread (interquartile range, standard deviation) of two or more different data sets in different and more complex ways.

3 Learning Goals 1.Know how to construction and analysis boxplots. 2.Be able to calculate outliers and discuss how they deviate from the overall pattern of the data. 3.Know how to compare the distribution of two or more groups by comparing their shapes, centers, and spreads. 4.Be able to display data over time: Time Plots. 5.Understand how re-expressing data can improve the symmetry of the data.

4 Learning Goal 1 Know how to construction and analysis boxplots.

5 Learning Goal 1: The Five-Number Summary The five-number summary of a distribution reports its median, quartiles, and extremes (maximum and minimum). Example: The five-number summary for the daily wind speed is: Min0.20 Q1Q1 1.15 Median1.90 Q3Q3 2.93 Max8.67

6 Learning Goal 1: The Five-Number Summary Consists of the minimum value, Q 1, the median, Q 3, and the maximum value, listed in that order. Offers a reasonably complete description of the center and spread. Calculate on the TI-84 using 1-Var Stats. Used to construct the Boxplot. Example: Five-Number Summary –1:20, 27, 34, 50, 86 –2:5, 10, 18.5, 29, 33

7 Learning Goal 1: Boxplot A boxplot is a graphical display of the five-number summary. Boxplots are useful when comparing groups. Boxplots are particularly good at pointing out outliers.

8 Learning Goal 1: Boxplot Boxplot: A Graphical display of data using 5-number summary: Minimum -- Q 1 -- Median -- Q 3 -- Maximum Example: 25% 25%

9 Learning Goal 1: Boxplots A graph of the Five-Number Summary. Can be drawn either horizontally or vertically. The box represents the IQR (middle 50%) of the data. Show less detail than histograms or stemplots, they are best used for side-by-side comparison of more than one distribution.

10 Learning Goal 1: Constructing Boxplots 1.Put the data in increasing numerical order (if it isn’t already). 2.Find the median of your set of data. 3.Consider only the values to the left of the median, the lower half of the data, and find the median of the lower half, the 1 st quartile Q 1. 4.Consider only the values to the right of the median, the upper half of the data, and find the median of the upper half, the 3 rd quartile Q 3. 5.Determine the lowest and highest values of the data. 6.Draw a scale on a number line and plot the lowest value, lower quartile, median, upper quartile, and the highest value on the number line. 7.Put a line through the lower quartile, median, and upper quartile. Then put a box around those lines. 8.Lastly draw a line from your extreme values to the box.

11 Learning Goal 1: Constructing Boxplots - Example 1.Put the date in increasing numerical order (if it isn’t already). Can use the TI-84, STATS/SortA(. Example: 100, 27, 34, 54, 59, 18, 52, 61, 78, 68, 82, 87, 85, 93, 91. Ascending order… 18, 27, 34, 52, 54, 59, 61, 68, 78, 82, 85, 87, 91, 93, 100.

12 Learning Goal 1: Constructing Boxplots - Example 2.Find the median of the data *Remember the median is the value exactly in the middle of an ordered set of numbers* 18, 27, 34, 52, 54, 59, 61, 68, 78, 82, 85, 87, 91, 93, 100.

13 Learning Goal 1: Constructing Boxplots - Example 3.Next, consider only the values to the left of the median, the lower half of the data. 18, 27, 34, 52, 54, 59, 61, 68, 78, 82, 85, 87, 91, 93, 100. Then, find the median of those numbers, the 1 st quartile Q 1. 18, 27, 34, 52, 54, 59, 61

14 Learning Goal 1: Constructing Boxplots - Example 4.Next, consider only the values to the right of the median, the upper half of the data. 18, 27, 34, 52, 54, 59, 61, 68, 78, 82, 85, 87, 91, 93, 100. Then find the median of those numbers, the 3 rd quartile Q 3. 78, 82, 85, 87, 91, 93, 100.

15 Learning Goal 1: Constructing Boxplots - Example 5.Now indicate your lowest and highest values 18, 27, 34, 52, 54, 59, 61, 68, 78, 82, 85, 87, 91, 93, 100.

16 Learning Goal 1: Constructing Boxplots - Example 6.Now we are ready to begin to draw the graph. 18, 27, 34, 52, 54, 59, 61, 68, 78, 82, 85, 87, 91, 93, 100. Plot the lowest value, lower quartile Q 1, median, upper quartile Q 3, and the highest value on a number line with an appropriate scale.

17 7.Put a line through the Lower Quartile Q 1, Median, and Upper Quartile Q 3. Then Put a box around those lines. Learning Goal 1: Constructing Boxplots - Example

18 8.Lastly draw a line from your extreme values to the box. There is your Boxplot. Don’t forget the label the horizontal scale and the graph. Also, label the scale and title the graph.

19 Learning Goal 1: Constructing Boxplots – Your Turn Given the commuting times (in minutes) of 20 randomly selected New York workers below, construct a boxplot. 103052540201015302015208515651560 4045

20 + Consider our NY travel times data. Construct a boxplot. M = 22.5 Q 3 = 42.5Q 1 = 15Min=5 103052540201015302015208515651560 4045 510 15 20 2530 40 4560 6585 Max=85 Learning Goal 1: Constructing Boxplots – Solution Don’t forget to title the graph.

21 Learning Goal 1: Shape from a Boxplot Box Plots do not display the shape of the distribution as clearly as histograms, but may be useful in determining the shape. 21 Skewed Left Skewed Right

22 Learning Goal 1: Shape from a Boxplot – Looking at the Median If the median is close to the center of the box, the distribution of the data values will be approximately symmetrical. If the median is to the left of the center of the box, the distribution of the data values will be Skewed Right. If the median is to the right of the center of the box, the distribution of the data values will be Skewed Left. Skewed Right Skewed Left

23 Learning Goal 1: Shape from a Boxplot– Looking at the Whiskers If the whiskers are approximately the same length, the distribution of the data values will be approximately symmetrical. If the right whisker is longer than the left whisker, the distribution of the data values will be Skewed Right. If the left whisker is longer than the right whisker, the distribution of the data values will be Skewed Left. Skewed Right Skewed Left

24 Learning Goal 1: Distribution Shape and Boxplot Right-SkewedLeft-SkewedSymmetric Q1Q2Q3Q1Q2Q3 Q1Q2Q3

25 Learning Goal 1: Boxplots - Problem Below you have a boxplot for the tar content of 25 different cigarettes. What is a plausible set of values for the five-number summary? a)Min = 13, Q1 = 10, Median = 12.6, Q3 = 14, Max = 15 b)Min = 1, Q1 = 8.5, Median = 12.6, Q3 = 15, Max = 17 c)Min = 1, Q1 = 8.5, Median = 11.5, Q3 = 13, Max = 15 d)Min = 8.5, Q1 = 10, Median = 11.5, Q3 = 15, Max = 17

26 Learning Goal 1: Boxplots - Problem (answer) Below you have a boxplot for the tar content of 25 different cigarettes. What is a plausible set of values for the five-number summary? a)Min = 13, Q1 = 10, Median = 12.6, Q3 = 14, Max = 15 b)Min = 1, Q1 = 8.5, Median = 12.6, Q3 = 15, Max = 17 c)Min = 1, Q1 = 8.5, Median = 11.5, Q3 = 13, Max = 15 d)Min = 8.5, Q1 = 10, Median = 11.5, Q3 = 15, Max = 17

27 Learning Goal 1: Boxplots - Problem The shape of the boxplot below can be described as: a)Bi-modal b)Left-skewed c)Right-skewed d)Symmetric e)Uniform

28 Learning Goal 1: Boxplots - Problem (answer) The shape of the boxplot below can be described as: a)Bi-modal b)Left-skewed c)Right-skewed d)Symmetric e)Uniform

29 Learning Goal 1: Boxplots - Problem What is the approximate range of the Male Wrist Girth dataset shown below? a)14.5 to 19.5 b)16.5 to 17 c)16.5 to 18 d)17 to 19.5 e)14.5 to 16.5 and 18 to 19.5

30 Learning Goal 1: Boxplots - Problem (answer) What is the approximate range of the Male Wrist Girth dataset shown below? a)14.5 to 19.5 b)16.5 to 17 c)16.5 to 18 d)17 to 19.5 e)14.5 to 16.5 and 18 to 19.5

31 Learning Goal 1: Boxplots - Problem What is the approximate interquartile range of the Male Wrist Girth dataset shown below? a)14.5 to 19.5 b)16.5 to 17 c)16.5 to 18 d)17 to 19.5 e)14.5 to 16.5 and 18 to 19.5

32 Learning Goal 1: Boxplots - Problem (answer) What is the approximate interquartile range of the Male Wrist Girth dataset shown below? a)14.5 to 19.5 b)16.5 to 17 c)16.5 to 18 d)17 to 19.5 e)14.5 to 16.5 and 18 to 19.5

33 Learning Goal 2 Be able to calculate outliers and discuss how they deviate from the overall pattern of the data.

34 Learning Goal 2: Outliers Recall that an outlier is an extremely small or extremely large data value when compared with the rest of the data values. What should we do about outliers? –Try to understand them in the context of the data. –Causes Data error Special nature to the data

35 Learning Goal 2: Outliers If there are any clear outliers and you are reporting the mean and standard deviation, report them with the outliers present and with the outliers removed. The differences may be quite revealing. Note: The median and IQR are not likely to be affected by the outliers. The following procedure allows us to check whether a data value can be considered as an outlier.

36 Learning Goal 2: Outliers IQR is used to determine if extreme values are actually outliers An observation is an outlier if it falls more than 1.5 times IQR below Q 1 or above Q 3. To test for outliers 1.Construct an upper and lower fence 2.Upper Fence = Q 3 + (1.5)IQR 3.Lower Fence = Q 1 – (1.5)IQR 4.If an observation falls outside the fences (ie. Greater than the upper fence or less than the lower fence) than it is an outlier. Indicate outliers as individual points or asterisk on a boxplot.

37 Learning Goal 2: Outliers – Example (odd set of data) Data: 20, 25, 25, 27, 28, 31, 33, 34, 36, 37, 44, 50, 59, 85, 86 Find Q1, M, Q3, IQR and any outliers. –Sort data Q 1 Q 3 20 25 25 27 28 31 33 34 36 37 44 50 59 85 86 lower half median upper half –IQR = 50 – 27 = 23 –Upper Fence = Q3 + (1.5)IQR = 50 + 34.5 = 84.5 –Lower Fence = Q1 – (1.5)IQR = 27 – 34.5 = -7.5 –Outliers 85 and 86 (greater than the upper fence)

38 Learning Goal 2: Outliers – Example (even set of data) Data 5, 7, 10, 14, 18, 19, 25, 29, 31, 33 Find Q 1, M, Q 3, IQR and outliers. Q 1 Q 3 5 7 10 14 18 19 25 29 31 33 lower half upper half median –IQR = 29 – 10 = 19 –Upper Fence = 29 + (1.5)IQR = 29 + 28.5 = 57.5 –Lower Fence = 10 – (1.5)IQR = 10 – 28.5 = -18.5 –No Outliers

39 Learning Goal 2: Outliers – Your Turn The data below represent the 20 countries with the largest number of total Olympic medals, including the United States, which had 101 medals for the 1996 Atlanta games. Determine whether the number of medals won by the United States is an outlier relative to the numbers for the other countries. Data values – 63, 65, 50, 37, 35, 41, 25, 23, 27, 21, 17, 17, 20, 19, 22, 15, 15, 15, 15, 101.

40 Learning Goal 2: Outliers – Solution The IQR = 39 – 17 = 22. Lower Fence, Q 1 – 1.5  IQR = 17 – (1.5  22) = -16. Upper Fence, Q 3 + 1.5  IQR = 39 + (1.5  22) = 72. Since, 101 > 72, the value of 101 is an outlier relative to the rest of the values in the data set. That is, the number of medals won by the United States is an outlier relative to the numbers won by the other 19 countries for the 1996 Atlanta Olympic Games.

41 Learning Goal 2: Outliers – Solution Pictorial representation for the OUTLIER of the Number of Olympic Medals Won by the United States in 1996 Atlanta Games. Lower Fence -16 Lower Fence -16 Upper Fence +72 Upper Fence +72 101 OUTLIER

42 Learning Goal 2: Outliers – Modified Boxplot Plots outliers as isolated points, where regular boxplots conceal outliers. From now on when we say “boxplot”, we mean “modified boxplot”. The modified boxplot is more useful than the boxplot. Constructing a Modified Boxplot. 1.Same as a boxplot with the exception of the “whiskers”. 2.Extend the “left whisker” to the minimum value if there are no outliers or to the last data value greater than or equal to the lower fence if there are outliers. 3.Extend the “right whisker” to the maximum value if there are no outliers or to the last data value less than or equal to the upper fence if there are outliers. 4.Outliers (either low or high) are then represented by a dot or an asterisk.

43 Learning Goal 2: Constructing Modified Boxplots – Example (vertical) 1.Draw a single vertical axis spanning the range of the data. Label and Scale. 2.Draw short horizontal lines at the lower and upper quartiles and at the median. 3.Then connect them with vertical lines to form a box.

44 Learning Goal 2: Constructing Modified Boxplots – Example (vertical) 4.Erect “fences” around the main part of the data. –The upper fence is 1.5 IQRs above the upper quartile. –The lower fence is 1.5 IQRs below the lower quartile. –Note: the fences only help with constructing the boxplot and should not appear in the final display.

45 Learning Goal 2: Constructing Modified Boxplots – Example (vertical) 5.Use the fences to grow “whiskers.” –Draw lines from the ends of the box up and down to the most extreme data values found within the fences. –If a data value falls outside one of the fences, we do not connect it with a whisker.

46 Learning Goal 2: Constructing Modified Boxplots – Example (vertical) 6.Add the outliers by displaying any data values beyond the fences with special symbols. –We often use a different symbol for “far outliers” that are farther than 3 IQRs from the quartiles.

47 Learning Goal 2: Modified Boxplots – Example Median Q1Q1 Q3Q3 Lines (“whiskers”) extend from each quartile to the most extreme value that is not an outlier Outliers

48 Learning Goal 2: Modified Boxplots – Problem Which boxplot goes with the histogram of waiting times for the bus? (a)(b)(c) The data do not show any low outliers.

49 Learning Goal 2: Constructing Modified Boxplots – Your Turn Given the commuting times (in minutes) of 20 randomly selected New York workers below, construct a boxplot (same data as earlier). 103052540201015302015208515651560 4045

50 + Consider our NY travel times data. Construct a modified boxplot. M = 22.5 Q 3 = 42.5Q 1 = 15Min=5 103052540201015302015208515651560 4045 510 15 20 2530 40 4560 6585 Max=85 Recall, this is an outlier by the 1.5 x IQR rule Learning Goal 2: Constructing Modified Boxplots – Solution

51 Learning Goal 2: More Outliers Far Outlier – Data values farther than 3 IQRs from the quartiles. Boxplot OutliersFar Outlier

52 Learning Goal 2: Boxplots – TI-84 Use the AL 2008 run data to construct a boxplot. AL runs scored: 782845811805821691765829789646671774 901714

53 Learning Goal 2: Boxplots – TI-84 Store this data as a list. Press STAT and choose the first option EDIT Enter the 14 AL data values in L1.

54 Learning Goal 2: Boxplots – TI-84 Now set up the boxplot. Exit back into the home screen. Then press STAT PLOT (2 nd and y= ). Choose Plot1. Then, turn Plot1 on. Scroll to Type and choose the boxplot icon (with outliers). It is the first option in the second row. Enter L1 for Xlist. Enter 1 for Freq. Choose a mark for outliers.

55 Learning Goal 2: Boxplots – TI-84 Now display the graph. Press ZOOM. Then select option 9: ZOOMSTAT. Press enter. Press TRACE and scroll around to see different statistics for the distribution.

56 Learning Goal 2: Boxplots – TI-84 STAT, EDIT, (enter data). STAT PLOT –Two boxplots (1 st modified boxplot, 2 nd regular boxplot) ZOOM, #9:ZoomStat Use trace key to indicate values and find outliers.

57 Learning Goal 2: Boxplots TI-84 - Your Turn Data: 20, 25, 25, 27, 28, 31, 33, 34, 36, 37, 44, 50, 59, 85, 86 Use TI-83/84

58 Learning Goal 3 Know how to compare the distribution of two or more groups by comparing their shapes, centers, and spreads.

59 Learning Goal 3: Comparing Distributions We can answer much more interesting questions about variables when we compare distributions for different groups. Some of the most interesting statistics questions involve comparing two or more groups. Always discuss shape, center, spread, and possible outliers whenever you compare distributions of a quantitative variable.

60 Learning Goal 3: Comparing Distributions When asked to compare two distributions, you must address four points: –The shape –The outliers –The center –The spread Think of the acronym SOCS to help you remember what to address.

61 Learning Goal 3: Describing and Comparing Distributions Key Words: Comparing the Distribution of a Quantitative Variable. 2 graphs SRoughly symmetric (slightly/heavily) Right- skewed (slightly/heavily) Left- skewed unimodal, bell- shaped, bimodal, gap between ? and ?. CTypical value (midpoint, median) About the same, greater than, less than SData varies from ? to ?varies more than, varies less than OPossible outlier at, possible outlier between ? No outlier.

62 Learning Goal 3: Comparing Distributions Compare the histogram and boxplot for daily wind speeds: How does each display represent the distribution? The shape of a distribution is not always evident in a boxplot. Boxplots are particularly good at pointing out outliers.

63 Learning Goal 3: Comparing Distributions - Example It is almost always more interesting to compare groups. With histograms, note the shapes, centers, and spreads of the two distributions. When using histograms to compare data sets make sure to use the same scale for both sets of data. What does this graphical display tell you?

64 Learning Goal 3: Comparing Distributions – Example Solution The shapes, centers, and spreads of these two distributions are strikingly different. During spring and summer (histogram on the left), the distribution is skewed to the right. A typical day has an average wind speed of only 1 to 2 mph, with a range between.5 and 4 mph. In the colder months (histogram on the right), the shape is less strongly skewed and more spread out. The typical wind speed is higher, between 1 and 6 mph, and days with average wind speeds above 3 mph are not unusual. There is also an outlier at about 9 mph.

65 Learning Goal 3: Comparing Distributions - Example Boxplots offer an ideal balance of information and simplicity, hiding the details while displaying the overall summary information, when comparing different groups. We often plot them side by side for groups or categories we wish to compare. What do these boxplots tell you?

66 Learning Goal 3: Comparing Distributions – Example Solution By placing the boxplots side by side, we can easily see which groups have higher medians, which have the greater IQRs, where the middle 50% of the data is located in each group, and which have the greater overall range When the boxes are placed in order, we can get a general idea of patterns in both the centers and the spreads. Equally important, we can see past any outliers in making these comparisons because they’ve been displayed separately.

67 Learning Goal 3: Comparing Distributions - Example Here are boxplots for the number of runs scored in the AL and in the NL during 2008. (Note: the plots are on the same scale for comparison purposes.) Let’s compare using our four points (SOCS).

68 Learning Goal 3: Comparing Distributions – Example Solution Shape The AL distribution is skewed slightly left (the left half of the distribution appears more spread out). The NL distribution is approximately symmetric.

69 Learning Goal 3: Comparing Distributions – Example Solution Outliers Neither distribution contains an outlier.

70 Learning Goal 3: Comparing Distributions – Example Solution Center Typically, teams score more runs in the AL, about 785, because the median for the AL distribution is higher than the median for the NL distribution, about 740.

71 Learning Goal 3: Comparing Distributions – Example Solution Spread The AL distribution is slightly more spread out because it has both a larger range, 250 vs. 225, and larger IQR, around 100 vs. 70. This indicates there is more variability among AL teams and more consistency among NL teams.

72 Learning Goal 3: Comparing Distributions – Your Turn These data represent the responses of 20 female and 20 male Statistics students to the question, “How many pairs of shoes do you have?” Construct a stemplot and compare the two distributions. 5026 31571924222338 13501334233049131551 1476512388710 1145227510357 Females Males

73 Key: 4|9 represents a student who reported having 49 pairs of shoes. Males 0 4 0 555677778 1 0000124 1 2 3 3 58 4 5 Females 333 95 4332 66 410 8 9 100 7 Both males and females have distributions that are skewed to the right, though the distribution for the males is more heavily skewed. The typical number of pairs of shoes owned by males 9 is less than the typical number of pairs of shoes owned by females 26. The number of shoes owned by females varies (from 13 to 57) more than the number of shoes owned by males (from 4 to 38). The male distribution has three possible outliers at 22, 35 and 38. The females do not have any likely outliers. Learning Goal 3: Comparing Distributions - Solution

74 Learning Goal 3: Comparing Distributions - Problem Look at the following side-by-side boxplots and compare the female and male shoulder girth. a)Females have a typically smaller shoulder girth than males. b)Females have a typically larger shoulder girth than males. c)Females and males have about the same shoulder girths.

75 Learning Goal 3: Comparing Distributions – Problem (answer) Look at the following side-by-side boxplots and compare the female and male shoulder girth. a)Females have a typically smaller shoulder girth than males. b)Females have a typically larger shoulder girth than males. c)Females and males have about the same shoulder girths.

76 Learning Goal 3: Comparing Distributions - Problem Look at the following side-by-side boxplots and compare the female and male thigh girth. a)Females have a typically smaller thigh girth than males. b)Females have a typically larger thigh girth than males. c)Females and males have about the same thigh girth.

77 Learning Goal 3: Comparing Distributions – Problem (answer) Look at the following side-by-side boxplots and compare the female and male thigh girth. a)Females have a typically smaller thigh girth than males. b)Females have a typically larger thigh girth than males. c)Females and males have about the same thigh girth.

78 Learning Goal 3: Comparing Distributions - Problem Compare the centers of Distribution A (Female Shoulder Girth) and Distribution B (Male Shoulder Girth) shown below. a)The center of Distribution A is greater than the center of Distribution B. b)The center of Distribution A is less than the center of Distribution B. c)The center of Distribution A is equal to the center of Distribution B.

79 Learning Goal 3: Comparing Distributions – Problem (answer) Compare the centers of Distribution A (Female Shoulder Girth) and Distribution B (Male Shoulder Girth) shown below. a)The center of Distribution A is greater than the center of Distribution B. b)The center of Distribution A is less than the center of Distribution B. c)The center of Distribution A is equal to the center of Distribution B.

80 Learning Goal 3: Comparing Distributions - Problem Compare the spreads of Distribution A (Female Shoulder Girth) and Distribution B (Male Shoulder Girth) shown below. a)The spread of Distribution A is greater than the spread of Distribution B. b)The spread of Distribution A is less than the spread of Distribution B. c)The spread of Distribution A is equal to the spread of Distribution B.

81 Learning Goal 3: Comparing Distributions – Problem (answer) Compare the spreads of Distribution A (Female Shoulder Girth) and Distribution B (Male Shoulder Girth) shown below. a)The spread of Distribution A is greater than the spread of Distribution B. b)The spread of Distribution A is less than the spread of Distribution B. c)The spread of Distribution A is equal to the spread of Distribution B.

82 Learning Goal 4 Be able to display data over time: Time Plots.

83 Learning Goal 4: Time Plots Used for displaying a time series, a data set collected over time. Plots each observation on the vertical scale against the time it was measured on the horizontal scale. Points are usually connected. Common patterns in the data over time, known as trends, should be noted.

84 Learning Goal 4: Time Plots - Example A Time Plot from 1995 – 2001 of the number of people worldwide who use the Internet.

85 Learning Goal 4: Constructing Time Plots To make a Time Plot – 1.Put time on the horizontal scale. 2.Put the variable being measured on the vertical scale. 3.Connect the data points by lines. Interpreting Time Plots – look for 1.An overall pattern – TREND (upward or downward) 2.Outliers – strong deviations from the overall pattern or trend.

86 Learning Goal 4: Time Plots - Patterns We describe time series by looking for an overall pattern and for striking deviations from that pattern. In a time series: A trend is a rise or fall that persists over time, despite small irregularities. A pattern that repeats itself at regular intervals of time is called seasonal variation.

87 Learning Goal 4: Describing Time Plots This time plot shows a regular pattern of yearly variations. These are seasonal variations in fresh orange pricing most likely due to similar seasonal variations in the production of fresh oranges. There is also an overall upward trend in pricing over time. It could simply be reflecting inflation trends or a more fundamental change in this industry. Retail price of fresh oranges over time

88 A picture is worth a thousand words, BUT There is nothing like hard numbers.  Look at the scales. Scales Matter How you stretch the axes and choose your scales can give a different impression. Learning Goal 4: Describing Time Plots

89 May contain outliers, deviations from the pattern. Try to explain. Interpolation – to predict a value from the pattern between known values. (within the data) Extrapolation – to predict a value by extending the pattern into the future (outside the data). Dangerous, no guarantee the pattern continues.

90 Learning Goal 4: Time Plots - Problem Which of these plots is a time plot? a)Plot A b)Plot B c)both d)neither

91 Learning Goal 4: Time Plots - Problem (answer) Which of these plots is a time plot? a)Plot A b)Plot B c)both d)neither

92 Learning Goal 4: Time Plots - Problem What type of trend does the following time plot show? a)Downward b)Upward c)No trend

93 Learning Goal 4: Time Plots - Problem (answer) What type of trend does the following time plot show? a)Downward b)Upward c)No trend

94 Learning Goal 4: Time Plots - Problem What would be the correct interpretation of the following graph? a)There is an upward trend in the data. b)There is no trend in the data. c)There is a downward trend in the data. d)The data show an existence of seasonal variation.

95 Learning Goal 4: Time Plots - Problem (answer) What would be the correct interpretation of the following graph? a)There is an upward trend in the data. b)There is no trend in the data. c)There is a downward trend in the data. d)The data show an existence of seasonal variation.

96 Learning Goal 5 Understand how re-expressing data can improve the symmetry of the data.

97 Learning Goal 5: Re-expressing Skewed Data to Improve Symmetry When the data are skewed it can be hard to summarize them simply with a center and spread, and hard to decide whether the most extreme values are outliers or just part of a stretched out tail. How can we say anything useful about such data? The secret is to re-express the data by applying a simple function (logarithms, square roots, and reciprocals) to each value.

98 Learning Goal 5: Re-expressing Skewed Data to Improve Symmetry One way to make a skewed distribution more symmetric is to re-express or transform the data by applying a simple function (e.g., logarithmic function). Note the change in skewness from the raw data (top) to the transformed data (right):

99 Assignment Chapter 5 Notes Worksheet Exercises pg. 95 – 103: #5, 9, 13, 15, 21, 23, 27, 37, 39 Read Ch-6, pg. 104 - 128


Download ppt "Chapter 5 Understanding and Comparing Distributions."

Similar presentations


Ads by Google