Download presentation
Presentation is loading. Please wait.
Published byPaul Boyd Modified over 8 years ago
1
Fundamentals of Business Statistics chapter2 descriptive statistics: tabular and graphical presentations
2
上 海 金 融 学 院上 海 金 融 学 院 Chapter 2 Descriptive Statistics Statistics in Practice Nowadays supermarkets become peoples’ mostly choice of shopping. Since China has taken part in the WTO, big foreign retailers come to China and open chain markets in several cities. No doubt this will enhance the competition in retailing. In 2003, a new opened supermarket in order to suit the new environment, it not only increase the development of hardware, but also the service quality. For understanding the request of customers, the market made a random survey from 100 customers to check the service quality here. One of the question is that “What do you think about the service quality in this market? Please circle the options with ‘ .’ A. very good B. good C. common D. bad E. very bad” When the customers finished the questionnaires, they would enjoy a 95% discount for their cooperation. The following Form A is the original record of the questionnaires.
3
上 海 金 融 学 院上 海 金 融 学 院 The supermarket trained the employees to improve the quality of service according to the consumers’ survey. To find whether the sales is raised after the improvement, the supermarket do the statistics for the everyday sales in the third fiscal season of 2004, and Table B shows the result. Table A: The original record of customers’ survey
4
上 海 金 融 学 院上 海 金 融 学 院 To analyze the data above, what would you do? Table B: The original data of everyday sales
5
Chapter Goals After completing this chapter, you should be able to: Summarize Qualitative Data Frequency Distribution Relative Frequency Distribution Percent Frequency Distribution Bar Graph Pie Chart Summarize Quantitative Data Frequency Distribution Relative Frequency and Percent Frequency Distributions Dot Plot Histogram Cumulative Distributions Ogive
6
上 海 金 融 学 院上 海 金 融 学 院 Chapter Goals After completing this chapter, you should be able to: Make Exploratory Data Analysis Stem-and-Leaf Display Construct Crosstabulations and Scatter Diagrams Crosstabulations Scatter Diagrams
7
上 海 金 融 学 院上 海 金 融 学 院 Section 2.1 Summarizing Qualitative Data Frequency Distribution Relative Frequency Distribution Percent Frequency Distribution Bar Graph Pie Chart
8
上 海 金 融 学 院上 海 金 融 学 院 A frequency distribution is a tabular summary of A frequency distribution is a tabular summary of data showing the frequency (or number) of items data showing the frequency (or number) of items in each of several nonoverlapping classes. in each of several nonoverlapping classes. A frequency distribution is a tabular summary of A frequency distribution is a tabular summary of data showing the frequency (or number) of items data showing the frequency (or number) of items in each of several nonoverlapping classes. in each of several nonoverlapping classes. The objective is to provide insights about the data The objective is to provide insights about the data that cannot be quickly obtained by looking only at that cannot be quickly obtained by looking only at the original data. the original data. The objective is to provide insights about the data The objective is to provide insights about the data that cannot be quickly obtained by looking only at that cannot be quickly obtained by looking only at the original data. the original data. Frequency Distribution
9
上 海 金 融 学 院上 海 金 融 学 院 Example: Marada Inn Guests staying at Marada Inn were asked to rate the quality of their accommodations as being excellent, above average, average, below average, or poor. The ratings provided by a sample of 20 guests are: Below Average Below Average Above Average Above Average Average Average Above Average Above Average Average Average Above Average Above Average Average Average Above Average Above Average Below Average Below Average Poor Poor Excellent Excellent Above Average Above Average Average Average Above Average Above Average Below Average Below Average Poor Poor Above Average Above Average Average Average
10
上 海 金 融 学 院上 海 金 融 学 院 Frequency Distribution Poor Below Average Average Above Average Excellent 2 3 5 9 1 Total 20 RatingFrequency
11
上 海 金 融 学 院上 海 金 融 学 院 The relative frequency of a class is the fraction or The relative frequency of a class is the fraction or proportion of the total number of data items proportion of the total number of data items belonging to the class. belonging to the class. The relative frequency of a class is the fraction or The relative frequency of a class is the fraction or proportion of the total number of data items proportion of the total number of data items belonging to the class. belonging to the class. A relative frequency distribution is a tabular A relative frequency distribution is a tabular summary of a set of data showing the relative summary of a set of data showing the relative frequency for each class. frequency for each class. A relative frequency distribution is a tabular A relative frequency distribution is a tabular summary of a set of data showing the relative summary of a set of data showing the relative frequency for each class. frequency for each class. Relative Frequency Distribution
12
上 海 金 融 学 院上 海 金 融 学 院 Percent Frequency Distribution The percent frequency of a class is the relative The percent frequency of a class is the relative frequency multiplied by 100. frequency multiplied by 100. The percent frequency of a class is the relative The percent frequency of a class is the relative frequency multiplied by 100. frequency multiplied by 100. A percent frequency distribution is a tabular A percent frequency distribution is a tabular summary of a set of data showing the percent summary of a set of data showing the percent frequency for each class. frequency for each class. A percent frequency distribution is a tabular A percent frequency distribution is a tabular summary of a set of data showing the percent summary of a set of data showing the percent frequency for each class. frequency for each class.
13
上 海 金 融 学 院上 海 金 融 学 院 Relative Frequency and Percent Frequency Distributions Poor Below Average Average Above Average Excellent.10.15.25.45.05 Total 1.00 10 15 25 45 5 100 Relative RelativeFrequency Percent PercentFrequency Rating.10(100) = 10 1/20 =.05
14
上 海 金 融 学 院上 海 金 融 学 院 Bar Graph A bar graph is a graphical device for depicting A bar graph is a graphical device for depicting qualitative data. qualitative data. On one axis (usually the horizontal axis), we specify On one axis (usually the horizontal axis), we specify the labels that are used for each of the classes. the labels that are used for each of the classes. A frequency, relative frequency, or percent frequency A frequency, relative frequency, or percent frequency scale can be used for the other axis (usually the scale can be used for the other axis (usually the vertical axis). vertical axis). Using a bar of fixed width drawn above each class Using a bar of fixed width drawn above each class label, we extend the height appropriately. label, we extend the height appropriately. The bars are separated to emphasize the fact that each The bars are separated to emphasize the fact that each class is a separate category. class is a separate category.
15
上 海 金 融 学 院上 海 金 融 学 院 Poor Below Average Below Average Above Average Above Average Excellent Frequency Rating Bar Graph 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 Marada Inn Quality Ratings
16
上 海 金 融 学 院上 海 金 融 学 院 Pie Chart The pie chart is a commonly used graphical device The pie chart is a commonly used graphical device for presenting relative frequency distributions for for presenting relative frequency distributions for qualitative data. qualitative data. n First draw a circle; then use the relative frequencies to subdivide the circle frequencies to subdivide the circle into sectors that correspond to the into sectors that correspond to the relative frequency for each class. relative frequency for each class. n Since there are 360 degrees in a circle, a class with a relative frequency of.25 would a class with a relative frequency of.25 would consume.25(360) = 90 degrees of the circle. consume.25(360) = 90 degrees of the circle.
17
上 海 金 融 学 院上 海 金 融 学 院 Below Average 15% Below Average 15% Average 25% Average 25% Above Average 45% Above Average 45% Poor 10% Poor 10% Excellent 5% Excellent 5% Marada InnQuality Ratings Marada Inn Quality Ratings Pie Chart
18
上 海 金 融 学 院上 海 金 融 学 院 n Insights Gained from the Preceding Pie Chart Example: Marada Inn One-half of the customers surveyed gave Marada One-half of the customers surveyed gave Marada a quality rating of “above average” or “excellent” a quality rating of “above average” or “excellent” (looking at the left side of the pie). This might (looking at the left side of the pie). This might please the manager. please the manager. For each customer who gave an “excellent” rating, For each customer who gave an “excellent” rating, there were two customers who gave a “poor” there were two customers who gave a “poor” rating (looking at the top of the pie). This should rating (looking at the top of the pie). This should displease the manager. displease the manager.
19
上 海 金 融 学 院上 海 金 融 学 院 Section 2.2 Summarizing Quantitative Data Frequency Distribution Relative Frequency and Percent Frequency Distributions Dot Plot Histogram Cumulative Distributions Ogive
20
上 海 金 融 学 院上 海 金 融 学 院 Example: Hudson Auto Repair The manager of Hudson Auto would like to have a better understanding of the cost of parts used in the engine tune-ups performed in the shop. She examines 50 customer invoices for tune-ups. The costs of parts, rounded to the nearest dollar, are listed on the next slide.
21
上 海 金 融 学 院上 海 金 融 学 院 Example: Hudson Auto Repair n Sample of Parts Cost for 50 Tune-ups
22
上 海 金 融 学 院上 海 金 融 学 院 Frequency Distribution Guidelines for Selecting Number of Classes Use between 5 and 20 classes. Use between 5 and 20 classes. Data sets with a larger number of elements Data sets with a larger number of elements usually require a larger number of classes. usually require a larger number of classes. Smaller data sets usually require fewer classes Smaller data sets usually require fewer classes
23
上 海 金 融 学 院上 海 金 融 学 院 Frequency Distribution Guidelines for Selecting Width of Classes Use classes of equal width. Use classes of equal width. Approximate Class Width = Approximate Class Width =
24
上 海 金 融 学 院上 海 金 融 学 院 Frequency Distribution For Hudson Auto Repair, if we choose six classes: 50-59 60-69 70-79 80-89 90-99 100-109 2 13 16 7 7 5 Total 50 Parts Cost ($) Frequency Approximate Class Width = (109 - 52)/6 = 9.5 10
25
上 海 金 融 学 院上 海 金 融 学 院 Relative Frequency and Percent Frequency Distributions 50-59 50-59 60-69 60-69 70-79 70-79 80-89 80-89 90-99 90-99 100-109 100-109 Parts Cost ($).04.26.32.14.14.10 Total 1.00 Relative RelativeFrequency 4 26 32 14 14 10 100 Percent Frequency Frequency 2/50.04(100)
26
上 海 金 融 学 院上 海 金 融 学 院 Only 4% of the parts costs are in the $50-59 class. Only 4% of the parts costs are in the $50-59 class. The greatest percentage (32% or almost one-third) The greatest percentage (32% or almost one-third) of the parts costs are in the $70-79 class. of the parts costs are in the $70-79 class. 30% of the parts costs are under $70. 30% of the parts costs are under $70. 10% of the parts costs are $100 or more. 10% of the parts costs are $100 or more. n Insights Gained from the Percent Frequency Distribution Relative Frequency and Percent Frequency Distributions
27
上 海 金 融 学 院上 海 金 融 学 院 Dot Plot One of the simplest graphical summaries of data is a dot plot. A horizontal axis shows the range of data values. Then each data value is represented by a dot placed above the axis.
28
上 海 金 融 学 院上 海 金 融 学 院 5060708090100110 50 60 70 80 90 100 110 Cost ($) Dot Plot Tune-up Parts Cost
29
上 海 金 融 学 院上 海 金 融 学 院 Histogram Another common graphical presentation of Another common graphical presentation of quantitative data is a histogram. quantitative data is a histogram. The variable of interest is placed on the horizontal The variable of interest is placed on the horizontal axis. axis. A rectangle is drawn above each class interval with A rectangle is drawn above each class interval with its height corresponding to the interval’s frequency, its height corresponding to the interval’s frequency, relative frequency, or percent frequency. relative frequency, or percent frequency. Unlike a bar graph, a histogram has no natural Unlike a bar graph, a histogram has no natural separation between rectangles of adjacent classes. separation between rectangles of adjacent classes.
30
上 海 金 融 学 院上 海 金 融 学 院 Histogram 2 2 4 4 6 6 8 8 10 12 14 16 18 Parts Cost ($) Parts Cost ($) Frequency 50 59 60 69 70 79 80 89 90 99 100-110 Tune-up Parts Cost
31
上 海 金 融 学 院上 海 金 融 学 院 Symmetric Left tail is the mirror image of the right tail Examples: heights and weights of people Histogram Relative Frequency.05.10.15.20.25.30.35 0 0
32
上 海 金 融 学 院上 海 金 融 学 院 Histogram Moderately Skewed Left A longer tail to the left Example: exam scores Relative Frequency.05.10.15.20.25.30.35 0 0
33
上 海 金 融 学 院上 海 金 融 学 院 Moderately Right Skewed A Longer tail to the right Example: housing values Histogram Relative Frequency.05.10.15.20.25.30.35 0 0
34
上 海 金 融 学 院上 海 金 融 学 院 Histogram Highly Skewed Right A very long tail to the right Example: executive salaries Relative Frequency.05.10.15.20.25.30.35 0 0
35
上 海 金 融 学 院上 海 金 融 学 院 Cumulative frequency distribution shows the Cumulative frequency distribution shows the number of items with values less than or equal to number of items with values less than or equal to the upper limit of each class.. the upper limit of each class.. Cumulative frequency distribution shows the Cumulative frequency distribution shows the number of items with values less than or equal to number of items with values less than or equal to the upper limit of each class.. the upper limit of each class.. Cumulative relative frequency distribution – shows Cumulative relative frequency distribution – shows the proportion of items with values less than or the proportion of items with values less than or equal to the upper limit of each class. equal to the upper limit of each class. Cumulative relative frequency distribution – shows Cumulative relative frequency distribution – shows the proportion of items with values less than or the proportion of items with values less than or equal to the upper limit of each class. equal to the upper limit of each class. Cumulative Distributions Cumulative percent frequency distribution – shows Cumulative percent frequency distribution – shows the percentage of items with values less than or the percentage of items with values less than or equal to the upper limit of each class. equal to the upper limit of each class. Cumulative percent frequency distribution – shows Cumulative percent frequency distribution – shows the percentage of items with values less than or the percentage of items with values less than or equal to the upper limit of each class. equal to the upper limit of each class.
36
上 海 金 融 学 院上 海 金 融 学 院 Cumulative Distributions Hudson Auto Repair <59 <69 <79 <89 <99 <109 Cost ($) Cumulative CumulativeFrequency RelativeFrequency CumulativePercent Frequency Frequency 2 15 31 38 45 50.04.30.62.76.90 1.00 4 30 62 76 90 100 2 + 13 15/50.30(100)
37
上 海 金 融 学 院上 海 金 融 学 院 Ogive n An ogive is a graph of a cumulative distribution. n The data values are shown on the horizontal axis. n Shown on the vertical axis are the: cumulative frequencies, or cumulative frequencies, or cumulative relative frequencies, or cumulative relative frequencies, or cumulative percent frequencies cumulative percent frequencies n The frequency (one of the above) of each class is plotted as a point. n The plotted points are connected by straight lines.
38
上 海 金 融 学 院上 海 金 融 学 院 Because the class limits for the parts-cost data are 50- 59, 60-69, and so on, there appear to be one-unit gaps from 59 to 60, 69 to 70, and so on. Because the class limits for the parts-cost data are 50- 59, 60-69, and so on, there appear to be one-unit gaps from 59 to 60, 69 to 70, and so on. Ogive These gaps are eliminated by plotting points halfway between the class limits. These gaps are eliminated by plotting points halfway between the class limits. Thus, 59.5 is used for the 50-59 class, 69.5 is used for the 60-69 class, and so on. Thus, 59.5 is used for the 50-59 class, 69.5 is used for the 60-69 class, and so on. n Hudson Auto Repair
39
上 海 金 融 学 院上 海 金 融 学 院 Parts Parts Cost ($) Parts Parts Cost ($) 20 40 60 80 100 Cumulative Percent Frequency 50 60 70 80 90 100 110 (89.5, 76) Ogive with Cumulative Percent Frequencies Cumulative Percent Frequencies Tune-up Parts Cost
40
上 海 金 融 学 院上 海 金 融 学 院 Section 2.3 Exploratory Data Analysis stem-and-leaf display
41
上 海 金 融 学 院上 海 金 融 学 院 Exploratory Data Analysis The techniques of exploratory data analysis consist of The techniques of exploratory data analysis consist of simple arithmetic and easy-to-draw pictures that can simple arithmetic and easy-to-draw pictures that can be used to summarize data quickly. be used to summarize data quickly. One such technique is the stem-and-leaf display. One such technique is the stem-and-leaf display. 5678910 2 7 2 7 2 2 2 2 5 6 7 8 8 8 9 9 9 2 2 2 2 5 6 7 8 8 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 0 0 2 3 5 8 9 0 0 2 3 5 8 9 1 3 7 7 7 8 9 1 3 7 7 7 8 9 1 4 5 5 9 1 4 5 5 9 stems leaves
42
上 海 金 融 学 院上 海 金 融 学 院 Stem-and-Leaf Display Each digit on a stem is a leaf. Each digit on a stem is a leaf. Each line in the display is referred to as a stem. Each line in the display is referred to as a stem. To the right of the vertical line we record the last To the right of the vertical line we record the last digit for each item in rank order. digit for each item in rank order. The first digits of each data item are arranged to the The first digits of each data item are arranged to the left of a vertical line. left of a vertical line. It is similar to a histogram on its side, but it has the It is similar to a histogram on its side, but it has the advantage of showing the actual data values. advantage of showing the actual data values. A stem-and-leaf display shows both the rank order A stem-and-leaf display shows both the rank order and shape of the distribution of the data. and shape of the distribution of the data. 5678910 2 7 2 7 2 2 2 2 5 6 7 8 8 8 9 9 9 2 2 2 2 5 6 7 8 8 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 0 0 2 3 5 8 9 0 0 2 3 5 8 9 1 3 7 7 7 8 9 1 3 7 7 7 8 9 1 4 5 5 9 1 4 5 5 9 Frequency 3 3 6 6 9 9 12 15 18 21 0 0
43
上 海 金 融 学 院上 海 金 融 学 院 Example: Hudson Auto Repair The manager of Hudson Auto would like to have a better understanding of the cost of parts used in the engine tune-ups performed in the shop. She examines 50 customer invoices for tune-ups. The costs of parts, rounded to the nearest dollar, are listed on the next slide.
44
上 海 金 融 学 院上 海 金 融 学 院 Example: Hudson Auto Repair n Sample of Parts Cost for 50 Tune-ups 5 2 7 2 2 2 2 5 6 7 8 8 8 9 9 9 1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9 0 0 2 3 5 8 9 1 3 7 7 7 8 9 1 4 5 5 9 a stem a leaf 6 7 8 9 10
45
上 海 金 融 学 院上 海 金 融 学 院 Stretched Stem-and-Leaf Display Whenever a stem value is stated twice, the first value Whenever a stem value is stated twice, the first value corresponds to leaf values of 0 4, and the second corresponds to leaf values of 0 4, and the second value corresponds to leaf values of 5 9. value corresponds to leaf values of 5 9. If we believe the original stem-and-leaf display has If we believe the original stem-and-leaf display has condensed the data too much, we can stretch the condensed the data too much, we can stretch the display by using two stems for each leading digit(s). display by using two stems for each leading digit(s).
46
上 海 金 融 学 院上 海 金 融 学 院 Example: Hudson Auto Repair n Sample of Parts Cost for 50 Tune-ups 5 5 9 1 4 7 7 7 8 9 1 3 5 8 9 0 0 2 3 5 5 5 6 7 8 9 9 9 1 1 2 2 3 4 4 5 6 7 8 8 8 9 9 9 2 2 2 2 7 2 5 5 6 6 7 7 8 8 9 9 10 10
47
上 海 金 融 学 院上 海 金 融 学 院 Stem-and-Leaf Display n Leaf Units Where the leaf unit is not shown, it is assumed Where the leaf unit is not shown, it is assumed to equal 1. to equal 1. Leaf units may be 100, 10, 1, 0.1, and so on. Leaf units may be 100, 10, 1, 0.1, and so on. In the preceding example, the leaf unit was 1. In the preceding example, the leaf unit was 1. A single digit is used to define each leaf. A single digit is used to define each leaf.
48
上 海 金 融 学 院上 海 金 融 学 院 Example: Leaf Unit = 0.1 If we have data with values such as 8 9 10 11 Leaf Unit = 0.1 6 8 1 4 2 0 7 8.6 11.79.49.110.211.08.8 a stem-and-leaf display of these data will be
49
上 海 金 融 学 院上 海 金 融 学 院 Example: Leaf Unit = 10 If we have data with values such as 16 17 18 19 Leaf Unit = 10 8 1 9 0 3 1 7 1806171719741791168219101838 a stem-and-leaf display of these data will be The 82 in 1682 is rounded down to 80 and is represented as an 8.
50
上 海 金 融 学 院上 海 金 融 学 院 Section 2.4 Crosstabulations and Scatter Diagrams Crosstabulations Scatter Diagrams
51
上 海 金 融 学 院上 海 金 融 学 院 Crosstabulations and Scatter Diagrams Crosstabulation and a scatter diagram are two Crosstabulation and a scatter diagram are two methods for summarizing the data for two (or more) methods for summarizing the data for two (or more) variables simultaneously. variables simultaneously. Often a manager is interested in tabular and Often a manager is interested in tabular and graphical methods that will help understand the graphical methods that will help understand the relationship between two variables. relationship between two variables. Thus far we have focused on methods that are used Thus far we have focused on methods that are used to summarize the data for one variable at a time. to summarize the data for one variable at a time.
52
上 海 金 融 学 院上 海 金 融 学 院 Crosstabulation The left and top margin labels define the classes for The left and top margin labels define the classes for the two variables. the two variables. n Crosstabulation can be used when: one variable is qualitative and the other is one variable is qualitative and the other is quantitative, quantitative, both variables are qualitative, or both variables are qualitative, or both variables are quantitative. both variables are quantitative. A crosstabulation is a tabular summary of data for A crosstabulation is a tabular summary of data for two variables. two variables.
53
上 海 金 融 学 院上 海 金 融 学 院 Price Range Colonial Log Split A-Frame Total < $99,000 > $99,000 18 6 19 12 55 45 30 20 35 15 Total 100 12 14 16 3 Home Style Home Style Crosstabulation n Example: Finger Lakes Homes The number of Finger Lakes homes sold for each style and price for the past two years is shown below. quantitative variable variable qualitative
54
上 海 金 融 学 院上 海 金 融 学 院 Crosstabulation Insights Gained from Preceding Crosstabulation Only three homes in the sample are an A-Frame Only three homes in the sample are an A-Frame style and priced at more than $99,000. style and priced at more than $99,000. The greatest number of homes in the sample (19) The greatest number of homes in the sample (19) are a split-level style and priced at less than or are a split-level style and priced at less than or equal to $99,000. equal to $99,000.
55
上 海 金 融 学 院上 海 金 融 学 院 PriceRange Colonial Log Split A-Frame Colonial Log Split A-Frame Total < $99,000 > $99,000 18 6 19 12 5545 30 20 35 15 Total 100 12 14 16 3 Home Style Home Style Crosstabulation Frequency distribution for the price variable Frequency distribution for the home style variable
56
上 海 金 融 学 院上 海 金 融 学 院 Crosstabulation: Row or Column Percentages Converting the entries in the table into row percentages or column percentages can provide additional insight about the relationship between the two variables.
57
上 海 金 融 学 院上 海 金 融 学 院 PriceRange Colonial Log Split A-Frame Colonial Log Split A-Frame Total < $99,000 > $99,000 32.73 10.91 34.55 21.82 100100 Note: row totals are actually 100.01 due to rounding. 26.67 31.11 35.56 6.67 Home Style Home Style (Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100 Crosstabulation: Row Percentages
58
上 海 金 融 学 院上 海 金 融 学 院 PriceRange Colonial Log Split A-Frame Colonial Log Split A-Frame < $99,000 > $99,000 60.00 30.00 54.29 80.00 40.00 70.00 45.71 20.00 Home Style Home Style 100 100 100 100 Total (Colonial and > $99K)/(All Colonial) x 100 = (12/30) x 100 Crosstabulation: Column Percentages
59
上 海 金 融 学 院上 海 金 融 学 院 Crosstabulation: Simpson’s Paradox Data in two or more crosstabulations are often Data in two or more crosstabulations are often aggregated to produce a summary crosstabulation. aggregated to produce a summary crosstabulation. Verdict Common Municipal Total Common Municipal Total Pleas Court Pleas Court UpheldReversed 29(91%) 100(85%) 129 3(9%) 18(15%) 21 Judge Luckett Judge Luckett 32(100%) 118(100%) 150 Total 90(90%) 20(80%) 110 Judge Kendall Judge Kendall 100(100%) 25(100%) 125 Verdict UpheldReversed Total Common Municipal Total Common Municipal Total Pleas Court Pleas Court 10(10%) 5(20%) 15
60
上 海 金 融 学 院上 海 金 融 学 院 Crosstabulation: Simpson’s Paradox Verdict Luckett Kendall Total Luckett Kendall Total UpheldReversed 129(86%) 110(88%) 239 21(14%) 15(12%) 36 Judge Judge 150(100%) 125(100%) 275 Total Verdict Common Municipal Total Common Municipal Total Pleas Court Pleas Court UpheldReversed 29(91%) 100(85%) 129 3(9%) 18(15%) 21 Judge Luckett Judge Luckett 32(100%) 118(100%) 150 Total 90(90%) 20(80%) 110 Judge Kendall Judge Kendall 100(100%) 25(100%) 125 Verdict UpheldReversed Total Common Municipal Total Common Municipal Total Pleas Court Pleas Court 10(10%) 5(20%) 15 Who is better?
61
上 海 金 融 学 院上 海 金 融 学 院 Crosstabulation: Simpson’s Paradox Simpson’ Paradox: In some cases the conclusions Simpson’ Paradox: In some cases the conclusions based upon an aggregated crosstabulation can be based upon an aggregated crosstabulation can be completely reversed if we look at the unaggregated completely reversed if we look at the unaggregated data. suggests the overall relationship between the data. suggests the overall relationship between the variables. variables. We must be careful in drawing conclusions about the We must be careful in drawing conclusions about the relationship between the two variables in the relationship between the two variables in the aggregated crosstabulation. aggregated crosstabulation.
62
上 海 金 融 学 院上 海 金 融 学 院 The general pattern of the plotted points suggests the The general pattern of the plotted points suggests the overall relationship between the variables. overall relationship between the variables. One variable is shown on the horizontal axis and the One variable is shown on the horizontal axis and the other variable is shown on the vertical axis. other variable is shown on the vertical axis. A scatter diagram is a graphical presentation of the A scatter diagram is a graphical presentation of the relationship between two quantitative variables. relationship between two quantitative variables. Scatter Diagram and Trendline A trendline is an approximation of the relationship. A trendline is an approximation of the relationship.
63
上 海 金 融 学 院上 海 金 融 学 院 Example: Panthers Football Team Scatter Diagram The Panthers football team is interested in investigating the relationship, if any, between interceptions and points scored. 1 3 2 1 3 14 24 18 17 30 x = Number of Interceptions y = Number of Points Scored Points Scored
64
上 海 金 融 学 院上 海 金 融 学 院 Scatter Diagram y x Number of Interceptions Number of Points Scored 5 10 15 20 25 30 035 12304 13213 1424181730 x = Number of Interceptions y = Number of Points Scored Points Scored
65
上 海 金 融 学 院上 海 金 融 学 院 n Insights Gained from the Preceding Scatter Diagram The relationship is not perfect; all plotted points in The relationship is not perfect; all plotted points in the scatter diagram are not on a straight line. the scatter diagram are not on a straight line. Higher points scored are associated with a higher Higher points scored are associated with a higher number of interceptions. number of interceptions. The scatter diagram indicates a positive relationship The scatter diagram indicates a positive relationship between the number of interceptions and the between the number of interceptions and the number of points scored. number of points scored. Example: Panthers Football Team
66
上 海 金 融 学 院上 海 金 融 学 院 Scatter Diagram A Positive Relationship x y
67
上 海 金 融 学 院上 海 金 融 学 院 Scatter Diagram A Negative Relationship x y
68
上 海 金 融 学 院上 海 金 融 学 院 Scatter Diagram No Apparent Relationship x y
69
上 海 金 融 学 院上 海 金 融 学 院 Tabular and Graphical Procedures Qualitative Data Quantitative Data Tabular TabularMethods Methods Methods MethodsGraphical Methods MethodsGraphical Graphical Graphical FrequencyFrequency Distribution Distribution Rel. Freq. Dist.Rel. Freq. Dist. Percent Freq.Percent Freq. Distribution Distribution CrosstabulationCrosstabulation Bar GraphBar Graph Pie ChartPie Chart FrequencyFrequency Distribution Distribution Rel. Freq. Dist.Rel. Freq. Dist. Cum. Freq. Dist.Cum. Freq. Dist. Cum. Rel. Freq.Cum. Rel. Freq. Distribution Distribution Stem-and-LeafStem-and-Leaf Display Display CrosstabulationCrosstabulation Dot PlotDot Plot HistogramHistogram OgiveOgive ScatterScatter Diagram Diagram DataData Chapter Summary
70
上 海 金 融 学 院上 海 金 融 学 院 Homework Chapter 2 Exercises Page 41 19
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.