Download presentation
Presentation is loading. Please wait.
Published byMatthew Potter Modified over 9 years ago
1
DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 120 Chapter 2
2
2 Summation notation The Greek letter Σ (a capital sigma) is used to designate summation. Suppose a sample consists of five books and the prices of these five books are $75, $80, $35, $97, and $88. We can denote the variable price of a book by x. The price of the five books can be written as follows: Price of the first book = x 1 = $75. Price of the second book = x 2 = $80. Price of the second book = x 3 = $35. Price of the second book = x 4 = $97. Price of the second book = x 5 = $88. Subscript of x denotes the number of the book
3
3 Summation notation In the previous notation, x represents the price, and the subscript denotes a particular book. To add the prices of all five books, we have x1 + x2 + x3 + x4 + x5 = 75 + 80 + 35 + 97 + 88 = $375 Or, using the summation notation Σ (Sigma) which denotes the sum of all values. We can write the sum to be Σx = x1 + x2 + x3 + x4 + x5 = $375 In more formal way, the summation notation for 5 values should be
4
4 Summation notation Example: Suppose the ages of four managers are 35, 47, 28, and 60 years. Find a) x b) (x - 6) c) ( x) 2 d) x 2 Solution:
5
5 Summation notation
6
6 Raw data Data recorded in the sequence in which they are collected and before they are processed or ranked are called raw data. Example: Ages (in years) of 20 students selected from a university are reported in the way they are collected. The data values are recorded in the following table. Table 2.1: Ages of 20 Students 23242230192018241921 25241820192325222120
7
7 Raw data The same students were asked about their status. The responses of the sample are recorded in the following table Table 2.2: Status of 20 Students Ungrouped data set contains information on each member of a sample or population individually SSJSFJFSFJ SSFJFJSJJF
8
8 Organizing and graphing qualitative data Frequency distributions: A frequency distribution exhibits how the frequencies are distributed over various categories. It is a list or a table containing the values of a variable (or a set of ranges within which the data fall) and the corresponding frequencies with which each value occurs (or frequencies with which data fall within each range). A frequency distribution is a way to summarize data. The distribution condenses the raw data into a more useful form and allows for a quick visual interpretation of the data
9
9 Organizing and graphing qualitative data Frequency distributions: A sample of 100 students at a university were asked what they intend to do after graduation. Forty-four said they wanted to work for a private companies, 16 said they wanted to work for a federal government, 23 wanted to work for state governments, and 17 intended to start their own businesses. Table 2.3: Type of Employment Students Intended to Engage Number of studentsType of Employment 44Private companies 16Federal government 23State Government 17Own Business Sum = 100 Variable Category Frequency Frequency Column
10
10 Organizing and graphing qualitative data A frequency distribution for a qualitative data lists all categories and the number of elements that belong to each of the categories. Relative Frequency and Percentage Distributions: Percentage = (Relative frequency). 100
11
11 Organizing and graphing qualitative data Example: Determine the relative frequency and percentage distribution for the data given in Table 2.3 Table 2.4: Relative Frequency and Percentage Distribution of Type of Employment PercentageRelative FrequencyFrequencyType of Employment.44 (100) = 4444/100 =.4444Private companies.16 (100) = 1616/100 =.1616Federal government.23 (100) = 2323/100 =.2323State Government.17 (100) = 1717/100 =.1717Own Business Sum = 100Sum = 1.0Sum = 100
12
12 Organizing and graphing qualitative data Graphical presentation of qualitative data Bar graph or bar chart is a graph of bars whose heights represent the frequencies of respective categories.
13
13 Organizing and graphing qualitative data Pie chart is a circle divided into portions that represent the relative frequencies or percentages of a population or a sample belonging to different categories. Pie Chart for Table 2.3 Table 2.5: Calculating Angle Size for the Pie Chart Angle SizeRelative Frequency Type of Employment.44 (360) = 168.4.44Private companies.16 (360) = 57.6.16Federal government.23 (360) = 82.8.23State Government.17 (360) = 61.2.17Own Business Sum = 360Sum = 1.0
14
14 Organizing and graphing quantitative data Frequency distributions Table 2.6: Weekly Earnings of 100 Employees of a company Number of employees f Weekly Earnings (dollars) 9401 to 600 22601 to 800 39801 to 1000 151001 to 1200 91201 to 1400 61401 to 1600 Variable First class Frequency Of first class Frequency Column Lower limit of the sixth class Upper limit of the sixth class
15
15 Organizing and graphing quantitative data A frequency distribution for quantitative data lists all the classes and the number of values that belong to each class. Data presented in the form of a frequency distribution are called grouped data. The difference between the two boundaries of a class gives the class width or class size. Class width = Lower limit – Lower limit of previous class The class midpoint or mark is obtained by dividing the sum of the two limits of a class by two.
16
16 Organizing and graphing quantitative data Constructing frequency distribution tables We need to make three important decisions: Number of classes From 5 to 20 More classes as the size of the data set increases c = 1 + 3.3 log n Class width . Lower limit of the first class or the starting point. Any convenient number that is equal or less than the smallest value in the data set.
17
17 Organizing and graphing quantitative data Example: Construct a frequency distribution table for the following data Table 2.7: Home Runs Hit by Major League Baseball Teams During the 2002 Season Home RunsTeamHome RunsTeam 139Milwaukee152Anaheim 167Minnesota165Arizona 162Montreal164Atlanta 160New York Mets165Baltimore 223New York Yankees177Boston 205Oakland200Chicago Cubs 165Philadelphia217Chicago White Sox 142Pittsburgh169Cincinnati 175St. Louis192Cleveland 136San Diego152Colorado 198San Francisco124Detroit 152Seattle146Florida 133Tampa Bay167Houston 230Texas140Kansas City 187Toronto155Los Angeles
18
18 Organizing and graphing quantitative data Solution:
19
19 Organizing and graphing quantitative data Relative frequency and percentage points Relative frequency can be computed in the same way for qualitative data. Calculating relative frequency and percentage Percentage = (Relative frequency). 100
20
20 Organizing and graphing quantitative data Example: Calculate the relative frequencies and percentages for the home runs table. Solution:
21
21 Organizing and graphing quantitative data Example: Construct a frequency distribution table for the following data and calculate the relative frequencies and percentages. Table 2.8: Revenues (in million of dollars) for 2001 season for 15 teams in the NFL RevenuesTeamRevenueTeam 131San Diego Chargers145Miami Dolphins 130Cincinnati Bengals136New England Patriots 158Cleveland Browns131New York Jets 142Pittsburgh Steelers131Buffalo Bills 159Denver Broncos127Indianapolis Colts 138Kansas City Chief’s141Tennessee Titans 132Oakland Raiders137Jacksonville Jaguars 148Baltimore Ravens
22
22 Organizing and graphing quantitative data Solution:
23
23 Organizing and graphing quantitative data Graphing grouped data Grouped (quantitative) data can be displayed in a: Histogram. Polygon. Pie chart. A histogram is a graph in which classes are marked on the horizontal axis and the frequencies, and relative frequencies, or percentages are marked on the vertical axis in which they are represented by the heights of the bars. In the histogram, the bars are drawn adjacent to each other.
24
24 Organizing and graphing quantitative data Frequency histogram Relative frequency histogram Truncation
25
25 Organizing and graphing quantitative data A graph formed by joining the midpoints of the tops of successive bars in a histogram with straight line is called a polygon. Frequency distribution curve Polygon
26
26 Organizing and graphing quantitative data How Many Class Intervals? Many (Narrow class intervals) may yield a very jagged distribution with gaps from empty classes can give a poor indication of how frequency varies across classes Few (Wide class intervals) may compress variation too much and yield a blocky distribution can obscure important patterns of variation. (X axis labels are upper class endpoints)
27
27 Organizing and graphing quantitative data More on classes and frequency distributions Less than method is more commonly used for continuous data. Example: The following data give the average travel time from home to work (in minutes) for 50 states. The data are based on a sample survey of 700,000 household conducted by the Census Bureau.. Construct a frequency distribution table and calculate the relative frequencies and percentages for all classes. 26.724.322.523.523.426.719.823.718.222.4 26.129.221.222.5 17.717.621.727.019.7 31.219.928.622.3 16.116.023.221.921.6 23.622.715.621.9 23.821.419.622.115.4 24.222.722.620.817.120.125.524.925.421.2
28
28 Organizing and graphing quantitative data Solution:
29
29 Organizing and graphing quantitative data Example: A manufacturer of insulation randomly selects 20 winter days and records the daily high temperature 24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27 Sort raw data from low to high: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Find range: 58 - 12 = 46 Select number of classes: 5 (usually between 5 and 20) Compute class width: 10 (46/5 then round off) Count the number of values in each class
30
30 Organizing and graphing quantitative data Data from low to high: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58 Relative Frequency
31
31 Organizing and graphing quantitative data Class Midpoints No gaps between bars, since continuous data
32
32 Organizing and graphing quantitative data Single-Valued classes technique is very useful when the data set assumes only a few distinct values. Example: The administration in a large city wanted to know the distribution of vehicles owned by households in that city. A sample of 40 randomly selected households from this city produced the following data on the number of vehicles owned: 1121102115 4321520331 1112212212 3141121124
33
33 Organizing and graphing quantitative data Single-Valued classes data can also be displayed in a bar graph.
34
34 Organizing and graphing quantitative data Example: An advertiser asks 200 customers how many days per week they read the daily newspaper. Number of days read Frequency 044 124 218 316 420 522 626 730 Total200
35
35 Organizing and graphing quantitative data Relative Frequency: What proportion is in each category?
36
36 Cumulative frequency distributions Suppose we want to know how many teams hit a total of 189 or fewer home runs during 2002. A cumulative frequency distribution gives the total number of values that fall below the upper boundary of each class. Each class has the same lower limit but different upper limit. Calculating cumulative relative frequency and cumulative percentage
37
37 Cumulative frequency distributions Example: Using the same data of the home runs example, find the cumulative frequency, cumulative relative frequency, and cumulative percentages distributions. Table 2.8: Cumulative frequency, cumulative relative frequency, and cumulative percentage distribution for home runs Class limits 124 - 145 124 – 167 124 – 189 124 – 211 124 - 233 Home Runsf 124 – 1456 146 – 16713 168 – 1894 190 – 2114 212 – 2333 Total30 Cumulative Frequency 6 6 + 13 = 19 6 + 13 + 4 = 23 6 + 13 + 4 + 4 = 27 6 + 13 + 4 + 4 + 3 = 30 Cumulative Percentage Cumulative Relative Frequency 20.06/30 =.200 63.319/30 =.633 76.723/30 =.767 90.027/30 =.900 100.030/30 = 1.000
38
38 Cumulative frequency distributions An ogive is a curve drawn for the cumulative frequency distribution by joining with straight lines the dots marked above the upper boundaries of classes at heights equal to the cumulative frequencies of respective classes 0
39
39 Shapes of histograms A histogram can take any shape but the most common shapes are Symmetric : identical on both sides of its central point Skewed : the tail on one side is longer than the tail on the other side
40
40 Shapes of histograms Uniform or rectangular : same frequency occurs for all classes
41
41 Dot plot One of the simplest methods for graphing and understanding quantitative data is to create a dot plot. A horizontal axis shows the range of values for the observations. Each data point is represented by a dot placed above the axis. Dot plots can help us detect outliers (also called extreme values) in a data set. Outliers are the values that are extremely large or extremely small with respect to the rest of the data values.
42
42 Dot plot Example: The following table lists the number of runs batted in (RBIs) during the 2004 Major League Baseball playoffs by members of the Boston Red Sox team with at least one at-bat. Create a dot plot for these data.
43
43 Dot plot Solution
44
44 Stem-and-leaf display Another technique to present quantitative data but with an advantage over frequency distribution where we don’t lose information on individual observation. The idea was originated by a statistician named John Tukey during 1970 In a stem-and-leaf display of quantitative data, each value is divided into two portions – a stem and a leaf. The leaves for each stem are shown separately in a display.
45
45 Stem-and-leaf display How do I construct a stem and leaf plot? Divide each measurement into two parts: the stem (leading) and the leaf (trailing). List the stems in a column, with a vertical line to their right. For each measurement, record the leaf portion in the same row as its corresponding stem. Order the leaves from the lowest to the highest in each stem. Provide a key to your stem and leaf coding so that the reader can recreate the actual measurements if necessary
46
46 Stem-and-leaf display Example: The following are the scores of 30 college students on a statistics test. Construct a stem-and-leaf display 95938771796596805275 92506879867661817269 98579272877164778483 Solution:
47
47 Stem-and-leaf display Example: Construct a stem-and-leaf display for the following data, which give the GPA of 30 students in CBA. 2.03.13.21.51.9 2.32.63.12.53.3 2.93.03.72.52.4 2.72.52.43.03.4 2.63.82.52.72.9 2.72.82.22.72.1
48
Cross-tabulation and scatter diagrams A cross-tabulation (often called cross tab) is a tabular summary of data of two variables. They are usually presented as a contingency table in a matrix format. Whereas a frequency distribution provides the distribution of one variable. A contingency table describes the distribution of two or more variables simultaneously. Each cell shows the number of respondents that gave a specific combination of responses, that is, each cell contains a single cross tabulation. It can be used with any level of data (What are they?) Cross-Tabulation & Scatter Diagrams 48
49
In a survey of the quality rating and the meal price conducted by a consumer restaurant review agency, the following table was produced: RestaurantQuality ratingMeal Price 1Good18 2Very Good22 3Good28 4Excellent38 5Very Good33 6Good28... Cross-Tabulation & Scatter Diagrams 49
50
Quality rating is a qualitative variable with the rating categories of good, very good and excellent Meal price is a quantitative variable (from 10 to 49) Meal price Quality rating10 - 1920 - 2930 - 3940 - 49Total Good42402084 Very Good3464466150 Excellent214282266 Total781187628300 Cross-Tabulation & Scatter Diagrams 50
51
Dividing the totals in the right margin of the cross tab by the grand total provides relative and percentage frequency distribution for the quality rating variable. Try it for the meal price (column totals) Quality ratingRelative frequencyPercentage frequency Good0.2828 Very Good0.5050 Excellent0.2222 Total1.00100 Cross-Tabulation & Scatter Diagrams 51
52
Example: The following data are for 30 observations involving two qualitative variables x (A, B and C) and y (1 and 2). a) Construct a cross tabulation for the data b) Calculate the row percentages Observationxy xy xy 1A111A121C2 2B112B122B1 3B113C223C2 4C214C224A1 5B115C225B1 6C216B226C2 7B117C127C2 8C218B128A1 9A119C129B1 10B120B130B2 Cross-Tabulation & Scatter Diagrams 52
53
Solution: Cross-Tabulation & Scatter Diagrams 53
54
A scatter diagram is a graphical presentation of the relationship between two quantitative variables. A trend line is a line that provides an approximation of that relationship. Cross-Tabulation & Scatter Diagrams 54
55
Consider the advertising/sales relationship giving in the following table WeekNumber of commercials (x)Sales in $ (y) 1250 2557 3141 4354 54 6138 7563 8348 9459 10246 Cross-Tabulation & Scatter Diagrams 55
56
The following figures shows the relationship of sales and commercials to be positive (How?) Cross-Tabulation & Scatter Diagrams 56
57
Example 57 Every spring semester, the School of Business coordinates a luncheon with local business leaders for graduating seniors, their families, and their friends. Corporate sponsorship pays for the lunches of each of the seniors, but students have to purchase tickets to cover the cost of lunches served to guests they bring with them. The following histogram represents the attendance at the senior luncheon, where X is the number of guests each student invited to the luncheon and f is the number of students in each category. 1) Referring to the bar graph, how many graduating seniors attended the luncheon? A) 275B) 388C) 152D) 4 2) Referring to the bar graph, if all the tickets purchased were used, how many guests attended the luncheon? A) 388 B) 4C) 275D) 152
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.