2- 1 Chapter Two McGraw-Hill/Irwin © 2006 The McGraw-Hill Companies, Inc., All Rights Reserved. Describing Data: Frequency Distribution & Graphic Presentation
2- 2 ‘Organizing’ data – Remember COPAID? Frequency Distribution A Frequency Distribution is a grouping of data into mutually exclusive categories showing the number of observations in each class. Each class has an interval of $3K
2- 3 Decide on the number of classes 5 Steps to constructing a frequency distribution Determine the class interval Set the individual class limits Tally the number of items in each class Count the number of items in each class
2- 4 Example RAW DATA - 80 data points Let us ‘organize’ this raw data into different price ranges.
2- 5 Example 1 continued Step One : Step One : Decide on the number of classes using the formula 2 k > n (2 to the k rule) where k=number of classes n=number of observations There are 80 observations, so n= 2 6 = 64 & 2 7 = > 80 2 7 > 80 Therefore, we should have at least 7 classes, i.e., k=7.
2- 6 where H=highest value, L=lowest value – = $2911 Step Two Step Two: Determine the class interval or width using the formula H – L k i > = Round up for an interval of $3000. Set the lower limit of the first class at $ Guideline: Make the lower limit of the first class a multiple of the class interval. Example 1 continued
2- 7 EXAMPLE 1 continued Step Three Step Three: Set the individual class limits 2-14
2- 8 Steps Four Steps Four: Tally the number of items in each class.
2- 9 Step Five Step Five: (last step!) Count the number of items in each class. The above table is called a Frequency Distribution (one way of ‘organizing’ initial raw data – again remember COPAID). You can also present in a graph – we will see that later.
2- 10 Relative Frequency Distribution A Relative Frequency Distribution shows the fraction of observations in each class. (easier to compare classes) You can also express Relative Frequency in %ages. For the above data, it will be 10%, 28.75%, 21.25%, …
2- 11 Try out problem #6 in page 31
2- 12 Visuals used in Statistics HistogramsHistograms Frequency PolygonsFrequency Polygons Line & Bar graphsLine & Bar graphs Pie chartsPie charts Scatter diagramsScatter diagrams Contingency tablesContingency tables Pareto chartsPareto charts
2- 13 Histogram A graph with X- axis: Classes Y- axis: Frequency Histogram gives the frequency distribution of data
2- 14 Graphic Presentation of a Frequency Distribution Frequency Polygon A Frequency Polygon - a line graph connecting the points formed by the class midpoint and the class frequency.
2- 15
2- 16 Frequency polygons allow comparison of 2 more frequency distributions Use %age frequencies if actual frequencies vary widely
2- 17 Cumulative Frequency Distribution A Cumulative Frequency Distribution is used to determine how many or what proportion of the data values are below or above a certain value. Cumulative Frequency Distribution Find the price below which 25 vehicles were sold. Find the price below which half the cars were sold. Find %age of vehicles sold priced below $28500.
2- 18 Practice time! (Problem #14 Page 41)
2- 19 Line graphs are typically used to show the change or trend of a variable over time.
2- 20 Example 3 continued Another example of Line Graph
2- 21 Bar Chart A Bar Chart - useful to show data of any level of measurement. Bar Chart
2- 22 Pie Chart Pie Chart A Pie Chart is useful for displaying a relative frequency distribution. - a circle is divided proportionally to the relative frequency Along with the % of each slice, you can also show the actual values
2- 23 Scatter diagram Example The twelve days of stock prices and the overall market index on each day are given as follows: Variables must be at least interval scaled. Relationship can be positive (direct) or negative (inverse). A graph showing relationship between two variables
Price Index (000s) Scatter Diagram
2- 25 A contingency table is a cross tabulation of two variables. Contingency tables are used when one or both variables are nominal or ordinal in scale. Contingency table
2- 26 MaleFemale Smokers10060 Non- smokers Example Contingency Table GoodBadUgly Dumbo12574 As Good as It Gets Nominal Ordinal
2- 27 a type of histogram; arrange the bars from tallest to shortest used in a process improvement project Eg. If you record the reasons for a machine breaking down, you might find some problems to be more common than others. If you record the frequency of each of these reasons, you will notice that a small number of reasons will account for most of the breakdowns. Pareto Principle, or the rule. In general about 80% of the problems will result from about 20% of the causes. Pareto Chart
2- 28 Number of Defects paper particle buildup74 excessive temperature38 worn roller5 defective paper10 guides misaligned26 Pareto Chart Problems of a photocopier machine
2- 29 Number of Defects Repair Cost ($)Total Cost paper particle buildup excessive temperature worn roller defective paper guides misaligned Pareto Chart