15.1 Histograms & Frequency Distributions
Ranking: a simple way to organize a set of data list numbers from lowest to highest or highest to lowest Frequency: if values repeat it is convenient to list the number of times it occurs Ex 1) A supermarket manager studies the amount of time customers stand in line before being checked out by tracking one customer each half hour. The times in minutes the first 20 customers wait are 3, 2, 5, 2, 0, 1, 2, 4, 6, 4, 4, 8, 3, 0, 2, 1, 6, 3, 3, 1. Organize this into a set of ranked data indicating frequency. Minutes Waiting Frequency 0 2 1 3 2 4 3 4 4 3 5 1 6 2 7 0 8 1 *Note: 7 is included even though no one waited 7 minutes *Note: frequency adds to 20
Large sets of data can be grouped into classes Grouping sacrifices some specifics but makes large data sets more manageable Table is called a frequency distribution Ex 2) A careers publication surveyed 23 companies and asked the average starting salaries of jobs offered in 1989. The info is summarized in the frequency distribution. Starting Salaries Frequency $19,000 – 22,999 3 23,000 – 26,999 1 27,000 – 30,999 5 31,000 – 34,999 5 35,000 – 38,999 5 39,000 – 42,999 4 How many of the starting salaries were: less than $23,000? less than $31,000? equal to $31,000? 3 5 + 1 + 3 = 9 Cannot tell
Some notes about classes: Classes should cover equal ranges of values All data must fall into one of the classes Classes may not overlap Try to use between 6 and 15 classes Cannot tell specifics such as lowest, highest, etc.
Let’s do 7 classes covering 15 possible readings Ex 3) A health professional studying the effects of smoking on blood pressure collected the following reading of systolic blood pressure from a control group of 26 people: 150, 121, 134, 129, 165, 148, 125, 130, 182, 164, 142, 110, 177, 139, 188, 151, 190, 205, 128, 160, 125, 178, 162, 149, 156, 137. Construct a frequency distribution. Data ranges from 110 to 205 Let’s do 7 classes covering 15 possible readings Systolic Blood Pressure Frequency 105 – 119 1 120 – 134 7 135 – 149 5 150 – 164 6 165 – 179 3 180 – 194 3 195 – 209 1
Ex 4) Other ways to represent the same data: Histogram Bar graph, no spaces Classes on horizontal axis Frequency on vertical axis 8 7 6 5 4 3 2 1 105– 120– 135– 150– 165– 180– 195– 119 134 149 164 179 194 209
Ex 4) Other ways to represent the same data (continued): Frequency Polygon Line graph, no spaces Midpoint of each class on horizontal axis Frequency on vertical axis Makes a polygon by “tying” it down on each side (by plotting midpts of classes immediately below and above the distribution) 8 7 6 5 4 3 2 1 97 112 127 142 157 172 187 202 217
Record the stem once and then its associated leaves If we want to organize data without losing detail, we can use a stem-and-leaf plot Record the stem once and then its associated leaves Ex 5) An independent marketing firm gathered data on the ages of people who watched the pilot of a new TV show. The following data represents the ages of 30 viewers: Use a stem-and-leaf plot to display the data. 22 26 37 64 18 10 12 55 32 45 49 50 27 68 59 71 43 17 15 70 29 61 73 67 65 20 62 48 Stem Leaf Key: 2 | 6 represents 26 Note: If leaves are missing, you still include the stem! For example, if this data didn’t have 32 or 37, you would still write 3 for the stem but leave the leaf side blank Like this 1 2 3 4 5 6 7 0 2 5 7 8 0 2 6 7 9 2 7 3 5 8 9 0 0 5 9 1 2 4 5 7 8 0 0 1 3