Basics of histograms and frequency tables
Overview What are frequency tables and histograms? Discrete data Continuous data Why histograms are different from bar charts
Frequency tables versus histograms A table with counts of the number times each unique value appears in the dataset. Histogram: A graphical display of a frequency table, usually using bars to present counts. Note: Frequency tables are sometimes called histograms.
An example of a frequency table Frequency table: A table with counts of the number times each unique value appears in the dataset. FREQUENCY TABLE EXAMPLE: A researcher asks 100 people a survey question and wants to analyze the response. The question has 5 possible answers: a, b, c, d, and e. Data: a, b, a, a, a, c, d, e, … Value Count a 40 b 20 c 15 d 30 e 5 Notes: The values don’t have to be numerical. Here the possible values are discrete, that is, they are easy to enumerate or list. The total of the Count column is the number of data values.
Histogram for survey responses Histogram: A graphical display of a frequency table, usually using bars to present counts. FREQUENCY TABLE Value Count a 40 b 20 c 15 d 30 e 5 Note: Don’t use a bar chart directly on the data. Compute the counts first!
Continuous data Continuous data: Takes on any value within a range. EXAMPLE: A researcher measures the height in cm of 100 corn plants in a particular field a certain number of days after planting in order to predict crop yields. Input: 3.01, 5.2, 3.02, 1.101, … Issues: There are many more values… What are the unique values? Is 3.02 really different from 3.01? Solution: Bin the data
How to “bin” continuous data Binning the data means dividing up the data range into subintervals and counting how many data values fall in each subinterval. FREQUENCY TABLE Range Bin Count [0,1] 1 30 [1,2] 2 20 [2,3] 3 15 [3,4] 4 [4,5] 5 [5,6] 6 7 [6,7] [7,8] 8 Input: 3.01, 5.2, 3.02, 1.101, … 1 2 8 Bin: 3 4 5 6 7 Values: 8 1 2 3 5 6 7 4 Min: 0 Max: 8 Range: 8 – 0 = 8
Displaying the histogram Input: 3.01, 5.2, 3.02, 1.101, … FREQUENCY TABLE Range Bin Count [0,1] 1 30 [1,2] 2 20 [2,3] 3 15 [3,4] 4 [4,5] 5 [5,6] 6 7 [6,7] [7,8] 8
Histograms versus bar charts Input: 3.01, 5.2, 3.02, 1.101, … Histogram Bar chart