Graphical Displays of Information Chapter 3.1 – Tools for Analyzing Data Mathematics of Data Management (Nelson) MDM 4U
Histograms Show: Continuous data grouped in class intervals How data is spread over a range Bin width = width of each bar Different bin widths produce different shaped distributions Bin widths should be equal Usually 5-6 bins
Histogram Example These histograms represent the same data One shows much less of the structure of the data Too many bins (bin width too small) is also a problem
Histogram Applet – Old Faithful m.html
Bin Width Calculation Bin width = (range) ÷ (number of intervals) where range = (max) – (min) Number of intervals is usually 5-6 Bins should not overlap wrong: 0-10, 10-20, 20-30, 30-40, etc. Discrete correct: 0-10, 11-20, 21-30, 31-40, etc. correct: , , , etc. Continuous correct: 0-9.9, , , , etc. correct: , , , , etc.
Mound-shaped distribution The middle interval(s) have the greatest frequency (i.e. the tallest bars) The bars get shorter as you move out to the edges. E.g. roll 2 dice 75 times
U-shaped distribution Lowest frequency in the centre, higher towards the outside E.g. height of a combined grade 1 and 6 class
Uniform distribution All bars are approximately the same height e.g. roll a die 50 times
Symmetric distribution A distribution that is the same on either side of the centre U-Shaped, Uniform and Mound-shaped Distributions are symmetric
Skewed distribution (left or right) Highest frequencies at one end Left-skewed drops off to the left E.g. the years on a handful of quarters
MSIP / Homework Define in your notes: Frequency distribution (p ) Cumulative frequency (p. 148) Relative frequency (p. 148) Complete p. 146 #1, 2, 4, 9, 11 (data in Excel file on wiki),13