Histograms & Stemplots for Quantitative Data
Describing Data using Summary Features of Quantitative Variables Center — Location in middle of all data Unusual features - Outliers, gaps, clusters Spread—Measure of variability, range Shape—Distribution pattern: symmetric, skewed, uniform, bimodal, etc. CUSS in context!
Dotplot for Univariate Quantitative Data Center: about -50 Unusual features: gap at -45 Spread: 48 degrees (-69 to -21) Shape: trimodal, representing 3 seasons
Stemplot for Quantitative Data Ages of Death of U.S. First Ladies 3 | 4, 6 4 | 3 5 | 2, 4, 5, 7, 8 6 | 0, 0, 1, 2, 4, 4, 4, 5, 6, 9 7 | 0, 1, 3, 4, 6, 7, 8, 8 8 | 1, 1, 2, 3, 3, 6, 7, 8, 9, 9 9 | 7 Center: 65 years, Spread: 63 years; Shape: skewed left (towards lower numbers) Stem Leaf—a single digit 3 | 4 indicates 34 years old Key
How to make a Stemplot (Stem and Leaf Plot) Separate each observation into a stem (all but the last digit) and a leaf (the last digit) Separate each observation into a stem (all but the last digit) and a leaf (the last digit) Sometimes rounding to the nearest hundred, thousand, etc. is a good idea when there are a lot of digits to consider Sometimes rounding to the nearest hundred, thousand, etc. is a good idea when there are a lot of digits to consider Write the stems in a vertical column in order from smallest to largest and draw a vertical line at the right of the column Write the stems in a vertical column in order from smallest to largest and draw a vertical line at the right of the column Write each leaf in the row to the right of its stem in increasing order Write each leaf in the row to the right of its stem in increasing order
Make a stemplot with the following data Joey’s first 14 quiz grades in a marking period were: Joey’s first 14 quiz grades in a marking period were: Key: 7 | 4 is score of Center: 86; Spread: 24; Shape: Uniform
Split Stemplot 1 | 7 1 | 8, 9, 9, 9, 9, 9 2 | 0, 0, 0, 0, 1, 1, 1, 1, 1, 1 2 | 2, 2, 2, 3, 3 2 | 4, 5 2 | 2 | 8 3 | 0, 1 Stem is split for every 2 leaves— (0, 1), (2, 3), (4, 5), (6, 7), and (8, 9) Age of 27 students randomly selected from Stat 303 at A&M
Split Stemplot 1 | 1 | 7, 8, 9, 9, 9, 9, 9 2 | 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 3, 3, 4 2 | 5, 8 3 | 0, 1 3 | Stem is split for every 5 leaves—(0 thru 4) and (5 thru 9) Age of 27 students randomly selected from Stat 303 at A&M
Back-to-back Stemplot Babe Ruth Roger Maris Babe Ruth Roger Maris | 0 | 8 | 0 | 8 | 1 | 3, 4, 6 | 1 | 3, 4, 6 5, 2 | 2 | 3, 6, 8 5, 2 | 2 | 3, 6, 8 5, 4 | 3 | 3, 9 5, 4 | 3 | 3, 9 9, 7, 6, 6, 6, 1, 1 | 4 9, 7, 6, 6, 6, 1, 1 | 4 9, 4, 4 | 5 | 9, 4, 4 | 5 | 0 | 6 | 1 0 | 6 | 1 Number of home runs in a season When comparing data, use comparative language! (higher, more than, etc.)
Compare Ruth & Maris Who’s Better? Who’s Better? Babe Ruth – centered higher at about 47 compared to Maris at 23 Babe Ruth – centered higher at about 47 compared to Maris at 23 Any unusual features? Any unusual features? Maris has a possible outlier at 61 Maris has a possible outlier at 61 Spread? Spread? Maris has larger spread of 53 compared to Babe’s of 38 Maris has larger spread of 53 compared to Babe’s of 38 Shape? Shape? Babe’s is mound shaped and symmetrical, while Maris’s is skewed right with the outlier Babe’s is mound shaped and symmetrical, while Maris’s is skewed right with the outlier
Histogram Variable being counted Frequency Count This bin represents the # of people whose age is at least 20 but less than 25 Centered at about 35 Skewed right Spread of 90 years
Uniform Distribution from rolling a fair six-sided die 300 times
How to make Histograms Divide the list of data into groups or classes of equal width (0-5, 5-10, 10-15, etc) Divide the list of data into groups or classes of equal width (0-5, 5-10, 10-15, etc) Scale the horizontal axis using these classes Scale the horizontal axis using these classes Count the number of individuals in each class Count the number of individuals in each class Scale the vertical axis using the counts Scale the vertical axis using the counts Draw bars representing the count for each class, so each bar has equal width Draw bars representing the count for each class, so each bar has equal width
Histograms on the calculator Enter data into List Enter data into List Choose histogram option in StatPlot Choose histogram option in StatPlot Choose the list you used for Xlist Choose the list you used for Xlist Choose 1 for Freq or a 2nd list if data is stored in two lists (values in one, frequency in another) Choose 1 for Freq or a 2nd list if data is stored in two lists (values in one, frequency in another) Zoom 9:statplot will scale it for you but check the Window to make sure you have reasonable values of min & max for both x (values) and y (frequency count). The Xscl will set the width of the bars. Zoom 9:statplot will scale it for you but check the Window to make sure you have reasonable values of min & max for both x (values) and y (frequency count). The Xscl will set the width of the bars.
Ch. 3 Test Results Centered at about 80 No unusual features Skewed left Spread of about 60
Ch. 3 Test Results Decimal point is 1 digit(s) to the right of the colon. 4 : 4 4 : 9 5 : 13 5 : : : : : : : : : : 0
Make a histogram using Babe Ruth’s data from the earlier slide