Chapter 4: Quantitative Data Part 1: Displaying Quant Data (Week 2, Wednesday) Part 2: Summarizing Quant Data (Week 2, Friday)
Displaying Quantitative Data Qualitative data Few categories made it easy to display this data Example: Gender has 2 categories (M/F) Example: Grade has 5 categories (A/B/C/D/F) Qual Tools: Pie Graphs, Frequency Tables, Bar Charts Quantitative data Typically has many distinct values Example: Weight, Age, Height, Salary Therefore the above qualitative tools won’t work Quant Tools: Histograms, Stem & Leaf, Dot Plots
Displaying Quantitative Data Histogram (p. 48) Group data into “bins” Example: Test Grades Bins:[55, 60)[60, 65)[65, 70) [70, 75)[75, 80)[80, 85) [85, 90)[90, 95)[95, 100)
Displaying Quantitative Data Histogram (p. 48) Group data into “bins” Example: Test Grades Bins:[55, 60)[60, 65)[65, 70) [70, 75)[75, 80)[80, 85) [85, 90)[90, 95)[95, 100) *** Notice the observation 60 is placed in the bin [60,65) not [55,60) *** This is the standard way to place observations that fall on the boundary
Displaying Quantitative Data Histogram (p. 48) Group data into “bins” Example: Test Grades
Displaying Quantitative Data Histogram (p. 48) Group data into “bins” Example: Test Grades
Displaying Quantitative Data Stem and Leaf (p. 50) “Histograms provide an easy-to-understand summary of the distribution, but they don’t show the data values themselves” Stem and Leaf displays are the solution.
Displaying Quantitative Data Stem and Leaf (p. 50) Example: Test Grades Bins:[50, 60)[60, 70)[70, 80) [80, 90)[90, 100)
Displaying Quantitative Data Stem and Leaf (p. 50) Example: Test Grades Bins:[50, 60)[60, 70)[70, 80) [80, 90)[90, 100)
Displaying Quantitative Data Stem and Leaf (p. 50) Example: Test Grades Test Grades (1|2 means 12%) *** See page 51 to learn how to build stem and leaf displays for bins that differ from 10 units in length ***
Displaying Quantitative Data Dotplots (p. 52) Example: Test Grades
Chapter 4: Quantitative Data Part 2: Summarizing Quant Data (Week 2, Friday)
Summarizing Quantitative Data Shape Mode Symmetry (Symmetric, Skewed) Center Median Mean Spread Range Quartiles IQR Advanced Topics (used throughout rest of semester) Variance Standard Deviation
Summarizing Quantitative Data Mode (p. 53) “Does the histogram have a single, central hump or several separated humps? These humps are called modes.”
Summarizing Quantitative Data Mode (p. 53) “Does the histogram have a single, central hump or several separated humps? These humps are called modes.” Unimodal Only one central hump
Summarizing Quantitative Data Mode (p. 53) “Does the histogram have a single, central hump or several separated humps? These humps are called modes.” Bimodal Two central humps Multimodal More than one central hump
Summarizing Quantitative Data Mode (p. 53) “Does the histogram have a single, central hump or several separated humps? These humps are called modes.” Uniform All the bars are approximately the same height and no mode is obvious
Summarizing Quantitative Data Symmetry (p. 54) “Can you fold it along a vertical line through the middle and have the edges match pretty closely, or are more of the values on one side?” Symmetric Skewed to the LeftSkewed to the Right
Summarizing Quantitative Data Mean VS Median Mean: what we typically think of when we hear the word “average”. Add up the values and divide by the total number Median: the number such that exactly half of the values are above it and half are below it.
Summarizing Quantitative Data Mean VS Median Mean: what we typically think of when we hear the word “average”. Add up the values and divide by the total number Median: the number such that exactly half of the values are above it and half are below it. Example 1: Consider the test grades 83, 94, 98, 99, 60 The mean can be found through: Mean = ( )/5 = 86.8 The median can be found by first ordering the values from smallest to highest: Then selecting the number that is in the middle. Median = 94
Summarizing Quantitative Data Mean VS Median Mean: what we typically think of when we hear the word “average”. Add up the values and divide by the total number Median: the number such that exactly half of the values are above it and half are below it. Example 2: Consider the test grades 83, 94, 98, 99 The mean can be found through: Mean = ( )/4 = 93.5 The median can be found by first ordering the values from smallest to highest: Then “averaging” the two numbers in the middle: Median = (94+98)/2 = 96
Summarizing Quantitative Data Range Largest Number – Smallest Number Example: Consider the test grades 83, 94, 98, 99 The range can be found through: Range = 99 – 83 = 16 *** THE RANGE IS A NUMBER. “83 to 99” IS WRONG
Summarizing Quantitative Data Quartiles (p. 58) A special way of splitting the data into fourths Order the data, split it in half, Find the medians of each half. “Lower Quartile” (or “Q1”) is the lower median “Upper Quartile” (or “Q3”) is the upper median Example 1: (even number of values) Find Q1 and Q3 of the following ages: First, order the numbers from lowest to highest: Next, split the data in half (four numbers in each half) First half: Q1 = Median of First Half = (21+22)/2 = 21.5 Last half: Q3 = Median of Last Half = (33+34)/2 = 33.5
Summarizing Quantitative Data Quartiles (p. 58) A special way of splitting the data into fourths Order the data, split it in half, Find the medians of each half. “Lower Quartile” (or “Q1”) is the lower median “Upper Quartile” (or “Q3”) is the upper median Example 2: (odd number of values) Find Q1 and Q3 of the following ages: First, order the numbers from lowest to highest: Next, split the data in half (22 is included in both) First half: Q1 = Median of First Half = (21+22)/2 = 21.5 Last half: Q3 = Median of Last Half = (33+34)/2 = 33.5
Summarizing Quantitative Data Quartiles (p. 58) A special way of splitting the data into fourths Order the data, split it in half, Find the medians of each half. “Lower Quartile” (or “Q1”) is the lower median “Upper Quartile” (or “Q3”) is the upper median IQR (“Inner-quartile Range”) IQR = Q3 – Q1 Single number (just like range is a single number)
Summarizing Quantitative Data Summation Notation (p. 62) Consider the grades: 80, 85, 90, 95. ∑y (represents a “summation” of the grades) That is: ∑y = = 350 For now on, we will use a new notation for MEAN Where “y-bar” represents the mean and n is the number of values for y
Summarizing Quantitative Data Variance The variance of a variable is a measure of how “spread out” the data is. It is given by the following “complicated” formula: Note that “s-squared” represents the variance. The equation is best understood through an example.
Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation:
Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y
Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y
Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y
Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y
Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y
Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y
Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y
Summarizing Quantitative Data Variance Example Let’s take a closer look at what’s happening: y y y
Summarizing Quantitative Data Variance Example Let’s take a closer look at what’s happening: y
Summarizing Quantitative Data Variance Example Let’s take a closer look at what’s happening: y When data is more spread out, the result is a higher variance
Summarizing Quantitative Data Variance Example Let’s take a closer look at what’s happening: y When all of the values are the same, the variance is 0
Summarizing Quantitative Data Standard Deviation Standard deviation is the square-root of variance: Note: the symbol for standard deviation is s