Chris Morgan, MATH G160 April 11, 2012 Lecture 29 Chapter 2.3 & 2.4: Stem and Leaf Plots and Cross Tabulation 1
2
Stem and Leaf Plots Gives a quick picture of the shape of the distribution Shows the rank order and the distribution simultaneously Includes actual numerical values Works best for small numbers of observations where all observations are greater than zero 3
Making a Stem and Leaf Plot Sort data from smallest to largest (and trim data if necessary) For each number set the last part to its “leaf” and the first part as its “stem” (eg. With the number 24 the 2 would be the stem and the 4 would be the leaf) Separate the stems and leafs into two columns; and format the leaves such that they are left-aligned 4
Making a Stem and Leaf Plot Number of times per day my mom will yell at me over Christmas break: 5
Making a Stem and Leaf Plot Number of times per day my mom will yell at me over Christmas break:
Making a Stem and Leaf Plot Number of times per day my mom will yell at me over Christmas break: 00, 0, 2, 3, 4, 4, 5, 6, 6, 6, 8, 8, 9 10, 1, 3, 3, 5, 7, 8, 9 21, 2, 4 7
Histogram or Stem and Leaf Plot? Histogram Quantitative variables Good for big data sets, especially if technology is available Uses a box to represent each data point Popular method of conveying information and will be utilized often in this course Stem and Leaf Plots Quantitative variables Good for small data sets, convenient for back-of-the-envelope calculations; rarely found in scientific or laymen publications Uses a digit to represent each data point Seen as elementary and will not be utilized often in this course 8
How to analyze relationships between types of data? What type(s) of variables we have will determine the method we use to compare the data. Types of Variables Method Categorical vs. Categorical Cross Tabulation Categorical vs. Quantitative ANOVA Quantitative vs. Quantitative Regression 9
Cross Tabulation YellowRedOrangeGreenBlueBrown Peanut Plain
Cross Tabulation YellowRedOrangeGreenBlueBrown Total Peanut Plain Total
Two way tables make it easy to compute conditional probability! P(Row A | Column B)= Cell(A,B). Column B Total Similarly, P(Column B | Row A)= Cell(A,B). Row A Total
Cross Tabulation YellowRedOrangeGreenBlueBrown Total Peanut Plain Total What more can I do what cross tabulation? Joint Probability Marginal Probability Condition Probability 13
Joint Probability YellowRedOrangeGreenBlueBrown Total Peanut Plain Total Probability (A and B) = cell count of A and B / grand total P(Red and Peanut) = 3 / 65 = 0.05 P(Blue and Plain) = 9 / 65 =
Marginal Probability YellowRedOrangeGreenBlueBrown Total Peanut Plain Total Probability [column A] = total of column A / grand total P(Orange) = 10 / 65 = 0.15 Probability [row B] = total of row B / grand total P(Plain) = 40 / 65 =
Conditional Probability YellowRedOrangeGreenBlueBrown Total Peanut Plain Total Conditional Probability is the probability event A occurs given that event B has already occurred. For instance, if I observe a yellow M&M then what is the probability it is plain. Or if it is peanut, what is the probability it’s red? Probability [A and B | B] = cell count of A and B / total count of B P(Plain | Yellow) = 6 / 11 = 0.55 P(Red | Peanut) = 3 / 25 =