DATA DETECTIVES TWSSP Tuesday
Agenda for today Distinguishing Distributions Old Faithful – Data Detectives Activity Head Measurement Hand Span Measurement Standard Deviation GOALS: Start thinking about how sample size influences variability Understand what information variability can give you about data Understand the difference in sources of variability Understand the meaning of standard deviation
Distinguishing Distributions Imagine you are describing a person’s face to a friend who has never seen that person. What features would be essential in your description? Now think about describing a graphical representation. What would be the essential features of a graphical representation of data which you would want to include in your description?
Distinguishing Distributions All of the graphs you see represent the exam scores of different classes. For classes A, B, and C, what main feature(s) distinguish the graphs from one another? What might be the source of this difference? Do the same for D, E, & F, and G, H, & I
Distinguishing Distributions What strikes you as the most distinguishing characteristic of the distribution of exam scores in graph J? What might be an explanation for this characteristic? Do the same for K and L Go back to graph D. If you wanted to tell someone how the class did on the exam, what would you say? How would you describe the “bulk” of the data for class D? Do the same for class E.
Distinguishing Distributions Suppose you are told that most exam scores for a class were between 65 and 85, and that the lowest score was 30 and the highest was 100 What might a possible distribution look like? Wrap up – Distinguishing Distributions: What features of a graph are important in describing a distribution? Are there any characteristics that some graphs may have but others won’t?
Old Faithful The data you have shows data on wait times between eruptions of the Old Faithful geyser in Yellowstone National Park from 1985 Each row shows the wait times for one day In your group, choose any two of the rows (they don’t need to be consecutive) to use to investigate the data. On your own: Look over the data for the two days your group selected, and jot down your notices and wonders Sketch a graphical representation of the data, and jot down any new notices and wonders
Old Faithful In your group: Share and compare your graphs. What do you notice and wonder about as you look at other graphs? Agree as a group on a graphical way to display your data On the basis of your data, make a group decision about how long you would expect to wait between blasts of Old Faithful if you got to the geyser right after it had finished erupting. Prepare to share your graph and defend your decision about wait time to the whole group.
Old Faithful – two weeks’ data
Old Faithful – wait time vs. previous
Old Faithful What do the representations reveal about the data that is different from other representations? How does variability appear in each of these representations? What does variability tell you about the data?
Head Measurement Measure and record the circumference of your head in cm and your hand span in mm Plot the aggregated class data I notice, I wonder Based on your observations, complete the statement: the typical head circumference is _______ cm, give or take ________ cm. What are the possible sources of variability in head measurements? How (if at all) could the variability be reduced?
Head Measurement Plot the data for the measurement of one participant’s head. I notice, I wonder How does this graph compare to your previous one? What are the possible reasons for variability in the measurements? How (if at all) could it be reduced?
Head Measurement How was the variability in each case different? When might we want and even expect variability? When is variability a type of error?
Hand Span Make a dot plot of the class data and use a wedge ( ▲) to mark the mean below the horizontal axis. I notice, I wonder What are two sources for the variability in the data?
Hand Span Make a second dot plot where the data points are the differences (deviations) between each hand span measurement and the mean Use a wedge to mark the mean of the deviations I notice, I wonder Could you get from the first plot to the second without any computations? How would you describe the ‘typical’ deviation from the hand span mean?
Growing Distributions Resting heart rates Find your pulse in your wrist or your neck. Count your pulse for 60 seconds I will keep time for you Plot the resting heart rates for our whole class I notice, I wonder What do you think the plot would look like if we compiled all the resting heart rates of all the institute participants? Of everyone on campus right now?
Growing distribution Data from NHANES (National Health and Nutrition Examination Survey), collected between ,610 participants, nationally representative sample using a “complex, multistage sampling design” Excluded: 11 people with resting heart rates > 200 bpm 740 people with high white blood counts (likelihood of infections) 9,083 people taking medications which could affect heart rate 1,331 pregnant women 523 people with irregular pulses 462 people with abnormal thyroid function
Growing Distributions What are the differences between dot plots and smooth curves? What kinds of data generate each? What information do we get from each of the plots about the spread of data?
Standard Deviation Calculate the deviations from the mean Draw horizontal lines off of the mean to show the deviations
Standard Deviation Draw a line the length of which is the average of the deviations from the mean. Don’t worry about whether the deviation is positive or negative. Estimate the length of the line. This length is a good estimate for the standard deviation, a measure of the extent to which data varies from the mean, without consideration for whether a data point is above the mean or below.
Estimate the standard deviation Mean is 2.5
Standard Deviation For each of the pairs of graphs on your handout, decide which has the greater standard deviation, and why.
Exit Ticket (sort of) What are the possible sources of variability in data? Is variability always a bad thing? Explain. In your own words, explain what standard deviation measures and what it tells you about a set of data.