Year 8: Data Handling 2 Dr J Frost Last modified: 11 th December 2014 Learning Outcomes: To understand stem and leaf diagrams, frequency polygons, box plots and cumulative frequency graphs.
Key: 2 | 1 means 2.1cm Stem and Leaf Diagram - What is it? Suppose this “stem and leaf diagram” represents the lengths of beetles. These numbers represent the first digit of the number. These numbers represent the second. The key tells us how two digits combine. Value represented = 4.5cm ? The numbers must be in order.
Example Here are the weights of a group of cats. Draw a stem-and-leaf diagram to represent this data. 36kg 15kg 35kg 50kg 11kg 36kg 38kg 47kg 12kg 30kg 18kg 57kg ? ? ? ? ? ? Key: 3 | 8 means 38kg ? What do you think are the advantages of displaying data in a stem-and-leaf diagram? Shows how the data is spread out. Identifies gaps in the values. All the original data is preserved (i.e. we don’t ‘summarise’ in any way). ?
Your turn Here is the brain diameter of a number of members of 8IW. Draw a stem and leaf diagram representing this data. 1.3cm 2.1cm 5.3cm 2.0cm 1.7cm 4.2cm 3.3cm 3.2cm 1.3cm 4.6cm 1.9cm ? ? ? ? ? ? Key: 3 | 8 means 3.8cm ? Median width = 2.1cm ? Lower Quartile = 1.7cm Upper Quartile = 4.2cm ? ?
Exercises Q1 on your provided worksheet. (Ref: Yr8-DataHandlingWorksheet.doc)
Suppose we wanted to plot the following data, where each value has a frequency. A suitable representation of this data would be a bar chart. Frequency Diagram Shoe SizeFrequency Shoe size Frequency ? When bar charts have frequency on the y-axis, they’re known as frequency diagrams.
But suppose that we had data grouped into ranges. What would be a sensible value to represent each range? Frequency Polygons IQ (x)Frequency 90 ≤ x < ≤ x < ≤ x < ≤ x < ≤ x < Join the points up with straight lines. This is known as a frequency polygon. Modal class interval: 100 ≤ x < 110 ?
Frequency Polygons – Exercises on sheet b) 30 < x ≤ 40 c) 16% Q1 Q2 b) 20 < x ≤ 30 c) 16% ? ? ? ? ? ?
Median/LQ/UQ class interval Estimate of Median/LQ/UQ/num values in range Determine Median/LQ/UQ Width (cm)Frequency 0 < w < < w < < w < 602 Width (cm)Cum Freq 0 < w < < w < < w < 6012 Widths (cm): 4, 4, 7, 9, 11, 12, 14, 15, 15, 18, 28, 42 Cumulative Frequency Table Cumulative Frequency Graph Box Plots Histogram Grouped Frequency Table Frequency Polygon The Whole Picture
Recap: Lower and Upper Quartile Suppose that we line up everyone in the school according to height. We already know that the median would be the middle person’s height. 50% of the people in the school would have a height less than them. 50% The height of the person 25% along the line is known as the: lower quartile The upper quartile is the height of the person 75% along the data. ?
Check your understanding 50% of the data has a value more than the median. 75% of the data has a value less than the upper quartile. 25% of the data has a value more than the upper quartile. 75% of the data has a value more than the lower quartile. 0%25%50%75%100% LQUQMedian ? ? ? ?
Here are the ages of 10 people at Pablo’s party. Choose the correct value. 12, 13, 14, 14, 15, 16, 16, 17, 19, 24 15 16 Median: 15.5 13 14 Lower: 13.5 17 19 UQ: 18 (Click to vote) Interquartile Range: 3 12 ?? Median/Quartile Revision
??? ??? ??? ??? Quickfire Quartiles 1, 2, 3 LQMedianUQ 1, 2, 3, 4 1, 2, 3, 4, 5 1, 2, 3, 4, 5, 6 Rule for lower quartile: Even num of items: find median of bottom half. Odd num of items: throw away middle item, find medium of remaining half.
What if there’s lots of items? There are 31 items, in order of value. What items should we use for the median and lower/upper quartiles? Use the 16 th item Median LQ UQ Use the 8 th item Use the 24 th item ? ? ?
Num items th 8 th 12 th 6 th 12 th 18 th 10 th 20 th 30 th 12 th 24 th 36 th 47 ??? ??? ??? ??? What if there’s lots of items? LQMedianUQ
Box Plots Box Plots allow us to visually represent the distribution of the data. MinimumMaximumMedianLower QuartileUpper Quartile Sketch How is the IQR represented in this diagram? How is the range represented in this diagram? Sketch IQRrange
Box Plots Sketch a box plot to represent the given weights of cats: 5lb, 6lb, 7.5lb, 8lb, 8lb, 9lb, 12lb, 14lb, 20lb MinimumMaximumMedianLower QuartileUpper Quartile ????? Sketch
Box Plots Sketch a box plot to represent the given ages of people at Dhruv’s party: 5, 12, 13, 13, 14, 16, 22 MinimumMaximumMedianLower QuartileUpper Quartile ????? Sketch
£100k £150k £200k £250k £300k £350k £400k £450k Kingston Croydon Box Plot comparing house prices of Croydon and Kingston-upon-Thames. Comparing Box Plots “Compare the prices of houses in Croydon with those in Kingston”. (2 marks) For 1 mark, one of: In interquartile range of house prices in Kingston is greater than Croydon. The range of house prices in Kingston is greater than Croydon. For 1 mark: The median house price in Kingston was greater than that in Croydon. (Note that in old mark schemes, comparing the minimum/maximum/quartiles would have been acceptable, but currently, you MUST compare the median) ??
Time (s)FrequencyCum Freq 9.6 < t ≤ < t ≤ < t ≤ < t ≤ TOTAL < t ≤ 10.2 Modal class interval < t ≤ 10.2 Median class interval Estimate of mean m times at the 2012 London Olympics ? ? ? ? ? ? ? ?
Time (s)FrequencyCum Freq 9.6 < t ≤ < t ≤ < t ≤ < t ≤ Time (s) Cumulative Frequency Median = 10.07s Lower Quartile = 9.95s Upper Quartile = 10.13s ? ? ? Interquartile Range = 0.18s ? Cumulative Frequency Graphs Plot This graph tells us how many people had “this value or less”.
Time (s) Cumulative Frequency Cumulative Frequency Graphs Estimate how many runners had a time less than 10.15s. 26 runners Estimate how many runners had a time more than – 8 = 24 runners Estimate how many runners had a time between 9.8s and 10s 11 – 3 = 8 runners ? ? A Cumulative Frequency Graph is very useful for finding the number of values greater/smaller than some value, or within a range. ?
Time (s)FrequencyCum Freq 9.6 < t ≤ < t ≤ < t ≤ < t ≤ Time (s) Cumulative Frequency Cumulative Frequency Graph Plot Frequency Time (s) Frequency Polygon Sketch Line
Worksheet Cumulative Frequency Graphs Printed handout. Q5, 6, 7, 8, 9, 10 Reference: GCSE-GroupedDataCumFreq
? ? ? ? ? 179 ? ?
34 Lower Quartile = 16 Upper Quartile = 44.5 ? ? ?
We previously found: Minimum = 9, Maximum = 57, LQ = 16, Median = 34, UQ = mark: Range/interquartile range of boys’ times is greater. 1 mark: Median of boys’ times is greater. ? ?
? ? ? ? ?
B C D A ? ? ? ?