Presentation is loading. Please wait.

Presentation is loading. Please wait.

HISTOGRAMS Representing Data. Why use a Histogram When there is a lot of data When data is Continuous a mass, height, volume, time etc Presented in a.

Similar presentations


Presentation on theme: "HISTOGRAMS Representing Data. Why use a Histogram When there is a lot of data When data is Continuous a mass, height, volume, time etc Presented in a."— Presentation transcript:

1 HISTOGRAMS Representing Data

2 Why use a Histogram When there is a lot of data When data is Continuous a mass, height, volume, time etc Presented in a Grouped Frequency Distribution Often in groups or classes that are UNEQUAL

3 Continuous data NO GAPS between Bars Histograms look like this......

4 Bars may be different in width Determined by Grouped Frequency Distribution

5 AREA is proportional to FREQUENCY NOT height, because of UNEQUAL classes! So we use FREQUENCY DENSITY = Frequency Class width

6 Grouped Frequency Distribution Speed, km/h 0< v ≤4040< v ≤5050< v ≤6060< v ≤9090< v ≤110 Frequency8015259030 Classes These classes are well defined there are no gaps !

7 Drawing Sensible Scales Bases of rectangles correctly aligned Plot the Class Boundaries carefully Heights of rectangles needs to be correct Frequency Density

8 Speed, kph0< v ≤4040< v ≤5050< v ≤6060< v ≤9090< v ≤110 Frequency8015259030 Frequency Density Class width4010 3020 2.01.52.53.01.5 Frequency Densities

9 040206080100120 3.0 2.0 1.0 Freq Dens Speed (km/h) Frequency = Width x Height Frequency = 40 x 2.0 = 80

10 Grouped Frequency Distribution Time taken (nearest minute) 5-910-1920-2930-3940-59 Freq1491835 Speed, kph0< v ≤4040< v ≤5050< v ≤6060< v ≤9090< v ≤110 Frequency8015259030 Classes No gaps GAPS!Need to adjust to Continuous Ready to graph

11 Adjusting Classes Class Widths Time taken (nearest minute) 5-910-1920-2930-3940-59 Freq1491835 9½4½19½29½39½59½ 105 20

12 Frequency Density Time taken (nearest minute) 5-910-1920-2930-3940-59 Freq 1491835 Class width 510 20 Frequency Density 2.80.91.80.30.25

13 Drawing Sensible Scales Bases correctly aligned Plot the Class Boundaries Heights correct Frequency Density

14 4.519.59.529.539.549.559.5 3.0 2.0 1.0 Freq Dens Time (Mins) 5 10 15 20 25 30 35 40 45 50 55 60

15 Estimating a Frequency Imagine we want to Estimate the number of people with a time between 12 and 25 mins Because we have rounded to nearest minute with our classes we......... Consider the interval from 11.5 to 25.5

16 4.519.59.529.539.549.559.5 3.0 2.0 1.0 Freq Dens Time (Mins) 11.5 25.5 Frequency = 0.9 x 8 = 7.2 Frequency = 1.8 x 6 = 10.8 Total Frequency = 18 FD Width

17 We can estimate the Mode Time taken (nearest minute) 5-910-1920-2930-3940-59 Freq1491835 CF1423414449 Mode is therefore in this Class

18 4.519.59.529.539.549.559.5 3.0 2.0 1.0 Freq Dens Time (Mins) Modal class

19 …and the other one? Simpler to plot No adjustments required – class widths friendly No ½ values Estimation from the EXACT values given No adjustment required Estimate 15 to 56 would use 15 and 56! Appear LESS OFTEN in the exam Speed, kph0< v ≤4040< v ≤5050< v ≤6060< v ≤9090< v ≤110 Frequency8015259030

20 Why use frequency density for the vertical axes of a Histogram? The effect of unequal class sizes on the histogram can lead to misleading ideas about the data distribution The vertical axis is Frequency Density

21 Example: Misprediction of Grade Point Average (GPA) The following table displays the differences between predicted GPA and actual GPA. Positive differences result when predicted GPA > actual GPA. Class IntervalFrequencyClass width -2.0 to < -0.4231.6 -0.4 to < -0.2550.2 -0.2 to < -0.1970.1 -0.1 to < 02100.1 0 to < 0.11890.1 0.1 to < 0.21390.1 0.2 to < 0.41160.2 0.4 to < 2.01711.6 The frequency histogram considerably exaggerates the incidence of overpredicted and underpredicted values The area of the two most extreme rectangles are much too large.!! X 10 -3 1000 2.3% of data 17.1% of data

22 Example: Density Histogram of Misreporting GPA Class IntervalFrequencyClass widthFrequency Density -2.0 to < -0.4231.614 -0.4 to < -0.2550.2275 -0.2 to < -0.1970.1970 -0.1 to < 02100.12100 0 to < 0.11890.11890 0.1 to < 0.21390.11390 0.2 to < 0.41160.2580 0.4 to < 2.01711.6107 Frequency=( rectangle height )x( class width ) = area of rectangle To avoid the misleading histogram like the one on last slide, display the data with frequency density

23 X 10 -3 Frequency density x 10 -3

24 Chap 2-24 Principles of Excellent Graphs  The graph should not distort the data.  The graph should not contain unnecessary things (sometimes referred to as chart junk).  The scale on the vertical axis should begin at zero.  All axes should be properly labelled.  The graph should contain a title.  The simplest possible graph should be used for a given set of data.

25 Chap 2-25 Graphical Errors: Chart Junk 1960: $1.00 1970: $1.60 1980: $3.10 1990: $3.80 Minimum Wage Bad Presentation Minimum Wage 0 2 4 1960197019801990 $ Good Presentation

26 Chap 2-26 Graphical Errors: No Relative Basis A’s received by students. Bad Presentation 0 200 300 FDUGGRSR Freq. 10% 30% FDUGGRSR FD = Foundation, UG = UG Dip, GR = Grad Dip, SR = Senior 100 20% 0% % Good Presentation

27 Chap 2-27 Graphical Errors: Compressing the Vertical Axis Good Presentation Quarterly Sales Bad Presentation 0 25 50 Q1Q2Q3 Q4 $ 0 100 200 Q1Q2 Q3 Q4 $

28 Chap 2-28 Graphical Errors: No Zero Point on the Vertical Axis Monthly Sales 36 39 42 45 JFMAMJ $ Graphing the first six months of sales Monthly Sales 0 39 42 45 J F MAMJ $ 36 Good Presentations Bad Presentation


Download ppt "HISTOGRAMS Representing Data. Why use a Histogram When there is a lot of data When data is Continuous a mass, height, volume, time etc Presented in a."

Similar presentations


Ads by Google