SUMMARIZING QUANTITATIVE DATA
The table is called the FREQUENCY TABLE A table or a bar graph showing the grouping of data values into classes with their respective frequencies is called the FREQUENCY DISTRIBUTION (or simply the DISTRIBUTION) of the data set. The table is called the FREQUENCY TABLE The bar graph is called the HISTOGRAM CLASS FREQUENCY
FREQUENCY DISTRIBUTION OF A DISCRETE VARIABLE Example 1: The scores of 20 students in a color sensitivity test is recorded in the following data table: (1 – least sensitive; 7 – most) 5 6 4 3 2 1 7 FREQUENCY TABLE: Score Frequency 1 2 3 4 5 6 7 6 2 4 1 3 3 1 For discrete variables, the classes are simply the distinct data values.
All the rectangles must be adjacent to each other. HISTOGRAM: Each class is represented by a rectangle. The height of each rectangle corresponds to the frequency of the class it represents. The base of the rect. must have its midpoint at the class value All the rectangles must be adjacent to each other.
FREQUENCY DISTRIBUTION OF A CONT. VARIABLE Example 2: The lengths (in cm) of 30 mango fruits are recorded in the following data table. 13 17 14 21 25 29 18 34 30 22 19 35 26 20 31 23 27 24 28 FREQUENCY TABLE: Diameter Frequency 12.5 – 16.5 16.5 – 20.5 20.5 – 24.5 24.5 – 28.5 28.5 – 32.5 32.5 – 36.5 7 9 6 2 4 3 For continuous variables, the classes are interval classes. In the interval class A – B, the no. A is called the LOWER CLASS LIMIT and B is called the UPPER CLASS LIMIT.
All the rectangles must be adjacent to each other. HISTOGRAM: Each class is represented by a rectangle. The height of each rectangle corresponds to the frequency of the class it represents. The base of the rect. must have its endpoints at the class limits. All the rectangles must be adjacent to each other.
HOW TO CONSTRUCT THE INTERVAL CLASSES Data set: 13 17 14 21 25 29 18 34 30 22 19 35 26 20 31 23 27 24 28 1. Choose the tentative number of interval classes (at least 5). no. of interval classes: 6 2. Take note of the number of decimal places in the data. Choose the first lower class limit to be slightly lower than the lowest data value and has one more decimal place ending in ‘5’. Since the data set has no decimal places, the 1st lower class limit must have 1 decimal place and end in ‘5’. 1st lower class limit: 12.5
3. Compute class width. ( ) The class width must have the same number of decimal places as the data; remove the exceeding decimal places (if there are) and add 1 to the last digit. class width: 4 Since the data set has no decimal places, the class width must also have none; so we remove the exceeding decimal places. We also add 1 to the last digit. Interval classes: 4. To get the 1st upper class limit, add the class width to the 1st lower class limit. This upper class limit will also be the lower class limit for the next interval. Keep doing this until the upper class limit exceeds the highest data value. 12.5 – 16.5 16.5 – 20.5 20.5 – 24.5 24.5 – 28.5 28.5 – 32.5 32.5 – 36.5
Example: Make a frequency table with 7 interval classes and with 1st lower class limit = 12.5 for the following data. 15 16 17 20 23 24 25 27 29 30 31 32 34 35 36 38 39 40 41 42 43 44 45 48 49 50 56 58
Q U I Z Make a frequency table and histogram for the following raw data. #1. no. of interval classes = 7 1st lower class limit = 20.5 22 23 24 25 26 27 28 29 30 31 32 33 34 35 37 38 39 42 43 47 48 54 #2. no. of interval classes = 7 1st lower class limit = 18.5 20 27 31 33 35 38 40 42 45 47 48 49 50 52 55 56 57 58 59 62 63 64 65 66
Example: (Data set with decimal places) In a study of one-way commuting distance of FEU students, a random sample of 60 students gives the ff. data (in kms): 13.2 47.8 10.5 3.7 16.4 20.1 17.9 40.3 4.5 2.8 7 25.3 8 21.4 19.6 15.1 3.2 17.8 14.2 6.3 12.2 45.8 1.4 8.2 4.1 16.7 11.2 18.5 23.2 12.4 6 2.5 15.2 13 15.6 46.2 12.5 9.3 18.7 34.2 13.5 41.6 28.1 36 17.2 24 27.6 29.5 9.2 14.6 26.1 10.6 37 31.2 16.8 16 Make a frequency table and histogram with 6 interval classes and 1st lower class limit = 1.05
BASIC DISTRIBUTION PATTERNS Data set #1: 15 16 17 20 23 24 25 27 29 30 31 32 34 35 36 38 39 40 41 42 43 44 45 48 49 50 56 58 FREQUENCY TABLE: HISTOGRAM: Interval class Frequency 12.5 – 19.5 3 19.5 – 26.5 6 26.5 – 33.5 8 33.5 – 40.5 10 40.5 – 47.5 7 47.5 – 54.5 4 54.5 – 61.5 2
NORMALLY DISTRIBUTED The mean, median and mode are almost equal and the shape of the distribution is “triangular” or “bell-shaped”. Much of the frequencies accumulate around the middle data values.
Data set #2: 22 23 24 25 26 27 28 29 30 31 32 33 34 35 37 38 39 42 43 47 48 54 FREQUENCY TABLE: HISTOGRAM: Interval class Frequency 20.5 – 25.5 8 25.5 – 30.5 12 30.5 – 35.5 35.5 – 40.5 6 40.5 – 45.5 3 45.5 – 50.5 2 50.5 – 55.5 1
RIGHT SKEWED DISTRIBUTION The mean is significantly greater than the median and mode, and the shape of the distribution is “thinned on the right side”. Much of the frequencies accumulate around lower data values.
Data set #3: 20 27 31 33 35 38 40 42 45 47 48 49 50 52 55 56 57 58 59 62 63 64 65 66 FREQUENCY TABLE: HISTOGRAM: Interval class Frequency 18.5 – 25.5 1 25.5 – 32.5 2 32.5 – 39.5 3 39.5 – 46.5 4 46.5 – 53.5 9 53.5 – 60.5 11 60.5 – 67.5 10
LEFT SKEWED DISTRIBUTION The mean is significantly smaller than the median and mode, and the shape of the distribution is “thinned on the left side”. Much of the frequencies accumulate around higher data values.