Chapter 3 Descriptive Measures Measures Central Tendency MeanMedianModeDispersionRange Variance & Standard Deviation Measures Central Tendency MeanMedianModeDispersionRange Variance & Standard Deviation
The 3-Ms: Mean, Median, Mode Mean: arithmetic average Median: mid-point of the distribution Mode: most frequent response Which one to use? Depends on the type of data you have: Nominal data: mode Ordinal data: mode and median Interval/Ratio: mode, median, and mean
Mean (Arithmetic)—the common measure of CT affected by extreme values(outliers) Sample mean n is sample size. Population mean N is population size
Example: All the students in IT A3-04 class are considered the population. Their course grades are: (i). Give the formula for the population mean (ii). Compute the mean course grade
Solution: (i). See the previous slide (ii). Compute the population mean µ µ=( )/10 =77.6
Median Mean is affected by extreme values. So, it can not represent a data set that contains one or two very large or very small values. The center point for such data can be better described using Median. In an ordered array, the median is the middle value The location of the media is (n+1)/2 If the number of values is odd the median is the middle number If the number of values is even the median is the average of the two middle numbers.
Example: The prices ordered at CP are: Calculate the median of the data set above. Solution: -Ordered array: Median position=(n+1)/2 =(5+1)/2=3th =>Median=300
Mode A measure of CT Value that occurs most often Not affected by extreme values There may not be a mode There may be several modes User for numerical or categorical data
Example: The following are the ages of the 9 people in CP: What is the mode of the data set? Solution: The data set reveals that the age 20 appear most often than any other ages. So the mode is 20.
Why measures Dispersion? It tells us how often something varies/spread. Imagine you go to a restaurant and received such very good mean, you return fore the same meal next day. However, this time the food tastes very bad. You never go back. The standard of quality has varied. The mean or the median only locates the center of the data. But they do not inform us about the spread of the data. A measure of dispersion can be used to evaluate the reliability of two or more means/averages.
Range The simplest measure of dispersion is the range. It is the difference between the lowest and the highest. It can be expressed by the following formula: Range=Highest value-Lowest value Example: In the morning IT class, students’ grades are In the evening IT class, students’ grades are
IT morning class: Range=90-45=45 Mean=( )/9 =65.11 IT evening class: Range=70-40=30 Mean=( )/10 =55 Therefore, the range of students’ grades of IT morning class is greater than the range of students’ IT evening class. This we can conclude that there is greater dispersion in students’ grades of IT morning class than in the students’ grades of IT evening class. And we also conclude that the students grades of IT morning class
Are not clustered more closely around the mean of than the students’ grades of IT evening class. Thus, the mean of 55 are more reliable than the mean of
Variance and Standard Deviation The defect of the range is that it is based only on two values—the highest value and the lowest value. Variance is the arithmetic mean of squared deviation from the mean. Variance is the important measure of variation. Standard Deviation is the positive square root of the variance. Standard Deviation is the most important measure of variation.
Population variance: Population Standard Deviation
Sample variance Sample Standard Deviation
Small vs. large standard deviation small standard deviation Large standard deviation
Example: Computer the population variances and standard deviation of the previous students’ grades. And then make a conclusion based on the two standard deviations.
Exercises 1.Based on a given data set (sample): 5, 4, 2, 7, 4, 8, and 2 a. Compute the variance b. Determine the sample standard deviation 2.Listed below are self-services prices for a sample of 16 retail stores: a. What is the arithmetic mean selling price? b. What is the median selling price? c. What is the modal selling price?
3.Listed below are he numbers of boxes of cigarettes produced daily in Luxury cigarette manufacture a. Find the mean, median, and mode of the data set b. Calculate the standard deviation c. Draw polygon to present the data set and provide the comment about the distribution of the data set.