Download presentation
Presentation is loading. Please wait.
1
Descriptive Statistics
26134 Business Statistics Week 3 Tutorial Descriptive Statistics Key concepts in this tutorial are listed below Measures of Central Tendency Measures of Dispersion/Spread
2
In statistics we usually want to statistically analyse a population but collecting data for the whole population is usually impractical, expensive and unavailable. That is why we collect samples from the population (sampling) and make inferences about the population parameters using the statistics of the sample (inferencing) with some level of accuracy (confidence level). A population is a collection of all possible individuals, objects, or measurements of interest. A sample is a subset of the population of interest.
3
Measures of Central Tendency:
The mean is the average of a set of numbers Applicable for interval and ratio data Affected by each value in the data set including extreme values. The median is the middle value of a set of numbers after they have been arranged in an order Applicable for ordinal, interval, and ratio data Unaffected by extremely large and extremely small values The mode is the most frequently occurring value in a data set. Applicable to all levels of data measurement (nominal, ordinal, interval, and ratio).
4
Measures of Dispersion/Spread:
RANGE is the difference between the largest and the smallest values in a set of data VARIANCE is the Average of the squared deviations from the arithmetic mean. STANDARD DEVIATION is the square root of the variance. Variance and standard deviation indicate the variability of the measurements. COEFFICIENT OF VARIATION (CV) is defined as the ratio of the standard deviation to the mean, expressed as a percentage. Used to compare standard deviation/variability of datasets with different means. INTER-QUARTILE RANGE (IQR) is the range of values between the first and third quartiles i.e. Interquartile range = Q3-Q1. Useful measure for ordinal data. Z SCORE represents the number of standard deviations a value (x) is above or below the mean of a set of numbers. Used for normally distributed data. We will learn about normal distribution in lecture on probability distributions.
5
a). nominal b). mode c). N/A (NOTE: no measures of dispersion that we discussed in this course is meaningful for a nominal variable) d). The mode is 11 which means the most preferred colour for a laptop bag is black
6
a). ordinal b). mode and median c). IQR is the most appropriate measure of dispersion for ordinal data. d). The median = mode = 11 and the IQR =2 a). interval b). mode (= 11), median (= 11) and mean (= 10.3) c). Range (= 4), Variance (= 1.9), Standard deviation (= 1.4) and IQR (= 2) d). In general, men tend to have shoe size between 10 and 11, but this varies from 8 to 12 with the deviation of 1.38 point about 10.3 and the scatter of size around the median is 2.
7
a). Mean =50 b). Median=50 c). Mode = 40 and 60 d). Range = 20 e). Variance=100 f). Standard Deviation=10 Both the mean and the median is 50, which means in general, the fresh graduate salary in small companies is $50,000. The data shows the possibility of bimodal distribution with two most frequent salaries of $40,000 and $60,000. The difference between the highest and the lowest salaries is $20,000 with the deviation of $10,000 about the average salary.
8
In Sydney: The Median and Mode of the level of satisfaction is 3.
The most appropriate measure of dispersion that can be used to compare between the two is the coefficient of variation. Since Sydney’s CV (9.78%) is lower than Perth’s CV (13.31%), we can conclude that Sydney has lower dispersion/spread. In Sydney: The Median and Mode of the level of satisfaction is 3. In Perth: The median in Perth is 4 and mode in Perth is 5 Overall, Perth has higher customer satisfaction level compared to Sydney.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.