Download presentation
Presentation is loading. Please wait.
1
Numerical Descriptive Statistics
Business Statistics Topic 3 Numerical Descriptive Statistics
2
Business Statistics:Topic 3
Learning Objectives By the end of this topic you will be able to: Describe and apply the measures of central location: mean, mode and median. Determine the significance of the skewness of a distribution Describe and apply non-central locations: Quartiles Describe and apply the measures of Variation: range, interquartile range and standard deviation Business Statistics:Topic 3
3
Business Statistics:Topic 3
Thinking What is descriptive statistics? How can we get the various features of a set of data How can we summarize the data? Business Statistics:Topic 3
4
How would you judge which salesperson should receive the bonus?
The management of a company is looking at performances of two of their salespersons to determine to give them a bonus. Sales Person A Sales person B Month Number of sales Jan 1100 800 Feb 1400 1800 Mar 1200 600 Apr 1300 1900 May 2000 Jun 1500 500 Business Statistics:Topic 3
5
Business Statistics:Topic 3
To calculate the average? Salesperson A salesperson B 1267 Business Statistics:Topic 3
6
Measures of Central Tendencies
Generally a data set shows a distinct tendency to group about a certain central point or location. These central locations are: Mean Mode Median Business Statistics:Topic 3
7
Business Statistics:Topic 3
Mean Definition: The arithmetic average and the most common measure of central tendency. e.g. average rate of return on different investments e.g. average waiting time for service Note: All values are included in computing the mean. A set of data has a unique mean The mean is affected by unusually large or small data points (outliers / extreme values). Business Statistics:Topic 3
8
Business Statistics:Topic 3
Sample Mean Raw Data Where ‘x’ is data and ‘n’ is the total number of data Example The annual incomes of middle-management employees at Westinghouse are: $42,900, $49,100, $38,300, and $56,800 The average income is given by (42, , , ,800)/4= 46,775 Business Statistics:Topic 3
9
Business Statistics:Topic 3
Effect of Outliers Mean of is 6 (sorted data) Mean of data when an extreme value is included is 8 (sorted data) Note : mean is shifted to the right because of 29 (outlier-high extreme value) What would happen to the mean if you added a low extreme value? Business Statistics:Topic 3
10
Business Statistics:Topic 3
Properties The mean is unique for a given set of data. The disadvantage with the mean is that it is affected by the presence of extreme or unusual values we can outliers. 85, 42, 89, 88, 89, 79, 95, 90, 93, 82 So if there are outliers or extreme values then the valuess of the arithmetic mean will not give you an accurate idea of the location of most of the values. Thus the mean loses its representativeness of the data. Business Statistics:Topic 3
11
Business Statistics:Topic 3
Mode Definition:The most frequent data, or data corresponding to the highest frequency. Note: Mode is not affected by extreme values There may not be a mode There may be several modes Used for either numerical or categorical data Example: Raw Data Mode = 5 (5 occurs the most frequently of all the values in the raw data) Business Statistics:Topic 3
12
Business Statistics:Topic 3
properties The mode can be used with either numerical or categorical data. The mode for a given set of data is not unique. Sometimes you may not have any mode or more than one mode. Mode is not affected by outliers. Business Statistics:Topic 3
13
Business Statistics:Topic 3
Median The Median is the value that splits a ranked set fo data into two equal parts. Middle observation after the data is sorted (ascending or descending order). 1. If ‘n’ is odd, the median is the value of the (n+1)/2th ordered data. 2. If ‘n’ is even, the median is the mean of the n/2th and (n+1)/2th ordered data Note: Median is not affected by extremely large or small values and is therefore a valuable measure of central tendency when such values occur. Business Statistics:Topic 3
14
Calculating the Median
Example: Sort data in ascending order: Here n(the amount of numbers)=7 (note: an odd no.). Calculate the middle, or 4th position: (7+1)/2=4th position. The 4th observation is 5 therefore the median is 5. Sort data first: Here n=8 (Note: an even number) There are two middle positions: n/2 and the next one. These are the 4th and 5th positions. The 4th observation is 5 and the 5th is 7 Calculate the average of 5 and 7. (5+7)/2=6 Therefore the median is 6 Business Statistics:Topic 3
15
Business Statistics:Topic 3
Effect of Outlier Median of is 5 (sorted data) is 5 (sorted data) Note : median is not affected by 29 (outlier) Business Statistics:Topic 3
16
Applying the Central Tendencies
Mean: When the data (polygon) is asymmetrical, or skewed, the mean loses its representativeness. Median: With skewed distributions, the median is the preferred measure of central tendency. The skew would need to be very strong, however, before the mean is rejected. Mode: When there is no mode or more than two values of mode, it is not a useful representation of data. However, the mode is very suitable for qualitative data. Business Statistics:Topic 3
17
Non- Central Locations: Quartiles
Calculating quartiles: Divide data into four equal parts The use of quartile calculations: Quartiles are often used with scores for aptitude tests They are also used in commerce and industry when a large number of observations is involved. Business Statistics:Topic 3
18
Business Statistics:Topic 3
Quartiles First Quartile ( ): There are 25% of the observations below and 75% above. Second Quartile ( ): There are 50% of the observations below and 50% above. Third Quartile ( ): There are 75% of the observations below and 25% above. Q 2 Business Statistics:Topic 3
19
Procedure for finding Quartiles
Suppose n/4 is not a whole number. Let ‘m’ be the next whole number larger than n/4. Then the lower quartile is the mth observation of the sorted data counting from the lower end. The upper quartile is mth observation of the sorted data counting from the upper end. Business Statistics:Topic 3
20
Procedure for finding Quartiles (continued)
Suppose n/4 is a whole number. Let ‘m’ be =n/4. Then the lower quartile is average of the mth and (m+1)th observation of the sorted data counting from the lower end. The upper quartile is similarly defined, counting from the upper end. Business Statistics:Topic 3
21
Business Statistics:Topic 3
Quartiles: example Consider the data : Sort data first Here n=7, 7/4=1.75, which is not a whole number. The next whole number is 2. Therefore the 2 nd observation in the sorted data from the lower end is which is 2. 2nd observation in the sorted data from the upper end is , which is 8. Business Statistics:Topic 3
22
Business Statistics:Topic 3
2. Measures of Variation In addition to measures of central tendencies, it is often desirable to consider measures of dispersion. Suppose you are looking at performances of sales persons who have equal sales average, it would be difficult to assess who has performed better. Only the spread of the data could tell us who is consistent in generating sales. Business Statistics:Topic 3
23
Business Statistics:Topic 3
Start Thinking Start thinking How would you judge which Salesperson should receive the bonus? Business Statistics:Topic 3
24
2 Measures of variability 变异度 离散度
The measures of dispersion we will be discussing are: Range 全距 Inter-quartile Range 四分位数间距 Standard Deviation 标准差 Business Statistics:Topic 3
25
Business Statistics:Topic 3
Observing and Thinking Observe these two data sets: Small variability The average value provides a good representation of the observations in the data set. This data set is now changing to... Business Statistics:Topic 3
26
Measures of variability
Observe two hypothetical data sets: Small variability The average value provides a good representation of the observations in the data set. Larger variability The same average value does not provide as good representation of the observations in the data set as before. Business Statistics:Topic 3
27
2.1 Range 全距 The range of a set of observations is the difference between the largest and smallest observations. Its major advantage is the ease with which it can be computed. Its major shortcoming is its failure to provide information on the dispersion of the observations between the two end points. But, how do all the observations spread out? ? ? ? The range cannot assist in answering this question Range Smallest observation Largest observation Business Statistics:Topic 3
28
Business Statistics:Topic 3
Range 全距 The range measures the total spread in the set data. For example: Range could be used to measure the following: the temperature fluctuation in a given day movement of share prices. Business Statistics:Topic 3
29
Effect of Outliers 离群值的影响
The Range is affected by the presence of outliers. It does not take into account how the data is distributed between the highest and the smallest values Business Statistics:Topic 3
30
Interquartile Range 四分位数间距
The Interquartile Range is defined as the difference between the upper and lower quartiles. IQR = — This measures the spread in the middle 50% of the data It is not affected by outlying observations It does not consider how the data is distributed Business Statistics:Topic 3
31
Business Statistics:Topic 3
2.3 Standard Deviation Definition of Standard Deviation: A measure of the variation of data from the mean. The most commonly used measure of variation Represented by the symbol ‘s’ Shows how the data is distributed around the mean Business Statistics:Topic 3
32
Calculation of SD from a sample
where X sample data - sample mean n sample size Business Statistics:Topic 3
33
Business Statistics:Topic 3
Example Here = 81 Business Statistics:Topic 3
34
Comparing Standard Deviations
Data A Mean = 13.5 s = 3.338 Data B Mean = 13.5 s = .9258 Data C Mean = 13.5 s = 4.57 Business Statistics:Topic 3
35
% of data within one sd of mean
% of data which falls in the interval Approximately 67% of the data lies within one standard deviation of the mean. Business Statistics:Topic 3
36
Business Statistics:Topic 3
Sample Variance Sample variance is the square of the sample standard deviation Business Statistics:Topic 3
37
Understanding Variation in Data
The more spread out, or dispersed the data are, the larger will be the range, interquartile range and the standard deviation Business Statistics:Topic 3
38
Relative location—Z score
The mean and standard deviation can be used together to learn about the relative locations, z of observations in a data. Where is the sample mean and ‘s’ is the sample standard deviation. Business Statistics:Topic 3
39
Business Statistics:Topic 3
Shape of distribution What factors influence the distribution of a set of data? The shape of a distribution is determined by the relative positions of the central tendencies. Business Statistics:Topic 3
40
Business Statistics:Topic 3
Shape of Distribution Symmetric Distribution Mode = Median = Mean Business Statistics:Topic 3
41
Right Skewed Distribution
A small portion of relatively large extreme values pull the polygon to the right. Note: Skewness is a measure of asymmetry. Business Statistics:Topic 3
42
Left Skewed Distribution
A small portion of extreme low values pull the polygon to the left. Business Statistics:Topic 3
43
The Five-Number Summary
Mean Mode Median Lower Quartile Upper Quartile Business Statistics:Topic 3
44
Business Statistics:Topic 3
Box-and-whisker Plot Business Statistics:Topic 3
45
Symmetrical Distribution
The median at equal distance from either quartile The whiskers have the same length Business Statistics:Topic 3
46
Symmetrical Distribution
Consider the following data: Business Statistics:Topic 3
47
Business Statistics:Topic 3
Skewed to The Left The median is closer to the third quartile The left whisker is much longer than the right whisker Business Statistics:Topic 3
48
Skewed to the Left: example
Consider the following data: Business Statistics:Topic 3
49
Business Statistics:Topic 3
Skewed to The Right The median is closer to the first quartile The right whisker is much longer than the left whisker Business Statistics:Topic 3
50
Skewed to the Right: example
Consider the following data: Business Statistics:Topic 3
51
Assessing the strength of skewness
A set of data is said to be significantly skewed Business Statistics:Topic 3
52
Business Statistics:Topic 3
Population Measures Mean Standard Deviation Business Statistics:Topic 3
53
Ethical Considerations
Results needs to be given in a: fair, objective, and neutral manner Learn to differentiate between poor presentation and unethical presentation. Business Statistics:Topic 3
54
Business Statistics:Topic 3
Summary In this topic you have: Looked at calculating & applying the measures of central tendencies : mean, mode and median Reviewed the concept of central tendency and the shape of distributions Looked at calculating & applying the measures of dispersion : range, interquartile range and standard deviation Business Statistics:Topic 3
55
Business Statistics:Topic 3
Main points preview what we have learned Raise question about the variation-lead in Summer measure of variation: standard deviation: definition, property, calculation; application and meaning Visual presentation: bell-shaped; box-whisker Using excel: standard deviation, box and whisker Final Summary Business Statistics:Topic 3
56
Business Statistics:Topic 3
全距 标准差 标准分 分布情况 Business Statistics:Topic 3
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.