Download presentation
Presentation is loading. Please wait.
Published byWilfred Ross Roberts Modified over 9 years ago
1
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach
2
2 2 Slide IS 310 – Business Statistics Measures of Distribution Shape, Relative Location, and Detecting Outliers n Distribution Shape n Chebyshev’s Theorem n Empirical Rule n Detecting Outliers
3
3 3 Slide IS 310 – Business Statistics Distribution Shape n In order to understand the shape of a distribution, we will refer to Histogram discussed in Chapter 2. n By looking at the Histogram, we will determine the shape of the distribution. n The shape of a distribution is measured with a quantity called Skewness.
4
4 4 Slide IS 310 – Business Statistics Distribution Shape: Skewness n An important measure of the shape of a distribution is called skewness. n The formula for computing skewness for a data set is somewhat complex. n Skewness can be easily computed using statistical software.
5
5 5 Slide IS 310 – Business Statistics Distribution Shape: Skewness n Symmetric (not skewed) Skewness is zero. Skewness is zero. Mean and median are equal. Mean and median are equal. Relative Frequency.05.10.15.20.25.30.35 0 0 Skewness = 0 Skewness = 0
6
6 6 Slide IS 310 – Business Statistics Relative Frequency.05.10.15.20.25.30.35 0 0 Distribution Shape: Skewness n Moderately Skewed Left Skewness is negative. Skewness is negative. Mean will usually be less than the median. Mean will usually be less than the median. Skewness = .31 Skewness = .31
7
7 7 Slide IS 310 – Business Statistics Distribution Shape: Skewness n Moderately Skewed Right Skewness is positive. Skewness is positive. Mean will usually be more than the median. Mean will usually be more than the median. Relative Frequency.05.10.15.20.25.30.35 0 0 Skewness =.31 Skewness =.31
8
8 8 Slide IS 310 – Business Statistics Distribution Shape: Skewness n Highly Skewed Right Skewness is positive (often above 1.0). Skewness is positive (often above 1.0). Mean will usually be more than the median. Mean will usually be more than the median. Relative Frequency.05.10.15.20.25.30.35 0 0 Skewness = 1.25 Skewness = 1.25
9
9 9 Slide IS 310 – Business Statistics Application of Standard Deviation Chebyshev’s Theorem How could we apply standard deviation to real-world problems? Let’s take the example of downtown LA. If we go further and further away from the downtown, we will capture more and more people. Similarly, if we go further and further from the mean of a data set, we will capture more and more of the data values. If we go three standard deviations from the mean, we will capture more data values than if we go two standard deviations. o----------------o---------------o o----------------o---------------o Mean Mean o-----------------------o-------------------------o o-----------------------o-------------------------o Mean Mean
10
10 Slide IS 310 – Business Statistics Chebyshev’s Theorem Chebyshev’s Theorem allows us to calculate the percent of data values that will be captured if we go so many standard deviations from the mean. 2 The formula is (1 – 1/z ) where z is the standard deviation and must be greater than 1. If we go 2 standard deviations from the mean, we will capture at least 2 (1 – 1/2 ) which is 75 percent (1 – 1/2 ) which is 75 percent
11
11 Slide IS 310 – Business Statistics At least of the data values must be At least of the data values must be within of the mean. within of the mean. At least of the data values must be At least of the data values must be within of the mean. within of the mean. 75%75% z = 2 standard deviations z = 2 standard deviations Chebyshev’s Theorem At least of the data values must be At least of the data values must be within of the mean. within of the mean. At least of the data values must be At least of the data values must be within of the mean. within of the mean.89%89% z = 3 standard deviations z = 3 standard deviations At least of the data values must be At least of the data values must be within of the mean. within of the mean. At least of the data values must be At least of the data values must be within of the mean. within of the mean. 94%94% z = 4 standard deviations z = 4 standard deviations
12
12 Slide IS 310 – Business Statistics Sample Problem Problem # 29 (10-Page 103; 11-Page 107) Mean = 6.9 Standard Deviation = 1.2 o----------------o---------------o o----------------o---------------o 4.5 6.9 9.3 4.5 6.9 9.3 The distance between 6.9 and 4.5 and between 9.3 and 6.9 is 2.4 which is two standard deviations. Using z = 2, we calculate z = 2, we calculate 2 (1 – 1/2 ) = 75 (1 – 1/2 ) = 75 That means at least 75 percent of individuals sleep between 4.5 and 9.3 hours
13
13 Slide IS 310 – Business Statistics Sample Problem #29 Calculate the percentage of individuals who sleep between 4.5 hours and 10.5 hours ------2.4----- --------3.6--------- ------2.4----- --------3.6--------- o-------------------o-------------------------o o-------------------o-------------------------o 4.5 6.9 10.5 4.5 6.9 10.5 First, calculate the percentage who sleep between 4.5 and 6.9 hours. Then, calculate the percentage who sleep between 6.9 and 10.5 hours 37.5 percent individuals sleep between 4.5 and 6.9 hours. Percentage of individuals who sleep between 6.9 and 10.5 hours is half of 2 (1- 1/3 ) = 89 which is 44.5 Add 37.5 and 44.5 to get 82 percent. So, 82 percent of individuals sleep between 4.5 and 10.5 hours.
14
14 Slide IS 310 – Business Statistics Empirical Rule For data having a bell-shaped distribution: For data having a bell-shaped distribution: of the values of a normal random variable of the values of a normal random variable are within of its mean. are within of its mean. of the values of a normal random variable of the values of a normal random variable are within of its mean. are within of its mean.68.26%68.26% +/- 1 standard deviation of the values of a normal random variable of the values of a normal random variable are within of its mean. are within of its mean. of the values of a normal random variable of the values of a normal random variable are within of its mean. are within of its mean. 95.44%95.44% +/- 2 standard deviations of the values of a normal random variable of the values of a normal random variable are within of its mean. are within of its mean. of the values of a normal random variable of the values of a normal random variable are within of its mean. are within of its mean.99.72%99.72% +/- 3 standard deviations
15
15 Slide IS 310 – Business Statistics Empirical Rule x – 3 – 1 – 2 + 1 + 2 + 3 68.26% 95.44% 99.72%
16
16 Slide IS 310 – Business Statistics Sample Problem Problem # 30 (10-Page 103; 11-107) Mean = 2.3 Standard Deviation = 0.1 o-------------o-------------o o-------------o-------------o 2.2 2.3 2.4 2.2 2.3 2.4 The distance between 2.3 and 2.2 and between 2.4 and 2.3 is 0.1 which is one standard deviation. Therefore, approximately 68 percent of regular grade gasoline sold between $2.20 and $2.40.
17
17 Slide IS 310 – Business Statistics Continuation of Problem # 30 b. What percent of gasoline sold between $2.20 and $2.50? 34% 47.5% 34% 47.5% o---------------o----------------------------o o---------------o----------------------------o 2.2 2.3 2.5 2.2 2.3 2.5 The distance between 2.3 and 2.2 is 0.1 which is one standard deviation. The distance between 2.5 and 2.3 is 0.2 which is two standard deviation. Take half of 68% and take half of 95% which gives us 81.5 percent. That means approximately 81.5 percent of regular grade gasoline sold between $2.2 and $2.5
18
18 Slide IS 310 – Business Statistics Measures of Association Between Two Variables n Covariance n Correlation Coefficient
19
19 Slide IS 310 – Business Statistics Covariance Positive values indicate a positive relationship. Positive values indicate a positive relationship. Negative values indicate a negative relationship. Negative values indicate a negative relationship. The covariance is a measure of the linear association The covariance is a measure of the linear association between two variables. between two variables. The covariance is a measure of the linear association The covariance is a measure of the linear association between two variables. between two variables.
20
20 Slide IS 310 – Business Statistics Covariance The covariance is computed as follows: The covariance is computed as follows: forsamples forpopulations
21
21 Slide IS 310 – Business Statistics Correlation Coefficient Just because two variables are highly correlated, it Just because two variables are highly correlated, it does not mean that one variable is the cause of the does not mean that one variable is the cause of the other. other. Just because two variables are highly correlated, it Just because two variables are highly correlated, it does not mean that one variable is the cause of the does not mean that one variable is the cause of the other. other. Correlation is a measure of linear association and not Correlation is a measure of linear association and not necessarily causation. necessarily causation. Correlation is a measure of linear association and not Correlation is a measure of linear association and not necessarily causation. necessarily causation.
22
22 Slide IS 310 – Business Statistics The correlation coefficient is computed as follows: The correlation coefficient is computed as follows: forsamplesforpopulations Correlation Coefficient
23
23 Slide IS 310 – Business Statistics Correlation Coefficient Values near +1 indicate a strong positive linear Values near +1 indicate a strong positive linear relationship. relationship. Values near +1 indicate a strong positive linear Values near +1 indicate a strong positive linear relationship. relationship. Values near -1 indicate a strong negative linear Values near -1 indicate a strong negative linear relationship. relationship. Values near -1 indicate a strong negative linear Values near -1 indicate a strong negative linear relationship. relationship. The coefficient can take on values between -1 and +1. The coefficient can take on values between -1 and +1.
24
24 Slide IS 310 – Business Statistics Covariance Problem n Use the data in Table 3.7 (10-Page 110; 11-Page 115) n Two variables are: Number of commercials and Sales volume n Week No. of Comm. Sales Volume ($100s) n 1 2 50 n 2 5 57 n 3 1 41 s = 99/9 = 11 n 4 3 54 xy n 5 4 54 n 6 1 38 calculations are in Table 3.8 n 7 5 63 n 8 3 48 n 9 4 59 n 10 2 46
25
25 Slide IS 310 – Business Statistics Interpretation on Covariance n Look at Figure 3.8 (10-Page 112) Figure 3.9 (11-Page 117) n There are four quadrants. _ n Points in Quadrant I correspond to x’s greater than x n _ n and y’s greater than y; points in Quadrant II n _ _ n correspond to x’s less than x and y’s greater than y and so on. n If the value of s is positive, the points with the greatest influence on n xy n s must be in Quadrants I and III. If the value of s is negative, the n xy xy n points with the greatest influence must be in Quadrants II and IV.
26
26 Slide IS 310 – Business Statistics The Weighted Mean and Working with Grouped Data n Weighted Mean n Mean for Grouped Data n Variance for Grouped Data n Standard Deviation for Grouped Data
27
27 Slide IS 310 – Business Statistics Weighted Mean When the mean is computed by giving each data When the mean is computed by giving each data value a weight that reflects its importance, it is value a weight that reflects its importance, it is referred to as a weighted mean. referred to as a weighted mean. In the computation of a grade point average (GPA), In the computation of a grade point average (GPA), the weights are the number of credit hours earned for the weights are the number of credit hours earned for each grade. each grade. When data values vary in importance, the analyst When data values vary in importance, the analyst must choose the weight that best reflects the must choose the weight that best reflects the importance of each value. importance of each value.
28
28 Slide IS 310 – Business Statistics Weighted Mean where: x i = value of observation i x i = value of observation i w i = weight for observation i w i = weight for observation i
29
29 Slide IS 310 – Business Statistics Grouped Data The weighted mean computation can be used to The weighted mean computation can be used to obtain approximations of the mean, variance, and obtain approximations of the mean, variance, and standard deviation for the grouped data. standard deviation for the grouped data. To compute the weighted mean, we treat the To compute the weighted mean, we treat the midpoint of each class as though it were the mean midpoint of each class as though it were the mean of all items in the class. of all items in the class. We compute a weighted mean of the class midpoints We compute a weighted mean of the class midpoints using the class frequencies as weights. using the class frequencies as weights. Similarly, in computing the variance and standard Similarly, in computing the variance and standard deviation, the class frequencies are used as weights. deviation, the class frequencies are used as weights.
30
30 Slide IS 310 – Business Statistics Mean for Grouped Data where: f i = frequency of class i f i = frequency of class i M i = midpoint of class i M i = midpoint of class i n Sample Data n Population Data
31
31 Slide IS 310 – Business Statistics Given below is the previous sample of monthly rents for 70 efficiency apartments, presented here as grouped data in the form of a frequency distribution. Sample Mean for Grouped Data Sample Mean for Grouped Data
32
32 Slide IS 310 – Business Statistics Sample Mean for Grouped Data Sample Mean for Grouped Data This approximation differs by $2.41 from the actual sample mean of $490.80.
33
33 Slide IS 310 – Business Statistics End of Chapter 3, Part B
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.