LESSON 4: MEASURES OF VARIABILITY AND PROPORTION

Slides:



Advertisements
Similar presentations
Chapter 3, Numerical Descriptive Measures
Advertisements

Measures of Dispersion
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Descriptive Statistics – Central Tendency & Variability Chapter 3 (Part 2) MSIS 111 Prof. Nick Dedeke.
B a c kn e x t h o m e Parameters and Statistics statistic A statistic is a descriptive measure computed from a sample of data. parameter A parameter is.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 3-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
Analysis of Economic Data
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Basic Business Statistics 10th Edition
B a c kn e x t h o m e Classification of Variables Discrete Numerical Variable A variable that produces a response that comes from a counting process.
Learning Objectives for Section 11.3 Measures of Dispersion
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Descriptive Statistics  Summarizing, Simplifying  Useful for comprehending data, and thus making meaningful interpretations, particularly in medium to.
Describing Data: Numerical
Describing Data Using Numerical Measures
LECTURE 12 Tuesday, 6 October STA291 Fall Five-Number Summary (Review) 2 Maximum, Upper Quartile, Median, Lower Quartile, Minimum Statistical Software.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Numerical Descriptive Techniques
Chapter 3 – Descriptive Statistics
Chapter 3 Descriptive Measures
4 - 1 Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
8.3 Measures of Dispersion  In this section, you will study measures of variability of data. In addition to being able to find measures of central tendency.
1 1 Slide Descriptive Statistics: Numerical Measures Location and Variability Chapter 3 BA 201.
Chapter 2 Describing Data.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved.
Applied Quantitative Analysis and Practices LECTURE#09 By Dr. Osman Sadiq Paracha.
Unit 3 Lesson 2 (4.2) Numerical Methods for Describing Data
Chapter Three McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Business Statistics Spring 2005 Summarizing and Describing Numerical Data.
1 STAT 500 – Statistics for Managers STAT 500 Statistics for Managers.
Data Description Note: This PowerPoint is only a summary and your main source should be the book. Lecture (8) Lecturer : FATEN AL-HUSSAIN.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
Chapter 2 Describing Data: Numerical
Descriptive Statistics ( )
Exploratory Data Analysis
Math 201: Chapter 2 Sections 3,4,5,6,7,9.
Measures of Dispersion
Statistics for Managers Using Microsoft® Excel 5th Edition
Business and Economics 6th Edition
Descriptive Statistics
Chapter 3 Describing Data Using Numerical Measures
Measures of variation (dispersion) [مقاييس التشتت]
Numerical Descriptive Measures
Topic 3: Measures of central tendency, dispersion and shape
Ch 4 實習.
Describing, Exploring and Comparing Data
Chapter 3 Created by Bethany Stubbe and Stephan Kogitz.
CHAPTER 3 Data Description 9/17/2018 Kasturiarachi.
NUMERICAL DESCRIPTIVE MEASURES
Characteristics of the Mean
Chapter 3 Describing Data Using Numerical Measures
Numerical Descriptive Measures
Descriptive Statistics
CHAPTET 3 Data Description.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
BUS173: Applied Statistics
Numerical Descriptive Measures
Quartile Measures DCOVA
Chapter 2 Exploring Data with Graphs and Numerical Summaries
LESSON 3: CENTRAL TENDENCY
Numerical Descriptive Measures
Numerical Descriptive Measures
MBA 510 Lecture 2 Spring 2013 Dr. Tonya Balan 4/20/2019.
Business and Economics 7th Edition
Numerical Descriptive Measures
Presentation transcript:

LESSON 4: MEASURES OF VARIABILITY AND PROPORTION Outline The range, variance, standard deviation and coefficient of variation Interpretation of standard deviation Population and sample variance Approximation from the grouped data Skewness Interquartile range and box plots The proportion

MEASURES OF VARIABILITY: EXAMPLE Heights of players of two teams in inches are as follows: Team I: 72,73,76,76,78, so mean=75, median=mode=76 Team II: 67,72,76,76,84, so mean=75, median=mode=76 How about the variation?

MEASURES OF VARIABILITY RANGE The first and simplest measure of variability is the range. The range of a set of measurements is the numerical difference between the largest and smallest measurements. Range = Largest value - Smallest value

MEASURES OF VARIABILITY RANGE Team I Range = 78-72 = inches Team II Range = 84-67 = inches So, Team I variation is a. less b. more

MEASURES OF VARIABILITY VARIANCE, STANDARD DEVIATION, CV A major drawback of the range is that it uses only two extreme values, ignores all the intermediate values, and provides no information on the dispersion of the values between the smallest and largest observations. On the other hand, variance / standard deviation / CV, uses all the values and provides information on the dispersion of the intermediate values Computation of variance / standard deviation / CV requires computation of deviation from the mean

MEASURES OF VARIABILITY VARIANCE, STANDARD DEVIATION, CV Team I deviations from the mean: (72-75)=-3, (73-75)=-2, (76-75)=1, ( - )= , ( - )=

MEASURES OF VARIABILITY VARIANCE, STANDARD DEVIATION, CV Team I deviations from the mean: -3, -2, 1, , From the property of mean (see Lesson 3, Slides 10-11), sum of deviations from the mean is zero. Check - 3 - 2 + 1 + + =

MEASURES OF VARIABILITY VARIANCE, STANDARD DEVIATION, CV Sum of squared deviations from the mean is not necessarily 0 e.g., sum of squared deviations Although sum of squared deviations increases if the dispersion increases, the sum depends on the number of measurements. So, mean squared deviations is a preferred measure of dispersion.

MEASURES OF VARIABILITY VARIANCE, STANDARD DEVIATION, CV Variance is the mean squared deviation For example, Team I variance

MEASURES OF VARIABILITY VARIANCE, STANDARD DEVIATION, CV Standard deviation is the root mean squared deviation i.e., square root of variance. So, Team I standard deviation

MEASURES OF VARIABILITY VARIANCE, STANDARD DEVIATION, CV Coefficient of variation is the standard deviation divided by the mean. So, Team I coefficient of variation

MEASURES OF VARIABILITY VARIANCE, STANDARD DEVIATION, CV Why there are three similar terms? In the above example, variance has unit inch2 But, standard deviation has unit inch - the unit of the original data. So, standard deviation may sometimes be preferred over variance. Coefficient of variation is dimension less. Hence, coefficient of variation is a useful quantity for comparing the variability in data sets having different standard deviations and different means.

MEASURES OF VARIABILITY VARIANCE, STANDARD DEVIATION, CV Interpretation of standard deviation It’s difficult to interpret A higher standard deviation implies a greater variability Standard deviation is widely used to approximate the proportion of measurements that fall into various intervals of values. This is specially true if the data has a bell-shaped distribution.

MEASURES OF VARIABILITY VARIANCE, STANDARD DEVIATION, CV Interpretation of standard deviation An empiricial rule states that if the data has a bell-shaped distribution, approximately 68% measurements fall within one standard deviation of the mean i.e., between (mean-standard deviation) and (mean+standard deviation) approximately 95% measurements fall within two standard deviations of the mean, and virtually all the measurements fall within three standard deviations of the mean

MEASURES OF VARIABILITY VARIANCE, STANDARD DEVIATION, CV -3 -2 -1 +1 +2 +3 Mean 68.26% 95.44% 99.74%

MEASURES OF VARIABILITY VARIANCE, STANDARD DEVIATION, CV Interpretation of standard deviation Example: suppose that the final marks has a bell-shaped distribution, with a mean of 75 and a standard deviation of 7. Then, approximately 68% marks fall between (75-7)=68 and (75+7)=82. approximately 95% marks fall between (75-27)=61 and (75+27)=89, and virtually all the measurements fall between (75-37) =54 and (75+37)=96

POPULATION VARIANCE The population variance is the mean squared deviation from the population mean: Where 2 stands for the population variance  is the population mean N is the total number of values in the population is the value of the i-th observation. represents a summation

SAMPLE VARIANCE The sample variance is defined as follows: Where s2 stands for the sample variance is the sample mean n is the total number of values in the sample is the value of the i-th observation. represents a summation

SAMPLE VARIANCE Notice that the sample variance is defined as the sum of the squared deviations divided by n-1. Sample variance is computed to estimate the population variance. An unbiased estimate of the population variance may be obtained by defining the sample variance as the sum of the squared deviations divided by n-1 rather than by n. Defining sample variance as the mean squared deviation from the sample mean tends to underestimate the population variance.

SAMPLE VARIANCE A sample of monthly advertising expenses (in 000$) is taken. The data for five months are as follows: 2.5, 1.3, 1.4, 1.0 and 2.0. Compute the sample variance.

SAMPLE VARIANCE An alternate formula for the sample variance: Where s2 stands for the sample variance is the sample mean n is the total number of values in the sample is the value of the i-th observation. represents a summation

SAMPLE VARIANCE A sample of monthly sales expenses (in 000 units) is taken. The data for five months are as follows: 264, 116, 165, 101 and 209. Compute the sample variance using the alternate formula.

POPULATION/SAMPLE STANDARD DEVIATION The standard deviation is the positive square root of the variance: Population standard deviation: Sample standard deviation:

POPULATION/SAMPLE STANDARD DEVIATION Compute the sample standard deviation of advertising data: 2.5, 1.3, 1.4, 1.0 and 2.0 Compute the sample standard deviation of sales data: 264, 116, 165, 101 and 209

POPULATION/SAMPLE CV The coefficient of variation is the standard deviation divided by the means Population coefficient of variation: Sample coefficient of variation:

POPULATION/SAMPLE CV Compute the sample coefficient of variation of advertising data: 2.5, 1.3, 1.4, 1.0 and 2.0 Compute the sample coefficient of variation of sales data: 264, 116, 165, 101 and 209

SAMPLE VARIANCE APPROXIMATED FROM GROUPED DATA Sample variance from grouped data: Where s2 stands for the sample variance is the sample mean n is the total number of observations is the midpoint of the k-th class is the frequency of the k-th class represents a summation over all classes

SAMPLE VARIANCE APPROXIMATED FROM GROUPED DATA Compute the sample variance of days to maturity of 40 investments from the following grouped data:

SAMPLE COEFFICIENT OF SKEWNESS The sample coefficient of skewness: Where SK stands for the coefficient of skewness s is the sample standard deviation is the sample mean m is the sample median

SAMPLE COEFFICIENT OF SKEWNESS Compute the sample coefficient of skewness of the advertising data: 2.5, 1.3, 1.4, 1.0 and 2.0 Mean, = 1.64 (see slide 20) Sample standard deviation, s = 0.6025 (see slides 20, 24) Median, m =

INTERQUARTILE RANGE AND BOX PLOTS The interquartile range represents the range of the middle 50% observations and is the difference between the third quartile and the first. The interquartile range The range and interquartile range are combined in a box plot.

INTERQUARTILE RANGE AND BOX PLOTS A box plot is used to graphically represent the data set. These plots involve five values: the minimum value, S the first quartile, the second quartile or median, the third quartile, and the maximum value, L

INTERQUARTILE RANGE AND BOX PLOTS Example: Construct a box plot with the following data which shows the assets of the 15 largest North American banks, rounded off to the nearest hundred million dollars: 111, 135, 217, 108, 51 , 98, 65, 85, 75, 75, 93, 64, 57, 56, 98

INTERQUARTILE RANGE AND BOX PLOTS Sort the data in the ascending order (low to high): 51, 56, 57, 64, 65, 75, 75, 85, 93, 98, 98, 108, 111, 135, 217 Find

INTERQUARTILE RANGE AND BOX PLOTS If the median is near the center of the box, the distribution is approximately symmetric. If the median falls to the left of the center of the box, the distribution is positively skewed. If the median falls to the right of the center of the box, the distribution is negatively skewed. If the lines are about the same length, the distribution is approximately symmetric. If the line segment to the right of the box is larger than the one to the left, the distribution is positively skewed. If the line segment to the left of the box is larger than the one to the right, the distribution is positively skewed.

THE PROPORTION Population proportion is denoted by The parameter is a number between 0 and 1 Sample proportion is denoted by P P serves as an estimator of and calculated as follows:

READING AND EXERCISES Lesson 4 Reading: Section 2-3, pp. 50-61 2-30, 2-37, 2-41