Measures of Central Tendency
These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed to be located or concentrated These measures indicate a value, which all the observations tend to have, or a value where all the observations can be assumed to be located or concentrated There are three such measures: There are three such measures: i) Mean i) Mean ii) Median, Quartiles, Percentiles and ii) Median, Quartiles, Percentiles and Deciles Deciles iii) Mode iii) Mode
Mean Arithmetic Mean Arithmetic Mean Harmonic Mean Harmonic Mean Geometric Mean Geometric Mean 1) Ungrouped data Sum of Observations x 1 + x 2 +….x n Sum of Observations x 1 + x 2 +….x n Mean = = Number of Observations n Number of Observations n
2) Grouped data When the data is grouped, prepare frequency table Mid-point of Class Frequency ( f i ) Class Interval Mid-point of Class Frequency ( f i ) Interval ( X i ) -- x 1 f x k f k ∑ f i x i x = ∑ f i Where x i is the middle point of the ith class interval. f i is the frequency of the ith class interval. f i x i is the product of f i and x i and k is the number of class intervals
Median Whenever there are some extreme values in data, calculation of A.M. is not desirable Median of a set of values is defined as the middle most value of a series of values arranged in ascending / descending order If the number of observations is odd, the value corresponding to the middle most values is the median If the number of observations is even then the average of the two middle most values is the median
Example Ascending Order Hence, the number of observations is 20, and therefore there is no middle observation. Two middle most observations 10 th and 11 th Median = = =
Quartiles Median divides the data into two parts such that 50 % of the observations are less than it and 50 % are more than it. Median divides the data into two parts such that 50 % of the observations are less than it and 50 % are more than it. Similarly there are “Quartiles”. There are three quartiles viz. Q1, Q2 and Q3. These are referred to as first, second and third quartiles. Similarly there are “Quartiles”. There are three quartiles viz. Q1, Q2 and Q3. These are referred to as first, second and third quartiles. The first quartile Q1, divides the data into two parts such that 25 % of the observations are less than it and 75 % more than it. The first quartile Q1, divides the data into two parts such that 25 % of the observations are less than it and 75 % more than it. The second quartile Q2 is the same as median. The second quartile Q2 is the same as median. The third quartile divides the data into two parts such that 75 % observations are less than it and 25 % are more than it. The third quartile divides the data into two parts such that 75 % observations are less than it and 25 % are more than it.
Percentiles Percentiles Percentiles splits the data into several parts, expressed in percentage. Percentiles splits the data into several parts, expressed in percentage. A percentage is also known as centile, divides the data in such a way that “given percent of the observations are less than it. A percentage is also known as centile, divides the data in such a way that “given percent of the observations are less than it. For example, 95 % of the observations are less than the 95 th percentile For example, 95 % of the observations are less than the 95 th percentile It may be noted that the 50 th percentile denoted as P 50 is the same as the median It may be noted that the 50 th percentile denoted as P 50 is the same as the median
Deciles The deciles divides the data into ten parts The deciles divides the data into ten parts First decile (10%) First decile (10%) Second (20%) and so on Second (20%) and so on
Mode It is defined in such a way that it represents the fashion of the observations in a data. It is defined in such a way that it represents the fashion of the observations in a data. Mode is defined as the most fashionable value, which, maximum number of observations have or tend to have as compared to any other value. Mode is defined as the most fashionable value, which, maximum number of observations have or tend to have as compared to any other value. Observations are 2, 4, 4, 6, 8, 8, 8, 10, 12 Observations are 2, 4, 4, 6, 8, 8, 8, 10, 12 Here mode is 8 because 3 observations have this value. Here mode is 8 because 3 observations have this value.
Measures of Variation/ Dispersion Measures of variation/dispersion provide an idea of the extent of variation present among the observations These are- Measures of variation/dispersion provide an idea of the extent of variation present among the observations These are- i) Range i) Range ii) Mean Deviation ii) Mean Deviation iii) Standard Deviation iii) Standard Deviation iv) Coefficient of Variation iv) Coefficient of Variation
Range It is the simplest measure of variation, and is defined as the difference between the maximum and the minimum values of the observations It is the simplest measure of variation, and is defined as the difference between the maximum and the minimum values of the observations Range = Maximum Value – Minimum Value Range = Maximum Value – Minimum Value Since the range depends only on the two viz. the minimum and the maximum values, and does not utilize the full information in the given data, it is not considered very reliable or efficient.. Since the range depends only on the two viz. the minimum and the maximum values, and does not utilize the full information in the given data, it is not considered very reliable or efficient.. Coefficient of scatter is another based on the range of the data Coefficient of scatter is another based on the range of the data Range Maximum – Minimum = Range Maximum – Minimum = Maximum + Minimum Maximum + Minimum Maximum + Minimum Maximum + Minimum It gives an indication about variability in the data
Mean Deviation In order to study the variation in a data, one method could be to take into consideration the deviation of all the observation from their mean In order to study the variation in a data, one method could be to take into consideration the deviation of all the observation from their mean Example ( Mean 50) Example ( Mean 50) Deviation from Mean Observation Deviation from Mean
Mean Deviation for Ungrouped Data If the data is ungrouped and the observations for certain variable x, are x 1, x 2, x 3, ….., x n If the data is ungrouped and the observations for certain variable x, are x 1, x 2, x 3, ….., x n ∑ xi - x ∑ xi - x Mean Deviation = Mean Deviation = n For the data comprising observations 1,2,3, it can be calculated as follows (x i ) x i – x x i – x (x i ) x i – x x i – x Sum Sum Mean / 3 Mean / 3 Thus the mean deviation is 2 / 3 = 0.67
Mean Deviation for Grouped Data Class Interval Middle Point of Class Interval ( x i ) Frequency ƒί ( ƒί ) ƒί ƒί x I │x i - x│ ƒί ƒί │x i - x│ Sum Average
∑ ƒί ∑ ƒί │x i - x│ Mean Deviation = = = 965 ∑ ƒί 20 ∑ ƒί 20 x i is the middle point of class interval x i is the middle point of class interval x is the mean ƒί is the frequency of the i th class interval
Variance and Standard Deviation While calculating mean deviation, the absolute values of observations from the mean were taken because without doing so, the total deviation was zero for the data comprising values 1,2 and 3 even though there was variation present among these observations. While calculating mean deviation, the absolute values of observations from the mean were taken because without doing so, the total deviation was zero for the data comprising values 1,2 and 3 even though there was variation present among these observations. However another way of getting over this problem of total deviation being zero is to take the squares of deviations of the observations from the mean However another way of getting over this problem of total deviation being zero is to take the squares of deviations of the observations from the mean xi xi - x ( xi - x ) 2 xi xi - x ( xi - x ) Sum Sum Mean 2 0 2/3 (=0.67) Mean 2 0 2/3 (=0.67)
Calculation of variance and standard deviation for ungrouped data Calculation of variance and standard deviation for ungrouped data 1 Variance (σ 2 )= ---- ∑ ( xi - x ) 2 Variance (σ 2 )= ---- ∑ ( xi - x ) 2 n 1 = x (2) = 0.67 = x (2) = The square root of σ 2 i. e σ is known as the standard deviation The square root of σ 2 i. e σ is known as the standard deviation Standard Deviation (σ ) = 0.67 = 0.82 Standard Deviation (σ ) = 0.67 = 0.82
Calculation of Variance and Standard Deviation for Grouped Data Class Interval Mid Point of Class Interval (x i ) Frequency ƒ ί ƒ ί x i ƒ ί x i 2 (x i - x)(x i - x) 2 ƒ ί │(x i - x) Sum Average ( x ) 4550Variance =
S.D. = Variance S.D. = Variance = = = =
Combining Variances of Two Populations The mean and S.D. of the “lives” of tyres manufactured by two factories of the “Durable” tyre company, making 50,000 tyres, annually, at each of the two factories, are given below. Calculate the mean and standard deviation of all the tyres producced in a year. Group Mean (‘000 kms.) S.D. (‘000 kms.)
We know that if there is one set of data having n1 observations with mean=m1 and s.d.= σ1 σ 2 then the mean (m) and variance (σ 2 ) of the combined data with (n 1 + n 2 ) We know that if there is one set of data having n1 observations with mean=m1 and s.d.= σ1 and another set of data having n2 observations with mean = m2 and s.d. = σ 2 then the mean (m) and variance (σ 2 ) of the combined data with (n 1 + n 2 ) observations are given as observations are given as m = n 1 m 1 + n 2 m 2 / n 1 + n 2 m = n 1 m 1 + n 2 m 2 / n 1 + n 2 σ 2 = n 1 (σ d 1 2 ) +n 2 (σ d 2 2 ) / n 1 +n 2 σ 2 = n 1 (σ d 1 2 ) +n 2 (σ d 2 2 ) / n 1 +n 2 d 1 = m 1 – m d 1 = m 1 – m d 2 = m 2 – m m= combined mean of both d 2 = m 2 – m m= combined mean of both the sets of data the sets of data
Factory 1 : n 1 = 50, m 1 = 60 and σ 1 = 8 Factory 2 : n 2 = 50, m 2 = 55 and σ 1 = 7 Substituting these values in the above formulas Mean = (50 x 60) + (50 x 55) / (50+50) = ( ) / 100 = ( ) / 100 = 5750 / 100 = 5750 / 100 = 57.5 = 57.5 Thus the mean life of the tyres manufactured by the company is 57,500 kms. Thus the mean life of the tyres manufactured by the company is 57,500 kms.
Therefore, d1 = m1 – m = = 2.5 d2 = m2 – m = = x ( ) + 50 x ( ) 50x ( ) + 50 x ( ) Variance (σ 2 ) = __________________________ = (50x70.25) +(50x 55.25) / 100 = (50x70.25) +(50x 55.25) / 100 = /100 = /100 = 6275 /100 = 6275 /100 = = 62.75
Variance = (σ 2 ) = Therefore S.D. (σ) = = 7.92 Thus, the S.D of the lives of tyres produced by the company is 7,920 kms.
Mean Deviation The mean deviation is defined as ∑ ∑ ƒ ί x i - x Mean Deviation = ∑ ∑ ƒ ί Where, x1 is the middle point of i th class interval ƒ ί is the frequency of the ith class interval and x Is the arithmetic mean of the I.Q. scores
Class Interval Frequen cy ƒ ί Mid Point of Class Interval x i ƒ ί x i X i - xƒ ί X i - x Summation
From this data, we get ∑ ∑ ƒ ί x i 7150 Mean = = = 71.5 ∑ ∑ ƒ ί 100 ∑ ∑ ƒ ί x i - x 1450 Mean Deviation = = = 14.5 ∑ ∑ ƒ ί 100 Thus the average score is 71.5 and the mean deviation of the score is 14.5
Class IntervalFrequency ƒ ί Mid Point of Class Interval (x i ) ƒ ί x i Xi2Xi2 ƒ ί x i ,08, ,80,500 Summation ,38,500 Suppose we are required to calculate only standard deviation for the Suppose we are required to calculate only standard deviation for the above data, then the table is constructed as below above data, then the table is constructed as below
∑ ∑ ƒ ί x i 7150 Mean = = = 71.5 ∑ ∑ ƒ ί 100 ∑ ∑ ∑ ƒ ί x i 2 _ _ (∑ ƒ ί )_ ( x _ ) – 100 (71.5) 2 S.D. = = ∑ ∑ ƒ ί 100 = = 16.5 Thus, the s.d. of the I.Q. scores is 16.5
Coefficient of Variation It is a relative measure of dispersion that enables us to compare two distributions. It is a relative measure of dispersion that enables us to compare two distributions. It relates the standard deviation and the mean by expressing the standard deviation as a percentage of the mean It relates the standard deviation and the mean by expressing the standard deviation as a percentage of the mean σ C. V. = x 100 C. V. = x 100 x x
Example For the data For the data 103,50,68,110,105,108,174,103,150,200,225,350,103 find the range, Coefficient of Range and coefficient of quartile deviation 103,50,68,110,105,108,174,103,150,200,225,350,103 find the range, Coefficient of Range and coefficient of quartile deviation 1) Range = H –L = =300 H – L 300 H – L 300 2) Coefficient of range = = = 0.7 H + L H + L
To find Q1 and Q3 we arrange the data in ascending order To find Q1 and Q3 we arrange the data in ascending order n+1 14 n = = = = (n+1) 3 (n+1) = = Q1 = ) = 103 Q2 = (200 – 174) = 187
Q 3 – Q Q 3 – Q Coefficient of QD = = = Q 3 + Q Q 3 + Q
Example A A purchasing agent obtained a sample of incandescent lamps from two suppliers. He had the sample tested in his laboratory for length of life with the following results. Length of light Sample A Sample B in hours 700 – – – Which company’s lamps are more uniform?
Class interval Sample A Midpoint x X U = f uf u Total Sample A
32 32 u A = = x A = u x A = u = (0.533) = = (0.533) = σ 2 u = ---- ∑ f u 2 - ( u ) 2 = (0.533) 2 N 60 N 60 σ 2 u = – = σ x = 200 x = C. V. for sample A = σ A / x A x 100 = / x 100 = / x 100 = % = %
Sample 2 Class interval Sample A Midpoint x X U = f uf u Total6027