Presentation is loading. Please wait.

Presentation is loading. Please wait.

Numerical Measures. Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape.

Similar presentations


Presentation on theme: "Numerical Measures. Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape."— Presentation transcript:

1 Numerical Measures

2 Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape

3 Measures of Central Tendency (Location) Mean Median Mode Central Location

4 Measures of Non-central Location Quartiles, Mid-Hinges Percentiles Non - Central Location

5 Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Variability

6 Measures of Shape Skewness Kurtosis

7 Summation Notation

8 Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the symbol denotes the sum of these n numbers x 1 + x 2 + x 3 + …+ x n

9 Example Let x 1, x 2, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi 101521713

10 Then the symbol denotes the sum of these 5 numbers x 1 + x 2 + x 3 + x 4 + x 5 = 10 + 15 + 21 + 7 + 13 = 66

11 Meaning of parts of summation notation Quantity changing in each term of the sum Starting value for i Final value for i each term of the sum

12 Example Again let x 1, x 2, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi 101521713

13 Then the symbol denotes the sum of these 3 numbers = 15 3 + 21 3 + 7 3 = 3375 + 9261 + 343 = 12979

14 Measures of Central Location (Mean)

15 Mean Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the mean of the n numbers is defined as:

16 Example Again let x 1, x 2, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi 101521713

17 Then the mean of the 5 numbers is:

18 Interpretation of the Mean Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the mean,, is the centre of gravity of those the n numbers. That is if we drew a horizontal line and placed a weight of one at each value of x i, then the balancing point of that system of mass is at the point.

19 x1x1 x2x2 x3x3 x4x4 xnxn

20 10715 21 13 In the Example 100 20

21 The mean,, is also approximately the center of gravity of a histogram

22 Measures of Central Location (Median)

23 The Median Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the median of the n numbers is defined as the number that splits the numbers into two equal parts. To evaluate the median we arrange the numbers in increasing order.

24 If the number of observations is odd there will be one observation in the middle. This number is the median. If the number of observations is even there will be two middle observations. The median is the average of these two observations

25 Example Again let x 1, x 2, x 3, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi 101521713

26 The numbers arranged in order are: 710131521 Unique “Middle” observation – the median

27 Example 2 Let x 1, x 2, x 3, x 4, x 5, x 6 denote the 6 denote numbers: 23411219648 Arranged in increasing order these observations would be: 81219234164 Two “Middle” observations

28 Median = average of two “middle” observations =

29 Example The data on N = 23 students Variables Verbal IQ Math IQ Initial Reading Achievement Score Final Reading Achievement Score

30 Data Set #3 The following table gives data on Verbal IQ, Math IQ, Initial Reading Acheivement Score, and Final Reading Acheivement Score for 23 students who have recently completed a reading improvement program InitialFinal VerbalMathReadingReading StudentIQIQAcheivementAcheivement 186941.11.7 21041031.51.7 386921.51.9 41051002.02.0 51181151.93.5 6961021.42.4 790871.51.8 8951001.42.0 9105961.71.7 1084801.61.7 1194871.61.7 121191161.73.1 1382911.21.8 1480931.01.7 151091241.82.5 161111191.43.0 1789941.61.8 18991171.62.6 1994931.41.4 20991101.42.0 2195971.51.3 221021041.73.1 23102931.61.9

31 Computing the Median Stem leaf Diagrams Median = middle observation =12 th observation

32 Summary

33 Numerical Measures

34 Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape

35 Measures of Central Tendency (Location) Mean Median Mode Central Location

36 Measures of Non-central Location Quartiles, Mid-Hinges Percentiles Non - Central Location

37 Measure of Variability (Dispersion, Spread) Variance, standard deviation Range Inter-Quartile Range Variability

38 Measures of Shape Skewness Kurtosis

39 Measures of Central Location Mean Median

40 Mean Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the mean of the n numbers is defined as:

41 The Median Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the median of the n numbers is defined as the number that splits the numbers into two equal parts. To evaluate the median we arrange the numbers in increasing order.

42 If the number of observations is odd there will be one observation in the middle. This number is the median. If the number of observations is even there will be two middle observations. The median is the average of these two observations

43 Some Comments The mean is the centre of gravity of a set of observations. The balancing point. The median splits the obsevations equally in two parts of approximately 50%

44 The median splits the area under a histogram in two parts of 50% The mean is the balancing point of a histogram 50% median

45 For symmetric distributions the mean and the median will be approximately the same value 50% Median &

46 50% median For Positively skewed distributions the mean exceeds the median For Negatively skewed distributions the median exceeds the mean 50%

47 An outlier is a “wild” observation in the data Outliers occur because –of errors (typographical and computational) –Extreme cases in the population

48 The mean is altered to a significant degree by the presence of outliers Outliers have little effect on the value of the median This is a reason for using the median in place of the mean as a measure of central location Alternatively the mean is the best measure of central location when the data is Normally distributed (Bell-shaped)

49 Review

50 Summarizing Data Graphical Methods

51 Histogram Stem-Leaf Diagram Grouped Freq Table

52 Numerical Measures Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape The objective is to reduce the data to a small number of values that completely describe the data and certain aspects of the data.

53 Summation Notation Quantity changing in each term of the sum Starting value for i Final value for i each term of the sum

54 Example Let x 1, x 2, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi 101521713

55 Then the symbol denotes the sum of these 3 numbers = 15 3 + 21 3 + 7 3 = 3375 + 9261 + 343 = 12979

56 Then the symbol denotes the sum of these 5 numbers x 1 + x 2 + x 3 + x 4 + x 5 = 10 + 15 + 21 + 7 + 13 = 66

57 Measures of Central Location (Mean)

58 Mean Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the mean of the n numbers is defined as:

59 Example Again let x 1, x 2, x 3, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi 101521713

60 Then the mean of the 5 numbers is:

61 Interpretation of the Mean Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the mean,, is the centre of gravity of those the n numbers. That is if we drew a horizontal line and placed a weight of one at each value of x i, then the balancing point of that system of mass is at the point.

62 x1x1 x2x2 x3x3 x4x4 xnxn

63 10715 21 13 In the Example 100 20

64 The mean,, is also approximately the center of gravity of a histogram

65 The Median Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the median of the n numbers is defined as the number that splits the numbers into two equal parts. To evaluate the median we arrange the numbers in increasing order.

66 If the number of observations is odd there will be one observation in the middle. This number is the median. If the number of observations is even there will be two middle observations. The median is the average of these two observations

67 Example Again let x 1, x 2, x 3, x 3, x 4, x 5 denote a set of 5 denote the set of numbers in the following table. i12345 xixi 101521713

68 The numbers arranged in order are: 710131521 Unique “Middle” observation – the median

69 Example 2 Let x 1, x 2, x 3, x 3, x 4, x 5, x 6 denote the 6 denote numbers: 23411219648 Arranged in increasing order these observations would be: 81219234164 Two “Middle” observations

70 Median = average of two “middle” observations =

71 Example The data on N = 23 students Variables Verbal IQ Math IQ Initial Reading Achievement Score Final Reading Achievement Score

72 Data Set #3 The following table gives data on Verbal IQ, Math IQ, Initial Reading Acheivement Score, and Final Reading Acheivement Score for 23 students who have recently completed a reading improvement program InitialFinal VerbalMathReadingReading StudentIQIQAcheivementAcheivement 186941.11.7 21041031.51.7 386921.51.9 41051002.02.0 51181151.93.5 6961021.42.4 790871.51.8 8951001.42.0 9105961.71.7 1084801.61.7 1194871.61.7 121191161.73.1 1382911.21.8 1480931.01.7 151091241.82.5 161111191.43.0 1789941.61.8 18991171.62.6 1994931.41.4 20991101.42.0 2195971.51.3 221021041.73.1 23102931.61.9

73 Computing the Median Stem leaf Diagrams Median = middle observation =12 th observation

74 Summary

75 Some Comments The mean is the centre of gravity of a set of observations. The balancing point. The median splits the observations equally in two parts of approximately 50%

76 The median splits the area under a histogram in two parts of 50% The mean is the balancing point of a histogram 50% median

77 For symmetric distributions the mean and the median will be approximately the same value 50% Median &

78 50% median For Positively skewed distributions the mean exceeds the median For Negatively skewed distributions the median exceeds the mean 50%

79 An outlier is a “wild” observation in the data Outliers occur because –of errors (typographical and computational) –Extreme cases in the population

80 The mean is altered to a significant degree by the presence of outliers Outliers have little effect on the value of the median This is a reason for using the median in place of the mean as a measure of central location Alternatively the mean is the best measure of central location when the data is Normally distributed (Bell-shaped)

81 Measures of Non-Central Location Percentiles Quartiles (Hinges, Mid-hinges)

82 Definition The P×100 Percentile is a point, x P, underneath a distribution that has a fixed proportion P of the population (or sample) below that value P×100 % xPxP

83 Definition (Quartiles) The first Quartile, Q 1,is the 25 Percentile, x 0.25 25 % x 0.25

84 The second Quartile, Q 2,is the 50th Percentile, x 0.50 50 % x 0.50

85 The second Quartile, Q 2, is also the median and the 50 th percentile

86 The third Quartile, Q 3,is the 75 th Percentile, x 0.75 75 % x 0.75

87 The Quartiles – Q 1, Q 2, Q 3 divide the population into 4 equal parts of 25%. 25 % Q1Q1 Q2Q2 Q3Q3

88 Computing Percentiles and Quartiles – Method 1 The first step is to order the observations in increasing order. We then compute the position, k, of the P×100 Percentile. k = P × (n+1) Where n = the number of observations

89 Example The data on n = 23 students Variables Verbal IQ Math IQ Initial Reading Achievement Score Final Reading Achievement Score We want to compute the 75 th percentile and the 90 th percentile

90 The position, k, of the 75 th Percentile. k = P × (n+1) =.75 × (23+1) = 18 The position, k, of the 90 th Percentile. k = P × (n+1) =.90 × (23+1) = 21.6 When the position k is an integer the percentile is the k th observation (in order of magnitude) in the data set. For example the 75 th percentile is the 18 th (in size) observation

91 When the position k is an not an integer but an integer(m) + a fraction(f). i.e.k = m + f then the percentile is x P = (1-f) × (m th observation in size) + f × (m+1 st observation in size) In the example the position of the 90 th percentile is: k = 21.6 Then x.90 = 0.4(21 st observation in size) + 0.6(22 nd observation in size)

92 When the position k is an not an integer but an integer(m) + a fraction(f). i.e.k = m + f then the percentile is x P = (1-f) × (m th observation in size) + f × (m+1 st observation in size) x p = (1- f) ( m th obs) + f [(m+1) st obs] (m+1) st obs m th obs

93 When the position k is an not an integer but an integer(m) + a fraction(f). i.e.k = m + f x p = (1- f) ( m th obs) + f [(m+1) st obs] (m+1) st obs m th obs Thus the position of x p is 100f% through the interval between the m th observation and the (m +1) st observation

94 Example The data Verbal IQ on n = 23 students arranged in increasing order is: 8082848686899094 949595969999102102 104105105109111118119

95 x 0.75 = 75 th percentile = 18 th observation in size =105 (position k = 18) x 0.90 = 90 th percentile = 0.4(21 st observation in size) + 0.6(22 nd observation in size) = 0.4(111)+ 0.6(118) = 115.2 (position k = 21.6)

96 An Alternative method for computing Quartiles – Method 2 Sometimes this method will result in the same values for the quartiles. Sometimes this method will result in the different values for the quartiles. For large samples the two methods will result in approximately the same answer.

97 Let x 1, x 2, x 3, … x n denote a set of n numbers. The first step in Method 2 is to arrange the numbers in increasing order. From the arranged numbers we compute the median. This is also called the Hinge

98 Example Consider the 5 numbers: 101521713 Arranged in increasing order: 710131521 The median (or Hinge) splits the observations in half Median (Hinge)

99 The lower mid-hinge (the first quartile) is the “median” of the lower half of the observations (excluding the median). The upper mid-hinge (the third quartile) is the “median” of the upper half of the observations (excluding the median).

100 Consider the five number in increasing order: 710131521 Median (Hinge) 13 Lower Half Upper Half Upper Mid-Hinge (First Quartile) (7+10)/2 =8.5 Upper Mid-Hinge (Third Quartile) (15+21)/2 = 18

101 Computing the median and the quartile using the first method: Position of the median: k = 0.5(5+1) = 3 Position of the first Quartile: k = 0.25(5+1) = 1.5 Position of the third Quartile: k = 0.75(5+1) = 4.5 710131521 Q 2 = 13Q 1 = 8. 5 Q 3 = 18

102 Both methods result in the same value This is not always true.

103 Example The data Verbal IQ on n = 23 students arranged in increasing order is: 80 82 84 86 86 89 90 94 94 95 95 96 99 99 102 102 104 105 105 109 111 118 119 Median (Hinge) 96 Lower Mid-Hinge (First Quartile) 89 Upper Mid-Hinge (Third Quartile) 105

104 Computing the median and the quartile using the first method: Position of the median: k = 0.5(23+1) = 12 Position of the first Quartile: k = 0.25(23+1) = 6 Position of the third Quartile: k = 0.75(23+1) = 18 80 82 84 86 86 89 90 94 94 95 95 96 99 99 102 102 104 105 105 109 111 118 119 Q 2 = 96Q 1 = 89 Q 3 = 105

105 Many programs compute percentiles, quartiles etc. Each may use different methods. It is important to know which method is being used. The different methods result in answers that are close when the sample size is large.

106 Measures of Central Location Mean Median

107 Mean Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the mean of the n numbers is defined as:

108 The Median Let x 1, x 2, x 3, … x n denote a set of n numbers. Then the median of the n numbers is defined as the number that splits the numbers into two equal parts. To evaluate the median we arrange the numbers in increasing order.

109 If the number of observations is odd there will be one observation in the middle. This number is the median. If the number of observations is even there will be two middle observations. The median is the average of these two observations

110 Measures of Non-Central Location Percentiles Quartiles (Hinges, Mid-hinges)

111 Definition The P×100 Percentile is a point, x P, underneath a distribution that has a fixed proportion P of the population (or sample) below that value P×100 % xPxP

112 Computing Percentiles and Quartiles – Method 1 The first step is to order the observations in increasing order. We then compute the position, k, of the P×100 Percentile. k = P × (n+1) Where n = the number of observations

113 When the position k is an integer the percentile is the k th observation (in order of magnitude) in the data set. When the position k is an not an integer but an integer(m) + a fraction(f). i.e.k = m + f then the percentile is x P = (1-f) × (m th observation in size) + f × (m+1 st observation in size)

114 An Alternative method for computing Quartiles – Method 2 Sometimes this method will result in the same values for the quartiles. Sometimes this method will result in the different values for the quartiles. For large samples the two methods will result in approximately the same answer.

115 Let x 1, x 2, x 3, … x n denote a set of n numbers. The first step in Method 2 is to arrange the numbers in increasing order. From the arranged numbers we compute the median. This is also called the Hinge

116 The lower mid-hinge (the first quartile) is the “median” of the lower half of the observations (excluding the median). The upper mid-hinge (the third quartile) is the “median” of the upper half of the observations (excluding the median).

117 Box-Plots Box-Whisker Plots A graphical method of of displaying data An alternative to the histogram and stem-leaf diagram

118 To Draw a Box Plot Compute the Hinge (Median, Q 2 ) and the Mid-hinges (first & third quartiles – Q 1 and Q 3 ) We also compute the largest and smallest of the observations – the max and the min.

119 Example The data Verbal IQ on n = 23 students arranged in increasing order is: 80 82 84 86 86 89 90 94 94 95 95 96 99 99 102 102 104 105 105 109 111 118 119 Q 2 = 96Q 1 = 89 Q 3 = 105 min = 80max = 119

120 The Box Plot is then drawn Drawing above an axis a “box” from Q 1 to Q 3. Drawing vertical line in the box at the median, Q 2 Drawing whiskers at the lower and upper ends of the box going down to the min and up to max.

121 Box Lower Whisker Upper Whisker Q2Q2 Q1Q1 Q3Q3 minmax

122 Example The data Verbal IQ on n = 23 students arranged in increasing order is: min = 80 Q 1 = 89 Q 2 = 96 Q 3 = 105 max = 119

123 7080 90100110120130 Box Plot of Verbal IQ

124 70 80 90 100 110 120 130 Box Plot can also be drawn vertically

125 Box-Whisker plots (Verbal IQ, Math IQ)

126 Box-Whisker plots (Initial RA, Final RA )

127 Summary Information contained in the box plot Middle 50% of population 25%

128 Next topic: Numerical Measures of Variability Numerical Measures of Variability


Download ppt "Numerical Measures. Measures of Central Tendency (Location) Measures of Non Central Location Measure of Variability (Dispersion, Spread) Measures of Shape."

Similar presentations


Ads by Google