Presentation is loading. Please wait.

Presentation is loading. Please wait.

Descriptive Statistics

Similar presentations


Presentation on theme: "Descriptive Statistics"— Presentation transcript:

1 Descriptive Statistics
Dr. Asif Rehman

2 Measures of Central Tendency
Measure of central tendency refer to the summary measures used to describe the most “typical” value in a set of values. The three most common measure of central tendency are: Mean Median Mode

3 Mean The most popular measure of central tendency for a quantitative data set. Also known as the average. It is calculated by adding all the observations and dividing by total number of observations. The sample mean is denoted by x̅ (pronounced x bar) and the population mean is denoted by µ (the Greek letter mu). Mean can only be calculated for quantitative data

4 Mean Suppose we draw a sample of five women and measure their weights in pounds. 110, 110, 140, 150, 160 The Mean weight would be equal to: /5 = 670/5 = 134 pounds

5 Median The median is an important measure of central tendency.
It is the value that divides the a distribution in to two equal halves Arrange the observations in order from smallest to largest value or vice versa. If there are an odd number of total observations, the Median is the middle value. If there is an even number of total observations, the Median is the average of of the two middle values. The Median value is useful when some measurements are much bigger or much smaller than the rest. The mean of such data will be biased towards these extreme values while the median is not influenced by extreme values.

6 Median Suppose we draw a sample of five women and measure their weights in pounds. 110, 140, 110, 160, 150 Arrange is ascending order ( ) The Median value would be 140 pounds since 140 pound is the middle weight

7 Mode The Mode is the most frequently occurring value in a set of observations. in a set 110, 110, 140, 150, 160 The most frequent value is 110 ( as occurring twice) so the mode of the data is 110 pounds.

8 Mean versus Median The mean may be better indicator of the most, typical value, if a set of scores has an outlines. An outliner is an extreme value that differs greatly from other values. Scores that are much above or below the mean are called outliners. E.g if in the above mentioned data one individual has a wt of 250 Ibs (wt of 160 Ibs replaced by 250 Ibs). This will be an extreme value, e.g outliner and will impact the mean value. Mean = ( )/5 = 760/5 = 152 Ibd The mean value on account of 250 Ibs is much higher than most reading in the data set. Hence, in such cases median should be reported which will continue to be 140 However, when the sample size is large & doesn’t include outliners, the mean score usually provides a better measure of central tendency

9 Measures of variations
These includes the measures to describe the amount of variability or spread in a set of data. The most common measures of variability are the Range Variance Standard deviation.

10 Measures of variations
Range Range is the simplest measure of variability. It is defined as the difference in value between the highest and lowest observation in the data set. For example consider the following women weight in the data set 110, 110, 140, 150, 160. The range would be: = 50

11 Measures of variations
Variance Variance quantifies the amount of variability or spread about the mean of the sample. For instance, the women weight in the previous example were 110, 110, 140, 150 and 160 pound. Variance (S) = ξ ( x1 – x̅ )2 / (n – 1) X1 = individual sample observation x̅ = sample mean N = total sample size ξ = some of the differences b/w individual sample observation and sample mean

12 Measures of variations
Variance Variance (S) = ξ ( x1 – x̅ )2 / (n – 1) = [( )2 + ( )2+( )2+( )2+( )]2/5-1 = [(-24)2 + (-24)2+(6)2+(16)2+(26)]2/5-1 = [ ]/4 = 2120/4 = 530 (To avoid the (–)sign we used the principle of squaring the value to get rid of the minus sign. Hence we obtain the squared difference of each value form the mean)

13 Standard Deviation SD is the square root of the variance.
The SD is a measure, which describes how much individual measurement differs, on the average from the mean. SD = S = (530) = 23.02

14 Standard Deviation A large SD reflects that there is a wide scatter of measured values around the mean while a small SD reflects that the individual values are concentrated around the mean with little variation among them.

15 Standard Deviation Remember in planning and decision making we are interested in figure which tells us the average difference of each value from the mean but what we obtained in Variance is the average of square of the difference, so to have an average of the difference we need to reverse the process i.e we would have to take the square root of the value. And this figure we obtain which we have all our interest and the most powerful tool in Biostatistics and is termed as Standard Deviation.

16 Standard Deviation Exercise
Suppose, for a study on 300 chronic kidney disease patients, the Hb levels were obtained. The data on Hb level is plotted. The data is normally distributed, with a mean Hb and SD are calculated as 7 mg/dL and 1 mg/dL, respectively: (3 marks) Calculate the number of patients who’s Hb level will be within range of 6mg/dL to 8 mg/dL.

17 SD

18 SD Mean = 7mg/dl SD = 1 mg/dl
No of pt at 1 SD either side= 300 x 68% = 204

19 THANK YOU


Download ppt "Descriptive Statistics"

Similar presentations


Ads by Google