Presentation is loading. Please wait.

Presentation is loading. Please wait.

2014.3.3 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 2

Similar presentations


Presentation on theme: "2014.3.3 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 2"— Presentation transcript:

1 2014.3.3 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 2 http://cc.jlu.edu.cn/ms.html

2 2014.3.3 2 Descriptive Statistics Descriptive Statistics 1. Statistical Description for Numerical Data

3 2014.3.3 3 1.1 Frequency Distribution Example 1-1 There are 132 observations on blood sugar ( mmol/L ) of 55~58 years old normal adults (see below). 5.17 5.56 4.86 4.87 4.74 5.24 5.51 4.46 4.96 4.82 4.90 5.30 5.22 5.58 4.48 4.80 4.60 4.02 5.16 5.36 4.34 4.24 4.64 4.27 4.25 4.44 4.46 4.62 4.87 4.34 4.90 5.25 4.77 4.85 5.07 4.16 4.66 4.70 4.20 3.95 4.09 4.64 4.33 5.21 4.61 4.98 5.24 4.60 4.25 4.78 5.00 3.60 4.11 4.61 4.08 4.78 4.26 4.44 4.38 4.40 4.79 4.76 4.92 4.60 4.78 5.03 4.35 4.18 4.68 4.65 4.57 4.27 4.99 4.21 4.89 4.71 4.72 4.41 4.38 4.06 4.79 4.96 4.83 4.45 4.51 4.27 4.50 4.31 5.05 5.59 5.08 5.16 3.74 4.36 5.36 4.64 5.09 4.57 4.46 4.56 4.39 5.24 4.61 4.21 4.96 4.34 4.45 4.86 4.50 4.90 4.45 4.49 4.42 4.68 4.56 5.38 4.34 4.46 4.16 4.98 4.29 4.83 4.27 3.68 3.85 3.86 4.56 4.56 4.55 5.16 5.15 5.16

4 2014.3.3 4 Look at the raw data, what feeling? vague and confuse! How to sort the data? Frequency table !

5 2014.3.3 5 (1) Steps of building a frequency table a. Find Minimum, maximum and calculate range. Range=R=Max-Min=5.59-3.60=1.99 b. Decide the interval of group. R/10=1.99/10=0.199≈0.2 i=0.2 ( i : length of sub-intervals) c. Work out the list of sub-intervals. sub-intervals: [lower limit , upper limit )

6 2014.3.3 6 ★ the first sub-interval must include the Min. ★ the last sub-interval must include the Max. ★ the number of sub-interval may be 8~15. 1st sub-interval: [3.60, 3.80) 2st sub-interval: [3.80, 4.00) … last sub-interval: [5.40, 5.60) (see column 1 of Table 1-1)Table 1-1

7 2014.3.3 7 d. Read, mark and count for whole data. Mark methods: “ 正 ” or “ ” Tip Tip: one by one. (see column 2 of Table 1-1)Table 1-1 c. Calculate the frequencies. The total strokes for each sub-interval equals the frequency. (see column 3 of Table 1-1)

8 2014.3.3 8

9 9 (2) Frequency plot ---histogram

10 2014.3.3 10 (3) The use of frequency table  Show the profile of data distribution. Symmetric? Skew ? Central position? Variability? Outliers? ★ Frequency table is a good way to summarize the data ---- 132 figures are reduced to 22 figures and are much easier to present.

11 2014.3.3 11 Numerical Description Central position (central tendency)

12 2014.3.3 12 Variation (measure of dispersion)

13 2014.3.3 13 1.2 Measures for Average (1) Arithmetic Mean Based on observed data Example: Blood sugar: 6.2, 5.4, 5.7, 5.3, 6.1, 6.0, 5.8, 5.9 (1.1)

14 2014.3.3 14 Based on frequency tableBased on frequency table

15 2014.3.3 15 (1.2) where f i and X k are the frequency and mid-value of the k-th sub-interval, n is the total sample size. Weighted mean For raw data

16 2014.3.3 16 (2) Geometric mean where lg -1 is the anti-logarithm of lg, i.e. lg -1 =10 x. Example: titer values: 2, 4, 8, 32, 32, 64, 64.  (1.3) For titer 100×2 << 1×1024 !

17 2014.3.3 17 (3) Median Ranking the values of observation from the smallest to the largest, Median = the value in the middle Based on raw dataBased on raw data Example 1: latent period (day): 6, 5, 4, 7, 12, 4, 5, 7, 9. (9 values) rank : 4 、 4 、 5 、 5 、 6 、 7 、 7 、 9 、 12 Median = 6 Tip Tip: odd number.

18 2014.3.3 18 Example 2: latent period (day): 6, 5, 4, 7, 12, 4, 5, 7, 9, 20. (10 values) rank : 4 、 4 、 5 、 5 、 6 、 7 、 7 、 9 、 12 、 20 Median = (6+7)/2=6.5 Tip Tip: even number.

19 2014.3.3 19 Based on frequency tableBased on frequency table Example 1-3 The same as Example 1-1. (see table 1-3)

20 2014.3.3 20 formulaformula (1.4) where L is lower limit of the interval that included M, i is interval for each sub-interval, f M is frequency of the interval that included M, n is the total sample size, Σf L is cumulated frequency of the whole sub-intervals below L. How to confirm which interval the Median is in?

21 2014.3.3 21 Method a. Calculate the cumulated frequency or the cumulated frequency (%). see Table 1-3.Table 1-3 b. Who is the 1st number that included n/2 or 50%? n/2=132/2=66, 86 is ! (50%, 65.15% is ) c. The median is in the sub-interval that this number is in. M is in [4.60, 4.80).

22 2014.3.3 22 So: L = 4.60 , i = 0.2 , f M = 25 , Σf L =61. take in (1.4): Do you remember how to calculate mean of the raw data? ★ This M is the approximate value of the mean, a better approximated value.

23 2014.3.3 23 Add: Add: Percentile --- P x  Ranking the values of observation from the smallest to the largest, and divide the range into 100 equal parts, (Percentile) P x = the x-th value. P 50 = the 50-th value, i.e. the middle value. So, P 50 = Median. MinMax P 50 = M PxPx x%x%100-x%

24 2014.3.3 24 formulaformula (1.5) where L is lower limit of the interval that included P x, i is interval for each sub-interval, f M is frequency of the interval that included P x, n is the total sample size, Σf L is cumulated frequency of the whole sub-intervals below L. How to confirm which interval the P x is in? See an example : calculate P 25 of Example 1-3.

25 2014.3.3 25 Method a. Calculate the cumulated frequency or the cumulated frequency (%). see Table 1-3.Table 1-3 b. Who is the 1st number that included P 25 or 25%? P 25 =132×25%=33, 37 is ! (25%, 28.03% is ) c. The P 25 is in the sub-interval that this number is in. P 25 is in [4.20, 4.40).

26 2014.3.3 26 So: L = 4.20 , i = 0.2 , f 25 = 23 , Σf L =14. take in (1.5): also:

27 2014.3.3 27 Symmetric RBC (10 12 /L)of 130 normal male adults in a place 3.8 4 4.2 4.4 4.6 4.8 5 5.2 5.4 5.6 5.8 RBC(10 /L) 0 5 10 15 20 25 30 Frequency

28 2014.3.3 28 Positive skew Hair Mercury (μg/g) of 238 normal adults 0.5 0.9 1.3 1.7 2.1 2.5 2.9 3.3 3.7 4.1 Hg(ug/g) 0 10 20 30 40 50 60 70 Frequency

29 2014.3.3 29 Summary 1.Mean 1.Mean: Suitable to symmetric distribution. Geometric mean 2. Geometric mean: G Suitable to positive skew distribution. Median 3. Median: M Suitable to all kinds of data, but poor attribute for further analysis.

30 2014.3.3 30 C ★ Remember there are three averages after learning statistics, not only one! http://en.wikipedia.org/wiki/Temple_of_Heaven


Download ppt "2014.3.3 1 Medical Statistics Medical Statistics Tao Yuchun Tao Yuchun 2"

Similar presentations


Ads by Google