Lecture9 non- parametric methods

Lecture9 nonparametric methods
Xiaojin YU Department of Epi. And Biostatistics, School of public health，Southeast university

Review: Type of data qualitative data (categorical data)
(1) binary (dichotomous, binomial) (2) multinomial (polytomous) (3) ordinal quantitative data Firstly, we have a review of last class, in that class, we have introduced type of data, There are two type, one is

Measures of central tendency- quantitative data
Mean: Normal distribution Geometric mean: positively skew and data can be transferred into normal distribution by log scale. Median: used by all data, in general, is often used to abnormal data.

Measures of Dispersion- quantitative data
Range, Interquartile range , Variance and standard deviation , coefficient of variation

Compare means by t-test
ν type conditions H0 Single sample t-test ,S,n,μ0 n-1 μ=μ0 Paired t-test ,Sd,n μd=0 np-1 Two group t-test , s1,n1, , s2,n2 Assumption: normality Equality of variance n1+n2-2

Comparison of Means between two groups
2018/11/18

Compare proportion by Chi-square test
Drugs Effect of drug total Sample rates effective Not effective Drug A 41 4 45 91.1 Drug B 24 11 35 68.6 65 15 70 Are the 2 population proportions equal or not? How categorical variables are distributed among 2 population?

solution H0： πA = πB H1: πA≠πB, α=0.05
Calculate T and Test Statistic, Chi-square A ( T ) 41 (36.56) 4 (8.44) 24 (28.44) 11 (6.56) Drug positive negative R total A T T n1 B T T n2 C total m1 m2 n

Test Statistic A ( T ) 41 (36.56) 4 (8.44) 24 (28.44) 11 (6.56)

Conclusion Since 6.573>3.84, P<0.05, we reject H0,accept H1 at 0.05 level, so We conclude that the two populations are not homogeneous with respected to effect of drug. The effects of drug A and drug B are not equivalent.

Compare Ordinal data

OUTLINE Basic logic of rank based methods
Rank sum test for 2 independent group (Completely random design) Sign rank test for Paired design Rank sum test for 3 or more independent group (Completely random design) Multiple Comparison

Rank & Rank Sum Review of median Example of duration in Hospital:
month rank

Task to you How to compare boys are taller or girl are taller no measuring is allowed?

Solution to the task Blue-male Red- female 16

The locations of small value are in front(small rank), and great value are in the post(great rank).

Part I: Wilcoxon Rank Sum Test
Rank Sum Test for Comparing the Locations of Two Populations Mann-Whitney test review t-test for comparing 2 population means Normality and homogeneity

The height of 3 and 4 is same, so their rank will be 3.5
2018/11/18

EXAMPLE 1: Table 9.1 Survival Times of Cats & Rabbits without oxygen
14 35 12 30 11 28 9.5 25 8 23 20 50 6.5 21 19 49 18 48 5 17 46 4 16 3 15 44 2 13 34 1 rank minutes rabbits Cats

STEP I: Test Hypothesis and sig. level
H0：M1=M2 population locations of survival time of both cat and rabbit are equal H1： M1 ≠ M2 population locations of survival time of both cat and rabbit are not equal ； a = 0.05

STEP II: Statistic Assign Ranks
9.5 25 19 49 T2=82.5 n2=12 T1=127.5 n1=8 20 50 8 23 18 48 6.5 21 17 46 16 5 15 44 4 14 35 3 13 34 2 12 30 1 11 28 rank time Pooled sample To pool n1 +n2 observations to form a single sample rank all observations of the pooled sample from smallest to largest in column 2 and 4 Mid-ranks are used by tied values

STEP II: Statistic Test statistic T
Calculate the rank sums for the two samples respectively, denotes by T1 and T2. Take the Ti with small n as T. n1=8<n2=12, so T= T1 =127.5. Sum(T1 ,T2)=N(N+1)/2=210

STEP III: Determine P Value, conclusion
From table in appendix E， by n1=8,n2-n1=4， we have the critical interval of Tα (58-110) Since T=127.5, is beyond of Tα, so,P≤α。Given α=0.05, P<0.05; H0 is rejected, it concludes that the survival times of cats and rabbits in the environment without oxygen might be different. Cat will survive for longer time without oxygen.

BASIC LOGIC N=N1+N2 GIVEN N, the total rank sum is fixed and can be calculated . If H0 is true, the total rank sum should be assigned between 2 groups with weight of ni.

Normal Approximation n1>10 or n2-n1 >10 Correction of ties

EXAMPLE 2: Table 9-2 Results From a Clinic Trial for Hypertension
10292 7663 ~ 189 120 69 total 5827.5 4252.5 157.5 64 37 27 2(healed) 1384.5 2662.5 106.5 88-125 38 13 25 1 effect. 3080 748 44 1-87 87 70 17 0 ineffect. DrugB DrugA Average rank Range of rank Drug B Rank sum effect TA=7663 n=69; TB=10293,n=120

Part II: Wilcoxon’s Signed Rank Test
Wilcoxon(1945) H0: Md=0 Example: A test procedure the data on 28 patients data(14 pairs) from a sequential analysis double blind clinical trial for cancer of the head and neck will be used. (Bakowski MT, etc. Int. J. Radiation Oncology Biology Physics 1978 ,4 : )

Wilcoxon’s Signed Rank Test often used
1) quantitative data---t-test for pairs design the difference of pairs must be normal, if its distribution is skew then must used Signed Rank Test. Qualitative data--- pairs design ordinal

Example 9.3 2 treatment groups : radiotherapy + drug (B)
radiotherapy + placebo (A) The tumor response within three months of completion of treatment was assessed for each patient in terms of complete regression (CR), partial regression (PR), no change (NC) and progression of the disease (P). Scored from 1 to 5 as follows: 5 = CR with no recurrence subsequently up to 6 months ore, 4 = CR initially but with a subsequent recurrence within 6 months, 3 = PR, 2 = NC, 1 = P.

STEP I: Test Hypothesis
H0 ： Md=0 population Median of differences is equal to zero; H1 ： Md≠0 population Median of differences is equal to zero; α=0.05

STEP II: statistic Assigning Rank
1) Calculate the difference di=xi-yi, and ignore all the pairs with zero differences. 2) Rank the absolute values of non-zero dis from the smallest to the largest such that each di gets a rank; if there is a tie, what will we do?

Ties: These six patients all have differences of 1 and therefore the rank numbers 1, 2, 3, 4, 5 and 6 must be divided amongst them. That is, they all have a rank of ( )/6 =3.5 3) Assign the initial signs of dis to their ranks

Test Statistics T valid number of pairs n=10;
Find the sum of the ranks with positive signs and denote by T+; Find the sum of the ranks with negative signs and denote by T-; Sum(T+ ,T-)=n(n+1)/2=55 Let T=min(T+ ,T-) or anyone。 T-=48, T+=7

Step 3) Determine the P value & Conclude A Conclusion
n<25，find the critical value range Tα in table 10.3 (P184) . n=10，T=48 or 7，in this example, given the value of α=0.05, find the critical value T0.05 is (8~47)，T is not in the interval，P<0.05, H0 is not rejected。It can not conclude that the results from two different between 2 treatments.

Normal Approximation When n> 25，the table 10.3 can’t help. Then we turn to the normal approximation. In fact it can be proved that if H0 is true, when n is large enough, the distribution of statistic T will close to a normal distribution with

Correction of Continuity
If there is tie, the statistic is

Part III: Kruskal-Wallis Test
Similar to one-way ANOVA /chi-square test Used to test location of more than 2 populations

Example 9.4 Allocate 24 person randomly to 1 of 3 groups: no exercise; 20 minutes of jogging per day; or 60 minutes of jogging per day. At the end of a month, ask each participant to rate how depressed they now feel, on a Likert scale that runs from 1 ("totally miserable") through to 100 (ecstatically happy"). Question:Does physical exercise alleviate depression?

Report on depression from 3 groups and ranks
Ri

Test Hypothesis H0: M1=M2=…=Mk
3 populations have the same population location H1: M1,M2,…Mk are not all equal : 3 populations have different population location , At least one of the populations has a median different from the others. a = 0.05

Test Statistic -H Let N=n1+n2+n3
Ri the sum of the ranks associated with the ith sample, like 76.5,79.5,144 The average rank is (N+1)/2 The sample average rank for ith sample is Ri/ni 12/{N(N+1)}standard the test statistic in terms of the overall sample size N.

Solution to Example There are k-1=2 degree of freedom in this example.
R1=76.5 n1=8 R2=79.5 n2=8 R3=144 n3=8 There are k-1=2 degree of freedom in this example.

Adjusted Formulae for Tied
the number of individuals within the j-th tied subgroup

CRITICAL VALUE Table 11 H-critical values C2 –Critical Values
when n is big enough， H is distributed as c2 distribution approximately with n = k – 1

Conclusion k=3，，the critical value is 5.99 .
Since 7.27>5.99,the P<0.05. we reject H0. that is, there is evidence that at least one of the groups is different from others.

NONPARAMETRIC test Nonparametric: That are not focused on testing hypothesis about the parameters of the population. Distribution-free: make no assumptions about the distribution of the data; and are suitable for small sample sizes or large samples where parametric assumptions are violated – Use ranks of the data values rather than actual data values themselves – Loss of power when parametric test is appropriate 48 48

Parametric and non-parametric equivalents

Loss of power when parametric test is appropriate
NONPARAMETRIC test Advantages More different types of data Numerical Data with unknown distribution or skewed distribution Ordinal variable or the measurement data that are given with rank only Disadvantage A waste of data Loss of power when parametric test is appropriate 50 50

Learning Objectives 1. Understand when nonparametric statistical methods are appropriate. 2. Know how to perform the Wilcoxon Signed-Rank Test and when it should be used. 3. Know how to perform the Wilcoxon Rank-Sum Test and when it should be used.

THANK YOU FOR YOUR ATTENTION!

Lecture9 non- parametric methods

Similar presentations

Presentation on theme: "Lecture9 non- parametric methods"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture9 non- parametric methods

Similar presentations

Presentation on theme: "Lecture9 non- parametric methods"— Presentation transcript:

Similar presentations

About project

Feedback