Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.1 Lecture 4: Fitting distributions: goodness of fit l Goodness of fit l Testing goodness of fit l Testing normality l An important note on testing normality! l Goodness of fit l Testing goodness of fit l Testing normality l An important note on testing normality!
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.2 Goodness of fit l measures the extent to which some empirical distribution “fits” the distribution expected under the null hypothesis Fork length Frequency Observed Expected
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.3 Goodness of fit: the underlying principle l If the match between observed and expected is poorer than would be expected on the basis of measurement precision, then we should reject the null hypothesis. Fork length Observed Expected Frequency Reject H 0 Accept H 0
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.4 Testing goodness of fit : the Chi- square statistic ( l Used for frequency data, i.e. the number of observations/results in each of n categories compared to the number expected under the null hypothesis. Frequency Category/class Observed Expected
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.5 How to translate 2 into p? Compare to the 2 distribution with n - 1 degrees of freedom. If p is less than the desired level, reject the null hypothesis. Compare to the 2 distribution with n - 1 degrees of freedom. If p is less than the desired level, reject the null hypothesis 2 (df = 5) Probability 2 = 8.5, p = 0.31 accept p = = 0.05
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.6 Testing goodness of fit: the log likelihood-ratio Chi-square statistic (G) Similar to 2, and usually gives similar results. l In some cases, G is more conservative (i.e. will give higher p values). Similar to 2, and usually gives similar results. l In some cases, G is more conservative (i.e. will give higher p values). Frequency Category/class Observed Expected
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.7 2 versus the distribution of 2 or G For both 2 and G, p values are calculated assuming a 2 distribution......but as n decreases, both deviate more and more from 2. For both 2 and G, p values are calculated assuming a 2 distribution......but as n decreases, both deviate more and more from 2 / 2 /G (df = 5) Probability 2 /G, very small n 2 /G, small n
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.8 Assumptions ( 2 and G) l n is larger than 30. l Expected frequencies are all larger than 5. l Test is quite robust except when there are only 2 categories (df = 1). For 2 categories, both X 2 and G overestimate 2, leading to rejection of null hypothesis with probability greater than i.e. the test is liberal. l n is larger than 30. l Expected frequencies are all larger than 5. l Test is quite robust except when there are only 2 categories (df = 1). For 2 categories, both X 2 and G overestimate 2, leading to rejection of null hypothesis with probability greater than i.e. the test is liberal.
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.9 What if n is too small, there are only 2 categories, etc.? l Collect more data, thereby increasing n. l If n > 2, combine categories. l Use a correction factor. l Use another test. l Collect more data, thereby increasing n. l If n > 2, combine categories. l Use a correction factor. l Use another test. More data Classes combined
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.10 Corrections for 2 categories For 2 categories, both X 2 and G overestimate 2, leading to rejection of null hypothesis with probability greater than i.e. test is liberal l Continuity correction: add 0.5 to observed frequencies. Williams’ correction: divide test statistic (G or 2 ) by: For 2 categories, both X 2 and G overestimate 2, leading to rejection of null hypothesis with probability greater than i.e. test is liberal l Continuity correction: add 0.5 to observed frequencies. Williams’ correction: divide test statistic (G or 2 ) by:
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.11 The binomial test l Used when there are 2 categories. l No assumptions l Calculate exact probability of obtaining N - k individuals in category 1 and k individuals in category 2, with k = 0, 1, 2,... N. l Used when there are 2 categories. l No assumptions l Calculate exact probability of obtaining N - k individuals in category 1 and k individuals in category 2, with k = 0, 1, 2,... N. Number of observations Probability Binominal distribution, p = 0.5, N = 10
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.12 An example: sex ratio of beavers l H 0 : sex-ratio is 1:1, so p = 0.5 = q l p(0 males, females) = l p(1 male/female, 9 male/female) =.0195 l p(9 or more individuals of same sex) =.0215, or 2.15%. l therefore, reject H 0 l H 0 : sex-ratio is 1:1, so p = 0.5 = q l p(0 males, females) = l p(1 male/female, 9 male/female) =.0195 l p(9 or more individuals of same sex) =.0215, or 2.15%. l therefore, reject H 0
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.13 Multinomial test l Simple extension of binomial test for more than 2 categories l Must specify 2 probabilities, p and q, for null hypothesis, p + q + r = 1.0. l No assumptions......but so tedious that in practice 2 is used. l Simple extension of binomial test for more than 2 categories l Must specify 2 probabilities, p and q, for null hypothesis, p + q + r = 1.0. l No assumptions......but so tedious that in practice 2 is used.
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.14 Multinomial test: segregation ratios l Hypothesis: both parents Aa, therefore segregation ratio is 1 AA: 2 Aa: 1 aa. l So under H 0, p =.25, q =.50, r =.25 l For N = 60, p <.001 l Therefore, reject H 0. l Hypothesis: both parents Aa, therefore segregation ratio is 1 AA: 2 Aa: 1 aa. l So under H 0, p =.25, q =.50, r =.25 l For N = 60, p <.001 l Therefore, reject H 0.
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.15 Goodness of fit: testing normality l Since normality is an assumption of all parametric statistical tests, testing for normality is often required. Tests for normality include 2 or G, Kolmogorov-Smirnov, Wilks-Shapiro & Lilliefors. l Since normality is an assumption of all parametric statistical tests, testing for normality is often required. Tests for normality include 2 or G, Kolmogorov-Smirnov, Wilks-Shapiro & Lilliefors. Frequency Category/class Observed Expected under hypothesis of normal distribution
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.16 Cumulative distributions l Areas under the normal probability density function and the cumulative normal distribution function 2.28% 50.00% 68.27% F Normal probability density function Cumulative normal density function
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.17 2 or G test for normality l Put data in classes (histogram) and compute expected frequencies based on discrete normal distribution. Calculate 2. l Requires large samples (k min = 10) and is not powerful because of loss of information. l Put data in classes (histogram) and compute expected frequencies based on discrete normal distribution. Calculate 2. l Requires large samples (k min = 10) and is not powerful because of loss of information. Observed Expected under hypothesis of normal distribution Frequency Category/class
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.18 “Non-statistical” assessments of normality l Do normal probability plot of normal equivalent deviates (NEDs) versus X. l If line appears more or less straight, then data are approximately normally distributed. l Do normal probability plot of normal equivalent deviates (NEDs) versus X. l If line appears more or less straight, then data are approximately normally distributed. NEDs X Normal Non-normal
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.19 Komolgorov-Smirnov goodness of fit l Compares observed cumulative distribution to expected cumulative distribution under the null hypothesis. l p is based on D max, absolute difference, between observed and expected cumulative relative frequencies. l Compares observed cumulative distribution to expected cumulative distribution under the null hypothesis. l p is based on D max, absolute difference, between observed and expected cumulative relative frequencies. D max X Cumulative frequency
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.20 An example: wing length in flies l 10 flies with wing lengths: 4, 4.5, 4.9, 5.0, 5.1, 5.3, 5.5, 5.6, 5.7, 5.8, 5.9, 6.0 l cumulative relative frequencies:.1,.2,.3,.4,.5,.6,.7,.8,.9, 1.0 l 10 flies with wing lengths: 4, 4.5, 4.9, 5.0, 5.1, 5.3, 5.5, 5.6, 5.7, 5.8, 5.9, 6.0 l cumulative relative frequencies:.1,.2,.3,.4,.5,.6,.7,.8,.9, 1.0 Wing length Cumulative frequency D max
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.21 Lilliefors test l KS test is conservative for tests in which the expected distribution is based on sample statistics. l Liliiefors corrects for this to produce a more reliable test. l Should be used when null hypothesis is intrinsic versus extrinsic. l KS test is conservative for tests in which the expected distribution is based on sample statistics. l Liliiefors corrects for this to produce a more reliable test. l Should be used when null hypothesis is intrinsic versus extrinsic.
Université d’Ottawa / University of Ottawa 2001 Bio 4118 Applied Biostatistics L4.22 An important note on testing normality! l When N is small, most tests have low power. l Hence, very large deviations are required in order to reject the null. l When N is large, power is high. l Hence, very small deviations from normality will be sufficient to reject the null. l So, exercise common sense! l When N is small, most tests have low power. l Hence, very large deviations are required in order to reject the null. l When N is large, power is high. l Hence, very small deviations from normality will be sufficient to reject the null. l So, exercise common sense!