杭州师范大学林隆慧 Distribuions Probability distributions

杭州师范大学林隆慧 Distribuions Probability distributions
The binomial distribution  The negative binomial distribution  The Poisson distribution and randomness  The normal distribution  Sampling distributions 杭州师范大学林隆慧

Discrete random variables
The pattern of behavior of a discrete random variable is described by a mathematical function called a density function or probability distribution. Let X be a discrete random variable. The probability density function (pdf) f for X is f (x) = P (X = x) Where x is any real number. Note that f is defined for all real numbers f (x) ≥ 0 since it is a probability f (x) = 0 for most real numbers because X is discrete and cannot assume most real values Summing f over all possible values of X produces 1, i.e., ∑f (x) = 1 all x

Example: A fair 6-sided die is rolled with the discrete random variable X representing the number obtained per roll. Give the density function for this variable. Random variable: x 1 2 3 4 5 6 Density : f (x) 1/6 f (1) = 1/6 f (3) = 1/ f (6) = 1/6 f (0.1) = f (3.5) = f (-2) = 0

Example: A fair 6-sided die is rolled twice with the discrete random variable X representing the sum of the numbers obtained on both rolls. Give the density function for this variable. x 2 3 4 5 6 7 8 9 10 11 12 Density : f (x) 1/36 2/36 3/36 4/36 5/36 6/36 f (10) = P (X = 10) = 3/36 = 1/12 P (X = 7 or 11) = 6/36 + 2/36 = 2/9

Suppose you rolled a die 1000 times and recorded the outcomes, what would be the long-run average value? 5000 times? The long-run expected value or mean for a discrete random variable with density function f (x) is given by  = E(X) = ∑x f (x) all x Random variable: x 1 2 3 4 5 6 Density : f (x) 1/6 6  = E(X) = ∑x f (x) = 1*1/6 + 2*1/6 + 3*1/6 + 4*1/6 + 5*1/6 + 6*1/6 = 3.5 X = 1

x 2 3 4 5 6 7 8 9 10 11 12 Density : f (x) 1/36 2/36 3/36 4/36 5/36 6/36 12  = E(X) = ∑x f (x) = 2*1/36 + 3*2/36 + … + 12*1/36 = 7 X = 2 Intuitively we would expect the mean for 2 dice to be twice the mean for a single die.

Let X be a discrete random variable with density f . The cumulative distribution function (CDF) for X is denoted by F and is defined by F (x) = P (X ≦ x) for all real x. x 2 3 4 5 6 7 8 9 10 11 12 Density : f (x) 1/36 2/36 3/36 4/36 5/36 6/36 CDF F (x) 10/36 15/36 21/36 26/36 30/36 33/36 35/36 36/36 Q: What is the probability of an 8 or less on roll a pair of dice? A: P (X ≦ 8) = F (8) = 26/36 Q: What is the probability of between 4 and 10 on a pair of dice? A: P (4 < X < 10) = F (9) - F (4) = 30/36 – 6/36 = 2/3

The binomial distribution
When considering nominal scale data coming from a population with only two categories, the proportion of the population belonging to one of the two categories is donated as p, and the proportion of the population belonging to the second class is 1 - p, and the notation q = 1 - p e.g., members of a snake litter may be classified as male or female e.g., progeny of Drosophila cross may be classified as white-eyed or red-eyed

Expansion of the binomial (p + q)n
n (p + q)n p + q p2 + 2pq + q2 p3 + 3p2q + 3pq2 + q3 p4 + 4p3q + 6p2q2 + 4pq3 + q4 p5 + 5p4q + 10p3q2 + 10p2q3 + 5pq4 + q5 p6 + 6p5q + 15p4q2 + 20p3q3 + 15p2q4 + 6pq5 + q6 Binomial probabilities P (X) = n!/[X!(n-X)!]  px qn-x

Four assumptions for the binomial distribution
A fixed number n of trials are carried out. The outcome of each trial can be classified in precisely one of two mutually exclusive ways termed “success” and “failure”. The term “binomial” literally means two names. The probability of a success, denoted by p, remains constant from trial to trial. The probability of a failure is 1-p. The trials are independent; that is, the outcome of any particular trial is not affected by the outcome of any other trial.

Theorem: Let X be a binomial random variable with parameters n and p. Then  = E(X) = np and x2 = Var X = np (1-p) Even though binomial distributions are not symmetric if p ≠ 0.5, as n becomes larger, they do tend to approach symmetry about their mean. Probability density function (pdf)

Example: A particular strain of inbred mice has a form of muscular dystrophy that has a clear genetic basis. In this strain the probability of appearance of muscular dystrophy in any one mouse born of specified parents is ¼, if 20 offspring are raised from these parents, find the following probabilities. Fewer than 5 will have muscular dystrophy; Five will have muscular dystrophy; Fewer than 8 and more than 2 will have muscular dystrophy.

Example: A particular strain of inbred mice has a form of muscular dystrophy that has a clear genetic basis. In this strain the probability of appearance of muscular dystrophy in any one mouse born of specified parents is ¼, if 20 offspring are raised from these parents, find the following probabilities. Fewer than 5 will have muscular dystrophy; Five will have muscular dystrophy; Fewer than 8 and more than 2 will have muscular dystrophy. Solution. n = 20 p = P (X < 5) = F (4) = P (X = 5) = f (5)= F (5) – F (4) = – = P (2 < X < 8) = F (7) – F (2) = – =

Example: Suppose it is known that the probability of recovery from infection with the Ebola virus is If 16 unrelated people are infected, find the following: The expected number who will recover; The probability that 5 or fewer will recover; The probability that at least 5 will recover; The probability that exactly 5 will recover. Solution. n = 16 and p = 0.10 E (X) =  = np = 16*0.10 = 1.6 P (X ≦ 5) = F (5) = P (X ≧ 5) = 1 – F (4) = 1 – = P (X = 5) = f (5) = F (5) – F (4) = – =

The Poisson distribution
The distribution is important in describing random occurrences, which are either objects in space or event in time. A random distribution of objects in space is one in which each portion of the space has the same probability of containing an object and the occurrence of an object in any portion of the space in no way influencing the occurrence of any other of the objects in any portion of the space (e.g., the distribution of bacteria in liquid medium). A random distribution of events in time is one in which each time period had an equal chance of witnessing an event, and the occurrence of any event is independent of the occurrence of any other event (e.g., the firing of certain fibers)

The Poisson assumptions
Occurrences are random Occurrence of one event must be independent of others Mean number of occurrence is small relative to the maximum possible

Poisson probabilities
P (X) = x / e x! e is thenatural exponential …. The expected value is E (X) =  The variance, Var X =  Unlike the binomial distribution that depends on two parameters n and p

Example: A radioactive source emits decay particles at an average rate of 4 particles per second. Find the probability that 2 particles are emitted in a 1-second interval.  = 4 particles per second P (X = 2) = f (2)= x / e x! = 42 / e42! = Alternatively, using Table C.2. P (X = 2) = F(2) -F(1) = =

Example: An ichthyologist studying the spoonhead sculpin, Cottus ricei, catches specimens in a large bag seine that she trolls through the lake. She knows from many years experience that on average she will catch 2 fish per trolling run. Find the probabilities of catching No fish on a particular run; Fewer than 6 fish on a particular run; Between 3 and 6 fish on a particular run. Solution.  = 2 fish/run P (X = 0) = f (0) = F(0) = P (X < 6) = F(5) = P (3 < X < 6) = F(5) -F(3) = = Alternatively, P (3 < X < 6) = f (4) + f (5) = =

Example: A group of forest ecology students survey several 10m×10m plots in a subtropical rainforest. They find a mean of 30 trees per plot. Under the assumption that the trees are randomly distributed, what is the probability of finding no more than 3 trees in a 1-m2 plot? At least 2 trees? Solution.  = 30/100 = 0.3 tree/m2 Table C. 2 does not list  = 0.3, so we must do the calculations by hand. P (X ≤ 3) = f (0) + f (1) + f (2) + f (3) = 1 - [f (0) + f (1)] =

Poisson approximation to the binomial distribution
Under certain circumstances it is a very good approximation for the binomial distribution. Generally, for a good approximation, the binomial parameter n must be large (at least 20) and p must be small (no greater than 0.05). The approximation is very good if n≥100 and np≤10.  = np

Example: A certain birth defect occurs with probability p = Assume that n = 5000 babies are born at a particular large, urban hospital in a given year. What is the approximate probability that there is at least 1 baby born with the defect? What is the probability that there will be no more than 2 babies born with the defect? Solution. Table C. 1 does not help here (n: 5 ~ 20). n is very large and np≤10  = np = 0.5 from Table C. 2 P (X ≥ 1) = 1-F(0) = = P (X ≤ 2) = F(2) =

Continuous random variables
Unlike counts that are common discrete random variables, continuous random variables can assume any values over an entire interval. Measurements like the lengths of rainbow trout (鲑鱼) or weights of manatees (海牛) would be examples of continuous random variables. The probability density function (pdf) for a continuous random variable X is a function f defined for all real numbers x such that f (x)≥0; The region under the graph of f and above the x axis has an area of 1; For any real numbers a and b, P (a ≤X ≤ b) is given by the area bounded by the graph of f, the lines x = a and x = b, and the x axis.

Continuous random variables
Example: a graph of a density function of a continuous variable. P (x < 1.0) = area of shaded triangle = ½ *0.5*1.0 = 0.25 P (x < c) = P (x ≤ c) because the probability that X exactly equals any particular value is zero. Another way to think about it is that P (x = c) = 0 because it represents the area of the line segment under f (c) This is a fundamental difference between continuous and discrete variables. Remember in the binomial or Poisson distributions P (x ≤ c) ≠P (x < c) because P (x = c) was a probability greater than zero.

The normal distribution
Most frequency distribution of interval or ratio scale data are observed to have a preponderance of values around the mean with progressively fewer observations toward the extremes of the range of values. If n is large, the frequency polygons of many data distributions are “bell-shaped” something like:

On the graph, the X axis represents different values for X, and the Y axis is the density, or the frequency or probability of occurrence of X.  = mean σ= standard deviation The points of inflection (where the curve changes from bending up to bending down) on the curve f occur at  ±σ.

 is called a location parameter because it indicates where the graph is centered or positioned. σ determines the shape of f.

History of the normal distribution
The normal distribution was originally studied by DeMoivre ( ), who was curious about its use in predicting the probabilities in gambling. The first person to apply the normal distribution to social data was Adolph Quetelet ( ). He collected data on the chest measurements of Scottish soldiers, and the heights of French soldiers, and found that they were normally distributed. His conclusion was that the mean was nature's ideal, and data on either side of the mean were a deviation from nature's ideal

Properties of the normal distribution curve
Normal distribution curves are bell-shaped and bilaterally symmetrical. The tails of the curve approach the X-axis, but never touch it. Although the graph will go on indefinitely, the area under the graph is considered to have a unit of 1.00 Also unique about the normal distribution curve is that the mean, median, and mode are the same value The mode and median can be estimated by simply looking at a graph: the mode is the value with the highest frequency, and the median is the middle point. It is harder to estimate the mean, however, as that depends on the range of values What if these three values did not equal each other? If mean < median < mode, the graph is negatively skewed: there are small outlier values. If mean > median > mode, the graph is positively skewed

Standard normal distribution
If a distribution is normal, one may know the percentiles of the data; given a normal distribution, 68.25%, 94.45% and 99.7% of the data will fall between +/- one, two and three standard deviation from the mean!

For each different  and , we will have a different distribution. But, if each distribution depends on its  and , how can we compare different distributions? We would have to make a table for each and every possible  and . To solve this problem, the normal distribution is converted to a standard normal distribution: a normal distribution with a mean of 0 and a standard deviation of 1. Let X be a normal random variable Z with mean , and standard deviation . The transformation express X as the standard normal random variable with  = 0 and  = 1

Suppose that the scores on an aptitude test are normally distributed with a mean of 100 and a standard deviation10. (Some of the original IQ tests were purported to have these parameters.) What is the probability that a randomly selected score is below 90?

= ( ) / 10 = -1.0 Thus a score of 90 can be represented as 1 standard deviation below the mean. P(X < 90) = P (Z < -1.0) Table C. 3 catalogues the CDF for the standard normal distribution from Z = to Z = 3.99 in increments of 0.01. P(X < 90) = P (Z < -1.0) = (from Table C. 3 we use the row marked -1.0 and the column marked 0.00)

What is the probability of a score between 90 and 115?

P(90 < X < 115) = P (-1.0 < Z < 1.5) = F(1.5) – F(-1.0) = = What is the probability of a score of 125 or higher?

= ( ) / 10 = 2.5 P(X ≥ 125) = 1-P(X < 125) = 1 – P(Z< 2.5) = 1-F(2.5) = =

Suppose that diastolic blood pressure X in hypertensive women centers about 100 mmHg and has a standard deviation of 16 mmHg and is normally distributed. Find P(X < 90), P(X >124), P(96 < X < 104). Then find x so that P(X < x) = 0.95 P (X < 90): P (Z < ) ≈0.2660 P (X >124): P (Z > 1.5): 1-F(1.5) = = P (96 < X < 104): F(0.25)-F(-0.25) = = Then find x so that P(X < x) = 0.95: z = = (x-100)/16 x =

if p is very small, the Poisson distribution may also be used to approximate the binomial. Generally, for a good approximation, the binomial parameter n must be large (at least 20) and p must be small (no greater than 0.05). The approximation is very good if n≥100 and np≤10.  = np review

Normal approximation to the binomial distribution
Let X be a binomial random variable with parameters n and p. For large values of n, X is approximately normal with mean  = np and variance  2 = np(1- p) . A simple rule of thumb is that this approximation is acceptable for values n and p such that  2 = np(1- p) > 3. e.g. if p = 0.5 then n should at least 12 before the approximation may be used, since 12*0.5*(1-0.5) = 3. p = 0.1 the n should be at least 34 since 34*0.1*(1-0.1) = 3.06

Example: A particular strain of inbred mice has a form of muscular dystrophy that has a clear genetic basis. In this strain the probability of appearance of muscular dystrophy in any one mouse born of specified parents is ¼, if 20 offspring are raised from these parents, find the following probabilities. Fewer than 5 will have muscular dystrophy; Five will have muscular dystrophy; Fewer than 8 and more than 2 will have muscular dystrophy. Suppose a larger sample of progeny are generated and we would like to know the probability of fewer than 15 with muscular dystrophy in a sample of 60. p = n = 60  = np =  2 = np(1- p) = >  = 3.35 Binomial: P (X < 15) =FB (14) = (Table C. 1 n:5~20. So requires summing 15 terms) Normal: P (X < 15) =FB (14) ≈ FN ( ) = FN ( ) = FN (-0.15) = But why was 14.5 instead of 14 used in this approximation? p83

The height of each bar of the binomial density function is the probability of the corresponding discrete outcome. But the width of each bar is 1, so the area of the bar also represent the corresponding probability of the same discrete outcome.

Convergence of a Poisson distribution to a normal distribution
When the number of observations is very large, a Poisson distribution will approximate to a normal distribution.

Sampling distributions 三大抽样分布: 卡方分布（χ2分布）、t 分布、F 分布

抽样分布从已知的总体中以一定的样本容量进行随机抽样，由样本的统计数所对应的概率分布称为抽样分布。抽样分布是统计推断的理论基础。

Sampling distributions
Example Suppose a population of Peripatus sp., ancient wormlike animals of the Phylum Onychophora that live beneath stones, logs, and leaves in the tropics and subtropics, has a mean length of cm and a standard deviation of 0.31 cm. These are population parameters, i.e.,  = 3.20 cm and  = If a random sample of 25 Peripatus is collected and the average length determined for this sample, it will likely not be 3.20 cm. It may be 3.12 cm or 3.36 cm or a myriad of other values. This one sample will generate a single that is a guess at . How good this guess is depends on the nature of the sampling process (randomness) how variable the attribute is () how much effort is put into the sampling process (sample size)

Sampling distributions
Repeating this sampling process over and over indefinitely will generate hundreds then thousands of s, and as long as the samples are random and have equal size n, their means will form a theoretical construct that we call the sampling distribution for the mean. In real life we sample the population only once, but we realize that our sample comes from a theoretical sampling distribution of all possible samples of a particular size. The sampling distribution concept provides a link between sampling variability and probability.

Distribution of the sample mean
When sampling from a normally distributed population with mean and variance , the distribution of the sample mean (sample distribution) will have the following attributes: The distribution of ’s will be normal. is the population standard error of the mean

When sampling from a nonnormally distributed population with mean and variance , the distribution of the sample mean (sample distribution) will have the following attributes: The distribution of ’s will be approximately normal, with the approach to normality becoming better as the sample size increases. Generally, the sample size required for the sampling distribution of to approach normality depends on the shape of the original distribution. Samples of 30 or more give very good normal approximations for this sampling distribution of in nearly all situations. (This property of means is know as the Central Limit Theorem) is the population standard error of the mean

The mean blood cholesterol concentration of a large population of adult males (50-60 years old) is 200 mg/dl with a standard deviation of 20 mg/dl. Assume that blood cholesterol measurements are normally distributed. What is the probability that a random selected individual from this age group will have a blood cholesterol level below 250 mg/dl? Solution. Apply the standard normal transformation. P (X < 250) = P (Z < ) = P (Z < 2.5) = F (2.5) = (Table C.3)

What is the probability that a random selected individual from this age group will have a blood cholesterol level above 225 mg/dl? Solution. Apply the standard normal transformation. P (X > 225) = P (Z > ) = P (Z > 1.25) = 1-F (1.25) = =

What is the probability that the mean of a sample of 100 men from this age will have a value below 204 mg/dl? Solution. P ( < 204) = P (Z < ) = P (Z < 2.0) = F (2.0) = = 200 mg/dl = 2.0 mg/dl

If a group of 25 older men who are strict vegetarians have a mean blood cholesterol level of 188 mg/dl, would you say that vegetarianism significantly lowers blood cholesterol levels? Explain. Solution. P ( < 188) = P (Z ≤ -3.0) = F (-3.0) = (Table C.3) Diet may affect blood cholesterol levels. = -3.0

Portions of prepared luncheon meats should have pH values with a mean of 5.6 and a standard deviation of 1.0. The usual quality control procedure is to randomly sample each consignment of meat by testing the pH value of 25 portions. The consignment is rejected if the pH value of the sample mean exceeds 6.0. What is the probability of a consignment being rejected? Solution. P ( > 6.0) = P (Z > 2.0) =1- F (2.0) = = Only 2.28% of the consignment will be rejected using the quality control procedure above. = 2.0

population standard error sample standard error

Z（u） distribution and t distribution
is distribution as a normal distribution (Z distribution) with  = 0 and  = 1 is distribution as a t distribution with  = 0 and  depending on the sample size. The t distributions are symmetric and bell-shaped like the normal distribution but a little flatter, i.e., they have a larger standard deviation. The degrees of freedom is just the sample size minus 1: df = n-1 for any t distribution.

t-distribution & Student’s t-test
t - distribution was first presented by W. S. Gosset, who published it under the pseudonym “Student” (1908), hence the common reference to “Student’s t distribution” or “Student’s t test” P. 383 t0

Example: one-tailed test for the hypotheses H0:   0 and HA: < 0
The data are weight changes of human, tabulated after administration of a drug proposed to result in weight loss. Each weight change (in kg) is the weight after minus the weight before drug administration 0.2 -0.5 -1.3 -1.6 -0.7 0.4 -0.1 0.0 -0.6 -1.1 -1.2 -0.8 n = 12 Mean = kg Variance (s2) = kg2 s = 0.63 kg t 0.05 (1), 11 = 1.796 If t  -t 0.05 (1), 11, reject H0 Conclusion: reject H0 and accept HA

Statistical hypothesis
Null hypothesis (abbreviated H0): The population which was sampled had a 3: 1 ratio of pink-flowered to white-flowered roses. The hypothesis is referred to as a null hypothesis, because it is a statement of “no significance” Alternate hypothesis (abbreviated HA): The population sampled had a flower color ratio which is not 3 pink: 1 white.

Review Distribution of the sample mean If a group of 25 older men who are strict vegetarians have a mean blood cholesterol level of 188 mg/dl, would you say that vegetarianism significantly lowers blood cholesterol levels? Explain. Solution. P ( < 188) = P (Z ≤ -3.0) = F (-3.0) = (Table C.3) Diet may affect blood cholesterol levels. = -3.0

Z distribution and t distribution
Review Z distribution and t distribution Z = is distribution as a normal distribution (Z distribution) with  = 0 and  = 1 is distribution as a t distribution with  = 0 and  depending on the sample size. The t distributions are symmetric and bell-shaped like the normal distribution but a little flatter, i.e., they have a larger standard deviation. The degrees of freedom is just the sample size minus 1: df = n-1 for any t distribution.

Z distribution and t distribution

Chi-square distribution
正态离差u服从平均数为0，标准差为1的正态分布。假定由该总体中随机抽取样本，样本容量为n，样本值为u1，u2，…，un。则随机变量 χ2 = u12 + u22 + … + un2

χ2 = u12 + u22 + … + un2 = (n-1)s2/ 2 If all possible samples of size n are drawn from a normal population with a variance equal to  2 and for each of these samples the value (n-1)s2/ 2 is computed, this values will form a sampling distribution called a χ2 with n-1 degrees of freedom. The Greek letter “chi” or χ, is pronounced as the “ky” in “sky”. The degree of freedom for the chi-square distribution are often denoted by v [nju:].

F distribution F = (S12/ 1 2 ) / (S22/ 2 2)

F distribution F分布是以统计学家R. A. Fisher 姓氏的第一个字母命名的

F distribution 在概率论和统计学里，F-分布是一种连续概率分布，被广泛应用于似然比率检验，特别是ANOVA中。一个F-分布的随机变量是两个卡方分布变量的比率： U1和U2呈卡方分布，它们的自由度（degree of freedom）分别是d1和d2。 U1和U2是相互独立的。

自由度例如，有一个有4个数据(n＝4)的样本，其平均值m等于5，即受到m＝5的条件限制，在自由确定4、2、5三个数据后，第四个数据只能是9，否则m≠5。因而这里的自由度υ＝ n-1＝ 4-1＝3。推而广之，任何统计量的自由度υ＝n-限制条件的个数。

Example

Chi-square (x) goodness of fit
Chi-square goodness of fit is widely used to infer whether the population from which a sample of nominal data came conforms to a certain theoretical distribution. e.g., a plant geneticist may raise 100 progeny from a cross that is hypothesized to result a 3:1 phenotypic ratio of pink-flowered to white-flowered. Perhaps a ratio of 84 pink: 16 white is observed, although out of this total of 100 roses, the geneticist’s hypothesis would predict a ratio of 75 pink: 25 white. The question to be answered, then, is whether the observed frequencies deviate significantly from the frequencies expected if the hypothesis were true

Chi-square (x) goodness of fit
The following calculation of a statistic called chi-square is used as a measure of how far a sample distribution deviate from a theoretical distribution Here, Oi is the frequency, or number of counts, observed in class i, Ei is the frequency expected in class i if the null hypothesis is true, and the summation is performed over all k categories of data. Larger disagreement between observed and expected frequencies will results in a larger x2 value. Thus, this type of calculation is referred to as a measure of goodness of fit. A calculated x2 value can be as small as zero, in the case of perfect fit.

Example: Chi-square goodness of fit for two categories
Calculation of chi-square goodness of fit for k = 2 (e.g., data consisting of 100 flower colors to a hypothesized color ratio of 3: 1) H0: The sample data came from a population having a 3: 1 ratio of pink to white flowers HA: The sample data came from a population not having a 3: 1 flower color ratio Categories (flower color) Pink White n Oi 84 16 100 (Ei ) (75) (25) degree of freedom =  = k – 1 = 2 – 1 = 1 = (84 – 75)2/75 + (16 – 25)2/25 = 4.320 0.025 < P < Therefore, reject H0 and accept HA

Statistical errors in hypothesis testing
A probability of 5% or less is commonly used as the criterion for rejection of H0. The probability used as the criterion for rejection is termed the significance level, denoted by , and the value of the test statistic corresponding to this probability is the critical value (临界值) of the statistic. It is very important to realize that a true null hypothesis occasionally will be rejected, which of course means that we have committed an error. This error will be committed with a frequency of . That is, if H0 is in fact a true statement about a statistical population, it will be concluded erroneously to be false 5% of the time.

Two types of statistical errors
Type I error: The rejection of a null hypothesis when it is in fact a true statement is a Type I error (also called  error, or an error of the first kind). (弃真) Type II error: On the other hand, if H0 is in fact false, our test may occasionally not detected this fact, and we shall have reached an erroneous conclusion by not rejecting H0. This error, of not rejecting the null hypothesis when it is in fact false, is a Type II error (also called  error, or an error of the second kind).（纳伪） If H0 is true If H0 is false If H0 is rejected Type I error  No error 1- If H0 is not rejected No error Type II error 1- 

Example: Chi-square goodness of fit for more than two categories
Calculation of chi-square goodness of fit for k = 4 H0: The sample from a population having a 9: 3: 3: 1 color pattern of flowers HA: The sample from a population not having a 9: 3: 3: 1 color pattern of flowers Categories (flower color) Red rayed Red margined Blizzard Rayed n Red margined Oi 152 39 53 6 250 (Ei ) (140.6) (46.9) (46.9) (15.6) Red rayed  = k – 1 = 4 – 1 = 3 Rayed Blizzard = 8.956 0.025 < P < Therefore, reject H0 and accept HA

Chi-square correction for continuity
Chi-square values obtained from actual data belonging to discrete or discontinuous distribution. However, the theoretical x2 distribution is a continuous distribution. x2 values calculated obtained from discrete data ( = 1 in particular) are often overestimated and may therefore cause us to commit the Type I error with a probability greater than the stated . The Yates correction (see below) should routinely be used when  = 1

The log-likelihood ratio (G-test)
The x2 test is the traditional method for tests of GOF. The G-test is an alternative to the x2 test for analyzing frequencies. The two methods are interchangeable. The G-test is increasingly used because: it is easier to calculate; mathematicians believe it has theoretical advantages in advanced applications G = 2  O ln (O/E) (ln = natural logarithm) The G-test statistic (G) uses the same tables as the x2 test. The G-test is based on the principle that the ratios of two probabilities can be used as a test statistic to measure the degree of agreement between sampled and expected frequencies. Williams (1976) recommends G be used in preference to x2 whenever any > expected frequency The two methods often yield the same conclusions; when they do not, many statiscians prefer G test and therefore recommend its routine use

Example: G-test for more than two categories
H0: The sample from a population having a 9: 3: 3: 1 color pattern of flowers HA: The sample from a population not having a 9: 3: 3: 1 color pattern of flowers Categories (flower color) Red rayed Red margined Blizzard Rayed n Red margined Oi 152 39 53 6 250 (Ei ) (140.6) (46.9) (46.9) (15.6) Red rayed  = k – 1 = 4 – 1 = 3 Rayed Blizzard G = 2  O ln (O/E) = 0.001 < P < Therefore, reject H0 and accept HA

杭州师范大学林隆慧 Distribuions Probability distributions

Similar presentations

Presentation on theme: "杭州师范大学林隆慧 Distribuions Probability distributions"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

杭州师范大学 林隆慧 Distribuions Probability distributions

Similar presentations

Presentation on theme: "杭州师范大学 林隆慧 Distribuions Probability distributions"— Presentation transcript:

Similar presentations

About project

Feedback

杭州师范大学林隆慧 Distribuions Probability distributions

Presentation on theme: "杭州师范大学林隆慧 Distribuions Probability distributions"— Presentation transcript: