BINOMIALDISTRIBUTION AND ITS APPLICATION
Binomial Distribution The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for x=0,1,2,3…,n = n! / (k! ( n-k)!) p x q n-x = n! / (k! ( n-k)!) p x q n-x –This is called the binomial distribution
Basic Probability Concepts Foundation of statistics because of the concept of sampling and the concept of variation or dispersion and how likely an observed difference is due to chance Probability statements used frequently in statistics –e.g., we say that we are 90% sure that an observed treatment effect in a study is real
Definition of Probabilities If some process is repeated a large number of times, n, and if some resulting event with the characteristic of E occurs m times, the relative frequency of occurrence of E, m/n, will be approximately equal to the probability of E: P(E)=m/n Also known as relative frequency
Elementary Properties of Probabilities - I Probability of an event is a non-negative number –Given some process (or experiment) with n mutually exclusive outcomes (events), E 1, E 2, …, E n, the probability of any event E i is assigned a nonnegative number –P(E i ) 0
Elementary Properties of Probabilities - II Sum of the probabilities of mutually exclusive outcomes is equal to 1 –Property of exhaustiveness refers to the fact that the observer of the process must allow for all possible outcomes –P(E 1 ) + P(E 2 ) + … + P(E n ) = 1
Elementary Properties of Probabilities - III Probability of occurrence of either of two mutually exclusive events is equal to the sum of their individual probabilities –Given two mutually exclusive events A and B –P(A or B) = P(A) + P(B)
Characteristics of Probabilities Probabilities are expressed as fractions between 0.0 and 1.0 –e.g., 0.01, 0.05, 0.10, 0.50, 0.80 –Probability of a certain event = 1.0 –Probability of an impossible event = 0.0 Application to biomedical research –e.g., ask if results of study or experiment could be due to chance alone –e.g., significance level and power –e.g., sensitivity, specificity, predictive values
Examples: (1) Suppose we toss a die. What is the probability of 4 coming up? since there are six mutually exclusive and equally likely outcomes out of which 4 in only one, the probability of 4 coming up is 1/6. since there are six mutually exclusive and equally likely outcomes out of which 4 in only one, the probability of 4 coming up is 1/6. (2) Suppose we toss 2 coins. We can have the following outcomes: both heads, HH; one head and the other tail, TH or HT; and both tails, TT (H=Head; T=Tail). Suppose we want to know the probability of HH. Suppose we want to know the probability of HH. HH being one of the four equally likely outcomes, the probability of obtaining HH is ¼. HH being one of the four equally likely outcomes, the probability of obtaining HH is ¼.
(3) Suppose we throw 2 dice and we want the probability of a total of 7 points. A total of 7 can come in 6 ways (1-6,2-5,3-4,4-3,5-2, or 6-1). So the numerator will be 6. Since we have 6 sides for each die, the total number of ‘equally likely’ ‘mutually exclusive’ outcomes is 6 x 6 =36. So the chance of getting a total of 7 when we throw 2 dice is 6/36 (or 1/6).
Examples: (Addition Law) When we toss a die, what is the probability of getting 2 or 4 or 6 ? The prob. of 2 =1/6 The prob. of 4=1/6 The prob. of 6=1/6 Probability of 2 or 4 or 6 is: 1/6 +1/6+1/6 = 3/6 = ½ 1/6 +1/6+1/6 = 3/6 = ½ (Multiplication Law) (Multiplication Law) In tossing 2 coins, Prob. of head in one coin =1/2 Prob. of head in another coin=1/2 Thus prob. of head in both coins =1/2 x 1/2 =1/4 =1/2 x 1/2 =1/4
Combinations Based on last example, it is clear that we need to calculate more easily the probability of a particular result –If a set consists of n objects, and we wish to form a subset of x objects from these n objects, without regard to order of the objects in the subset, the result is called a combination The number of combinations of n objects taken x at a time is given by – n C k = n! / (k! ( n-k)!) –Where k! (factorial) is the product of all numbers from k to 0 0! = 1
Permutations Similar to combinations –If a set consists of n objects, and we wish to form a subset of x objects from these n objects, taking into account the order of the objects in the subset, the result is called a permutation The number of permutations of n objects taken x at a time is given by – n P k = n! / ( n-k)!
Probability distributions of discrete variables A table, graph, formula, or other device used to specify all possible values of a discrete random variable along with their respective probabilities –P(X=x) Tables – value, frequency, probability Graph – usually bar chart Formula - Binomial distribution
Theoretical Probability Distributions -- If we know (reasonably) that data are from a certain distribution, than we know a lot about it -- Means, standard deviations, other measures of dispersion –That knowledge makes it easier to make statistical inference; i.e., to test differences Many types of distributions –1300+ have been documented in the literature Three main ones –Binomial (discrete - 0,1) –Poisson (discrete counts) –Normal (continuous)
Binomial Distribution Derived from a series of binary outcomes called a Bernoulli trial When a random process or experiment, called a trial, can result in only one of two mutually exclusive outcomes, such as dead or alive, sick or well, the trial is called a Bernoulli trial
Bernoulli Process A sequence of Bernoulli trials forms a Bernoulli process under the following conditions –Each trial results in one of two possible, mutually exclusive, outcomes: “success” and “failure” –Probability of success, p, remains constant from trial to trial. Probability of failure is q = 1-p. –Trials are independent; that is, success in one trial does not influence the probability of success in a subsequent trial.
Bernoulli Process - Example Probability of a certain sequence of binary outcomes (Bernoulli trials) is a function of p and q. For example, a particular sequence of 3 “successes” and 2 “failures” can be represented by p*p*p*q*q; = p 3 q 2 However, if we ask for the probability of 3 “successes” and 2 “failures” in a set of 5 trials, then we need to know how may possible combinations of 3 successes and 2 failures out of all of the possible outcomes there are.
Parameters of Binomial distribution n (the number of objects) and p (the probability of a ‘success’) are the two unknown quantities which define a binomial distribution. They are called the parameters of the binomial distribution. n (the number of objects) and p (the probability of a ‘success’) are the two unknown quantities which define a binomial distribution. They are called the parameters of the binomial distribution. The Binomial distribution is applicable to the situation where: i)the n trails are independent (ie., what occurs in one trail does not affect what will occur in the next trail), ii)at each trail there is only two possible outcomes (‘success’ or ‘failure’) iii)the probability of a ‘success’ (p) should be known and is the same for all trails
Mean and Variance of the Binomial distribution To obtain the frequency distribution for a particular combination of ’ n ‘and ‘p ‘we need to calculate the probabilities p (X=0), p (X=2), , p (X=n) Where X is the random variable representing the number of ‘successes’ in ‘n’ trails. Where ‘n’ is large, the calculations of these probabilities becomes very tedious and time consuming. However, we can obtain the mean and variance of ‘X’ as a summary of the distribution. For a binomial distribution mean ( ) is given by ‘np’ and the variance ( 2 ) is equal to ‘ np(1-p)’. ‘ np(1-p)’. The mean is the average value of the random variable that would be expected to occur in the long run and the standard deviation is the expected variation from the mean
Binomial Table Normally, we would look up probabilities in the Binomial Table Tables of the Binomial probability distribution function –P (X=k) –Find probability that x=4 successes when n trials = 10 and p of success = 0.3 –Find probability that x 4 –Find probability that x 5
In medical research, an outcome of interest can often be expressed as the presence or absence of a particular disease, sign or symptom or as whether the patient lived or died, or recovered or did not recover. In each case we are dealing with an outcome in which exactly one of two alternatives can occur.
Suppose we know that the survival rate for a particular disease is 20% and we have 10 patients with this disease. We would use the binomial distribution to calculate the probability of having 3 or fewer patients survive. The answer is 0.88, so that we have about a 90% chance of having 3 or fewer patients surviving (or 7 or more dying)
(d) P( two or fewer) = P(X<=2) (d) P( two or fewer) = P(X<=2) =1- P(x=3) =1- P(x=3) = = =0.271 =0.271 (e)P( two or three) = P(X=2 or X=3) X=3) =P(X=2) +P(X=3) =P(X=2) +P(X=3) =0.972 =0.972 (f) P( exactly three) = P(X=3) = = 0.729