Presentation is loading. Please wait.

Presentation is loading. Please wait.

SSE 2193 Engineering Statistics Chapter 2 Special Variables 2A Binomial Distribution.

Similar presentations


Presentation on theme: "SSE 2193 Engineering Statistics Chapter 2 Special Variables 2A Binomial Distribution."— Presentation transcript:

1 SSE 2193 Engineering Statistics Chapter 2 Special Variables 2A Binomial Distribution

2 Events from experiments While it is educational to learn about general discrete and continuous random variables, such variables are not much use in describing real life events. The result of experiments usually can be expected to mimic some distributions with special properties.

3 Bernoulli trials In experiments where a certain process is carried out repeatedly, with each process independent of the others, it is possible to deduce the result of an event. In particular, if we can separate the outcomes of an experiment into two groups, one which is desirable, and we call it success, while the other a failure, then we have the case of a Bernoulli trial.

4 Success and failure Take the weather. If we wish for a sunny day, then we call a sunny day a success, and a rainy day a failure. A person may expect a grade B+ or better as a success, then anything less will be a failure. If you are expected to reach an office before 10.00 a.m. then arriving before then is a success, and being late will be a failure.

5 Binomial Distribution If we repeat a Bernoulli trial many times, and count the number of successes using X, then we shall have a binomial distribution if 1.Each trial is independent of the other; 2.The probability of each success remains the same for each trial. If the probability of each success is p, and we run the experiment n times, then suppose we use X to represent the number of successes, we shall write X ~ Bin(n, p).

6 The values of X When a trail is done n times, the number of successes must be one of 0 (all failures), 1, 2, … n (all successes). Hence, unlike general DRV, a binomial distribution X~Bin(n, p) must have all values from 0 to n only. There is a non-trivial probability for every r in the range 0 … n. In short: 1.X = {0, 1, …, n} 2.P(X=r) > 0 for all r  X. Of course, it is possible that P(X=r) is so small that in practice, we will just put it as 0. But technically, the is no zero probability for each of the X.

7 P(X=r) If X~Bin(n, p), then 1.P(X=0) designates probability of no success at all. This is equal to (1-p) n. 2.P(X=n) means probability of getting all n successes. This is equal to p n. 3.For any other r, P(X=r)= n C r p r (1-p) n-r. NOTE: For brevity, we usually use the symbol q to represent 1-p. Hence the formula is frequently written as P(X=r)= n C r p r q n-r.

8 nCrnCr n C r designates the number of ways to choose r items out of n items. Its value is n(n-1)(n-2)…(n- r+1)/r!. r! = 1  2  3  …  r For example 8 C 3 =8  7  6/(1  2  3) = 56. This means, given 8 items, you have 56 different ways of selecting 3 items out of the lot. Note that n C n = n C 0 = 1.

9 Bin(n, p) – Example 1 1: If a Bernoulli trial has a success rate of 0.2, and it is carried out 7 times, what is the probability we get (i)4 successes? (ii)No success? Solution: Let X represent the number of successes. Then X ~ Bin(7, 0.2). (i)So P(X=4) = 7 C 4 0.2 4 0.8 3 = 0.0287. (ii)P(X=0) = 0.8 7 = 0.2097.

10 Bin(n, p) – Example 2 2: It is known that in 35% of accidents involving motorcycles, the rider dies. On a day when 10 such accidents are reported, what is the probability (i)2 riders die? (ii)At least 3 riders die? Solution: Let D represent the number of deaths. Then D ~ Bin(10, 0.35). (i)So P(D=2) = 10 C 2 0.35 2 0.65 8 = 0.17565. (ii)In this case, we want P(D  3). This means we need to add up P(D=3), P(D=4) … P(D=10). However, we note that the sum of all probabilities is 1. So, we may also obtain P(D  3) as 1 – [P(D=0)+ P(D=1)+ P(D=2)] = 1 – [0.01346+0.07249+0.17565] =0.7384 (correct to 4 decimal places)

11 Tables of binomial distributions In practice, it is not desirable to carry out calculations such as P(D=3), P(D=4) … P(D=10) as for the last example. Apart from time, this tedious work may lead to errors. Particularly when calculators are not available, statisticians find it more convenient to use ready- made tables of binomial distributions. The UTM table provides probabilities cumulative from below. I.e. the table shows P(X  k) for each k, not P(X=k).

12 UTM Table for Binomial Distribution The binomial distribution table shows cumulative probabilities [P(X  k)] for k=0, … n. The table list the probabilities separately for n=1, 2, 20, 23, 25, 27, and 30. The table shows p=0.01, … 0.09 (first part) 0.10, 0.15, …0.50 (second part). For values of p beyond 0.5, we have to use complementary procedures. For values between the given p, and n=21, 22, 26, 28 and 29, we may use linear interpolation for approximate answers.

13 Use and Limitations of Table Going back to Example 2, we could have referred to the table, and read off P(D  2) which is 0.2616, and hence obtain P(D  3) as 1 – 0.2616 = 0.7384. However, space constraint means that only n  30 can be accommodated in a small handbook, and we only have values of p=0.01, …0.09, 0.1, 0.15, …0.5. We have to deal with cases of other p values ourselves. In the following examples, you will learn to deal with each of the cases when the table can be used, and how to overcome the limitations when they occur.

14 Example 3 25 trainees undergo a perseverance test. Based on records, it is known that 40% of them will drop off before completion. What is the probability that (i)Up to 6 of them will drop? (ii)4 to 8 of them will drop? Solution: Let X represent the number of trainees who drop. Then X~Bin(25, 0.4) (i)P(X  6) = 0.0736 [Read from table] (ii)P(4  X  8) = P(X  8) – P(X  3) = 0.2735 – 0.0024 = 0.2711. Note that in (ii), we have to express the value as the difference between two others.

15 Example 4 The head of department calls a policy meeting among his 23 clerical staff. From experience, he knows that 15% of them will be absent. What is the probability (i)3 to 5 of them will be absent? (ii)At least 3 of them will be absent? Solution: A = number absent ~ Bin(23, 0.15) (i)We need P(3  A  5). In this case, we can read P(A  5) = 0.8811, and P(A  2) = 0.3080, so P(3  A  5) = P(A  5) – P(A  2) = 0.5731. (contd)

16 Example 4 (contd) (i)(contd) However, as P(3  A  5) = P(A=3) + P(A=4) + P(A=5), it is just as easy to calculate the three values using the formula. Now P(A=3) = 23 C 3 0.15 3 0.85 20 = 0.23167; P(A=4) = 23 C 4 0.15 4 0.85 19 = 0.20442; P(A=5) = 23 C 5 0.15 5 0.85 18 = 0.13708; So P(3  A  5) = 0.23167+0.20442+0.13708=0.6618 (4 d.p.) (ii) The event of A  3 need to be interpreted as the event complement to A  2. Hence we decide that P(A  3) = 1 – P(A  2) = 1 – 0.3382 = 0.6618.

17 Example 5 The probability a new menu from a fast-food restaurant MOM is successful in sale is 0.8. MOM proposes to bring out 18 new menus this year. What is the probability that (i)All will be successful? (ii)At least 12 will be successful? (iii)10 to 15 menus will be successful? Solution : Let S represent the number of successful menus. Then S~Bin(18, 0.8) (i)P(S=18) = 0.8 18 = 0.0180

18 Example 5 (contd) (ii)The value we want is P(S  12). Unfortunately, the table does not provide for p=0.8. Let us interpret this another way. Let S’ represent the number of unsuccessful menus. Then S’~Bin(18, 0.2). [Refer to the next slide] Now S  12 corresponds to S’  6. From the table, we see that this is. Hence we decide that P(S  12) = P(S’  6) = 0.9487.

19 Table for S and S’ (ii)S 0 1 2 3 4 5 6 7 8 9 S’ 18 17 16 15 14 13 12 11 10 9 S 10 11 12 13 14 15 16 17 18 S’ 8 7 6 5 4 3 2 1 0 --------------------------------------------------------------- (iii)S 0 1 2 3 4 5 6 7 8 9 S’ 18 17 16 15 14 13 12 11 10 9 S 10 11 12 13 14 15 16 17 18 S’ 8 7 6 5 4 3 2 1 0

20 Example 5 (contd) Again we need S’ for this. In this case, 10  S  15 is identical with 3  S’  8. [See the previous slide.] So P(3  S’  8) = P(S’  8) – P(S’  2) = 0.9957 – 0.2713 = 0.7244. WARNING: Determining events in S using the complement S’ is a tricky process. You need to exercise care to make sure the events do match. Otherwise you will be getting wrong results. The tables shown are very useful in identifying the equivalent complement events. Use it.

21 Example 6 The probability a flight of airline A is full is 0.75. There are 19 flights of A from an airport. What is the the probability (i)More than 15 will be full? (ii)10 to 15 will be full? Solution : Let F represent the number of full flights. Then F~Bin(19, 0.75). As p is more than 0.5, we shall introduce F’ which represents the number of flights which are not full. Hence F’~Bin(19, 0.25).

22 Table for F and F’ (ii)F 0 1 2 3 4 5 6 7 8 9 F’ 19 18 17 16 15 14 13 12 11 10 F 10 11 12 13 14 15 16 17 18 19 F’ 9 8 7 6 5 4 3 2 1 0 --------------------------------------------------------------- (iii)F 0 1 2 3 4 5 6 7 8 9 F’ 19 18 17 16 15 14 13 12 11 10 F 10 11 12 13 14 15 16 17 18 19 F’ 9 8 7 6 5 4 3 2 1 0

23 Example 6 (contd) (i)More than 15 flights full is equivalent to 4 flights or less not full. Hence we conclude that P(F>15) = P(F’  3) = 0.2631. (ii)Similarly, we find that 10  F  15 is the same as 4  F’  9. This means that P(10  F  15) = P(4  F’  9) = P(F’  9) – P(F’  3) = 0.9911 – 0.2631 = 0.7280.

24 Properties of binomial distribution For X~Bin(n, p), the mean  = np, and the variance  2 = npq. When p is small, its main values are near the point when X=0. However, the opposite is true when p is close to 1. When p is close to 0.5, (say from 0.3 to 0.7) the distribution will be quite symmetric.

25 Bar charts for p=0.4 and p=0.9

26 Example 7 The mean of a binomial distribution X is 8, and the variance is 4.8. Find the probability of 3  X  8. Solution : Let X~Bin(n, p), then the mean = np = 8, and the variance = npq = 4.8. Hence q = 4.8/8 = 0.6  p = 0.4, n = 20. And so X~Bin(20, 0.4) P(3  X  8) = P(X  8) – P(X  2) =. NOTE: This example is an exercise to test your skills in using the knowledge about the mean and variance of a binomial distribution. It has no other practical applications. As such, we shall not be dealing with this type of problems anymore.

27 Example 8 The probability a man suffers from colour- blindness is 0.2; for a woman, it is 0.05. 4 men and 7 women are checked for colour-blindness. What is the probability (i)Exactly 3 people are colour-blind? (ii)More women than men are colour-blind? Solution : Let M and W represents the numbers of men and women who are colour-blind. Then M~Bin(4, 0.2) and W~Bin(7, 0.05). We first construct a table each for each of the variables.

28 Distributions of M and W Probability Distribution of M R 0 1 2 3 4 P(M=r) 0.4096 0.4096 0.1536 0.0256 0.0016 Probability Distribution of W r 0 1 2 3 4 5 P(W=r) 0.6983 0.2572 0.0406 0.0036 0.0002 0.0000 Note that P(W=5) = 0.0000059, NOT 0. However, as we keep our values correct to only 4 decimal places, we show it as 0. The values for r=6 and 7 are even smaller, and are not shown.

29 Answer to Ex 8(i) (i)We need the value of P(M+W=3). There is no short cut here. We shall just work out as below: P(M+W=3) = P(M=0  W=3) +P(M=1  W=2) +P(M=2  W=1) +P(M=3  W=0) = 0.4096  0.0036 + 0.4096  0.0406 + 0.1536  0.2572 +0.0256  0.6983 = 0.7549 (4 d.p.)

30 Answer to Ex 8(ii) (ii) We need the value of P(W>M). This also has to be analysed individually. We reason it this way: Either W=1 and M=0  0.2572  0.4096 or W=2 and M=0 or 1  0.0406  0.8192 or W=3 and M=0, 1 or 2  0.0036  0.9728 or W=4 and M=0, 1, 2 or 3  0.0002  0.9984 As the probabilities for W=5 or more are very small, we need not consider them. Thus P(M>W) = 0.2572  0.4096 + 0.0406  0.8192 + 0.0036  0.9728 + 0.0002  0.9984 = 0.1423.

31 A note on combining binomial distributions As can be seen from Ex 7, combining variables of binomial distributions is difficult. In the exceptional event that X~Bin(n 1, p) and Y~Bin(n 2, p), then we can conclude that X+Y~Bin(n 1 +n 2, p). However, even when the p’s are the same, we can not deal with comparisons of the type X>Y (as in 8(ii)). Except this case (equal p’s), there is no short-cut in combining the two variables. The event has to be broken down into separate cases and calculated step-by-step. This is of course practical only when n 1 and n 2 are small. For larger values of n, we need computer programs to perform the calculations, or use approximations. (See 2D)

32 Interpolation It is impossible to provide a table which will include all values of n and p for binomial distributions. In particular, it is not meaningful to include tables for binomial distributions for n>30. For p, the UTM tables gives p in increments of 0.01 from 0 to 0.1, then by increments of 0.05 to 0.5. This means we have no table for p = 0.11, 0.12, 0.13 and 0.14, and similarly for the other values between 0.15 and 0.2 and so on. The table will become exceptionally thick if we were to include these values. In problems where such p values appear, we have two ways out. One is to do our own calculations, and the other is to use interpolation.

33 Interpolation Procedure If f(x1) = a and f(x2) = b, then for a value x between x1 and x2, we can obtain a linear interpolation of f(x) by the formula f(x) = f(x1) + [f(x2)-f(x1)]  [x-x1]  [x2-x1]. For example, if f(5) = 18.7 and f(8) = 22.9, then f(6.2) = f(5) + [f(8)-f(5)]  [6.2-5]  [8-5] = 18.7 + [22.9-18.7]  [1.2]  [3] =20.38. As linear interpolation uses the linear rule to replace the actual relation of f, it can only provide an approximate value. This further assumes that the function is not discontinuous or wildly uneven. This method is only used to save time and when accuracy is not too important. Otherwise, the only way out is to carry out calculations yourself.

34 Example 9 12% of dengghi patient hemorrhage. In a ward, there are 16 dengghi patients. What is the probability that (i)Up to 3 will hemorrhage? (ii)If, by a certain time, 2 patients already hemorrhage, what is the probability at least 2 more will hemorrhage? Solution: H = number who hemorrhage ~Bin(16, 0.12). Unfortunately, the table we have do not have the table for n=16 and p = 0.12. We have to either resort to direct calculations for each value or use the method of interpolation.

35 Interpolation for Ex 9 We shall use the tables of H1~Bin(16, 0.1) and H2~Bin(16, 0.15). Note that P(H1  3) = 0.9316 and P(H2  3) = 0.7899. By the rule of interpolation, (i)P(H  3) = 0.9316 + (0.7899 - 0.9316 )  2/5 = 0.8749. Note that exact calculations yield P(H  3) = 0.8838 (4 d.p.) (ii)This asks for the conditional probability P(H  4| H  2) = P(H  4  H  2)  P(H  2). Since H  4  H  2= H  4, the answer we need is P(H  4)  P(H  2). Now P(H  2) = 1 – [P(H=0)+P(H=1)] = 1 – [0.12934 + 0.28219] = 0.5885, and P(H  4) = 1 – P(H  3) = 0.1162 from (i). So P(H  4)  P(H  2) = 0.1162/0.5885=0.1975. [In this case, we carry out exact calculations because only a few numbers are involved]

36 Complex problems In some cases, the probability of an event is built on another event. The next few examples are complex problems of such nature. In addition to this, we shall also meet problems which combine different distributions, including Poisson, normal and others. We shall come to such problems when they arise.

37 Example 10 A school has 18 classes, each with 20 students. During a flu season, 15% of students are absent. (i)To avoid the epidemic from spreading, a class will be closed if 6 or more students are absent. What is the probability a class will close? (ii)For administration purposes, the school is forced to close temporarily when three classes or more are closed. What is the probability the school will close?

38 Example 10 - Solution (i)Let A represent the number of absentees in a class, then A~Bin(20, 0.15). The probability a class will close is P(A  6) = 1–P(A  5) = 1–0.9327 = 0.0673. (ii)Let the number of classes closed be represented by C. Then C~Bin(18, 0.0673). The probability the school will close is P(C  3) = 1 – P(C  2) = 1 – (0.9327 18 + 18  0.0673  0.9327 17 + 18 C 2  0.0673 2  0.9327 16 ) = 0.1168. Note: If we wish to use the statistics tables, use Bin(18, 0.06) and Bin(18, 0.07) and carry out interpolation.

39 Example 11 The probability Chai succeeds in netting a ball is 0.8. (i)Chai is given 6 chances to net a ball, and she is to be rewarded with an ice-cream cone if she nets at least 4. What is the probability she will get the reward in a trial? (ii)5 classmates challenge Chai to win by netting 4 balls or more in 6 trials. Assuming teat she remains consistent throughout the trial, what is the probability she can get 3 ice-cream cones or more?

40 Example 11 (Solution) (i)B=number of balls netted. B~Bin(6, 0.8). P(B  4) = P(B=4)+P(B=5)+ P(B=6) = 6 C 4 0.8 4 0.2 2 + 6 C 5 0.8 5 0.2+ 0.8 6 =0.90112 (ii)I=number of ice-cream cones rewarded I~Bin(5, 0.90112) P(I  3)= 5 C 3 0.90112 3 0.09888 2 + 5 C 4 0.90112 4 0.09888 + 0.90112 5 = 0.0715426+0.3259935+0.5941733 = 0.9917.

41 Example 12 A lecturer runs a quiz with 4 questions. The questions are a bit confusing and the probability a student can guess at the correct answer for each question is 0.75. A student passes if he/she gets 3 correct answers. What is the probability, out of 40 students, at least 30 will pass?


Download ppt "SSE 2193 Engineering Statistics Chapter 2 Special Variables 2A Binomial Distribution."

Similar presentations


Ads by Google