Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Probability Theory Rosen 5 th ed., Chapter 5. 2 Random Experiment - Terminology A random (or stochastic) experiment is an experiment whose result is.

Similar presentations


Presentation on theme: "1 Probability Theory Rosen 5 th ed., Chapter 5. 2 Random Experiment - Terminology A random (or stochastic) experiment is an experiment whose result is."— Presentation transcript:

1 1 Probability Theory Rosen 5 th ed., Chapter 5

2 2 Random Experiment - Terminology A random (or stochastic) experiment is an experiment whose result is not certain in advance.A random (or stochastic) experiment is an experiment whose result is not certain in advance. An outcome is the result of a random experiment.An outcome is the result of a random experiment. The sample space S of the experiment is the set of possible outcomes.The sample space S of the experiment is the set of possible outcomes. An event E is a subset of the sample space.An event E is a subset of the sample space.

3 3 Examples Experiment: roll two diceExperiment: roll two dice Possible Outcomes: (1,2), (5,6) etc.Possible Outcomes: (1,2), (5,6) etc. Sample space: {(1,1) (1,2) … (6,6)}Sample space: {(1,1) (1,2) … (6,6)} (36 possible outcomes) (36 possible outcomes) Event: sum is 7Event: sum is 7 {(1,6) (6,1) (2,5) (5,2) (3,4) (4,3)} {(1,6) (6,1) (2,5) (5,2) (3,4) (4,3)}

4 4 Probability The probability p = Pr[E]  [0,1] of an event E is a real number representing our degree of certainty that E will occur.The probability p = Pr[E]  [0,1] of an event E is a real number representing our degree of certainty that E will occur. –If Pr[E] = 1, then E is absolutely certain to occur, –If Pr[E] = 0, then E is absolutely certain not to occur,. –If Pr[E] = ½, then we are completely uncertain about whether E will occur; –What about other cases?

5 5 Four Definitions of Probability Several alternative definitions of probability are commonly encountered:Several alternative definitions of probability are commonly encountered: –Frequentist, Bayesian, Laplacian, Axiomatic They have different strengths & weaknesses.They have different strengths & weaknesses. Fortunately, they coincide and work well with each other in most cases.Fortunately, they coincide and work well with each other in most cases.

6 6 Probability: Laplacian Definition First, assume that all outcomes in the sample space are equally likely.First, assume that all outcomes in the sample space are equally likely. Then, the probability of event E is given by:Then, the probability of event E is given by: Pr[E] = |E| / |S| Pr[E] = |E| / |S|

7 7 Example Experiment: roll two diceExperiment: roll two dice Sample space S: {(1,1) (1,2) … (6,6)}Sample space S: {(1,1) (1,2) … (6,6)} (36 possible outcomes) (36 possible outcomes) Event E: sum is 7Event E: sum is 7 {(1,6) (6,1) (2,5) (5,2) (3,4) (4,3)} {(1,6) (6,1) (2,5) (5,2) (3,4) (4,3)} P(E) = |E| / |S| = 6 / 36 = 1/6 P(E) = |E| / |S| = 6 / 36 = 1/6

8 8 Another Example What is the probability of wining the lottery assuming that you have to pick correctly 6 numbers out of 40 numbers?What is the probability of wining the lottery assuming that you have to pick correctly 6 numbers out of 40 numbers?

9 9 Probability of Complementary Events Let E be an event in a sample space S.Let E be an event in a sample space S. Then, E represents the complementary event:Then, E represents the complementary event: Pr[E] = 1-Pr[E] Pr[E] = 1-Pr[E] Proof:Proof: Pr[E]=(|S|-|E|)/|S| = 1 – |E|/|S| = 1 − Pr[E] Pr[E]=(|S|-|E|)/|S| = 1 – |E|/|S| = 1 − Pr[E]

10 10 Example A sequence of ten bits is randomly generated. What is the probability that at least one of these bits is 0?A sequence of ten bits is randomly generated. What is the probability that at least one of these bits is 0?

11 11 Probability of Unions of Events Let E 1,E 2  S, then:Let E 1,E 2  S, then: Pr[E 1  E 2 ] = Pr[E 1 ] + Pr[E 2 ] − Pr[E 1  E 2 ] Pr[E 1  E 2 ] = Pr[E 1 ] + Pr[E 2 ] − Pr[E 1  E 2 ] –By the inclusion-exclusion principle.

12 12 Example What is the probability that a positive integer selected at random from the set of positive integers not exceeding 100 is divisible by either 2 or 5?What is the probability that a positive integer selected at random from the set of positive integers not exceeding 100 is divisible by either 2 or 5?

13 13 Mutually Exclusive Events Two events E 1, E 2 are called mutually exclusive if they are disjoint: E 1  E 2 = Two events E 1, E 2 are called mutually exclusive if they are disjoint: E 1  E 2 =  Note that two mutually exclusive events cannot both occur in the same instance of a given experiment.Note that two mutually exclusive events cannot both occur in the same instance of a given experiment. –If event E 1 happens, then event E 2 cannot, or vice-versa. For mutually exclusive events, Pr[E 1  E 2 ] = Pr[E 1 ] + Pr[E 2 ].For mutually exclusive events, Pr[E 1  E 2 ] = Pr[E 1 ] + Pr[E 2 ].

14 14 Example Experiment: Rolling a die once:Experiment: Rolling a die once: –Sample space S = {1,2,3,4,5,6} – E 1 = 'observe an odd number' = {1,3,5} –E 2 = 'observe an even number' = {2,4,6} – E 1  E 2 = , so E 1 and E 2 are mutually exclusive.

15 15 Probability: Axiomatic Definition How to study probabilities of experiments where outcomes may not be equally likely?How to study probabilities of experiments where outcomes may not be equally likely? Let S be the sample space of an experiment with a finite or countable number of outcomes.Let S be the sample space of an experiment with a finite or countable number of outcomes. Define a function p:S→[0,1], called a probability distribution.Define a function p:S→[0,1], called a probability distribution. –Assign a probability p(s) to each outcome s where p(s)= (# times s occurs) / (# times the experiment is performed) as # of times  ∞ as # of times  ∞

16 16 Probability: Axiomatic Definition Two conditions must be met:Two conditions must be met: 0 ≤ p(s) ≤ 1 for all outcomes s  S.0 ≤ p(s) ≤ 1 for all outcomes s  S. ∑ p(s) = 1 (sum is over all s  S).∑ p(s) = 1 (sum is over all s  S). The probability of any event E  S is just:The probability of any event E  S is just:

17 17 Example Suppose a die is biased (or loaded) so that 3 appears twice as often as each other number but that the other five outcomes are equally likely. What is the probability that an odd number appears when we roll the die?Suppose a die is biased (or loaded) so that 3 appears twice as often as each other number but that the other five outcomes are equally likely. What is the probability that an odd number appears when we roll the die?

18 18 Properties (same as before …) Pr[E] = 1 − Pr[E] Pr[E] = 1 − Pr[E] Pr[E 1  E 2 ] = Pr[E 1 ] + Pr[E 2 ] − Pr[E 1  E 2 ] Pr[E 1  E 2 ] = Pr[E 1 ] + Pr[E 2 ] − Pr[E 1  E 2 ] For mutually exclusive events, Pr[E 1  E 2 ] = Pr[E 1 ] + Pr[E 2 ].For mutually exclusive events, Pr[E 1  E 2 ] = Pr[E 1 ] + Pr[E 2 ].

19 19 Independent Events Two events E,F are independent if the outcome of event E, has no effect on the outcome of event F.Two events E,F are independent if the outcome of event E, has no effect on the outcome of event F. Pr[E  F] = Pr[E]·Pr[F]. Pr[E  F] = Pr[E]·Pr[F]. Example: Flip a coin, and roll a die.Example: Flip a coin, and roll a die. Pr[ quarter is heads  die is 1 ] = Pr[quarter is heads] × Pr[die is 1]

20 20 Example Are the eventsAre the events E=“a family with three children has children of both sexes”, and E=“a family with three children has children of both sexes”, and F=“a family with three children has at most one boy” F=“a family with three children has at most one boy” independent? independent?

21 21 Mutually Exclusive vs Independent Events If A and B are mutually exclusive, they cannot be independent.If A and B are mutually exclusive, they cannot be independent. If A and B are independent, they cannot be mutually exclusive.If A and B are independent, they cannot be mutually exclusive.

22 22 Conditional Probability Then, the conditional probability of E given F, written Pr[E|F], is defined asThen, the conditional probability of E given F, written Pr[E|F], is defined as Pr[E|F] = Pr[E  F]/Pr[F] Pr[E|F] = Pr[E  F]/Pr[F] P(Cavity) = 0.1P(Cavity) = 0.1 - In the absence of any other information, there is a 10% chance that somebody has a cavity. P(Cavity/Toothache) = 0.8P(Cavity/Toothache) = 0.8 - There is a 80% chance that somebody has a cavity given that he has a toothache.

23 23 Example A bit string of length four is generated at random so that each of the 16 bit strings of length four is equally likely. What is the probability that it contains at least two consecutive 0s, given that its first bit is a 0?A bit string of length four is generated at random so that each of the 16 bit strings of length four is equally likely. What is the probability that it contains at least two consecutive 0s, given that its first bit is a 0?

24 24 Conditional Probability (cont’d) Note that if E and F are independent, then Pr[E|F] = Pr[E]Note that if E and F are independent, then Pr[E|F] = Pr[E] Using Pr[E|F] = Pr[E  F]/Pr[F] we have: Using Pr[E|F] = Pr[E  F]/Pr[F] we have: Pr[E  F] = Pr[E|F] Pr[F] Pr[E  F] = Pr[E|F] Pr[F] Pr[E  F] = Pr[F|E] Pr[E] Pr[E  F] = Pr[F|E] Pr[E] Combining them we get the Bayes rule:Combining them we get the Bayes rule:

25 25 Bayes’s Theorem Useful for computing the probability that a hypothesis H is correct, given data D:Useful for computing the probability that a hypothesis H is correct, given data D: Extremely useful in artificial intelligence apps:Extremely useful in artificial intelligence apps: –Data mining, automated diagnosis, pattern recognition, statistical modeling, evaluating scientific hypotheses.

26 26 Example Consider the probability of Disease given SymptomsConsider the probability of Disease given Symptoms P(Disease/Symptoms) = P(Disease/Symptoms) = P(Symptoms/Disease)P(Disease) / P(Symptoms) P(Symptoms/Disease)P(Disease) / P(Symptoms)

27 27 Random Variable A random variable V is a function (i.e., not a variable) that assigns a value to each event !A random variable V is a function (i.e., not a variable) that assigns a value to each event !

28 28 Why Using Random Variables? It is easier to deal with a summary variable than with the original probability structure.It is easier to deal with a summary variable than with the original probability structure. Example: in an opinion poll, we ask 50 people whether agree or disagree with a certain issue.Example: in an opinion poll, we ask 50 people whether agree or disagree with a certain issue. –Suppose we record a "1" for agree and "0" for disagree. –The sample space for this experiment has possible elements. –Suppose we are only interested in the number of people who agree. –Define the variable X=“number of 1’s recorded out of 50” –Easier to deal with this sample space - has only 51 elements!

29 29 Random Variables How is the probability function of a r.v. being defined from the probability function of the original sample space?How is the probability function of a r.v. being defined from the probability function of the original sample space? –Suppose the original sample space is: S={ } S={ } –Suppose the range of the r.v. X is: { } { } –We will observe X= iff the outcome of the random experiment is an such that X( )=

30 30 Example X = “sum of dice”X = “sum of dice” –X=5 corresponds to E={(1,4) (4,1) (2,3) (3,2)} P(X=5)=P(E)= = P(X=5)=P(E)= = P((1,4) + P((4,1)) + P((2,3)) + P((3,2)) = 4/36=1/9 P((1,4) + P((4,1)) + P((2,3)) + P((3,2)) = 4/36=1/9

31 31 Expectation Values For a random variable X having a numeric domain, its expected value or weighted average value or arithmetic mean value E[X] is defined asFor a random variable X having a numeric domain, its expected value or weighted average value or arithmetic mean value E[X] is defined as Provides a central point for the distribution of values of the r.v.Provides a central point for the distribution of values of the r.v.

32 32 Example X=“outcome of rolling a die”X=“outcome of rolling a die” E(X) = ∑ x·P(X=x) = 1· 1/6 + 2 ·1/6 + 3· 1/6 + E(X) = ∑ x·P(X=x) = 1· 1/6 + 2 ·1/6 + 3· 1/6 + 4 ·1/6 + 5 ·1/6 + 6· 1/6 = 3.5 4 ·1/6 + 5 ·1/6 + 6· 1/6 = 3.5

33 33 Another Example X=“sum of two dice”X=“sum of two dice” P(X=2)=P(X=12)=1/36 P(X=3)=P(X=11)=2/36 P(X=4)=P(X=10)=3/36 P(X=5)=P(X=9)=4/36 P(X=6)=P(X=8)=5/36 P(X=7)=6/36 E(X) = ∑ x·P(X=x) =2·1/36 + 3·2/36 + …. E(X) = ∑ x·P(X=x) =2·1/36 + 3·2/36 + …. + 12· 1/36 = 7 + 12· 1/36 = 7

34 34 Linearity of Expectation Let X 1, X 2 be any two random variables derived from the same sample space, then:Let X 1, X 2 be any two random variables derived from the same sample space, then: –E[X 1 +X 2 ] = E[X 1 ] + E[X 2 ] –E[aX 1 + b] = aE[X 1 ] + b

35 35 Example Find the expected value of the sum of the numbers that appear when a pair of fair dice is rolled.Find the expected value of the sum of the numbers that appear when a pair of fair dice is rolled. X 1 : “number appearing on first die” X 1 : “number appearing on first die” X 2 : “number appearing on second die” X 2 : “number appearing on second die” Y = X 1 + X 2 : “sum of dice” Y = X 1 + X 2 : “sum of dice” E(Y) = E(X 1 +X 2 )=E(X 1 )+E(X 2 ) = 3.5+3.5=7 E(Y) = E(X 1 +X 2 )=E(X 1 )+E(X 2 ) = 3.5+3.5=7

36 36 Variance The variance Var[X] = σ 2 (X) of a r.v. X is defined as:The variance Var[X] = σ 2 (X) of a r.v. X is defined as: where where The standard deviation or root-mean-square (RMS) difference of X, σ(X) :≡ Var[X] 1/2.The standard deviation or root-mean-square (RMS) difference of X, σ(X) :≡ Var[X] 1/2. Indicates how spread out the values of the r.v. are.Indicates how spread out the values of the r.v. are.

37 37 Example X=“outcome of rolling a die”X=“outcome of rolling a die” E(X) = ∑ x·P(X=x) = 1· 1/6 + 2 ·1/6 + 3· 1/6 + E(X) = ∑ x·P(X=x) = 1· 1/6 + 2 ·1/6 + 3· 1/6 + 4 ·1/6 + 5 ·1/6 + 6· 1/6 = 3.5 4 ·1/6 + 5 ·1/6 + 6· 1/6 = 3.5 =

38 38 Properties of Variance Let X 1, X 2 be any two independent random variables derived from the same sample space, then:Let X 1, X 2 be any two independent random variables derived from the same sample space, then: Var[X 1 +X 2 ] = Var[X 1 ] + Var[X 2 ] Var[X 1 +X 2 ] = Var[X 1 ] + Var[X 2 ]

39 39 Example Suppose two dice are rolled and X is a r.v.Suppose two dice are rolled and X is a r.v. X = “sum of dice” X = “sum of dice” Find Var(X)Find Var(X)


Download ppt "1 Probability Theory Rosen 5 th ed., Chapter 5. 2 Random Experiment - Terminology A random (or stochastic) experiment is an experiment whose result is."

Similar presentations


Ads by Google