Probabilistic thinking – part 1 Nur Aini Masruroh
Events Event is a distinction about some states of the world Example: Whether the next person entering the room is a beer drinker The date of the next general election Whether it will be raining tonight Our next head of department Etc
Clarity test When we identify an event, we have in mind what we meant. But will other people know precisely what you mean? Even you may not have precise definition of what you have in mind To avoid ambiguity, every event should pass the clarity test Clarity test: to ensure that we are absolutely clear and precise about the definition of every event we are dealing with in a decision problem The clarity test is conducted by submitting our definition of each event to a clairvoyant A clairvoyant is a hypothetical being who is: Competent and trustworthy Knows the outcome of any past and future event Knows the value of any physically defined quantity both in the past and future Has infinite computational (mental) power and is able to perform any reasoning and computation instantly and without any effort
Clarity test (cont’d) Passing the clarity test: If and only if the clairvoyant can tell its outcome without any further judgment Example: The next person entering this room is a beer drinker What is a beer drinker? What is a beer? The next person entering this room is a graduate What is a graduate?
Possibility tree Single event tree Example: event “the next person entering this room is a businessman” Suppose B represents a businessman and B’ otherwise,
Possibility tree Two-event trees Simultaneously consider several events Example: event “the next person entering this room is a businessman” and event “the next person entering this room is a graduate” can be jointly considered
Reversing the order of events in a tree In the previous example, we have considered the distinctions in the order of “businessman” then “graduate”, i.e., B to G. The same information can be expressed with the events in the reverse order, i.e., G to B.
Multiple event trees We can jointly consider three events businessman, graduate, and gender.
Using probability to represent uncertainty Probability: Frequentist view Probabilities are fundamentally dispositional properties of non- deterministic physical systems Probabilities are viewed as long-run frequencies of events This is the standard interpretation used in classical statistics Subjective (Bayesian) view Probabilities are representations of our subjective degree of belief Probabilities in general are not necessarily ties to any physical or process which can be repeated indefinitely
Assigning probabilities to events To assign probabilities, it depends on our state of information about the event Example: information relevant to assessment of the likelihood that the next person entering the room is a businessman might include the followings: There is an alumni meeting outside the room and most of them are businessman You have made arrangement to meet a friend here and she to your knowledge is not a businessman. She is going to show up any moment. Etc After considering all relevant background information, we assign the likelihood that the next person entering the room is a businessman by assigning a probability value to each of the possibilities or outcomes
Marginal and conditional probabilities In general, given information about the outcome of some events, we may revise our probabilities of other events We do this through the use of conditional probabilities The probability of an event X given specific outcomes of another event Y is called the conditional probability X given Y The conditional probability of event X given event Y and other background information ξ, is denoted by p(X|Y, ξ) and is given by
Factorization rule for joint probability
Changing the order of conditioning Suppose in the previous tree we have There is no reason why we should always conditioned G on B. suppose we want to draw the tree in the order G to B Need to flip the tree!
Flipping the tree Graphical approach Change the ordering of the underlying possibility tree Transfer the elemental (joint) probabilities from the original tree to the new tree Compute the marginal probability for the first variable in the new tree, i.e., G. We add the elemental probabilities that are related to G 1 and G 2 respectively. Compute conditional probabilities for B given G Bayes’ theorem Doing the above tree flipping is already applying Bayes’theorem
Bayes’ Theorem Given two uncertain events X and Y. Suppose the probabilities p(X|ξ) and p(Y|X, ξ) are known, then
Probabilistic dependency or relevance Let A be an event with n possible outcomes a i, i=1,…,n B be an event with m possible outcomes b j,j=1,…,m Event A is said to be probabilistically dependent on event B if p(A|b j, ξ) ≠ p(A|b k, ξ) for some j ≠ k The conditional probability of A given B is different for different outcomes or realizations of event B. we also say that B is relevant to A Event A is said to be probabilistically independent on event B if p(A|b j, ξ) = p(A|b k, ξ) for all j = k The conditional probability of A given B is the same for all outcomes or realizations of event B. we also say that B is irrelevant to A In fact, if A is independent of B, then p(A|B, ξ) = p(A|ξ) Intuitively, independence means knowing the outcome of one event does not provide any information on the probability of outcomes of the other event
Joint probability distribution of independent events In general, the joint probability distribution for any two uncertain events A and B is p(A, B|ξ)=p(A|B, ξ)p(B| ξ) If A and B are independent, then since p(A|B,ξ)=p(A| ξ), we have p(A, B|ξ)=p(A|ξ) p(B|ξ) The joint probability of A and B is simply the product of their marginal probabilities In general, the joint probability for n mutually independent events is p(X 1, X 2, …, X n |ξ)=p(X 1 |ξ) p(X 2 |ξ)… p(X n-1 |ξ) p(X n |ξ)
Conditional independence or relevance Suppose given 2 events, A and B, and they are found to be not independent Introduce event C with 2 outcomes, c1 and c2 If C=c 1 is true, and we have p(A|B, c 1, ξ)=p(A|c 1, ξ) If C=c 2 is true, we have p(A|B, c 2, ξ)=p(A|c 2, ξ) Then we say that event A is conditionally independent of event B given event C Definition (Conditional Independence): given 3 distinct events A, B, and C, if p(A|B, c k, ξ)=p(A|c k, ξ) for all k, that is the conditional probability table (CPT) for A given B and C repeats for all possible realizations of C, then we say that A and B are conditional independent given C, and denote by
Conditional independence (cont’d) If then p(A|B, C, ξ)=p(A|C, ξ) Example: Given the following conditional probabilities: p(a 1 |b 1, c 1 )= 0.9p(a 2 |b 1, c 1 )= 0.1 p(a 1 |b 2, c 1 )= 0.9 p(a 2 |b 2, c 1 )= 0.1 p(a 1 |b 1, c 2 )= 0.8p(a 2 |b 1, c 2 )= 0.2 p(a 1 |b 2, c 2 )= 0.8 p(a 2 |b 2, c 2 )= 0.2 we conclude that Note that A is not (marginally) independent of B unless we can show that p(a1|b1) = p(a1|b2) with more information
Join probability distribution of conditional probability distribution Recall, by factorization rule, the joint probability for A, B, and C is p(A, B, C|ξ)= p(A| B, C, ξ)p(B|C, ξ)p(C| ξ) If A is independent of B given C, then since p(A| B, C, ξ) = p(A|C, ξ) we have p(A, B, C|ξ)= p(A| C, ξ)p(B|C, ξ)p(C| ξ)
To be continued… See you next week!