Probability
Randomness Long-Run (limiting) behavior of a chance (non-deterministic) process Relative Frequency: Fraction of time a particular outcome occurs Some cases the structure is known (e.g. tossing a coin, rolling a dice) Often structure is unknown and process must be simulated Probability of an event E (where n is number of trials) :
Set Notation S a set of interest Subset: A S A is contained in S Union: A B Set of all elements in A or B or both Intersection: AB Set of all elements in both A and B Complement: Ā Set of all elements not in A
Probability Sample Space (S)- Set of all possible outcomes of a random experiment. Mutually exclusive and exhaustive form. Event- Any subset of a sample space Probability of an event A (P(A)): P(A) ≥ 0 P(S) = 1 If A1,A2,... is a sequence of mutually exclusive events (AiAj = ):
Counting Rules for Probability (I) Multiplicative Rule- Brute-force method of defining number of elements in S Experiment consists of k stages The jth stage has nj possible outcomes Total number of outcomes = n1...nk Tabular structure for k = 2 Stage 1\ Stage2 1 2 … n2 n2+1 n2+2 2n2 n1 (n1-1)n2+1 (n1-1)n2+2 n1n2
Counting Rules for Probability (II) Permutations: Ordered arrangements of r objects selected from n distinct objects without replacement (r ≤ n) Stage 1: n possible objects Stage 2: n-1 remaining possibilities Stage r : n-r+1 remaining possibilities
Counting Rules for Probability (III) Combinations: Unordered arrangements of r objects selected from n distinct objects without replacement (r ≤ n) The number of distinct orderings among a set of r objects is r ! # of combinations where order does not matter = number of permutations divided by r !
Counting Rules for Probability (IV) Partitions: Unordered arrangements of n objects partitioned into k groups of size n1,...nk (where n1 +...+ nk = n) Stage 1: # of Combinations of n1 elements from n objects Stage 2: # of Combinations of n2 elements from n-n1 objects Stage k: # of Combinations of nk elements from nk objects
Counting Rules for Probability (V) Runs of binary outcomes (String of m+n trials) Observe n Successes (S) Observe m Failures (F) k = minimum # of “runs” of S or F in the ordered outcomes of the m+n trials
Runs Examples – 2006/7 UF NCAA Champs
Conditional Probability and Independence In many situations one event in a multi-stage experiment occurs temporally before a second event Suppose event A can occur prior to event B We can consider the probability that B occurs given A has occurred (or vice versa) For B to occur given A has occurred: At first stage A has to have occurred At second stage, B has to occur (along with A) Assuming P(A), P(B) > 0, we can obtain “Probability of B Given A” and “Probability of A Given B” as follow: If P(B|A) = P(B) and P(A|B) = P(A), A and B are said to be INDEPENDENT
Rules of Probability
Bayes’ Rule - Updating Probabilities Let A1,…,Ak be a set of events that partition a sample space such that (mutually exclusive and exhaustive): each set has known P(Ai) > 0 (each event can occur) for any 2 sets Ai and Aj, P(Ai and Aj) = 0 (events are disjoint) P(A1) + … + P(Ak) = 1 (each outcome belongs to one of events) If C is an event such that 0 < P(C) < 1 (C can occur, but will not necessarily occur) We know the probability C will occur given each event Ai: P(C|Ai) Then we can compute probability of Ai given C occurred:
Example - OJ Simpson Trial Given Information on Blood Test (T+/T-) Sensitivity: P(T+|Guilty)=1 Specificity: P(T-|Innocent)=.9957 P(T+|I)=.0043 Suppose you have a prior belief of guilt: P(G)=p* What is “posterior” probability of guilt after seeing evidence that blood matches: P(G|T+)? Source: B.Forst (1996). “Evidence, Probabilities and Legal Standards for Determination of Guilt: Beyond the OJ Trial”, in Representing OJ: Murder, Criminal Justice, and the Mass Culture, ed. G. Barak pp. 22-28. Harrow and Heston, Guilderland, NY
OJ Simpson Posterior (to Positive Test) Probabilities
Northern Army at Battle of Gettysburg Regiments: partition of soldiers (A1,…,A9). Casualty: event C P(Ai) = (size of regiment) / (total soldiers) = (Column 3)/95369 P(C|Ai) = (# casualties) / (regiment size) = (Col 4)/(Col 3) P(C|Ai) P(Ai) = P(Ai and C) = (Col 5)*(Col 6) P(C)=sum(Col 7) P(Ai|C) = P(Ai and C) / P(C) = (Col 7)/.2416
CRAPS Player rolls 2 Dice (“Come out roll”): After first roll: 2,3,12 - Lose (Miss Out) 7,11 - Win (Pass) 4,5,6.8,9,10 - Makes point. Roll until point (Win) or 7 (Lose) Probability Distribution for first (any) roll: After first roll: P(Win|2) = P(Win|3) = P(Win|12) = 0 P(Win|7) = P(Win|11) = 1 What about other conditional probabilities if make point?
CRAPS Suppose you make a point: (4,5,6,8,9,10) You win if your point occurs before 7, lose otherwise and stop Let P mean you make point on a roll Let C mean you continue rolling (neither point nor 7) You win for any of the mutually exclusive events: P, CP, CCP, …, CC…CP,… If your point is 4 or 10, P(P)=3/36, P(C)=27/36 By independence, and multiplicative, and additive rules:
CRAPS Similar Patterns arise for points 5,6,8, and 9: For 5 and 9: P(P) = 4/36 P(C) = 26/36 For 6 and 8: P(P) = 5/36 P(C) = 25/36 Finally, we can obtain player’s probability of winning:
CRAPS - P(Winning) Note in the previous slides we derived P(Win|Roll), we multiply those by P(Win) to obtain P(Roll&Win) and sum those for P(Win). The last column gives the probability of each come out roll given we won.
Odds, Odds Ratios, Relative Risk Odds: Probability an event occurs divided by probability it does not occur odds(A) = P(A)/P(Ā) Many gambling establishments and lotteries post odds against an event occurring Odds Ratio (OR): Odds of A occurring for one group, divided by odds of A for second group Relative Risk (RR): Probability of A occurring for one group, divided by probability of A for second group
Example – John Snow Cholera Data 2 Water Providers: Southwark & Vauxhall (S&V) and Lambeth (L) Provider\Status Cholera Death No Cholera Death Total S&V 3706 263919 267625 Lambeth 411 171117 171528 4117 435036 439153