Nevin L. Zhang Room 3504, phone: 2358-7015, THE HONG KONG UNIVERSITY OF SCIENCE & TECHNOLOGY CSIT 5220: Reasoning and Decision under Uncertainty L09: Graphical Models for Decision Problems Nevin L. Zhang Room 3504, phone: 2358-7015, Email: lzhang@cs.ust.hk Home page
L09: Graphical Models for Decision Problems Page 2 L09: Graphical Models for Decision Problems Introduction Extending BN to Include a Single Decision Fundamentals of Rational Decision Making Decision Trees Influence Diagrams Solving influence Diagrams Value of information
Probabilistic Reasoning and Decision Method 1: Two-stage In a BN, calculate posterior probabilities Use the posteriors to make decisions Method 2 Combine the two stages Extend BN to include decisions Better reveal structure of decision problem Compute optimal decisions directly from model Reasoning: Jensen & Nielsen, Sections 9.1-9.4, 10.2, 11.1
L09: Graphical Models for Decision Problems Page 4 L09: Graphical Models for Decision Problems Extending BN to Include a Single Decision Fundamentals of Rational Decision Making Decision Trees Influence Diagrams Solving influence Diagrams Value of information
Poker From Lecture 04 Extend the model so that I can calculate the probability that my hand is better than the opponent’s hand MH: My Hand BH: Best Hand
Fold or Call
Fold or Call Information that I have: FC, SC, MH
Modeling One Action Start with a BN Add the decision node and utility nodes What information we have when making the decision What chance and utility variables will the decision influence
Including More Decisions Things become a bit more complicated. Will see later.
L09: Graphical Models for Decision Problems Extending BN to Include Decisions Fundamentals of Rational Decision Making Decision Trees Influence Diagrams Solving influence Diagrams Value of information
Decision Theory Normative decision theory How people should decide. (Rational agent) Descriptive decision theory How people actually decide.
Normative Decision Theory
Are you rational? Lottery A: [$1mill] Lottery B: 0.5[$2mill] + 0.5[$0mill] Which one do you choose? Most people would choose A U(1) > 0.5 U(2) + 0.5 U(0) Most people are risk-averse, with concave utility function
Are your rational? Suppose that you are $2mill in debt Lottery A: [$1mill] Lottery B: 0.5[$2mill] + 0.5[$0mill] Which one do you choose? Probably B U(1) < 0.5 U(2) + 0.5 U(0) You are being risk-seeking, with convex utility function
Utilities without Money Page 15 Utilities without Money
Utilities without Money Page 16 Utilities without Money
Marks as Utilities
Other Considerations 2 is passing grade If fail, can retake and hopefully get a better grade in transcript In this case, 2 is the worst!
L10: Graphical Models for Decision Problems Extending BN to Include Decisions Fundamentals of Rational Decision Making Decision Trees Influence Diagrams Solving influence Diagrams Value of information
Decision Trees Classical way to represent decision problems with multiple decisions Explicitly show all possible sequences of decisions and observations. Example: Oil Wildcatter A wildcatter is a person who drills wildcat wells, which are oil wells drilled in areas not known to be oil fields. Test on Seismic structure
Decision Tree for Oil Wildcatter
Decision Trees Decision nodes: Rectangles Chance nodes: ellipses Utility values: at leaves, some times inside diamonds To be read from root to leaves Branches from a decision node: possible actions Branches from a chance node: possible outcomes and probs A decision node follows a chance node: The chance node is observed before the decision is made No-forgetting Decision-maker remembers all the labels from root to a decision node Game between decision maker and nature
Solution to a Decision Tree Strategy: Which decision node to pick at each decision node
Solution to a Decision Tree Optimal Strategy: The strategy with the highest expected utility
Solving Decision Trees
Example 77.59 77.59
L09: Graphical Models for Decision Problems Extending BN to Include Decisions Fundamentals of Rational Decision Making Decision Trees Influence Diagrams Solving influence Diagrams Value of information
Extending BN to Include one Decision Start with a BN Add the decision node and utility nodes What information we have when making the decision What chance and utility variables will the decision influence To include multiple decision nodes, Need to consider the interactions among the decisions
Including Multiple Decisions Page 30 Including Multiple Decisions Two more decisions MFC: my first change MSC: my second change
Representing the Decision Sequence First representation All nodes observed before a decision are parents of that decision. Information arcs. Assume that the decision maker doesn’t forget, then some links are redundant.
Representing the Decision Sequence No-forgetting allows a more concise representation Keep directed path going through all the decision node: Order of decision. Arrows into a decision node only from those nodes observed immediately before that decision. Implicit parents: parents of earlier decisions
Influence Diagram A DAG with three types of nodes Chance nodes, decision nodes, and utility nodes There is a directed path containing all the decision nodes. The utility nodes have no children. Each chance node is associated with the conditional distribution given its parents. Each utility node is associated with a utility function, a real-valued function of its parents.
Page 34 Influence Diagram
Influence Diagram An influence diagram for the oil wildcatter problem Page 35 Influence Diagram An influence diagram for the oil wildcatter problem Decision: T: test = {y, n}; D: drill={y, n} Utility: C: cost of test ; V: Benefit of drilling Chance: O: Oil ={dry, wet, soaking} R: seismic structure {no-structure, open-structure, closed-structure, no-result}
L09: Graphical Models for Decision Problems Extending BN to Include Decisions Fundamentals of Rational Decision Making Decision Trees Influence Diagrams Solving influence Diagrams Value of information
Strategy (Policy) A policy specifies what to do for each decision It is a function of observed variables Different policies lead to different expected utility Optimal policy: the Policy that yields the maximum expected utility. How to find the optimal policy?
Finding Optimal Policy First idea: Convert to decision tree and solve it How to convert influence diagram into decision tree Draw tree Root: the thing that happens first Children of root: the thing that happens next … Figure out numerical information
Order of events Tree structure Numerical info Prob for branches from chance node Utility for leaves
A Side Note Two decision trees for Oil Wildcatter First Second directly from problem specification. Asymmetric Second from influence diagram Symmetric Pro of ID: compact Con of ID: cannot represent assymetry Need to introduce artificial state R = no-result
Finding Optimal Policy Page 41 Finding Optimal Policy First idea: Convert to decision tree and solve it Exponential still! Next: Variable Elimination Algorithm for solving influence diagrams Note BN inference: All orderings give correct result, but might have different complexity ID: Must use “strong elimination orderings”.
Temporal Order among Decisions and Observations Notations Decision nodes have a temporal order: D1, D2, …, Dn T0: Set of chance nodes observed prior to any decision Ti: Set of chance nodes observed after Di is taken and before Di+1 is taken Oil Wildcatter D1 = T; D2 = D T0 = {}; T1 = {R}; T2={O} Partial temporal order T0, D1, T1, D2, T2, …., Dn, Tn Oil Wildcatter: T, R, D, O
Temporal Order T0={}, T1={T}, T2={A, B, C} Partial temporal ordering D1, T, D2. {A, B, C} No ordering among A, B, C
Strong Elimination Ordering Partial temporal order T0, D1, T1, D2, T2, …., Dn, Tn Strong elimination orders First eliminate variables in Tn Then eliminate Dn Then eliminate variables in Tn-1 Then eliminate Dn-1 ….. Oil Wildcatter Temporal order: T, R, D, O Strong elimination ordering O, D, R, T
Strong Elimination Ordering T0={}, T1={T}, T2={A, B, C} Partial temporal ordering D1, T, D2. {A, B, C} No ordering among A, B, C Strong elimination orderings A, B, C, D2, T, D1 B, C, A, D2, T, D1 C, A, B, D2, T, D1 ….
Variable Elimination Two set of potentials (factors): Eliminate decision and chance nodes one by one according to a strong elimination ordering. When eliminate variable X
Variable Elimination on Oil Wildcatter Strong Elimination Ordering: O, D, R, T
Variable Elimination on Oil Wildcatter Eliminate: O
Page 49
Page 50
Potentials after Eliminating O
Potentials after Eliminating O
Eliminating D No probability potential involves D Optimal decision for D
Potentials after Eliminating D
Eliminating R
Potentials after Eliminating R
Eliminating T Optimal decision for T Results same as those by decision tree
L09: Graphical Models for Decision Problems Page 58 L09: Graphical Models for Decision Problems Extending BN to Include Decisions Fundamentals of Rational Decision Making Decision Trees Influence Diagrams Solving influence Diagrams Value of information
Two types of Decisions Action decisions Page 59 Two types of Decisions Action decisions Result in significant state change of variables of interest Example: D: Drill or not to drill Test decisions Look for more evidence T: Test of Seismic structure
Two types of Decisions Typical scenario Need to make one decision Page 60 Two types of Decisions Typical scenario Need to make one decision Want to get more information before making the decision Question Is it worthwhile to perform a particular test? Which test to choose if multiple tests are available?
Value of Information What is the value of a test? Page 61 Value of Information What is the value of a test? Create two influence diagrams Solve both Compare their values Example: Oil wildcatter Is it worthwhile to perform the seismic test? ID1: without the test ID2: with the test
Value of Information Expected utility of ID2 U(ID2) = 22.55 Page 62 Value of Information Expected utility of ID2 U(ID2) = 22.55 What is the expected utility of ID1?
Expected Utility of ID1 Temporal ordering: D, O Elimination ordering: O, D Eliminate O:
Expected Utility of ID1 Potentials after eliminating O Eliminate D Page 64 Expected Utility of ID1 Potentials after eliminating O Eliminate D Expected utility of ID1 U(ID1) = 20
Value of Information Difference in expected utility Page 65 Value of Information Difference in expected utility U(ID2) – U(ID1) = 22.55 – 20 = 2.55 The expected value of the seismic test is 2.55 The test is worthwhile
Value of Information If there are multiple tests T1, T2, T3, … Page 66 Value of Information If there are multiple tests T1, T2, T3, … Compute the value of each test, pick the best one If the value of the best is positive, Pick the test among remain tests Stop when value of the selected test is not positive