Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modeling Decision Nur Aini Masruroh.

Similar presentations


Presentation on theme: "Modeling Decision Nur Aini Masruroh."— Presentation transcript:

1 Modeling Decision Nur Aini Masruroh

2 Outline Introduction Probabilistic thinking Decision tree
Introduction to Bayesian Network and Influence Diagram

3 Introduction Why are decisions hard to make? Complexity Uncertainty
There are many alternatives or possible solutions There are many factors to be considered and many of these factors are interdependent Uncertainty The possible future outcomes are uncertain or difficult to predict Information may be vague, incomplete, or unavailable Multiple conflicting objectives The decision maker(s) may have many goals and objectives Many of these goals or objectives may be conflicting in nature

4 Good decision versus good outcome
Good decision is not guarantee good outcome – it only enhances the chance Good decision Good outcome Bad outcome Bad decision

5 Probabilistic thinking
Event is a distinction about some states of the world Example: Whether the next person entering the room is a beer drinker Whether it will be raining tonight, etc When we identify an event, we have in mind what we meant. But will other people know precisely what you mean? Even you may not have precise definition of what you have in mind To avoid ambiguity, every event should pass the clarity test Clarity test: to ensure that we are absolutely clear and precise about the definition of every event we are dealing with in a decision problem

6 Possibility tree Single event tree
Example: event “the next person entering this room is a businessman” Suppose B represents a businessman and B’ otherwise,

7 Possibility tree Two-event trees
Simultaneously consider several events Example: event “the next person entering this room is a businessman” and event “the next person entering this room is a graduate” can be jointly considered

8 Reversing the order of events in a tree
In the previous example, we have considered the distinctions in the order of “businessman” then “graduate”, i.e., B to G. The same information can be expressed with the events in the reverse order, i.e., G to B.

9 Multiple event trees We can jointly consider three events businessman, graduate, and gender.

10 Assigning probabilities to events
To assign probabilities, it depends on our state of information about the event Example: information relevant to assessment of the likelihood that the next person entering the room is a businessman might include the followings: There is an alumni meeting outside the room and most of them are businessman You have made arrangement to meet a friend here and she to your knowledge is not a businessman. She is going to show up any moment. Etc After considering all relevant background information, we assign the likelihood that the next person entering the room is a businessman by assigning a probability value to each of the possibilities or outcomes

11 Marginal and conditional probabilities
In general, given information about the outcome of some events, we may revise our probabilities of other events We do this through the use of conditional probabilities The probability of an event X given specific outcomes of another event Y is called the conditional probability X given Y The conditional probability of event X given event Y and other background information ξ, is denoted by p(X|Y, ξ) and is given by

12 Factorization rule for joint probability

13 Changing the order of conditioning
Suppose in the previous tree we have There is no reason why we should always conditioned G on B. suppose we want to draw the tree in the order G to B Need to flip the tree!

14 Flipping the tree Graphical approach Bayes’ theorem
Change the ordering of the underlying possibility tree Transfer the elemental (joint) probabilities from the original tree to the new tree Compute the marginal probability for the first variable in the new tree, i.e., G. We add the elemental probabilities that are related to G1 and G2 respectively. Compute conditional probabilities for B given G Bayes’ theorem Doing the above tree flipping is already applying Bayes’theorem

15 Bayes’ Theorem Given two uncertain events X and Y. Suppose the probabilities p(X|ξ) and p(Y|X, ξ) are known, then

16 Application of conditional probability
Direct conditioning: Relevance of smoking to lung cancer Suppose: S: A person is a heavy smoker which is defined as having smoked at least two packs of cigarettes per day for a period of at least 10 years during a lifetime L: A person has lung cancer according to standard medical definition A doctor not associated with lung cancer treatment assigned the following probabilities:

17 Relevance of smoking to lung cancer (cont’d)
A lung cancer specialist remarked: “The probability p(L1|S1, ξ) = 0.1 is too low” When asked to explain why, he said: “Because in all these years as a lung cancer specialist, whenever I visited my lung cancer ward, it is always full of smokers.” What’s wrong with the above statement? The answer can be found by flipping the tree:

18 Relevance of smoking to lung cancer (cont’d)
What the specialist referred to as “high” is actually the probability of a person being a smoker given that he has lung cancer, i.e., p(S1|L1, ξ) = is exactly what he was referring to. He has confused p(S1|L1, ξ) with p(L1|S1, ξ) Notice that p(L1|S1, ξ) << p(S1|L1, ξ) Hence even highly a trained professional can fall victim to wrong reasoning

19 Expected value criterion
Suppose you face a situation where you must choose between alternatives A and B as follows: Alternative A: $10,000 for sure. Alternative B: 70% chance of receiving $18,000 and 30% chance of loosing $4,000. What is your personal choice? Compare now Alternative B with: Alternative C: 70% chance of winning $24,600 and 30% chance of loosing $19,400 Note that EMV(B) = EMV(C), but are they “equivalent”? Alternative C seems to be “more risky” than Alternative B even thought they have the same EMV. Conclusion: EMV does not take Risk into account

20 The Petersburg Paradox
In 1713 Nicolas Bernoulli suggested playing the following games: An unbiased coin is tossed until it lands with Tails The player is paid $2 if tails comes up the opening toss, $4 if tails first appears on the second toss, $8 if tails appears on third toss, $16 if tails appears on the forth toss, and so forth What is the maximum you would pay to play the above game? If we follow the EMV criterion: This means that you should be willing to pay up to an infinite amount of money to play the game, but why people are unwilling to pay more than a few dollars?

21 The Petersburg Paradox
25 years later, Nicolas’s cousin, Daniel Bernoulli, arrived at a solution that contained the first seeds of contemporary decision theory Daniel reasoned that the marginal increase the value or “utility” of money declines with the amount already possessed. A gain of $1,000 is more significant to a poor person than to a rich man through both gain same amount Specifically, Daniel Bernoulli argued that the value or utility of money should exhibit some form of diminishing marginal return with increase in wealth: The measure to use to value the game is then the “expected utility”  u is an increasing concave function, converge to a finite number

22 The rules of actional thought
How a person should acts or decides rationally under uncertainty? Answer: by following the following rules or axioms: The ordering rule The equivalence or continuity rule The substitution or independence rule Decomposition rule The choice rule The above five rules form the axioms for Decision Theory

23 The ordering rule The decision maker must be able to state his preference among the prospects, outcomes, or prizes of any deal Furthermore, the transitivity property must be satisfied: that is, if he prefers X to Y, and Y to Z, then he must prefer X to Z Mathematically, The ordering rule implies that the decision maker can provide a complete preference ordering of all the outcomes from the best to the worst Suppose a person does not follow the transitivity property: the money pump argument

24 The equivalence or continuity rule
Given a prospect A, B, and C such that , then there exists p where 0 < p < 1 such that the decision maker will be indifferent between receiving the prospect B for sure and receiving a deal with a probability p for prospect A and a probability of 1 – p for prospect C Given that B: certain equivalent of the uncertain deal on the right p: preference probability of prospect B with respect to prospects A and C

25 The substitution rule We can always substitute a deal with its certainty equivalent without affecting preference For example, suppose the decision maker is indifferent between B and the A – C deal below Then he must be indifferent between the two deals below where B is substituted for the A – C deal

26 The decomposition rule
We can reduce compound deals to simple ones using the rules of probabilities For example, a decision maker should be indifferent between the following two deals:

27 The choice or monotonicity rule
Suppose that a decision maker can choose between two deals L1 and L2 as follows: If the decision maker prefers A to B, then he must prefer L1 to L2 if and only if p1 > p2. That is, if In other words, the decision maker must prefer the deal that offers the greater chance of receiving the better outcome

28 Maximum expected utility principle
Let a decision maker faces the choice between two uncertain deals or lotteries L1 and L2 with outcomes A1, A2, …, An as follows: There is no loss of generality in assuming that L1 and L2 have the same set of outcomes A1, A2, …, An because we can always assign zero probability to those outcomes that do not exist in either L1 and L2. It’s not clear whether L1 or L2 is preferred By ordering rule, let

29 Maximum expected utility principle
Again, there is no loss of generality as we can always renumber the subscripts according to the preference ordering We note that A1 is the most preferred outcome, while An is the least preferred outcome By equivalent rule, for each outcome Ai (i =1, …, n) there is a number ui such that 0 ≤ ui ≤ 1 and Note that u1 = 1 and un = 0. Why?

30 Maximum expected utility principle
By the substitution rule, we replace each Ai (i=1,…,n) in L1 and L2 with the above constructed equivalent lotteries

31 Maximum expected utility principle
By the decomposition rule, L1 and L2 may be reduced to equivalent deals with only two outcomes (A1 and An) each having different probabilities Finally, by the choice rule, since , the decision maker should prefer lottery L1 to lottery L2 if and only if

32 Utilities and utility functions
We define the quantity ui (i=1,…,n) as the utility of outcome Ai and the function that returns the values ui given Ai as a utility function, i.e. u(Ai) = ui The quantities are known as the expected utilities for lotteries L1 and L2 respectively Hence the decision maker must prefer the lottery with a higher expected utility

33 Case for more than 2 alternatives
The previous may be generalized to the case when a decision maker is faced with more than two uncertain alternatives. He should choose the one with maximum expected utility Hence where is the probability for the outcome Ai in the alternative j

34 Comparing expected utility criterion with expected monetary value criterion
The expected utility criterion takes into account both return and risk whereas expected monetary value criterion does not consider risk The alternative with the maximum expected utility is the best taking into account the trade off between return and risk The best preference trade-off depends on a person’s risk attitude Different types of utility function represent different attitudes and degree of aversion to risk taking

35 Decision tree Consider the following party problem:
Problem: decide party location to maximize total satisfaction Note: Decision is represented by square Uncertainties are represented by circles

36 Preference Suppose we have the following preference Note:
Best case: O – S  set 1 Worst case: O – R  set 0 Other outcomes set the preference relative to these two values

37 Assigning probability to the decision tree
Suppose we believe that the probability it will rain is 0.6,

38 Applying substitution rule

39 Using utility values We may interpret preference probability as utility values,

40 Introduction to Bayesian Network and Influence Diagram

41 “A good representation is the key to good problem solving”

42 Probabilistic modeling using BN
Suppose we have the following problem (represented in decision tree): Can be represented using Bayesian Network (BN): Conditional Probability Table (CPT) is embedded in each arch

43 Probabilistic modeling using BN
The network can be extended … Can you imagine the size of decision tree for these?

44 Bayesian Network: definition
Also called relevance diagrams, probabilistic network, causal network, causal graph, etc. BN represents the probabilistic relations between uncertain variables It is a directed acyclic graph; the nodes in the graph indicate the variables of concern, while the arcs between nodes indicate the probabilistic relations among the nodes In each node, we store a conditional probability distribution of the variable represented by that node, conditioned on the outcomes of all the uncertain variables that are parents of that node

45 Two layers of representation of knowledge
Qualitative level Graphical structure represents the probabilistic dependence or relevance between variables Quantitative level Conditional probabilities represent the local “strength” of the dependence relationship

46 Where do the numbers in a BN come from?
Direct assessment by domain experts Learn from sufficient amount of data using: Statistical estimation methods Machine learning and data mining algorithms Output from other mathematical models Simulation models Stochastic models Systems dynamics models Etc Combination of the above Expert assess the graphical structure and learning algorithms or other models fill in the number Learn both structure and numbers and let the experts fine-tune the results

47 Properties of BN Presence of an arc indicates possible relevance
Arc reversal: If we are interested to know the probability that he is a smoker if a specific person has lung cancer … The operation will compute and replace the probabilities at the two nodes An arc can be drawn in any direction

48 Arc reversal operation
Suppose initially we have, Then we want, The probability distribution p(Y) and p(X|Y) for the new network can be computed using Bayes’ Theorem as follows: Original network

49 Arc reversal: example Note: in arc reversal, sometimes we should add arc(s) to preserve the Bayes’ Theorem. However, if possible, avoid arc reversal that will introduce additional arcs as that implies loss of conditional independence information

50 If an arc can be drawn in any direction, which shall I use?
During the network construction, draw arcs in the directions in which you know the conditional probabilities or you know that there are data which you can used to determined these values later. Arcs drawn in these directions are said to be in assessment order. During inference, if he arcs are not in the desired directions, reverse them. Arcs in directions required for inference are said to be in inference order. Example: The network with the arc from “smoking” to “lung cancer” is in assessment order The network with the arc from “lung cancer” to “smoking” is in inference order

51 BN represents joint probability distribution
BN can help simplifying the JPD Consider the following BN Without constructing BN first … p(A,B,C,D,E,F)= p(A)p(B|A)p(C)p(D|B,C)p(E|B)p(F|B,E) p(A,B,C,D,E,F)= p(A)p(B|A)p(C|A,B)p(D|A,B,C)p(E|A,B,C,D)p(F|A,B,C,D,E)

52 Example of BN: car starting system

53 Example of BN: cause of dyspnea

54 Example of BN: ink jet printer trouble shooting

55 Example of BN: patient monitoring in an ICU (alarm project)

56 Decision modeling using Influence Diagram
BN represents probabilistic relationship among uncertain variables They are useful for pure probabilistic reasoning and inferences BN can be extended to Influence Diagram (ID) to represent decision problem by adding decision nodes and value nodes This is analogues to extending a probability tree to a decision tree by adding decision branches and adding values or utilities to the end points of the tree

57 Decision node Decision variable: variable within the control of the decision maker Represented by rectangular node in an ID In each decision node, we store a list of possible alternatives associated with the decision variable

58 Arcs Information arcs: arc from chance node into decision node
Influence arcs: arcs from decision node to chance node

59 Arcs (cont’d) Chronological arcs:
Arc from one decision node to another decision node indicates the chronological order in which the decisions are being carried out

60 Value node and value arc
Used to represent the utility or value function of the decision maker Denoted by a diamond Value node must be a sink node, i.e. it has only incoming arcs (known as value arcs) but no outgoing arc Value arcs indicate the variables whose outcomes the decision maker cares about or have impact on his utility Only one value node is allowed in a standard ID

61 Deterministic node Special type of chance node
Represent the variable whose outcomes are deterministic (i.e. has probability = 1), once the outcomes of other conditioning nodes are known Denoted by a double-oval

62 ID vs decision tree No Influence Diagram Decision tree 1 Compact
The size of an ID is equal to the total number of variables Combinatory The size of decision tree grows exponentially with the total number of variables. A binary tree with n nodes has 2n leaf nodes 2 Graphical representation of independence Conditional independence relations among the variables are represented by the graphical structure of network. No numerical computations needed to determine conditional independence relations Numerical representation of independence Conditional independence relations among the variables can only determined through numerical computation using the numerical computation using the probabilities 3 Non-directional The nodes and arcs of an ID may be added or deleted in any order. This makes the modeling process flexible Unidirectional A decision tree can only be built in the direction from the root to the leaf nodes. The exact sequence of the nodes or events must be known in advance 4 Symmetric model only The outcomes of all nodes must be conditioned on all outcomes of its parents. This implies that the equivalent tree must be symmetrical Asymmetric model possible The outcomes of some nodes may be omitted for certain outcomes of its parent leading to a asymmetrical tree

63 Example 1

64 Example 2

65 Example 3

66 Decision model : example 1
The party problem Basic risky decision problem

67 Decision model : example 2
Decision problem with imperfect information

68 Decision model : example 3
Production/sale problem

69 Decision model : example 4
Maintenance decision for space shuttle tiles

70 Decision model : example 5
Basic model for electricity generation investment evaluation

71 Evaluating ID To find the optimal decision policy of a problem represented by an ID Methods: Convert ID into an equivalent decision tree and perform tree roll back Perform operations directly on the network to obtain the optimal decision policy. First algorithm is that of Shachter (1986)

72

73 Readings Clemen, R.T. and Reilly, T. (2001). Making Hard Decisions with Decision Tools. California: Duxbury Thomson Learning Howard, R.A, (1988). Decision Analysis: Practice and Promise. Management Science, 34(6), pp. 679 – 695. Russell, S. and Norvig, P. (2003). Artificial intelligent: A modern approach, 2 ed. Prentice-Hall, Inc. Shachter, R.D., 1986, Evaluating Influence Diagrams, Operations Research, 34(6), pp. 871 – 882.


Download ppt "Modeling Decision Nur Aini Masruroh."

Similar presentations


Ads by Google