Decision Analysis Lecture 4

Slides:



Advertisements
Similar presentations
Probabilistic models Haixu Tang School of Informatics.
Advertisements

Managerial Decision Modeling with Spreadsheets
Chapter 5 Some Important Discrete Probability Distributions
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Basic Business Statistics.
ฟังก์ชั่นการแจกแจงความน่าจะเป็น แบบไม่ต่อเนื่อง Discrete Probability Distributions.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Statistics.
Engineering Economic Analysis Canadian Edition
Chapter 4 Discrete Random Variables and Probability Distributions
Jon Curwin and Roger Slater, QUANTITATIVE METHODS: A SHORT COURSE ISBN © Thomson Learning 2004 Jon Curwin and Roger Slater, QUANTITATIVE.
Decision Analysis (cont)
Statistical Decision Theory
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Basic Business Statistics.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 5 Section 1 – Slide 1 of 33 Chapter 5 Section 1 Probability Rules.
Probability Rules!. ● Probability relates short-term results to long-term results ● An example  A short term result – what is the chance of getting a.
Theory of Probability Statistics for Business and Economics.
OPIM 5103-Lecture #3 Jose M. Cruz Assistant Professor.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Basic Business Statistics.
Engineering Economic Analysis Canadian Edition
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 5-1 Business Statistics: A Decision-Making Approach 8 th Edition Chapter 5 Discrete.
QM Spring 2002 Business Statistics Probability Distributions.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
Probability and Distributions. Deterministic vs. Random Processes In deterministic processes, the outcome can be predicted exactly in advance Eg. Force.
1 Optimizing Decisions over the Long-term in the Presence of Uncertain Response Edward Kambour.
Business Statistics: A Decision-Making Approach, 7e © 2008 Prentice-Hall, Inc. Chap 5-1 Business Statistics: A Decision-Making Approach 7 th Edition Chapter.
+ Chapter 5 Overview 5.1 Introducing Probability 5.2 Combining Events 5.3 Conditional Probability 5.4 Counting Methods 1.
© 2015 McGraw-Hill Education. All rights reserved. Chapter 16 Decision Analysis.
STROUD Worked examples and exercises are in the text Programme 29: Probability PROGRAMME 29 PROBABILITY.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Business Statistics,
Math 6330: Statistical Consulting Class 9
Decision Making Under Uncertainty
Decision Analysis Lecture 3
Decision Analysis Lecture 7
Lecture #14 Thursday, October 6, 2016 Textbook: Sections 8.4, 8.5, 8.6
Computer Simulation Henry C. Co Technology and Operations Management,
Experiments vs. Observational Studies vs. Surveys and Simulations
OPERATING SYSTEMS CS 3502 Fall 2017
Math 6330: Statistical Consulting Class 11
Optimal Stopping.
Example Suppose that a deck of 52 cards containing four aces is shuffled thoroughly and the cards are then distributed among four players so that each.
Business Modeling Lecturer: Ing. Martina Hanová, PhD.
Math 6330: Statistical Consulting Class 8
Determining the theoretical probability of an event
Probability and Statistics Chapter 3 Notes
Bayes Net Learning: Bayesian Approaches
Inference for Proportions
Chapter 5 Sampling Distributions
Artificial Intelligence
Chapter 5 Sampling Distributions
CAP 5636 – Advanced Artificial Intelligence
MNG221- Management Science –
Lecture 11 Sections 5.1 – 5.2 Objectives: Probability
Chapter 5 Some Important Discrete Probability Distributions
Simple Random Sample A simple random sample (SRS) of size n consists of n elements from the population chosen in such a way that every set of n elements.
Chapter 5 Sampling Distributions
Discrete Probability Distributions
Probability distributions
CS 188: Artificial Intelligence
Random Variables Binomial Distributions
Chapter 5 Sampling Distributions
Discrete Probability Distributions
Discrete Probability Distributions
Discrete Probability Distributions
Lecture 11: Binomial and Poisson Distributions
Parametric Methods Berlin Chen, 2005 References:
Discrete Probability Distributions
Mathematical Foundations of BME Reza Shadmehr
Discrete Probability Distributions
Daniela Stan Raicu School of CTI, DePaul University
Presentation transcript:

Decision Analysis Lecture 4 Tony Cox My e-mail: tcoxdenver@aol.com Course web site: http://cox-associates.com/DA/

Agenda Problem set 3 solutions Assignment 4: Bayesian inference Simulation-optimization for Joe’s pills Assignment 4: Bayesian inference Introduction to Netica for Bayesian inference Wrap-up on decision trees Binomial distribution

Homework #4 (Due by 4:00 PM, February 14) Problems Machines Fair coin Readings Required: Clinical vs. statistical predictions, http://emilkirkegaard.dk/en/?p=6085 Recommended: Important probability distributions Binomial https://www.utdallas.edu/~scniu/OPRE-6301/documents/Important_Probability_Distributions.pdf Recommended: Binomial distribution in R http://www.r-tutor.com/elementary-statistics/probability-distributions/binomial-distribution http://www.stats.uwo.ca/faculty/braun/RTricks/basics/BasicRIV.pdf

Assignment 3, Problem 1 (ungraded) A fair coin is tossed once. Draw the risk profile (cumulative distribution function) for the number of heads. Purpose: Be able to draw, interpret risk profiles Practice! You do not have to turn this in, but we will go over the solution next class Helpful background on discrete CDFs: www.probabilitycourse.com/chapter3/3_2_1_cdf.php

Assignment 3, Solution 1 A fair coin is tossed once. Draw the risk profile (cumulative distribution function) for the number of heads. Solution: http://www.gaussianwaves.com/2008/04/probability/

Assignment 3, Problem 2 Joe’s medicine Joe takes pills to reduce his risk of heart attack Pharmacist can prescribe for him either 1 pill per day at full strength, or 2 pills per day, each at half strength The probability that Joe forget to take any given pill on any occasion is p. Its value is uncertain. Here is how pills affect daily heart attack risk: If he takes full strength pill, multiply his risk by 0.5 (it is cut in half) If he takes 1 half-strength pill, multiply risk by 0.7 If he takes both half-strength pills, multiply his risk by 0.5 If he takes no pill, multiply risk by 1 What should the pharmacist prescribe? Please submit answer as two ranges (intervals) of p values for which the best choice is (A) Prescribe 1 full strength pill; (B) Prescribe 2 half-strength pills

Binomial distribution with parameters p and N = 2 The probability that Joe takes 0 pills is p2 Pr(takes 2 pills) = (1- p)2 Pr(takes 1 pill) = p(1 - p) + (1 - p)p = 2p(1-p)

Assignment 3, Solution 2 Joe’s medicine The probability that Joe forgets to take any pill is p. Here is how pills affect daily heart attack risk: If he takes full strength pill, multiply his risk by 0.5 (it is cut in half) If he takes 1 half-strength pill, multiply risk by 0.7 If he takes both half-strength pills, multiply his risk by 0.5 If he takes no pill, multiply risk by 1 What should the pharmacist prescribe? 1 full pill per day reduces Joe’s risk by p*1 + (1-p)*0.5 = 0.5 + 0.5p 2 half pills per day reduce risk by (p2)*1 + 2p(1-p)*0.7 + (1-p)2*0.5 = p2 + 1.4p – 1.4p2 + (1 -2p + p2)*0.5 = 0.1p2 + 0.4p + 0.5 2 pills are better than 1 (i.e., have lower risk for Joe) if 0.5 + 0.5p > 0.1p2 + 0.4p + 0.5  0.5p > 0.1p2 + 0.4p  0.1p > 0.1p2  p > p2 But p > p2 for all 0 < p < 1. If p = 0 or 1, they are equally good. Otherwise, 2 pills are always better than 1, for all p in (0, 1).

Graphical solution 2 pills deterministically dominates 1 pill over the whole (infinite) set of states in (0, 1). If EU(a) lines crossed, then we would have to assess probabilities for the values of p. Simulation-optimization: Could solve using decision tree and simulation of EU(a), given model {u(c), Pr(s), and Pr(c | a, s)}. (Here, state s is p.) For each a: Draw s from Pr(s) Draw c from Pr(c | a, s) Evaluate u(c) Repeat and average u(c) values Select a with greatest mean(u(c))

Assignment 3, Problem 3 Certainty Equivalent calculation If you buy a raffle ticket for $2.00 and win, you will get $19.00; else, you will receive nothing from the ticket. The probability of winning is 1/3 Your utility function for final wealth x is u(x) = log(x) Your initial wealth (before deciding whether to buy the ticket) is $10 What is your certainty equivalent (selling price) for the opportunity to buy this raffle ticket? Please submit one number Should you buy it? Please answer Yes or No

Assignment 3, Solution 3 Certainty Equivalent calculation If you buy a raffle ticket for $2 and win, you will get $19.00; else, you will receive nothing from the ticket. The probability of winning is 1/3 X = random variable for final wealth if you buy ticket = 10 - 2 +19 = $27 with probability 1/3, else 10 - 2 = $8. Your utility function for final wealth x is u(x) = log(x) Your initial wealth is $10 Let CE = CE(X) = CE of final wealth if you buy ticket u(CE) = EU(X) = (1/3)*log(10 - 2 + 19) + (2/3)*log(10 - 2) = 2.4849. CE = exp(2.4849) = $12. So, deciding to buy the ticket increases your CE(wealth) from $10 to $12. This transaction is worth $2 to you.

Assignment 3, Solution 3 Certainty Equivalent calculation u(CE) = EU(X) = (1/3)*log(10 - 2 + 19) + (2/3)*log(10 - 2) = 2.4849. CE = exp(2.4849) = $12. So, ticket increases your CE(wealth) from $10 to $12 and is worth $2 to you Note: EMV(X) = (1/3)*(10 - 2 + 19) + (2/3)*(10 - 2) = $14.33, so your risk premium is $14.33 - $12.00 =$2.33 Note: Suppose initial wealth is 1000: Then CE(final wealth) is exp((1/3)*log(1000 - 2 + 19) + (2/3)*log(1000 - 2)) = 1004.29 (compared to EMV = 1000 + (1/3)*17 + (2/3)*(-2) = 1004.33). Risk premium is $0.04.

Assignment 4,Problem 1: Fair Coin Problem (due 2-14-17) A box contains two coins: (a) A fair coin; and (b) A coin with a head on each side. One coin is selected at random (we don’t know which) and tossed once. It comes up heads. Q1: What is the probability that the coin is the fair coin? Q2: If the same coin is tossed again and shows heads again, then what is the new (posterior) probability that it is the fair coin? Solve manually and/or using Netica.

Assignment 4, Problem 2: Defective Items (due 2-14-17) Machines 1, 2, and 3 produced (20%, 30%, 50%) of items in a large batch, respectively. The defect rates for items produced by these machines are (1%, 2%, 3%), respectively. A randomly sampled item is found to be defective. What is the probability that it was produced by Machine 2? Exercise: (a) Solve using Netica (b) Solve manually E-mail answer (a single number,) to tcoxdenver@aol.com

Introduction to Bayesian inference with Netica®

Example: HIV screening Pr(s) = 0.01 = fraction of population with HIV s = has HIV, s′ = does not have HIV y = test is positive Pr(test positive | HIV) = 0.99 Pr(test positive | no HIV) = 0.02 Find: Pr(HIV | test positive) = Pr(s | y) Subjective probability estimates?

Solution via Bayesian Network (BN) Solver DAG model: “True state  Observation” DAG = “directed acyclic graph”: Nodes and arrows, no cycles allowed Store “marginal probabilities” at input nodes (having output arrows only) Store “conditional probability tables” at all other nodes. Make observations Enter query Solver calculates conditional probabilities

Solution in Netica Step 1: Build model, compile network

Solution in Netica Step 1: Build model, compile network Step 2: Condition on observation (right-click, choose “Enter findings”), view conditional probabilities

Wrap-up on Netica introduction User just needs to enter model and observations (“findings”) Netica uses Bayesian Network algorithms to update all probabilities (conditioning them on findings) We will learn to do this manually for small problems Algorithms and software are essential for large, complex inference problems

Review and wrap-up on decision trees and probabilities

Decision tree ingredients Three types of nodes Choice nodes (squares) Chance nodes (circles) Terminal nodes / value nodes Arcs show how decisions and chance events can unfold over time Uncertainties are resolved as time passes and choices are made

Solving decision trees “Backward induction” “Stochastic dynamic programming” “Average out and roll back”  implicitly, tree determines Pr(c | a) Procedure: Start at tips of tree, work backward Compute expected value at each chance node “Averaging out” Choose maximum expected value at each choice node

Obtaining Pr(s) from Decision trees http://www. eogogics                                                                      Decision 1: Develop or Do Not Develop Development Successful + Development Unsuccessful (70% X $172,000) + (30% x (- $500,000)) $120,400 + (-$150,000)

Obtaining Pr(s) from Decision trees http://www. eogogics                                                                      Decision 1: Develop or Do Not Develop Development Successful + Development Unsuccessful (70% X $172,000) + (30% x (- $500,000)) $120,400 + (-$150,000)

What happened to act a and state s. http://www. eogogics                                                                      Decision 1: Develop or Do Not Develop Development Successful + Development Unsuccessful (70% X $172,000) + (30% x (- $500,000)) $120,400 + (-$150,000)

What happened to act a and state s. http://www. eogogics                                                                      Decision 1: Develop or Do Not Develop Development Successful + Development Unsuccessful (70% X $172,000) + (30% x (- $500,000)) $120,400 + (-$150,000)

What happened to act a and state s. http://www. eogogics                                                                      What are the 3 possible acts in this tree?

What happened to act a and state s. http://www. eogogics                                                                      What are the 3 possible acts in this tree? (a) Don’t develop; (b) Develop, then rebuild if successful; (c) Develop, then new line if successful.

What happened to act a and state s. http://www. eogogics                                                                      Optimize decisions! What are the 3 possible acts in this tree? (a) Don’t develop; (b) Develop, then rebuild if successful; (c) Develop, then new line if successful.

Key points Solving decision trees (with decisions) requires embedded optimization Make future decisions optimally, given the information available when they are made Event trees = decision trees with no decisions Can be solved, to find outcome probabilities, by forward Monte-Carlo simulation, or by multiplication and addition In general, sequential decision-making cannot be modeled well using event trees. Must include (optimal choice | information)

What happened to state s. http://www. eogogics                                                                      What are the 4 possible states?

What happened to state s. http://www. eogogics                                                                      What are the 4 possible states? C1 can succeed or not; C2 can be high or low demand

Acts and states cause consequences http://www. eogogics                                                                     

Key theoretical insight A complex decision model can be viewed as a (possibly large) simple Pr(c | a) model. s = selection of branch at each chance node a = selection of branch at each choice node c = outcome at terminal node for (a, s) Pr(c | a) = sPr(c | a, s)*Pr(s) Other complex decision models can also be interpreted as c(a, s), Pr(c | a, s), or Pr(c |s) models s = system state & information signal a = decision rule (information  act) c may include changes in s and in possible a.

Real decision trees can quickly become “bushy messes” (Raiffa, 1968) with many duplicated sub-trees

Influence Diagrams help to avoid large trees http://en. wikipedia Often much more compact than decision trees

Limitations of decision trees Combinatorial explosion Example: Searching for a prize in one of N boxes or locations involves building a tree of depth N! = N(N – 1)…*2*1. Infinite trees Continuous variables When to stop growing a tree? How to evaluate utilities and probabilities?

Optimization formulations of decision problems Example: Prize is in location j with prior probability p(j), j = 1, 2, …, N It costs c(j) to inspect location j What search strategy minimizes expected cost of finding prize? What is a strategy? Order in which to inspect How many are there? N!

With two locations, 1 and 2 Strategy 1: Inspect 1, then 2 if needed: Expected cost: c1 + (1 – p1)c2 = c1 + c2 – p1c2 Strategy 2: Inspect 2, then 1 if needed: Expected cost: c2 + (1 – p2)c1 = c1 + c2 – p2c1 Strategy 1 has lower expected cost if: p1c2 > p2c1, or p1/c1 > p2/c2 So, look first at location with highest success probability per unit cost

With N locations Optimal decision rule: Always inspect next the (as-yet uninspected) location with the greatest success probability-to-cost ratio Example of an “index policy,” “Gittins index” If M players take turns, competing to find prize, each should still use this rule. A decision table or tree can be unwieldy even for such simple optimization problems

Other optimization formulations maxa A EU(a) Typically, a is a vector, A is the feasible set More generally, a is a strategy/policy/decision rule, A is the choice set of feasible strategies In previous example, A = set of permutations s.t. EU(a) = ∑cPr(c | a)u(c) Pr(c | a) = ∑sPr(c | a, s)p(s) g(a) ≤ 0 (feasible set, A)

Advanced decision tree analysis Game trees Different decision-makers Monte Carlo tree search (MCTS) in games with risk and uncertainty https://jeffbradberry.com/posts/2015/09/intro-to-monte-carlo-tree-search/, http://www.cameronius.com/research/mcts/about/index.html http://www.cameronius.com/cv/mcts-survey-master.pdf Generating trees Apply rules to expand and evaluate nodes Learning trees from data Sequential testing http://stackoverflow.com/questions/23803186/monte-carlo-tree-search-implementation-for-tic-tac-toe

Summary on decision trees Decision trees show sequences of choices, chance nodes, observations, and final consequences. Mix observations, acts, optimization, causality Good for very small problems; less good for medium-sized problems; unwieldy for large problems  use IDs instead Can view decision trees and other decision models as simple c(a, s) models But need good optimization solvers!

Road map: Filling in the normal form matrix Assessing probabilities Eliciting well-calibrated probabilities Deriving probabilities from models Estimating probabilities from data Assessing utilities Utility elicitation Single-attribute utility theory Multi-attribute utility theory

Binomial probability model

Some useful probability models Uniform = unif Binomial (n trials, 2 outcomes) = binom Poisson (“rare events” law) = pois Exponential (waiting time) = exp Normal (sums, random errors) = norm Beta (proportions) = beta p = distribution function (cdf) r = random sample (simulation) q = quantile, d = density

Binomial model pbinom(x, n, p) Two outcomes on each of n independent trials, “success” and “failure” Probability of success = p for each trial independently Expected number of successes in n trials with success probability p = ? Probability of no more than x successes in n trials with success probability p = pbinom(x, n, p)

pbinom(x, n, p) includes probability of x succeses Expected number of successes in n trials with success probability p = np Probability of no more than x successes in n trials with success probability p = pbinom(x, n, p) Note that pbinom is for less than or equal to x successes in n trials

Binomial model pbinom(x, n, p) 2 outcomes on each of n independent trials P(success) = p for each trial independently E(successes in n trials) = np = mean Pr(x successes in n trials) = nC xpx(1 – p)n-x = dbinom(x,n,p) nC x = “n choose x” = number of combinations of n things taken x at a time = n(n-1)…(n – x + 1)/x! Example: Pr(1 or 2 heads in 4 tosses of a fair coin) = ?

Binomial model pbinom(x,n,p), dbinom(x,n,p) Pr(1 or 2 heads in 4 tosses of a fair coin) = Pr(1 head) + Pr(2 heads) = 4C 1p1(1 – p)3 + 4C 2p2(1 – p)2 = (4 + 6)*0.54 = 10/16 = 5/8 = 0.625 = dbinom(1,4,0.5)+ dbinom(2,4,0.5) = pbinom(2,4,0.5)- pbinom(0,4,0.5)

Example of binomial model pbinom(x, n, p) Susan goes skiing each weekend if the weather is good n = 12 weekends in ski season Probability of good weather = 0.65 for each weekend independently What is the probability that she will ski for 8 or more weekends? (Use pbinom) Find her expected number of ski weekends

Do it!

Example of binomial model pbinom(x, n = 12, p = 0.65) Expected number of weekends she skis is np = ? Probability of skiing for 8 or more weekends = 1 – Pr(no more than 7 ski trips in 12 weekends, with p = 0.65 for each) = ?

Example of binomial model pbinom(x, n = 12, p = 0.65) Expected number of weekends she skis is np = 12*0.65 = 7.8 Probability of skiing for 8 or more weekends = 1 – Pr(no more than 7 ski trips in 12 weekends, with p = 0.65 for each) = 1- pbinom(7, 12, 0.65) > 1- pbinom(7, 12, 0.65) [1] 0.583345

Optional practice problems on binomial calculations: Do using R Ten percent of computer parts produced by a certain supplier are defective. What is the probability that a sample of 10 parts contains more than 3 defective ones? On the average, two tornadoes hit major U.S. metropolitan areas every year. What is the probability that more than five tornadoes occur in major U.S. metropolitan areas next year? A lab network consisting of 20 computers was attacked by a computer virus. This virus enters each computer with probability 0.4, independently of other computers. a) Find the probability that the virus enters at least 10 computers. b) A computer manager checks the lab computers, one after another, to see if they were infected by the virus. What is the probability that she has to test at least 6 computers to find the first infected one? Check answers at www.utdallas.edu/~mbaron/3341/Practice4.pdf E-mail any questions on R solutions to tcoxdenver@aol.com

Plotting a binomial distribution (probability density function) > x = c(0:12) > y = dbinom(x, 12, 0.65); plot(x,y) Probability distribution for number of ski weekends This “probability density” or “probability mass” function lets us calculate expected utility of season pass if its utility is determined by the number of ski weekends.

Plotting a binomial distribution > barplot(dbinom(x, 12, 0.65))

Risk profile (CDF) for binomial R: x <- c(0:12) R: y <- pbinom(x, 12, 0.65) G: plot(x, y)

Using the binomial model to calculate probabilities A company will remain solvent if at least 3 of its 8 markets are profitable. The probability that each market is profitable is 25%. What is the probability that the company remains solvent?

Using the binomial model to calculate probabilities A company will remain solvent if at least 3 of its 8 markets are profitable. The probability that each market is profitable is 25%. What is the probability that the company remains solvent? Pr(no more than 5 failure) = pbinom(5, 8, 0.75) [1] 0.3214569

Bayesian analysis and probability basics

How to get needed probabilities? Derive from other probabilities and models; condition on data Bayes’ rule, decomposition and logic, event trees, fault trees, probability theory & models Monte Carlo simulation models Make them up (subjective probabilities), ask others (elicitation) Calibration, biases (e.g., over-confidence) Estimate them from data