Utilities and Decision Theory

Slides:

Advertisements

Similar presentations

CPS Utility theory Vincent Conitzer

Advertisements

Utility theory U: O-> R (utility maps from outcomes to a real number) represents preferences over outcomes ~ means indifference We need a way to talk about.

Making Simple Decisions

Strongly based on Slides by Vincent Conitzer of Duke

Lirong Xia Bayesian networks (2) Thursday, Feb 25, 2014.

Rational choice: An introduction Political game theory reading group 3/ Carl Henrik Knutsen.

91.420/543: Artificial Intelligence UMass Lowell CS – Fall 2010 Lecture 16: MEU / Utilities Oct 8, 2010 A subset of Lecture 8 slides of Dan Klein – UC.

Making Simple Decisions Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 16.

Decision Making Under Uncertainty Russell and Norvig: ch 16 CMSC421 – Fall 2006.

CS 188: Artificial Intelligence Fall 2009 Lecture 8: MEU / Utilities 9/22/2009 Dan Klein – UC Berkeley Many slides over the course adapted from either.

CMSC 671 Fall 2003 Class #26 – Wednesday, November 26 Russell & Norvig 16.1 – 16.5 Some material borrowed from Jean-Claude Latombe and Daphne Koller by.

Risk attitudes, normal-form games, dominance, iterated dominance Vincent Conitzer

PGM 2003/04 Tirgul7 Foundations of Decision Theory (mostly from Pearl)

CPS 270: Artificial Intelligence Decision theory Instructor: Vincent Conitzer.

Expectimax Evaluation

Reminder Midterm Mar 7 Project 2 deadline Mar 18 midnight in-class

Quiz 5: Expectimax/Prob review

CS 188: Artificial Intelligence Spring 2007 Lecture 21:Reinforcement Learning: I Utilities and Simple decisions 4/10/2007 Srini Narayanan – ICSI and UC.

Decision theory under uncertainty

Decision Making Under Uncertainty CMSC 671 – Fall 2010 R&N, Chapters , , material from Lise Getoor, Jean-Claude Latombe, and.

Uncertainty ECE457 Applied Artificial Intelligence Spring 2007 Lecture #8.

CS 188: Artificial Intelligence Fall 2008 Lecture 7: Expectimax Search 9/18/2008 Dan Klein – UC Berkeley Many slides over the course adapted from either.

CS 188: Artificial Intelligence Fall 2007 Lecture 8: Expectimax Search 9/20/2007 Dan Klein – UC Berkeley Many slides over the course adapted from either.

Markov Decision Process (MDP)

Making Simple Decisions Chapter 16 Some material borrowed from Jean-Claude Latombe and Daphne Koller by way of Marie desJadines,

CS 188: Artificial Intelligence Fall 2006 Lecture 8: Expectimax Search 9/21/2006 Dan Klein – UC Berkeley Many slides over the course adapted from either.

CS 188: Artificial Intelligence Fall 2008 Lecture 8: MEU / Utilities 9/23/2008 Dan Klein – UC Berkeley Many slides over the course adapted from either.

Chapter 4 Consumer and Firm Behavior: The Work-Leisure Decision and Profit Maximization.

Hypothesis testing and statistical decision theory

Reasoning Under Uncertainty: Belief Networks

CS 188: Artificial Intelligence Spring 2006

Vincent Conitzer CPS Utility theory Vincent Conitzer

QUANTITATIVE BUSINESS ANALYSIS

Ralf Möller Universität zu Lübeck Institut für Informationssysteme

The Economics of Information and Choice Under Uncertainty

ECE457 Applied Artificial Intelligence Fall 2007 Lecture #10

ECE457 Applied Artificial Intelligence Spring 2008 Lecture #10

Business Economics (ECO 341) Fall Semester, 2012

CPS 570: Artificial Intelligence Decision theory

12 Uncertainty.

Expectimax Lirong Xia. Expectimax Lirong Xia Project 2 MAX player: Pacman Question 1-3: Multiple MIN players: ghosts Extend classical minimax search.

CS 4/527: Artificial Intelligence

Decision Theory: Single Stage Decisions

L11 Uncertainty.

Risk Chapter 11.

CS 188: Artificial Intelligence Fall 2009

Rational Decisions and

CS 188: Artificial Intelligence Fall 2008

Utility Theory Decision Theory.

Decision Theory: Single Stage Decisions

13. Acting under Uncertainty Wolfram Burgard and Bernhard Nebel

Vincent Conitzer CPS 173 Utility theory Vincent Conitzer

Instructor: Vincent Conitzer

Vincent Conitzer Utility theory Vincent Conitzer

Expectimax Lirong Xia.

Minimax strategies, alpha beta pruning

Constraint satisfaction problems

Behavioral Finance Economics 437.

Vincent Conitzer Utility theory Vincent Conitzer

CS 188: Artificial Intelligence Fall 2007

Artificial Intelligence Decision theory

Utilities and Decision Theory

Utilities and Decision Theory

Bayesian networks (2) Lirong Xia. Bayesian networks (2) Lirong Xia.

Markov Decision Processes

Constraint satisfaction problems

Bayesian networks (2) Lirong Xia.

Markov Decision Processes

Making Simple Decisions

Presentation transcript:

Utilities and Decision Theory Lirong Xia Spring, 2017

Checking conditional independence from BN graph Given random variables Z1,…Zp, we are asked whether X⊥Y|Z1,…Zp dependent if there exists a path where all triples are active independent if for each path, there exists an inactive triple

General method for variable elimination Compute a marginal probability p(x1,…,xp) in a Bayesian network Let Y1,…,Yk denote the remaining variables Step 1: fix an order over the Y’s (wlog Y1>…>Yk) Step 2: rewrite the summation as Σy1 Σy2 …Σyk-1 Σykanything Step 3: variable elimination from right to left sth only involving X’s sth only involving Y1 and X’s sth only involving Y1, Y2,…,Yk-1 and X’s sth only involving Y1, Y2 and X’s

Today Utility theory expected utility: preferences over lotteries maximum expected utility (MEU) principle

Expectimax Search Trees Max nodes (we) as in minimax search Chance nodes Need to compute chance node values as expected utilities Next class we will formalize the underlying problem as a Markov decision Process

Expectimax Pseudocode Def value(s): If s is a max node return maxValue(s) If s is a chance node return expValue(s) If s is a terminal node return evaluations(s) Def maxValue(s): values = [value(s’) for s’ in successors(s)] return max(values) Def expValue(s): weights = [probability(s, s’) for s’ in successors(s)] return expectation(values, weights) Next agent 应该是指父亲节点。。。

Expectimax Quantities

Maximum expected utility Principle of maximum expected utility: A rational agent should chose the action which maximizes its expected utility, given its (probabilistic) knowledge Questions: Where do utilities come from? How do we know such utilities even exist? What if our behavior can’t be described by utilities?

Inference with Bayes’ Rule Example: diagnostic probability from causal probability: Example: F is fire, {f, ¬f} A is the alarm, {a, ¬a} Note: posterior probability of fire still very small Note: you should still run when hearing an alarm! Why? p(f)=0.001 p(a)=0.1 p(a|f)=0.9

0.009 stay 0.991 100% run out

Utilities Utilities are functions from outcomes (states of the world, sample space) to real numbers that represent an agent’s preferences Where do utilities come from? In a game, may be simple (+1/-1) Utilities summarize the agent’s goals -10100 100 -100

Preferences over lotteries An agent chooses among: Prizes: A, B, etc. Lotteries: situations with uncertain prizes Notation: A p L 1-p B A is strictly preferred to B In difference between A and B A is strictly preferred to or indifferent with B

Utility theory in Economics State of the world: money you earn Money does not behave as a utility function, but we can talk about the utility of having money (or being in debt) Which would you prefer? A lottery ticket that pays out $10 with probability .5 and $0 otherwise, or A lottery ticket that pays out $1 with probability 1 How about: A lottery ticket that pays out $100,000,000 with probability .5 and $0 otherwise, or A lottery ticket that pays out $10,000,000 with probability 1 Usually, people do not simply go by expected value

Which one you would prefer? Lottery A: $1M@100% Lottery B: $1M@89% + $5M@10% + 0@1% How about Lottery A: $1M@11%+0@89% Lottery B: $0@90% + $5M@10%

Encoding preferences over lotteries How many lotteries? infinite! Need to find a compact representation Maximum expected utility (MEU) principle which type of preferences (rankings over lotteries) can be represented by MEU?

Rational Preferences We want some constraints on preferences before we call them rational For example: an agent with intransitive preferences can be induced to give away all of its money If B≻C, then an agent with C would pay (say) 1 cent to get B If A≻B, then an agent with B would pay (say) 1 cent to get A If C≻A, then an agent with A would pay (say) 1 cent to get C

Rational Preferences Preference of a rational agent must obey constraints The axioms of rationality: for all lotteries A, B, C Orderability Transitivity Continuity Substitutability Monotonicity Theorem: rational preferences imply behavior describable as maximization of expected utility

MEU Principle Maximum expected utility (MEU) principle: Theorem: [Ramsey, 1931; von Neumann & Morgenstern, 1944] Given any preference satisfying these axioms, there exists a real-value function U such that: Maximum expected utility (MEU) principle: Choose the action that maximizes expected utility Utilities are just a representation! an agent can be entirely rational (consistent with MEU) without ever representing or manipulating utilities and probabilities Utilities are NOT money

What would you do? -10100 0.009 stay 0.991 100 100% run out -100

Common types of utilities

Risk attitudes An agent is risk-neutral if she only cares about the expected value of the lottery ticket An agent is risk-averse if she always prefers the expected value of the lottery ticket to the lottery ticket Most people are like this An agent is risk-seeking if she always prefers the lottery ticket to the expected value of the lottery ticket

Decreasing marginal utility Typically, at some point, having an extra dollar does not make people much happier (decreasing marginal utility) utility buy a nicer car (utility = 3) buy a car (utility = 2) buy a bike (utility = 1) money $200 $1500 $5000

Maximizing expected utility buy a nicer car (utility = 3) buy a car (utility = 2) buy a bike (utility = 1) money $200 $1500 $5000 Lottery 1: get $1500 with probability 1 gives expected utility 2 Lottery 2: get $5000 with probability .4, $200 otherwise gives expected utility .4*3 + .6*1 = 1.8 (expected amount of money = .4*$5000 + .6*$200 = $2120 > $1500) So: maximizing expected utility is consistent with risk aversion

Different possible risk attitudes under expected utility maximization money Green has decreasing marginal utility → risk-averse Blue has constant marginal utility → risk-neutral Red has increasing marginal utility → risk-seeking Grey’s marginal utility is sometimes increasing, sometimes decreasing → neither risk-averse (everywhere) nor risk-seeking (everywhere)

Example: Insurance Because people ascribe different utilities to different amounts of money, insurance agreements can increase both parties’ expected utility You own a car. Your lottery: LY = [0.8, $0; 0.2, -$200] i.e., 20% chance of crashing You do not want -$200! Insurance is $50 UY(LY) = 0.2*UY(-$200)=-200 UY(-$50)=-150 Amount Your Utility UY $0 -$50 -150 -$200 -1000

Example: Insurance Because people ascribe different utilities to different amounts of money, insurance agreements can increase both parties’ expected utility You own a car. Your lottery: LY = [0.8, $0; 0.2, -$200] i.e., 20% chance of crashing You do not want -$200! UY(LY) = 0.2*UY(-$200)=-200 UY(-$50)=-150 Insurance company buys risk: LI = [0.8, $50; 0.2, -$150] i.e., $50 revenue + your LY Insurer is risk-neutral: U(L) = U(EMV(L)) UI(LI) = U(0.8*50+0.2*(-150)) = U($10) >U($0)

Acting optimally over time Finite number of periods: Overall utility = sum of rewards in individual periods Infinite number of periods: … are we just going to add up the rewards over infinitely many periods? Always get infinity! (Limit of) average payoff: limn→∞Σ1≤t≤nr(t)/n Limit may not exist… Discounted payoff: Σtϒt r(t) for some ϒ < 1 Interpretations of discounting: Interest rate r: ϒ= 1/(1+r) World ends with some probability 1-ϒ Discounting is mathematically convenient We will see more in the next class