Definition and Complexity of Some Basic Metareasoning Problems Vincent Conitzer and Tuomas Sandholm Computer Science Department Carnegie Mellon University.

Slides:



Advertisements
Similar presentations
On allocations that maximize fairness Uriel Feige Microsoft Research and Weizmann Institute.
Advertisements

This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.
Complexity of manipulating elections with few candidates Vincent Conitzer and Tuomas Sandholm Carnegie Mellon University Computer Science Department.
Computing Kemeny and Slater Rankings Vincent Conitzer (Joint work with Andrew Davenport and Jayant Kalagnanam at IBM Research.)
SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
1 Chapter 12 Value of Information. 2 Chapter 12, Value of information Learning Objectives: Probability and Perfect Information The Expected Value of Information.
© The McGraw-Hill Companies, Inc., Chapter 8 The Theory of NP-Completeness.
Complexity Results about Nash Equilibria
Computational problems, algorithms, runtime, hardness
Hardness Results for Problems P: Class of “easy to solve” problems Absolute hardness results Relative hardness results –Reduction technique.
Definition and Complexity of Some Basic Metareasoning Problems Vincent Conitzer Tuomas Sandholm Presented by Linli He 04/22/2004.
1 Optimization problems such as MAXSAT, MIN NODE COVER, MAX INDEPENDENT SET, MAX CLIQUE, MIN SET COVER, TSP, KNAPSACK, BINPACKING do not have a polynomial.
An Algorithm for Automatically Designing Deterministic Mechanisms without Payments Vincent Conitzer and Tuomas Sandholm Computer Science Department Carnegie.
Dealing with NP-Complete Problems
Computing Shapley Values, Manipulating Value Distribution Schemes, and Checking Core Membership in Multi-Issue Domains Vincent Conitzer and Tuomas Sandholm.
Computational Criticisms of the Revelation Principle Vincent Conitzer, Tuomas Sandholm AMEC V.
AWESOME: A General Multiagent Learning Algorithm that Converges in Self- Play and Learns a Best Response Against Stationary Opponents Vincent Conitzer.
The Theory of NP-Completeness
NP-Complete Problems Problems in Computer Science are classified into
Analysis of Algorithms CS 477/677
Complexity of Mechanism Design Vincent Conitzer and Tuomas Sandholm Carnegie Mellon University Computer Science Department.
Chapter 11: Limitations of Algorithmic Power
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 22 Instructor: Paul Beame.
CS10 The Beauty and Joy of Computing Lecture #23 : Limits of Computing Thanks to the success of the Kinect, researchers all over the world believe.
NP-complete and NP-hard problems. Decision problems vs. optimization problems The problems we are trying to solve are basically of two kinds. In decision.
Hardness Results for Problems P: Class of “easy to solve” problems Absolute hardness results Relative hardness results –Reduction technique.
Hardness Results for Problems
10/31/02CSE Greedy Algorithms CSE Algorithms Greedy Algorithms.
Arbitrage in Combinatorial Exchanges Andrew Gilpin and Tuomas Sandholm Carnegie Mellon University Computer Science Department.
10/31/02CSE Greedy Algorithms CSE Algorithms Greedy Algorithms.
1.1 Chapter 1: Introduction What is the course all about? Problems, instances and algorithms Running time v.s. computational complexity General description.
MCS312: NP-completeness and Approximation Algorithms
Chapter 11 Limitations of Algorithm Power. Lower Bounds Lower bound: an estimate on a minimum amount of work needed to solve a given problem Examples:
Scott Perryman Jordan Williams.  NP-completeness is a class of unsolved decision problems in Computer Science.  A decision problem is a YES or NO answer.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Approximation Algorithms Pages ADVANCED TOPICS IN COMPLEXITY THEORY.
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 8: Complexity Theory.
RESOURCES, TRADE-OFFS, AND LIMITATIONS Group 5 8/27/2014.
MIT and James Orlin1 NP-completeness in 2005.
Automated Design of Multistage Mechanisms Tuomas Sandholm (Carnegie Mellon) Vincent Conitzer (Carnegie Mellon) Craig Boutilier (Toronto)
Mechanism design for computationally limited agents (previous slide deck discussed the case where valuation determination was complex) Tuomas Sandholm.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
P, NP, and Exponential Problems Should have had all this in CS 252 – Quick review Many problems have an exponential number of possibilities and we can.
EMIS 8373: Integer Programming NP-Complete Problems updated 21 April 2009.
1 CPSC 320: Intermediate Algorithm Design and Analysis July 28, 2014.
CSCI 3160 Design and Analysis of Algorithms Tutorial 10 Chengyu Lin.
1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.
Combinatorial Auctions with Structured Item Graphs Vincent Conitzer, Jonathan Derryberry, and Tuomas Sandholm Computer Science Department Carnegie Mellon.
Complexity of Determining Nonemptiness of the Core Vincent Conitzer, Tuomas Sandholm Computer Science Department Carnegie Mellon University.
Umans Complexity Theory Lectures Lecture 1a: Problems and Languages.
Automated Mechanism Design Tuomas Sandholm Presented by Dimitri Mostinski November 17, 2004.
NP-Complete Problems. Running Time v.s. Input Size Concern with problems whose complexity may be described by exponential functions. Tractable problems.
Beauty and Joy of Computing Limits of Computing Ivona Bezáková CS10: UC Berkeley, April 14, 2014 (Slides inspired by Dan Garcia’s slides.)
CS10: The Beauty and Joy of Computing Lecture #22 Limits of Computing Warning sign posted at Stern Hall. Also, Apple releases new operating.
Optimization Problems
CPS Computational problems, algorithms, runtime, hardness (a ridiculously brief introduction to theoretical computer science) Vincent Conitzer.
CS6045: Advanced Algorithms NP Completeness. NP-Completeness Some problems are intractable: as they grow large, we are unable to solve them in reasonable.
The Beauty and Joy of Computing Lecture #23 Limits of Computing Researchers at Facebook and the University of Milan found that the avg # of “friends” separating.
Chapter 11 Introduction to Computational Complexity Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans Lecture 8: Crash Course in Computational Complexity.
NP Completeness Piyush Kumar. Today Reductions Proving Lower Bounds revisited Decision and Optimization Problems SAT and 3-SAT P Vs NP Dealing with NP-Complete.
CSC 413/513: Intro to Algorithms
Great Theoretical Ideas in Computer Science.
Computing Shapley values, manipulating value division schemes, and checking core membership in multi-issue domains Vincent Conitzer, Tuomas Sandholm Computer.
Chapter 10 NP-Complete Problems.
Complexity of Determining Nonemptiness of the Core
Chapter 11 Limitations of Algorithm Power
CPS 173 Computational problems, algorithms, runtime, hardness
Reinforcement Learning Dealing with Partial Observability
Presentation transcript:

Definition and Complexity of Some Basic Metareasoning Problems Vincent Conitzer and Tuomas Sandholm Computer Science Department Carnegie Mellon University

Bounded rationality: descriptive vs. normative Full rationality requires doing all possibly useful deliberation –Computation, information gathering actions, etc. In many settings, not all possibly useful deliberation can be done because of time/other constraints –E.g. full rationality requires solving a (too) hard SAT instance This leads to bounded rationality –Try to do as well as you can given the constraint Two strands of research on bounded rationality: –Descriptive: How do humans deal with the constraint? –Normative (prescriptive): How should the constraint be dealt with? Decision theory The normative strand is useful in the design of agents

Metareasoning Normative bounded rationality requires (optimally) deciding which deliberation actions to take This is a computational problem called metareasoning –“Reasoning about how to do the reasoning” But, of course, … The metareasoning problem itself may be hard! Widely acknowledged, but there is a lack of canonical metareasoning problems in the literature We defined some canonical metareasoning problems and studied their complexity –Any real-world metareasoning system must likely deal with one or more of these problems (in addition to others)

Three problems that we study PERFORMANCE-PROFILES –How do you distribute your deliberation time over multiple disjoint problems? ACTION-EVALUATION –How do you distribute your deliberation time over multiple options available to you? STATE-DISAMBIGUATION –What is the best plan for trying to discern the state of the world?

PERFORMANCE PROFILES

What is a performance profile? Assume that we can perfectly predict how the quality of the solution will improve with the computation time we spend on it This function is a performance profile (curve) deliberation time solution quality

The performance-profiles problem Suppose: –We have multiple problems to compute on; –Our utility is the sum of the solution qualities. For example: a newspaper company has to solve vehicle routing problems to deliver the newspaper orders for the day –One for each of the cities it delivers in –Has until 5am to improve the solutions –Wants to achieve the maximum (overall) savings over standard route How should it schedule its computation time? deliberation time savings

Complexity result Theorem. PERFORMANCE-PROFILES is NP-complete. –Even with piecewise linear performance profiles Reduction is from KNAPSACK Visual proof: deliberation time solution quality v1 c1 v2 c2 But: solvable in polynomial time with concave profiles –[Boddy and Dean, 94]

ACTION EVALUATION

An action evaluation tree A robot is considering whether to dig for gold The prior probability of there being gold is 1/8 The robot can test for gold before digging –Alternative interpretation: the robot could reason about whether there is gold Say that: P(test positive|gold) = 14/15 P(test positive|no gold) = 1/15 Then (using Bayes’ rule) P(test positive) = 7/40 P(gold|test positive) = 2/3 P(gold|test negative) = 1/99 Say the utility of finding gold is 5 Then the action evaluation tree to the right represents the predicament In general, action evaluation trees can have multiple tests/reasoning steps 7/40 33/40 5/8 10/3 5/99 test N = expected utility of digging at this point M = transition probability

The action evaluation problem: example Now suppose the robot could dig in (at most one of) multiple locations, A, B, and C A may have gold (as before) B may have silver, C may have copper (For simplicity of presentation) B and C have errorless tests Different tests may take different amounts of time –Say, tests at A and C take 2 units, test at B 3 units –5 units of time available for testing 7/40 33/40 5/8 10/3 5/99 test N = expected utility of digging here now M = transition probability 1/2 3/2 3 0 test 1/ test A B C Optimal plan: test at B first; if positive, test at A; otherwise at C –Maximizes expected utility

General definition and complexity The general ACTION-EVALUATION problem: Given a set of performance profile trees, what is the expected utility of the best deliberation plan? Theorem. ACTION-EVALUATION is NP-hard even when no tree has depth greater than 1, branching factor 2, and all leaf values -1, 0, or 1. Open question. Is the general problem even harder (for example, PSPACE-hard?)

STATE DISAMBIGUATION

State disambiguation: example A robot encounters a gap in the floor in front of it The gap can be a staircase (S), hole (H), or canyon (C) There are three actions: descend (d), jump (j), or walk away (w) u(S, d) = 2 (new floors are very interesting) u(H, j) = 1 (passing a hole is somewhat interesting) u(S, w) = u(H, w) = u(C, w) = 0 u(S, j) = u(H, d) = u(C, j) = u(C, d) = -infinity (the robot is destroyed)

State disambiguation: example (cont.) To find out what the robot is in front of, it has some tests/queries that it can run Test (1) answers: “Am I inside a building?” –“Yes” is consistent only with S –“No” is consistent with H, C, and S (staircases occur outside, too) Test (2) answers: “If I drop an item in front of me, do I hear a sound?” –“Yes” is consistent with S, H –“No” is consistent with C, H (some holes are very deep) Test (3) answers: “Can I walk around the gap?” –“Yes” is consistent with S, H –“No” is consistent with S, H, C (the robot may be prevented from walking around by a wall in any case) Test (3) Test (1) Test (2) Yes No Yes No

State disambiguation: solutions Given the number of tests the robot can do… … the robot wants to set a contingency plan for which tests to do… … to maximize expected utility (of the best action to take after testing) On the right are the optimal plans with time for one or two tests Test (1) Deadline Yes No Test (1) Deadline Yes No Test (3) No Test (2) No Yes Optimal 2- step plan Optimal 1-step plan

The general state-disambiguation problem (There are some obvious generalizations of this, all of which are at least as hard as the basic variant.) We are given: –A set of possible world states, with a prior distribution –A utility for every state (how much is it worth to know (for certain) the world is in that state) Not knowing the state for certain always gives utility 0 –A set of queries (or tests), each with a set of answers For each query, for each state, some (at least 1) answers are consistent with that state, others are not; one of the consistent answers is chosen at random –A deadline giving the maximum number of reasoning steps We are asked what the expected utility of the best plan is

Complexity results Theorem. STATE-DISAMBIGUATION is NP-hard even when there is only one consistent answer for each state, for each query. –Proof reduces from SET-COVER Theorem. STATE-DISAMBIGUATION is PSPACE-hard in general. –Proof reduces from STOCHASTIC SATISFIABILITY

Conclusions & future research We defined 3 fundamental metareasoning problems –Most real-world metareasoning systems would need to deal with one or more of these problems We showed high computational complexity in solving these problems –Metareasoning policies directly suggested by decision theory not always feasible Future research should address how to deal with this complexity in metareasoning systems –Approximation algorithms –Algorithms for easy special cases –Algorithms that usually run fast –Meta-metareasoning –…

Thank you for your attention!