4/8: Cost Propagation & Partialization Today’s lesson: Beware of solicitous suggestions from juvenile cosmetologists Exhibit A: Abe Lincoln Exhibit B:

Slides:

Advertisements

Similar presentations

Algorithm Design Techniques

Advertisements

Risk Modeling The Tropos Approach PhD Lunch Meeting 07/07/2005 Yudistira Asnar –

CLASSICAL PLANNING What is planning ?  Planning is an AI approach to control  It is deliberation about actions  Key ideas  We have a model of the.

1 Constraint Satisfaction Problems A Quick Overview (based on AIMA book slides)

Multi‑Criteria Decision Making

Top 5 Worst Times For A Conference Talk 1.Last Day 2.Last Session of Last Day 3.Last Talk of Last Session of Last Day 4.Last Talk of Last Session of Last.

Multi-Objective Optimization NP-Hard Conflicting objectives – Flow shop with both minimum makespan and tardiness objective – TSP problem with minimum distance,

Effective Approaches for Partial Satisfaction (Over-subscription) Planning Romeo Sanchez * Menkes van den Briel ** Subbarao Kambhampati * * Department.

Planning Graphs * Based on slides by Alan Fern, Berthe Choueiry and Sungwook Yoon.

Planning and Scheduling. 2 USC INFORMATION SCIENCES INSTITUTE Some background Many planning problems have a time-dependent component –  actions happen.

Best-First Search: Agendas

Sussman anomaly - analysis The start state is given by: ON(C, A) ONTABLE(A) ONTABLE(B) ARMEMPTY The goal by: ON(A,B) ON(B,C) This immediately leads to.

3/4  The slides on quotienting were added after the class to reflect the white-board discussion in the class.

Planning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 11.

Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.

A Heuristic Bidding Strategy for Multiple Heterogeneous Auctions Patricia Anthony & Nicholas R. Jennings Dept. of Electronics and Computer Science University.

Planning: Part 3 Planning Graphs COMP151 April 4, 2007.

CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.

Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 1 Ryan Kinworthy CSCE Advanced Constraint Processing.

10/18: Temporal Planning (Contd) 10/25: Rao out of town; midterm Today:  Temporal Planning with progression/regression/Plan-space  Heuristics for temporal.

1 Planning. R. Dearden 2007/8 Exam Format  4 questions You must do all questions There is choice within some of the questions  Learning Outcomes: 1.Explain.

Planning Graph-based Heuristics for Cost-sensitive Temporal Planning Minh B. Do & Subbarao Kambhampati CSE Department, Arizona State University

Academic Advisor: Prof. Ronen Brafman Team Members: Ran Isenberg Mirit Markovich Noa Aharon Alon Furman.

1001 Ways to Skin a Planning Graph for Heuristic Fun and Profit Subbarao Kambhampati Arizona State University (With tons of.

Minh Do - PARC Planning with Goal Utility Dependencies J. Benton Department of Computer Science Arizona State University Tempe, AZ Subbarao.

Planning Copyright, 1996 © Dale Carnegie & Associates, Inc. Chapter 11.

Integrating Planning & Scheduling Subbarao Kambhampati Integrating Planning & Scheduling Agenda:  Questions on Scheduling?  Discussion on Smith’s paper?

CS121 Heuristic Search Planning CSPs Adversarial Search Probabilistic Reasoning Probabilistic Belief Learning.

MAE 552 – Heuristic Optimization Lecture 5 February 1, 2002.

SAPA: A Domain-independent Heuristic Temporal Planner Minh B. Do & Subbarao Kambhampati Arizona State University.

Local Search Techniques for Temporal Planning in LPG Paper by Gerevini, Serina, Saetti, Spinoni Presented by Alex.

CPSC 411, Fall 2008: Set 4 1 CPSC 411 Design and Analysis of Algorithms Set 4: Greedy Algorithms Prof. Jennifer Welch Fall 2008.

10/31/02CSE Greedy Algorithms CSE Algorithms Greedy Algorithms.

10/31/02CSE Greedy Algorithms CSE Algorithms Greedy Algorithms.

GraphPlan Alan Fern * * Based in part on slides by Daniel Weld and José Luis Ambite.

Quality Indicators (Binary ε-Indicator) Santosh Tiwari.

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

Destination Choice Modeling of Discretionary Activities in Transport Microsimulations Andreas Horni.

Homework 1 ( Written Portion )  Max : 75  Min : 38  Avg : 57.6  Median : 58 (77%)

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

STDM - Linear Programming 1 By Isuru Manawadu B.Sc in Accounting Sp. (USJP), ACA, AFM

Efficient Provisioning of Service Level Agreements for Service Oriented Applications Valeria Cardellini, Emiliano Casalicchio, Vincenzo Grassi, Francesco.

CP Summer School Modelling for Constraint Programming Barbara Smith 2. Implied Constraints, Optimization, Dominance Rules.

1 Short Term Scheduling. 2  Planning horizon is short  Multiple unique jobs (tasks) with varying processing times and due dates  Multiple unique jobs.

CSCI 5582 Fall 2006 CSCI 5582 Artificial Intelligence Fall 2006 Jim Martin.

CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.

1 Markov Decision Processes Infinite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.

Chapter 5 Constraint Satisfaction Problems

AI Lecture 17 Planning Noémie Elhadad (substituting for Prof. McKeown)

Implicit Hitting Set Problems Richard M. Karp Erick Moreno Centeno DIMACS 20 th Anniversary.

Searching for Solutions

Algorithmic Mechanism Design Shuchi Chawla 11/7/2001.

Biologically Inspired Computation Ant Colony Optimisation.

© Daniel S. Weld 1 Logistics Travel Wed class led by Mausam Week’s reading R&N ch17 Project meetings.

Optimization in Engineering Design 1 Introduction to Non-Linear Optimization.

Heuristic Search Planners. 2 USC INFORMATION SCIENCES INSTITUTE Planning as heuristic search Use standard search techniques, e.g. A*, best-first, hill-climbing.

Class #17 – Thursday, October 27

Dr. Arslan Ornek IMPROVING SEARCH

Graphplan/ SATPlan Chapter

Planning Problems On(C, A)‏ On(A, Table)‏ On(B, Table)‏ Clear(C)‏

Multi-Objective Optimization

Class #19 – Monday, November 3

Richard Anderson Autumn 2016 Lecture 7

Constraint satisfaction problems

Graphplan/ SATPlan Chapter

Richard Anderson Winter 2019 Lecture 7

Graphplan/ SATPlan Chapter

Richard Anderson Autumn 2015 Lecture 7

Constraint satisfaction problems

Richard Anderson Autumn 2019 Lecture 7

Presentation transcript:

4/8: Cost Propagation & Partialization Today’s lesson: Beware of solicitous suggestions from juvenile cosmetologists Exhibit A: Abe Lincoln Exhibit B: Rao Next Class: LPG—ICAPS 2003 paper. *READ* it before coming. Homework on SAPA coming from Vietnam

Multi-objective search  Multi-dimensional nature of plan quality in metric temporal planning:  Temporal quality (e.g. makespan, slack)  Plan cost (e.g. cumulative action cost, resource consumption)  Necessitates multi-objective optimization:  Modeling objective functions  Tracking different quality metrics and heuristic estimation  Challenge: There may be inter-dependent relations between different quality metric

Example  Option 1: Tempe  Phoenix (Bus)  Los Angeles (Airplane)  Less time: 3 hours; More expensive: $200  Option 2: Tempe  Los Angeles (Car)  More time: 12 hours; Less expensive: $50  Given a deadline constraint (6 hours)  Only option 1 is viable  Given a money constraint ($100)  Only option 2 is viable Tempe Phoenix Los Angeles

Solution Quality in the presence of multiple objectives  When we have multiple objectives, it is not clear how to define global optimum  E.g. How does plan compare to ?  Problem: We don’t know what the user’s utility metric is as a function of cost and makespan.

Solution 1: Pareto Sets  Present pareto sets/curves to the user  A pareto set is a set of non-dominated solutions  A solution S1 is dominated by another S2, if S1 is worse than S2 in at least one objective and equal in all or worse in all other objectives. E.g. dominated by  A travel agent shouldn’t bother asking whether I would like a flight that starts at 6pm and reaches at 9pm, and cost 100$ or another ones which also leaves at 6 and reaches at 9, but costs 200$.  A pareto set is exhaustive if it contains all non-dominated solutions  Presenting the pareto set allows the users to state their preferences implicitly by choosing what they like rather than by stating them explicitly.  Problem: Exhaustive Pareto sets can be large (non-finite in many cases).  In practice, travel agents give you non-exhaustive pareto sets, just so you have the illusion of choice  Optimizing with pareto sets changes the nature of the problem—you are looking for multiple rather than a single solution.

Solution 2: Aggregate Utility Metrics  Combine the various objectives into a single utility measure  Eg: w1*cost+w2*make-span  Could model grad students’ preferences; with w1=infinity, w2=0  Log(cost)+ 5*(Make-span) 25  Could model Bill Gates’ preferences.  How do we assess the form of the utility measure (linear? Nonlinear?)  and how will we get the weights?  Utility elicitation process  Learning problem: Ask tons of questions to the users and learn their utility function to fit their preferences  Can be cast as a sort of learning task (e.g. learn a neual net that is consistent with the examples)  Of course, if you want to learn a true nonlinear preference function, you will need many many more examples, and the training takes much longer.  With aggregate utility metrics, the multi-obj optimization is, in theory, reduces to a single objective optimization problem  *However* if you are trying to good heuristics to direct the search, then since estimators are likely to be available for naturally occurring factors of the solution quality, rather than random combinations there-of, we still have to follow a two step process 1.Find estimators for each of the factors 2.Combine the estimates using the utility measure THIS IS WHAT WE WILL DO IN THE NEXT FEW SLIDES

Our approach  Using the Temporal Planning Graph (Smith & Weld) structure to track the time-sensitive cost function:  Estimation of the earliest time (makespan) to achieve all goals.  Estimation of the lowest cost to achieve goals  Estimation of the cost to achieve goals given the specific makespan value.  Using this information to calculate the heuristic value for the objective function involving both time and cost  New issue: How to propagate cost over planning graphs?

The (Relaxed) Temporal PG Tempe Phoenix Los Angeles Drive-car(Tempe,LA) Heli(T,P) Shuttle(T,P) Airplane(P,LA) t = 0t = 0.5t = 1t = 1.5 t = 10

Time-sensitive Cost Function  Standard (Temporal) planning graph (TPG) shows the time-related estimates e.g. earliest time to achieve fact, or to execute action  TPG does not show the cost estimates to achieve facts or execute actions Tempe Phoenix L.A Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hour Helicopter(Tempe,Phx): Cost: $100; Time: 0.5 hour Car(Tempe,LA): Cost: $100; Time: 10 hour Airplane(Phx,LA): Cost: $200; Time: 1.0 hour cost time $300 $220 $100  Drive-car(Tempe,LA) Heli(T,P) Shuttle(T,P) Airplane(P,LA) t = 0t = 0.5t = 1t = 1.5 t = 10

Estimating the Cost Function Tempe Phoenix L.A time $300 $220 $100  t = 1.5 t = 10 Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hour Helicopter(Tempe,Phx): Cost: $100; Time: 0.5 hour Car(Tempe,LA): Cost: $100; Time: 10 hour Airplane(Phx,LA): Cost: $200; Time: 1.0 hour 1 Drive-car(Tempe,LA) Hel(T,P) Shuttle(T,P) t = 0 Airplane(P,LA) t = t = 1 Cost(At(LA))Cost(At(Phx)) = Cost(Flight(Phx,LA)) Airplane(P,LA) t = 2.0 $20

Cost Propagation  Issues:  At a given time point, each fact is supported by multiple actions  Each action has more than one precondition  Propagation rules:  Cost(f,t) = min {Cost(A,t) : f  Effect(A)}  Cost(A,t) = Aggregate(Cost(f,t): f  Pre(A))  Sum-propagation:  Cost(f,t)  The plans for individual preconds may be interacting  Max-propagation: Max {Cost(f,t)}  Combination: 0.5  Cost(f,t) Max {Cost(f,t)} Probably other better ideas could be tried Can’t use something like set-level idea here because That will entail tracking the costs of subsets of literals

Termination Criteria  Deadline Termination: Terminate at time point t if:   goal G: Dealine(G)  t   goal G: (Dealine(G) < t)  (Cost(G,t) =   Fix-point Termination: Terminate at time point t where we can not improve the cost of any proposition.  K-lookahead approximation: At t where Cost(g,t) < , repeat the process of applying (set) of actions that can improve the cost functions k times. cost time $300 $220 $100  Drive-car(Tempe,LA) H(T,P) Shuttle(T,P) Plane(P,LA) t = t = 10 Earliest time point Cheapest cost

Heuristic estimation using the cost functions  If the objective function is to minimize time: h = t 0  If the objective function is to minimize cost: h = CostAggregate(G, t  )  If the objective function is the function of both time and cost O = f(time,cost) then: h = min f(t,Cost(G,t)) s.t. t 0  t  t  Eg: f(time,cost) = 100.makespan + Cost then h = 100x at t 0  t = 2  t  time cost 0 t 0 =1.52t  = 10 $300 $220 $100  Cost(At(LA)) Earliest achieve time: t 0 = 1.5 Lowest cost time: t  = 10 The cost functions have information to track both temporal and cost metric of the plan, and their inter-dependent relations !!!

Heuristic estimation by extracting the relaxed plan  Relaxed plan satisfies all the goals ignoring the negative interaction:  Take into account positive interaction  Base set of actions for possible adjustment according to neglected (relaxed) information (e.g. negative interaction, resource usage etc.)  Need to find a good relaxed plan (among multiple ones) according to the objective function

Heuristic estimation by extracting the relaxed plan  Initially supported facts: SF = Init state  Initial goals: G = Init goals \ SF  Traverse backward searching for actions supporting all the goals. When A is added to the relaxed plan RP, then: SF = SF  Effects(A) G = (G  Precond(A)) \ Effects  If the objective function is f(time,cost), then A is selected such that: f(t(RP+A),C(RP+A)) + f(t(G new ),C(G new )) is minimal (G new = (G  Precond(A)) \ Effects)  When A is added, using mutex to set orders between A and actions in RP so that less number of causal constraints are violated time cost 0 t 0 =1.52t  = 10 $300 $220 $100  Tempe Phoenix L.A f(t,c) = 100.makespan + Cost

Heuristic estimation by extracting the relaxed plan  General Alg.: Traverse backward searching for actions supporting all the goals. When A is added to the relaxed plan RP, then: Supported Fact = SF  Effects(A) Goals = SF \ (G  Precond(A))  Temporal Planning with Cost: If the objective function is f(time,cost), then A is selected such that: f(t(RP+A),C(RP+A)) + f(t(G new ),C(G new )) is minimal (G new = (G  Precond(A)) \ Effects)  Finally, using mutex to set orders between A and actions in RP so that less number of causal constraints are violated time cost 0 t 0 =1.52t  = 10 $300 $220 $100  Tempe Phoenix L.A f(t,c) = 100.makespan + Cost

Adjusting the Heuristic Values Ignored resource related information can be used to improve the heuristic values (such like +ve and –ve interactions in classical planning) Adjusted Cost: C = C +  R  (Con(R) – (Init(R)+Pro(R)))/  R  * C(A R )  Cannot be applied to admissible heuristics

4/10

Partialization Example A1A2A3 A1(10) gives g1 but deletes p A3(8) gives g2 but requires p at start A2(4) gives p at end We want g1,g2 A position-constrained plan with makespan 22 A1 A2 A3 G p g1 g2 [et(A1) = st(A3)] [et(A2) <= st(A3) …. Order Constrained plan The best makespan dispatch of the order-constrained plan A1 A2A3 14+  There could be multiple O.C. plans because of multiple possible causal sources. Optimization will involve Going through them all.

Problem Definitions  Position constrained (p.c) plan: The execution time of each action is fixed to a specific time point  Can be generated more efficiently by state-space planners  Order constrained (o.c) plan: Only the relative orderings between actions are specified  More flexible solutions, causal relations between actions  Partialization: Constructing a o.c plan from a p.c plan Q R R G Q RR {Q} {G} t1t1 t2t2 t3t3 p.c plano.c plan Q R R G Q RR {Q} {G}

Validity Requirements for a partialization  An o.c plan P oc is a valid partialization of a valid p.c plan P pc, if:  P oc contains the same actions as P pc  P oc is executable  P oc satisfies all the top level goals  (Optional) P pc is a legal dispatch (execution) of P oc  (Optional) Contains no redundant ordering relations P Q P Q X redundant

Greedy Approximations  Solving the optimization problem for makespan and number of orderings is NP-hard (Backstrom,1998)  Greedy approaches have been considered in classical planning (e.g. [Kambhampati & Kedar, 1993], [Veloso et. al.,1990]):  Find a causal explanation of correctness for the p.c plan  Introduce just the orderings needed for the explanation to hold

Partialization: A simple example Pickup(A)Stack(A,B)Pickup(C)Stack(C,D) Pickup(A) Stack(A,B) Pickup(C) Stack(C,D) On(A,B) On(C,D) Holding(B) Holding(C) Hand-empty

Modeling greedy approaches as value ordering strategies  Variation of [Kambhampati & Kedar,1993] greedy algorithm for temporal planning as value ordering:  Supporting variables: S p A = A’ such that:  et p A’ < st p A in the p.c plan P pc   B s.t.: et p A’ < et  p B < st p A   C s.t.: et p C < et p A’ and satisfy two above conditions  Ordering and interference variables:   p AB = if st  p B > st p A   r AA’ = if st r A > et r A’ in P pc ;  r AA’ =  other wise. Key insight: We can capture many of the greedy approaches as specific value ordering strategies on the CSOP encoding

CSOP Variables and values  Continuous variables:  Temporal: st A ; D(st A ) = {0, +  }, D(st init ) = {0}, D(st Goals ) = {Dl(G)}.  Resource level: V r A  Discrete variables:  Resource ordering:  r AA’ ; Dom(  r AA’ ) = { } or Dom(  r AA’ ) = {,  }  Causal effect: S p A ; Dom(S p A ) = {B 1, B 2,…B n }, p  E(B i )  Mutex:  p AA’ ; Dom(  p AA’ ) = { }; p  E(A),  p  E(A’) U P(A’) Q R R G Q RR {Q} {G} A1A1 A2A2 A3A3 Exp: Dom(S Q A2 ) = {A ibit, A 1 } Dom(S R A3 ) = {A 2 }, Dom(S G Ag ) = {A 3 }  R A1A2,  R A1A3

Constraints  Causal-link protection:  S p A = B   A’,  p  E(A’): (  p A’B = )  Ordering and temporal variables:  S p A = B  et p B < st p A   p A’B =  et  p A > st p A’   r AA’ =  st r A > et r A’  Optional temporal constraints:  Goal deadline: st Ag  t g ;  Time constraints on individual actions: L  st A  U  Resource precondition constraints:  For each precondition V r A  K,  = {>,<, , ,=} set up one constraint involving all  r AA’ such as:  Exp: Init r +  A’ K if  = >

Modeling Different Objective Functions  Temporal quality:  Minimum Makespan: Minimize Max A (st A + dur A )  Maximize summation of slacks: Maximize  (st g Ag - et g A ); S g Ag = A  Maximize average flexibility: Maximize Avg(Dom(st A ))  Fewest orderings:  Minimize #(st A < st A’ )

Empirical evaluation  Objective:  Demonstrate that metric temporal planner armed with our approach is able to produce plans that satisfy a variety of cost/makespan tradeoff.  Testing problems:  Randomly generated logistics problems from TP4 (Hasslum&Geffner) Load/unload(package,location): Cost = 1; Duration = 1; Drive-inter-city(location1,location2): Cost = 4.0; Duration = 12.0; Flight(airport1,airport2): Cost = 15.0; Duration = 3.0; Drive-intra-city(location1,location2,city): Cost = 2.0; Duration = 2.0;

LPG Discussion—Look at notes of Week 12 (as they are more uptodate)