Planning Graph-based Heuristics for Cost-sensitive Temporal Planning Minh B. Do & Subbarao Kambhampati CSE Department, Arizona State University

Planning Graph-based Heuristics for Cost-sensitive Temporal Planning Minh B. Do & Subbarao Kambhampati CSE Department, Arizona State University {binhminh,rao}@asu.edu

Motivation Multi-dimensional nature of plan quality in metric temporal planning: –Temporal quality (e.g. makespan, slack) –Plan cost (e.g. cumulative action cost, resource consumption) Necessitates multi-objective optimization: –Modeling objective functions –Tracking different quality metrics and heuristic estimation  Challenge: There may be inter-dependent relations between different quality metric

Example Option 1: Tempe  Phoenix (Bus)  Los Angeles (Airplane) –Less time: 3 hours; More expensive: $200 Option 2: Tempe  Los Angeles (Car) –More time: 12 hours; Less expensive: $50 Given a deadline constraint (6 hours)  Only option 1 is viable Given a money constraint ($100)  Only option 2 is viable Tempe Phoenix Los Angeles

General Problem Planner Good quality solution Problem specification Objective function How to design objective function? -User define -Learning users utility model We do not investigate We investigate Given the objective function that involve both time and cost quality  Finding heuristics that sensitive to the cost function

Our approach Using the Temporal Planning Graph (Smith & Weld) structure to track the time-sensitive cost function: –Estimation of the earliest time (makespan) to achieve all goals. –Estimation of the lowest cost to achieve goals –Estimation of the cost to achieve goals given the specific makespan value.  Using those information to calculate the heuristic value for the objective function involving both time and cost

Outline Action representation and Temporal Planning Graph Time sensitive cost functions: –Cost propagation using the temporal planning graph. –Termination criteria for the cost propagation process. Deriving heuristic values from cost functions –Direct calculation –Heuristic by relaxed plan extraction Empirical evaluation Conclusion and future work

Action Representation Similar to PDDL2.1 Level 3: –Actions have non-uniform durations and may consume resources –Preconditions are true at start point or hold true for the action duration. –Effects at start or end points. Load(package,truck,place) At(package,place)  At(package,place) At(truck,place) In(package,truck)

The (Relaxed) Temporal PG Tempe Phoenix Los Angeles Drive-car(Tempe,LA) Heli(T,P) Shuttle(T,P) Airplane(P,LA) t = 0t = 0.5t = 1t = 1.5 t = 10

Time-sensitive Cost Function Standard (Temporal) planning graph (TPG) shows the time-related estimates e.g. earliest time to achieve fact, or to execute action TPG does not show the cost estimates to achieve facts or execute actions Tempe Phoenix L.A Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hour Helicopter(Tempe,Phx): Cost: $100; Time: 0.5 hour Car(Tempe,LA): Cost: $100; Time: 10 hour Airplane(Phx,LA): Cost: $200; Time: 1.0 hour cost time 0 1.5210 $300 $220 $100  Drive-car(Tempe,LA) Heli(T,P) Shuttle(T,P) Airplane(P,LA) t = 0t = 0.5t = 1t = 1.5 t = 10

Estimating the Cost Function Tempe Phoenix L.A time 01.5210 $300 $220 $100  t = 1.5 t = 10 Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hour Helicopter(Tempe,Phx): Cost: $100; Time: 0.5 hour Car(Tempe,LA): Cost: $100; Time: 10 hour Airplane(Phx,LA): Cost: $200; Time: 1.0 hour 1 Drive-car(Tempe,LA) Hel(T,P) Shuttle(T,P) t = 0 Airplane(P,LA) t = 0.5 0.5 t = 1 Cost(At(LA))Cost(At(Phx)) = Cost(Flight(Phx,LA)) Airplane(P,LA) t = 2.0 $20

Cost Propagation Issues: –At a given time point, each fact is supported by multiple actions –Each action has more than one precondition Propagation rules: –Cost(f,t) = min {Cost(A,t) : f  Effect(A)} –Cost(A,t) = Aggregate(Cost(f,t): f  Pre(A)) Sum-propagation:  Cost(f,t) Max-propagation: Max {Cost(f,t)} Combination: 0.5  Cost(f,t) + 0.5 Max {Cost(f,t)}

Termination Criteria Deadline Termination: Terminate at time point t if: –  goal G: Dealine(G)  t –  goal G: (Dealine(G) < t)  (Cost(G,t) =  Fix-point Termination: Terminate at time point t where we can not improve the cost of any proposition. K-lookahead approximation: At t where Cost(g,t) < , repeat the process of applying (set) of actions that can improve the cost functions k times. cost time 0 1.5210 $300 $220 $100  Drive-car(Tempe,LA) H(T,P) Shuttle(T,P) Plane(P,LA) t = 0 0.5 1 1.5 t = 10 Earliest time point Cheapest cost

Heuristic estimation using the cost functions If the objective function is to minimize time: h = t 0 If the objective function is to minimize cost: h = CostAggregate(G, t  ) If the objective function is the function of both time and cost O = f(time,cost) then: h = min f(t,Cost(G,t)) s.t. t 0  t  t  Eg: f(time,cost) = 100.makespan + Cost then h = 100x2 + 220 at t 0  t = 2  t  time cost 0 t 0 =1.52t  = 10 $300 $220 $100  Cost(At(LA)) Earliest achieve time: t 0 = 1.5 Lowest cost time: t  = 10 The cost functions have information to track both temporal and cost metric of the plan, and their inter-dependent relations !!!

Heuristic estimation by extracting the relaxed plan Relaxed plan (Hoffman) satisfies all the goals ignoring the negative interaction: –Take into account positive interaction –Base set of actions for possible adjustment according to neglected (relaxed) information (e.g. negative interaction, resource usage etc.)  Need to find a good relaxed plan (among multiple ones) according to the objective function

Heuristic estimation by extracting the relaxed plan Initially supported facts: SF = Init state Initial goals: G = Init goals \ SF Traverse backward searching for actions supporting all the goals. When A is added to the relaxed plan RP, then: SF = SF  Effects(A) G = (G  Precond(A)) \ Effects If the objective function is f(time,cost), then A is selected such that: f(t(RP+A),C(RP+A)) + f(t(G new ),C(G new )) is minimal (G new = (G  Precond(A)) \ Effects) When A is added, using mutex to set orders between A and actions in RP so that less number of causal constraints are violated time cost 0 t 0 =1.52t  = 10 $300 $220 $100  Tempe Phoenix L.A f(t,c) = 100.makespan + Cost

Heuristic estimation by extracting the relaxed plan General Alg.: Traverse backward searching for actions supporting all the goals. When A is added to the relaxed plan RP, then: Supported Fact = SF  Effects(A) Goals = SF \ (G  Precond(A)) Temporal Planning with Cost: If the objective function is f(time,cost), then A is selected such that: f(t(RP+A),C(RP+A)) + f(t(G new ),C(G new )) is minimal (G new = (G  Precond(A)) \ Effects) Finally, using mutex to set orders between A and actions in RP so that less number of causal constraints are violated time cost 0 t 0 =1.52t  = 10 $300 $220 $100  Tempe Phoenix L.A f(t,c) = 100.makespan + Cost

Empirical evaluation Objective: –Demonstrate that metric temporal planner armed with our approach is able to produce plans that satisfy a variety of cost/makespan tradeoff. Testing problems:  Randomly generated logistics problems from TP4 (Hasslum&Geffner) Load/unload(package,location): Cost = 1; Duration = 1; Drive-inter-city(location1,location2): Cost = 4.0; Duration = 12.0; Flight(airport1,airport2): Cost = 15.0; Duration = 3.0; Drive-intra-city(location1,location2,city): Cost = 2.0; Duration = 2.0;

Empirical Results Results over 20 randomly generated temporal logistics problems involve moving 4 packages between different locations in 3 cities: O = f(time,cost) = .Makespan + (1-  ).TotalCost

Empirical Results (cont.) Higher look-ahead option generally produces better results in term of solving times and quality Relaxed plan heuristic is generally more informative than the direct plan heuristic

Related Work TGP, TP4 aim at makespan optimization (do not consider cost) MO-GRT does multi-criteria search, but does not exploit the inter-dependent relations between them. ASPEN (JPL) uses the iterative repairing technique to improve multi-dimensional plan quality

Conclusion Introduced the time-sensitive cost functions to guide the heuristic search according to the objective functions involving both time (makespan) and monetary action cost: –Propagating cost function while building the temporal planning graph –Extract the heuristic values using the cost function –Preliminary experiment result with Sapa showing the utilities of the time-sensitive cost functions

Future Work Experiments with domains and problems from the planning competition Improving the cost function by better propagation rules, mutex information when building the temporal planning graph (TGP approach) Heuristics for tracking other types of planning qualities such as execution flexibility Multi-objective search involving non-combinable criteria

Planning Graph-based Heuristics for Cost-sensitive Temporal Planning Minh B. Do & Subbarao Kambhampati CSE Department, Arizona State University

Similar presentations

Presentation on theme: "Planning Graph-based Heuristics for Cost-sensitive Temporal Planning Minh B. Do & Subbarao Kambhampati CSE Department, Arizona State University"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Planning Graph-based Heuristics for Cost-sensitive Temporal Planning Minh B. Do & Subbarao Kambhampati CSE Department, Arizona State University

Similar presentations

Presentation on theme: "Planning Graph-based Heuristics for Cost-sensitive Temporal Planning Minh B. Do & Subbarao Kambhampati CSE Department, Arizona State University"— Presentation transcript:

Similar presentations

About project

Feedback