Download presentation
Presentation is loading. Please wait.
1
Finding Admissible Bounds for Over- subscribed Planning Problems J. Benton Menkes van den BrielSubbarao Kambhampati Arizona State University
2
Is this plan “good”?
3
How good is a given plan How to drive a planner to find a good plan Related Admissible heuristics Need a heuristic schema that admits degrees of relaxation Helps per-node useHelps one-shot use Especially important when quality may vary widely { e.g., when we have many soft goals
5
Challenges 1. Build a strong admissible heuristic 2. Provide a way to add relaxation for varied use An integer programming (IP) based heuristic Use the linear programming (LP) relaxation
6
PSP UD Partial Satisfaction Planning with Utility Dependency cost: 20 cost: 5 (at t loc2) (in p1 t) (move t loc2) (unload p1 loc2) (at t loc1) (in p1 t) (at t loc2) (at p1 loc2) utility((at t loc1) & (at p1 t)) = 60 cost: 20 (move t loc1) (at t loc1) (at p1 loc2) utility((at t loc1)) = 10utility((at p1 loc2)) = 10 util(S 0 ): 10 S0S0 util(S 1 ): 0 S1S1 util(S 2 ): 10 S2S2 util(S 3 ): 10+10+60=80 S3S3 sum cost: 20sum cost: 25 sum cost: 45 loc2 loc1 net benefit(S 0 ): 10-0=10net benefit(S 1 ): 0-20=-20net benefit(S 2 ): 10-25=-15net benefit(S 3 ): 80-45=35 Actions have costGoal sets have utility
7
Building a Heuristic loc2 loc1 A network flow model on variable transitions truck package Capture relevant transitions with multi-valued fluents add prevail constraints add initial states add goal states add cost on actions add utility on goals cost: 20 cost: 5 util: 10 util: 60
8
Building a Heuristic truck package cost: 20 cost: 5 util: 10 util: 60 Constraints of this model 2. If a fact is deleted, then it must be added to re-achieve a value. 3. If a prevail condition is required, then it must be achieved. 1. If an action executes, then all of its effects and prevail conditions must also. 4. A goal utility dependency is achieved if its goals are achieved. util: 10
9
Formulation Variables action(a) ∈ Z + The number of times a ∈ A is executed effect(a,v,e) ∈ Z + The number of times a transition e in state variable v is caused by action a prevail(a,v,f) ∈ Z + The number of times a prevail condition f in state variable v is required by action a endvalue(v,f) ∈ {0,1} Equal to 1 if value f is the end value in a state variable v goaldep(k) Equal to 1 if a goal dependency is achieved Parameters cost(a) the cost of executing action a ∈ A utility(v,f) the utility of achieving value f in state variable v utility(k) the utility of achieving achieving goal dependency G k 1. If an action executes, then all of its effects and prevail conditions must also. action(a) = Σ effects of a in v effect(a,v,e) + Σ prevails of a in v prevail(a,v,f) 2. If a fact is deleted, then it must be added to re-achieve a value. 1{if f ∈ s 0 [v]} + Σ effects that add f effect(a,v,e) = Σ effects that delete f effect(a,v,e) + endvalue(v,f) 3. If a prevail condition is required, then it must be achieved. 1{if f ∈ s 0 [v]} + Σ effects that add f effect(a,v,e) ≥ prevail(a,v,f) / M 4. A goal utility dependency is achieved if its goals are achieved. goaldep(k) ≥ Σ f in dependency k endvalue(v,f) – |G k | – 1 goaldep(k) ≤ endvalue(v,f) ∀ f in dependency k
10
Formulation Variables action(a) ∈ Z + The number of times a ∈ A is executed effect(a,v,e) ∈ Z + The number of times a transition e in state variable v is caused by action a prevail(a,v,f) ∈ Z + The number of times a prevail condition f in state variable v is required by action a endvalue(v,f) ∈ {0,1} Equal to 1 if value f is the end value in a state variable v goaldep(k) Equal to 1 if a goal dependency is achieved Parameters cost(a) the cost of executing action a ∈ A utility(v,f) the utility of achieving value f in state variable v utility(k) the utility of achieving achieving goal dependency G k Objective Function Σ v ∈ V,f ∈ Dv utility(v,f) endvalue(v,f) + Σ k ∈ K utility(k) goaldep(k) – Σ a ∈ A cost(a) action(a) Maximize Net Benefit
11
Experimental Setup Three modified IPC 3 domains: zenotravel, satellite, rovers Compared with, a cost propagation-based heuristic (maximize net benefit) One IPC 5 domain: Rovers, simple preferences (minimize (goal achievement violations + action cost)) heuristic value at initial state versus optimal plan Found using a branch and bound search LP > IP > OPTIMALmaximizing LP < IP < OPTIMALminimizing
12
Results
14
IPLP
15
Summary IP gives bound on quality of plan Doubly relaxed (LP) to provide heuristic for search (Search I Session: Monday at 4:10 pm)
16
Future Work Improve encoding (to give better LP values) Use fluent merging
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.