Download presentation
Presentation is loading. Please wait.
1
SOCS 1 GamePlan: AN ADVERSARIAL PLANER IN MULTI-AGENT SYSTEM (A Preliminary Report) Wenjin Lue and Kostas Stathis City University, London Dagstuhl, 26th November, 2002
2
SOCS 2 Content Motivations and Background GamePlan, an extension of GraphPlan for Adversarial Planning in MAS. Further Work
3
SOCS 3 Background and Motivation SOCS (Societies of ComputeeS) Project: Aims at providing a computational logic model for the description, analysis and verification of global and open Societies of Heterogeneous ComputeeS. What is a computee?
4
SOCS 4 A Computee is interpreted as interactive agents with internal (mental) state consisting of –Knowledge base (or beliefs): This is the total knowledge that the computee has, –Goals: A set of goals that the computee has decided that it wants (desires), –Plans: These represents the “concrete” actions of the computee with which it plans (intends) to satisfy its goals.
5
SOCS 5 Interaction among computees may be –Adversarial: A computee competes against other computees in a society for its own goals. –Co-operative: Computees co-operate with each other for common goals in a society. A plan in SOCS is a conditional plan. (not surprise because of the uncertainty of interaction among computees) How to make a plan?
6
SOCS 6 Planning in SOCS Uncertainty: A Planer for a computee has to take other computees’ behaviours into consideration. Existing planers for uncertain domain is not enough because most of them deal with the situation where actions are uncertain. But in our case, the uncertainty comes from the interactions among computees. SOLUTION: Extends GraphPlan [Blum&Furst97] to deal with interactions.
7
SOCS 7 GraphPlan Overview[Blum&Furst97] What GraphPlan do? –STRIPS Domain Operator –Preconditions –Add-effects –Delete-effects Initial Conditions: Conjunction of facts. Goals: conjunction of Facts. –Find Partial-order Plans
8
SOCS 8 GraphPlan Overview(cont.) Planning Graph and its Expanding NO-OP... 0 ii+1 Delete a precondition With mutex preconditions – Mutex (mutual exclusive relation) Delete a effect Proposition Node Action Node Add edge Delete Edge Mutex actions Mutex propositions Precondition Edge i-1 P Q R S a R T b c U... e d
9
SOCS 9 GraphPlan Overview(cont.) Solution Extraction –If planning graph has proceeded to some propositional level at which all the goal facts are present and no pair of these is mutex. –Then extract a solution backtracking Assume S, T are all goal propositions. Then {b, d} and {c, d} are both local solutions. Select {b,d}, then {P, R} forms a new goal for next level. b a c d R Q P... T S
10
SOCS 10 GraphPlan Overview(cont.) Algorithm (High Level) While no plan is found Do –Graph-expansion phase: extend a planning graph forward until all goal facts are presented at last level. –Solution-extraction phase: search the resulting graph for a correct plan.
11
SOCS 11 GamePlan What GamePlan do –STRIPS-like Domain A set of agents (limited for two agents here) Operators: available for all agents Initial Condition An agent-specific Goal –Find a Conditional Plan
12
SOCS 12 Example (Tic Tac Toe) X O 111213 23 333231 2122
13
SOCS 13 Example (cont.) Agents X, O for two players Operator mark(i, j, mark) ( 0<i, j <4, mark {X, O}) precondition: blank(i, j) add effect: at(i, j, mark) delete effect: blank(i, j)
14
SOCS 14 Example (cont.) Initial Condition: blank(1,1), blank(1,2), blank(1, 3), at(2, 2, X), at(2, 1, O), blank(2, 3), blank(3,1), blank(3,2), blank(3, 3) X O 111213 23 333231 2122
15
SOCS 15 Example (cont.) Goal: wins(Me) which has definition wins(Me) at(1,1,Me), at(1,2,Me), at(1,3, Me) wins(Me) at(2,1,Me), at(2,2,Me), at(2,3, Me) wins(Me) at(3,1,Me), at(3,2,Me), at(3, 3,Me) wins(Me) at(1,1,Me), at(2,2,Me), at(3,3, Me) wins(Me) at(1,3,Me), at(2,2,Me), at(3,1, Me) wins(Me) at(1,1,Me), at(2,1,Me), at(3,1, Me) wins(Me) at(1,2,Me), at(2,2,Me), at(2,3, Me) wins(Me) at(1,3,Me), at(2,3,Me), at(3,3, Me)
16
SOCS 16 GamePlan (cont.) Extending Planning Graph –extend graph by agents alternatively –label each action level with the agent who owns control on the actions at the level. –An alternation is called a TURN. Bonus: share all properties of planning graph in GraphPlan.
17
SOCS 17 B 11 B 12 B 13 O 21 X 22 B 23 B 31 B 32 B 33 X 11 B 11 M 11X NO-OP B ij for blank(i,j) M ijx for mark(i,j,x) X ij for at(i,j,x) O ij for at(i,j,O) X O 111213 23 313233
18
SOCS 18 B ij for blank(i,j) M ijx for mark(i,j,x) X ij for at(i,j,x) X ij for at(i,j,O) B 11 M 11X X 11 NO-OP B 11 B 12 M 12 X X 12 NO-OP B 12 B 13 M 13X X 13 NO-OP B 13 O 21 NO-OP O 21 X 22 NO-OP O 21 B 23 M 23X X 23 NO-OP B 23 B 31 M 31X X 31 NO-OP B 31 B 32 M 32X X 32 NO-OP B 32 B 33 M 33 X X 33 NO-OP B 33 X X O 111213 23 313233
19
SOCS 19 B 11 M 11X X 11 NO-OP B 11 B 12 M 12 X X 12 NO-OP B 12 B 13 M 13X X 13 NO-OP B 13 O 21 NO-OP O 21 X 22 NO-OP O 21 B 23 M 23X X 23 NO-OP B 23 B 31 M 31X X 31 NO-OP B 31 B 32 M 32X X 32 NO-OP B 32 B 33 M 33 X X 33 NO-OP B 33 X X O 111213 23 313233 O NO-OP X 11 M 11O B 11 O 11
20
SOCS 20
21
SOCS 21 GamePlan (cont.) Solution Extraction –Instead of level by level, it works turn-by-turn. –dynamically merge a turn to produce a set of conditional actions. –adapt a Conditional Planer to extract a local solution at a specific turn
22
SOCS 22 Merging a turn –Let (A, B) be a turn and a A, b B, we say a blocks b if execution of a will delete a precondition of b. –Merging a ( no-op) with level B, denoted by a + B, is a conditional action with the same precondition as a but a set of conditional effects {b':eff(a,b) | b’ is an unique label for b B which is not blocked by a} where eff(a,b) = a’s effects that are not conflict with any effects of b b’s effects.
23
SOCS 23 M 11X + L O Precondition: B 11 Post-condition: M 12O : B 11 X 11 B 12 O 12 M 13O : B 11 X 11 B 13 O 13 M 23O : B 11 X 11 B 23 O 23 M 31O : B 11 X 11 B 31 O 31 M 32O : B 11 X 11 B 32 O 32 M 33O : B 11 X 11 B 33 O 33 Example(cont.)
24
SOCS 24 Adapting a Conditional Planer In Principle, any conditional planer which can deal with conditional actions can be adapted for our purpose. What we need is – Introduce backtrack point. Bonus: Adapted conditional planer is much simple than the original one.
25
SOCS 25 Instantiate a Goal –X 11, X 12, X 13 are called open conditions which need to be resolved. Win(X) 1 X 11 X 22 X 33 Example (cont.)
26
SOCS 26 Win(X) 1 X 11 X 22 X 33 NO-OP Resolve Open Conditions: Add Causal Links –Causal link = (action, labels, openConditon) –Record backtrack point, several actions may used to resolve an open condition –Side effect: Add causal link may introduce conflicts. M 11X +L O {12, 13, 23, 31, 32, 33} M 11X + L O Precondition: B 11 Post-condition: M 12O : B 11 X 11 B 12 O 12 M 13O : B 11 X 11 B 13 O 1 M 23O : B 11 X 11 B 23 O 23 M 31O : B 11 X 11 B 31 O 31 M 32O : B 11 X 11 B 32 O 32 M 33O : B 11 X 11 B 33 O 33 M 33X +L O {11, 12, 13, 23, 31, 32} M 33X + L O Precondition: B 33 Post-condition: M 11O : B 33 X 33 B 11 O 11 M 12O : B 33 X 33 B 12 O 12 M 13O : B 33 X 33 B 13 O 13 M 23O : B 33 X 33 B 23 O 23 M 31O : B 33 X 33 B 31 O 31 M 32O : B 33 X 33 B 32 O 32
27
SOCS 27 Win(X) 1 X 11 X 22 X 33 M 11X + L O {12, 13, 23, 31, 32, 33} NO-OP M 33X +L O {11, 12, 13, 23, 31, 32} {33} Resolve Conflict: add conditional links –conditional link = (action 1, labels, action 2 ) –Side effect: add conditional link may cause open outcomes M 11X + L O Precondition: B 11 Post-condition: M 12O : B 11 X 11 B 12 O 12 M 13O : B 11 X 11 B 13 O 1 M 23O : B 11 X 11 B 23 O 23 M 31O : B 11 X 11 B 31 O 31 M 32O : B 11 X 11 B 32 O 32 M 33O : B 11 X 11 B 33 O 33 M 33X + L O Precondition: B 33 Post-condition: M 11O : B 33 X 33 B 11 O 11 M 12O : B 33 X 33 B 12 O 12 M 13O : B 33 X 33 B 13 O 13 M 23O : B 33 X 33 B 23 O 23 M 31O : B 33 X 33 B 31 O 31 M 32O : B 33 X 33 B 32 O 32
28
SOCS 28 Resolve Open Outcomes: Instantiate new goal –Side effect: Instantiate new goal may introduce open conditions Win(X) 2 X 13 X 12 X 11 M 11X +L O Win(X) 1 X 11 X 22 X 33 NO-OP M 33X +L O {11, 12, 13, 23, 31, 32} { 33} { 12, 13, 23, 31, 32, 33} { 12} M 12X +L O {11, 13, 23, 31, 32,33} { 13} M 13X +L O {11, 12, 23, 31, 32, 33}
29
SOCS 29 Repeat the process until –no open conditions –no open outcomes M 33X +L O M 11X +L O Win(X) 1 X 11 X 22 X 33 { 12, 13, 23, 31, 32, 33} NO-OP {11, 12, 13, 23, 31, 32} { 33} Win(X) 2 X 13 X 12 X 11 { 12, 13, 23, 31, 32, 33} M 12X +L O M 13X +L O { 12} {11, 13, 23, 31, 32,33} { 13} {11, 12, 13, 23, 31, 33} Win(X) 3 X 32 X 22 X 12 NO-OP M 32X +L O {11, 12, 23, 31, 32, 33} {11, 13, 23, 31, 32,33} { 32} M 11 O 33 O 13 O 33 M 12 M 33 M 32 M 13 O 13 X O 111213 23 313233
30
SOCS 30 Preconditions of the actions in the local plan forms a new goal for the next turn. –If no plan for the new goal in the next turn, then backtrack. M 11X +L O Win(X) 1 X 11 X 22 X 33 { 12, 13, 23, 31, 32, 33} NO-OPM 33X +L O {11, 12, 13, 23, 31, 32} { 33} Win(X) 2 X 13 X 12 X 11 { 12, 13, 23, 31, 32, 33} M 12X +L O M 13X +L O { 12} {11, 13, 23, 31, 32,33} { 13} {11, 12, 13, 23, 31, 33} Win(X) 3 X 32 X 22 X 12 NO-OP M 32X +L O {11, 12, 23, 31, 32, 33} {11, 13, 23, 31, 32,33} { 32} B 32 B 13 B 12 B 11 B 33
31
SOCS 31 Conclusion What we have done –A GamePlan framework –Adversarial planning algorithm What we are planning to do –refinement –implementation and test –exploit the idea to deal with co-operative planning.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.