Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning Relational Rules for Goal Decomposition

Similar presentations


Presentation on theme: "Learning Relational Rules for Goal Decomposition"— Presentation transcript:

1 Learning Relational Rules for Goal Decomposition
Prasad Tadepalli Oregon State University Chandra Reddy IBM T.J. Watson Research Center Supported by Office of Naval Research 2/25/2019 Symposium on Reasoning and Learning

2 A Critique of Current Research
Most work is confined to learning in isolation Predominantly employs propositional representations The learner is passive and has to learn from random examples The role of prior knowledge in learning is minimal 2/25/2019 Symposium on Reasoning and Learning

3 Symposium on Reasoning and Learning
Our Approach Learning in the context of hierarchical problem solving The goals, states and actions are represented relationally Active Learning: Learner can ask questions, pose problems to itself, and solve them Declarative prior knowledge guides and speeds up learning 2/25/2019 Symposium on Reasoning and Learning

4 Air Traffic Control (ATC) Task (Ackerman and Kanfer study)
2/25/2019 Symposium on Reasoning and Learning

5 Goal Decomposition Rules (D-rules)
D-rules decompose goals into subgoals. goal: land(?plane) condition: plane-at(?plane, ?loc) & level(L3, ?loc) subgoals: move(?plane, L2)}; move(?plane, L1); land1(?plane) Problems are solved by recursive decomposition of goals to subgoals Control knowledge guides the selection of appropriate decomposition rules. 2/25/2019 Symposium on Reasoning and Learning

6 Domain theory for ATC task
Domain Axioms: can-land-short(?p) :- type(?p propeller) can-land-short(?p) : - type(?p DC10) & wind-speed(low) & runway-cond(dry) Primitive Operators: jump(?cursor-from, ?cursor-to), short-deposit(?plane, ?runway), long-deposit(?plane, ?runway), select(?loc, ?plane) 2/25/2019 Symposium on Reasoning and Learning

7 Learning from Demonstration
Input Examples: State: at(p1, 10), type(p1, propeller), fuel(p1, 5),cursor-loc(4), free(1), free(2),…, free(9), free(11),…, free(15), runway-cond(wet), wind-speed(high), wind-dir(south) Goal: land-plane(p1) Solution: jump(4, 10), select(10,p1), jump(10,14), short-deposit(p1,R2) Output: underlying D-rules 2/25/2019 Symposium on Reasoning and Learning

8 Generalizing Examples
Examples are inductively generalized: Examples to D-rules Example goal D-rule Goal Initial state Condition Literals in other states Subgoals Least General Generalization (lgg) X lgg H Problem: Size of lgg grows exponentially with the number of examples. 2/25/2019 Symposium on Reasoning and Learning

9 Symposium on Reasoning and Learning
Learning from Queries Use queries to prevent the exponential growth of the lgg: (Reddy and Tadepalli, 1997) Non-recursive, single-predicate Horn programs are learnable from queries and examples. Prune each literal in the lgg and ask a membership query (a question) to confirm that the result is not overgeneral. 2/25/2019 Symposium on Reasoning and Learning

10 Symposium on Reasoning and Learning
Need for queries D 2/25/2019 Symposium on Reasoning and Learning

11 Symposium on Reasoning and Learning
Need for queries x D 2/25/2019 Symposium on Reasoning and Learning

12 Symposium on Reasoning and Learning
Need for queries x lgg D 2/25/2019 Symposium on Reasoning and Learning

13 Symposium on Reasoning and Learning
Need for queries x D target 2/25/2019 Symposium on Reasoning and Learning

14 Symposium on Reasoning and Learning
Need for queries overgeneral x D 2/25/2019 Symposium on Reasoning and Learning

15 Symposium on Reasoning and Learning
Need for queries x D 2/25/2019 Symposium on Reasoning and Learning

16 Symposium on Reasoning and Learning
Using Prior Knowledge Explanation-Based Pruning: Remove literals that don't play a causal role in the plan e.g., free(1), free(2), ...etc. Abstraction by Forward Chaining: can-land-short(?p) :- type(?p propeller) Helps learn a more general rule. Learning subgoal order: Subgoal literals are maintained as a sequence of sets of literals. A set is refined into a sequence of smaller sets using multiple examples. 2/25/2019 Symposium on Reasoning and Learning

17 Learning Multiple D-Rules
Maintain a list of d-rules for each goal. Combine a new example x with the first d-rule hi for which lgg(x,hi) is not over-general Reduce the result and replace hi 2/25/2019 Symposium on Reasoning and Learning

18 Results on learning from demonstration
2/25/2019 Symposium on Reasoning and Learning

19 Learning from Exercises
Supplying solved training examples is too demanding for the teacher. Solving problems from scratch is computationally hard. A compromise solution: learning from exercises. Exercises are helpful intermediate subproblems that help solve the main problems. Solving easier subproblems makes it possible to solve more difficult problems. 2/25/2019 Symposium on Reasoning and Learning

20 Difficulty Levels in ATC Domain
2/25/2019 Symposium on Reasoning and Learning

21 Symposium on Reasoning and Learning
Solving Exercises Use previously learned d-rules as operators . Iterative-deepening DFS to find short rules. Generalization is done as before. 2/25/2019 Symposium on Reasoning and Learning

22 Query Answering by Testing
Generate test problems {InitialState, Goal} that match the d-rule. Use the decomposition that the d-rule suggests, and solve the problems If some problem cannot be solved the rule is over-general. 2/25/2019 Symposium on Reasoning and Learning

23 Results on learning from exercises
14 d-rules 2/25/2019 Symposium on Reasoning and Learning

24 Symposium on Reasoning and Learning
Conclusions It is possible to learn useful problem-solving strategies in expressive representations. Prior knowledge can be put to good use in learning. Queries can be implemented approximately using heuristic techniques. Learning from demonstration and learning from exercises make different tradeoffs with respect to learning and reasoning. 2/25/2019 Symposium on Reasoning and Learning

25 Learning for Training Environments (Ron Metoyer)
Task Training Sports Military Electronic Arts Boston Dynamics Inc. who IS creating the content (some programmer) vs. who SHOULD be (the coach). want an immersive screen projecting onto (3d game going on) and he throws to right person (a net in front to catch it). We can track who they threw to and the characters respond. Who creates the training content? 2/25/2019 Symposium on Reasoning and Learning

26 Symposium on Reasoning and Learning
Research Challenges Learning must be on-line. Must learn quickly, since users can only give a few examples. Extension to more complex strategy languages that include concurrency, partial observability, real-time execution, multiple agents, e.g., ConGolog Provide a predictable model of generalization. Allow learning from demonstrations, reinforcement, advice, and hints e.g., improving or learning to select between strategies. 2/25/2019 Symposium on Reasoning and Learning

27 Vehicle Routing & Product Delivery
2/25/2019 Symposium on Reasoning and Learning

28 Symposium on Reasoning and Learning
Learning Challenges Very large number of states and actions Stochastic demands by customers and shops Multiple agents (trucks, truck companies, shops, distribution centers) Partial observability Hierarchical decision making Significant real-world impact 2/25/2019 Symposium on Reasoning and Learning

29 ICML Workshop on Relational Reinforcement Learning
Paper Deadline: April 2 Check ICML website 2/25/2019 Symposium on Reasoning and Learning


Download ppt "Learning Relational Rules for Goal Decomposition"

Similar presentations


Ads by Google