Learning Relational Rules for Goal Decomposition

Slides:



Advertisements
Similar presentations
TEACHER TRAINING WORKSHOPS Module 1: Methodology “Lesson Planning”   © English Highway Language Center 2012.
Advertisements

A Decision-Theoretic Model of Assistance - Evaluation, Extension and Open Problems Sriraam Natarajan, Kshitij Judah, Prasad Tadepalli and Alan Fern School.
Modelling with expert systems. Expert systems Modelling with expert systems Coaching modelling with expert systems Advantages and limitations of modelling.
Chapter 11 user support. Issues –different types of support at different times –implementation and presentation both important –all need careful design.
Planning Module THREE: Planning, Production Systems,Expert Systems, Uncertainty Dr M M Awais.
Introduction to Hierarchical Reinforcement Learning Jervis Pinto Slides adapted from Ron Parr (From ICML 2005 Rich Representations for Reinforcement Learning.
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수
Yiannis Demiris and Anthony Dearden By James Gilbert.
Case-Based Reasoning, 1993, Ch11 Kolodner Adaptation method and Strategies Teacher : Dr. C.S. Ho Student : L.W. Pan No. : M Date : 1/7/2000.
Supporting Design Managing complexity of designing Expressing ideas Testing ideas Quality assurance.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
Lesson Planning. Successful Lessons Engaging and challenging Attaining the goals and objectives Exciting and fun Connecting learning content with students’
Oregon State University School of Electrical Engineering and Computer Science End-User Programming of Intelligent Learning Agents Prasad Tadepalli, Ron.
Knowledge Acquisition CIS 479/579 Bruce R. Maxim UM-Dearborn.
CS350/550 Software Engineering Lecture 1. Class Work The main part of the class is a practical software engineering project, in teams of 3-5 people There.
A Reinforcement Learning Approach for Product Delivery by Multiple Vehicles Scott Proper Oregon State University Prasad Tadepalli Hong TangRasaratnam Logendran.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
CSC230 Software Design (Engineering)
Sepandar Sepehr McMaster University November 2008
Useful Techniques in Teaching Reading
Top 10 Instructional Strategies
Chen Cai, Benjamin Heydecker Presentation for the 4th CREST Open Workshop Operation Research for Software Engineering Methods, London, 2010 Approximate.
“Enhancing Reuse with Information Hiding” ITT Proceedings of the Workshop on Reusability in Programming, 1983 Reprinted in Software Reusability, Volume.
Author: James Allen, Nathanael Chambers, etc. By: Rex, Linger, Xiaoyi Nov. 23, 2009.
Transportation Technology Learning Standards 6. Transportation Technologies Transportation technologies are systems and devices that move goods and people.
Welcome to AP Biology Mr. Levine Ext. # 2317.
K. J. O’Hara AMRS: Behavior Recognition and Opponent Modeling Oct Behavior Recognition and Opponent Modeling in Autonomous Multi-Robot Systems.
Machine Learning Chapter 5. Artificial IntelligenceChapter 52 Learning 1. Rote learning rote( โรท ) n. วิถีทาง, ทางเดิน, วิธีการตามปกติ, (by rote จากความทรงจำ.
Automated Reasoning Early AI explored how to automated several reasoning tasks – these were solved by what we might call weak problem solving methods as.
Ch. 13 Ch. 131 jcmt CSE 3302 Programming Languages CSE3302 Programming Languages (notes?) Dr. Carter Tiernan.
Human Computer Interaction
Stephen Flockton.  What is my Project?  What is Planning?  Advantages and Disadvantages of Planning.  Description of the Product.  Product Demonstration.
A Roadmap towards Machine Intelligence
Review of Parnas’ Criteria for Decomposing Systems into Modules Zheng Wang, Yuan Zhang Michigan State University 04/19/2002.
(Classical) AI Planning. General-Purpose Planning: State & Goals Initial state: (on A Table) (on C A) (on B Table) (clear B) (clear C) Goals: (on C Table)
Finite State Machines (FSM) OR Finite State Automation (FSA) - are models of the behaviors of a system or a complex object, with a limited number of defined.
Data Mining and Decision Support
Transfer Learning in Sequential Decision Problems: A Hierarchical Bayesian Approach Aaron Wilson, Alan Fern, Prasad Tadepalli School of EECS Oregon State.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
1 UNIT-3 KNOWLEDGE REPRESENTATION. 2 Agents that reason logically(Logical agents) A Knowledge based Agent The Wumpus world environment Representation,
Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia
Changes in Mathematics Dixon-Smith Middle School Parent Information Session Kimberly McGehee, Math Coach.
16 April 2011 Alan, Edison, etc, Saturday.. Knowledge, Planning and Robotics 1.Knowledge 2.Types of knowledge 3.Representation of knowledge 4.Planning.
Human Computer Interaction Lecture 21 User Support
Effective Instruction: Delivery & Techniques
Helping Students Examine Their Reasoning
Vacaville USD December 8, 2014
Aircraft Sequencing Problem Near Terminal Area
SNS College of Engineering Department of Computer Science and Engineering AI Planning Presented By S.Yamuna AP/CSE 5/23/2018 AI.
Human Computer Interaction Lecture 21,22 User Support
Chapter 11: Artificial Intelligence
Building the foundations for innovation
Spectrum of Teaching Styles Practical examples
Architecture Components
Transportation Technology
Learning to Program in Python
Course Logistics CS533: Intelligent Agents and Decision Making
Objective of This Course
Chapter 2: Designing your project
Exercise & fitness instruction BTEC Level 2 in Sport Carlos Munoz.
CIS 488/588 Bruce R. Maxim UM-Dearborn
The Behavior of Tutoring Systems
Class #20 – Wednesday, November 5
Chapter 11 user support.
Artificial Intelligence
Search.
Search.
Dan Roth Department of Computer Science
Area Coverage Problem Optimization by (local) Search
Presentation transcript:

Learning Relational Rules for Goal Decomposition Prasad Tadepalli Oregon State University Chandra Reddy IBM T.J. Watson Research Center Supported by Office of Naval Research 2/25/2019 Symposium on Reasoning and Learning

A Critique of Current Research Most work is confined to learning in isolation Predominantly employs propositional representations The learner is passive and has to learn from random examples The role of prior knowledge in learning is minimal 2/25/2019 Symposium on Reasoning and Learning

Symposium on Reasoning and Learning Our Approach Learning in the context of hierarchical problem solving The goals, states and actions are represented relationally Active Learning: Learner can ask questions, pose problems to itself, and solve them Declarative prior knowledge guides and speeds up learning 2/25/2019 Symposium on Reasoning and Learning

Air Traffic Control (ATC) Task (Ackerman and Kanfer study) 2/25/2019 Symposium on Reasoning and Learning

Goal Decomposition Rules (D-rules) D-rules decompose goals into subgoals. goal: land(?plane) condition: plane-at(?plane, ?loc) & level(L3, ?loc) subgoals: move(?plane, L2)}; move(?plane, L1); land1(?plane) Problems are solved by recursive decomposition of goals to subgoals Control knowledge guides the selection of appropriate decomposition rules. 2/25/2019 Symposium on Reasoning and Learning

Domain theory for ATC task Domain Axioms: can-land-short(?p) :- type(?p propeller) can-land-short(?p) : - type(?p DC10) & wind-speed(low) & runway-cond(dry) Primitive Operators: jump(?cursor-from, ?cursor-to), short-deposit(?plane, ?runway), long-deposit(?plane, ?runway), select(?loc, ?plane) 2/25/2019 Symposium on Reasoning and Learning

Learning from Demonstration Input Examples: State: at(p1, 10), type(p1, propeller), fuel(p1, 5),cursor-loc(4), free(1), free(2),…, free(9), free(11),…, free(15), runway-cond(wet), wind-speed(high), wind-dir(south) Goal: land-plane(p1) Solution: jump(4, 10), select(10,p1), jump(10,14), short-deposit(p1,R2) Output: underlying D-rules 2/25/2019 Symposium on Reasoning and Learning

Generalizing Examples Examples are inductively generalized: Examples to D-rules Example goal D-rule Goal Initial state Condition Literals in other states Subgoals Least General Generalization (lgg) X lgg H Problem: Size of lgg grows exponentially with the number of examples. 2/25/2019 Symposium on Reasoning and Learning

Symposium on Reasoning and Learning Learning from Queries Use queries to prevent the exponential growth of the lgg: (Reddy and Tadepalli, 1997) Non-recursive, single-predicate Horn programs are learnable from queries and examples. Prune each literal in the lgg and ask a membership query (a question) to confirm that the result is not overgeneral. 2/25/2019 Symposium on Reasoning and Learning

Symposium on Reasoning and Learning Need for queries D 2/25/2019 Symposium on Reasoning and Learning

Symposium on Reasoning and Learning Need for queries x D 2/25/2019 Symposium on Reasoning and Learning

Symposium on Reasoning and Learning Need for queries x lgg D 2/25/2019 Symposium on Reasoning and Learning

Symposium on Reasoning and Learning Need for queries x D target 2/25/2019 Symposium on Reasoning and Learning

Symposium on Reasoning and Learning Need for queries overgeneral x D 2/25/2019 Symposium on Reasoning and Learning

Symposium on Reasoning and Learning Need for queries x D 2/25/2019 Symposium on Reasoning and Learning

Symposium on Reasoning and Learning Using Prior Knowledge Explanation-Based Pruning: Remove literals that don't play a causal role in the plan e.g., free(1), free(2), ...etc. Abstraction by Forward Chaining: can-land-short(?p) :- type(?p propeller) Helps learn a more general rule. Learning subgoal order: Subgoal literals are maintained as a sequence of sets of literals. A set is refined into a sequence of smaller sets using multiple examples. 2/25/2019 Symposium on Reasoning and Learning

Learning Multiple D-Rules Maintain a list of d-rules for each goal. Combine a new example x with the first d-rule hi for which lgg(x,hi) is not over-general Reduce the result and replace hi 2/25/2019 Symposium on Reasoning and Learning

Results on learning from demonstration 2/25/2019 Symposium on Reasoning and Learning

Learning from Exercises Supplying solved training examples is too demanding for the teacher. Solving problems from scratch is computationally hard. A compromise solution: learning from exercises. Exercises are helpful intermediate subproblems that help solve the main problems. Solving easier subproblems makes it possible to solve more difficult problems. 2/25/2019 Symposium on Reasoning and Learning

Difficulty Levels in ATC Domain 2/25/2019 Symposium on Reasoning and Learning

Symposium on Reasoning and Learning Solving Exercises Use previously learned d-rules as operators . Iterative-deepening DFS to find short rules. Generalization is done as before. 2/25/2019 Symposium on Reasoning and Learning

Query Answering by Testing Generate test problems {InitialState, Goal} that match the d-rule. Use the decomposition that the d-rule suggests, and solve the problems If some problem cannot be solved the rule is over-general. 2/25/2019 Symposium on Reasoning and Learning

Results on learning from exercises 14 d-rules 2/25/2019 Symposium on Reasoning and Learning

Symposium on Reasoning and Learning Conclusions It is possible to learn useful problem-solving strategies in expressive representations. Prior knowledge can be put to good use in learning. Queries can be implemented approximately using heuristic techniques. Learning from demonstration and learning from exercises make different tradeoffs with respect to learning and reasoning. 2/25/2019 Symposium on Reasoning and Learning

Learning for Training Environments (Ron Metoyer) Task Training Sports Military Electronic Arts Boston Dynamics Inc. who IS creating the content (some programmer) vs. who SHOULD be (the coach). want an immersive screen projecting onto (3d game going on) and he throws to right person (a net in front to catch it). We can track who they threw to and the characters respond. Who creates the training content? 2/25/2019 Symposium on Reasoning and Learning

Symposium on Reasoning and Learning Research Challenges Learning must be on-line. Must learn quickly, since users can only give a few examples. Extension to more complex strategy languages that include concurrency, partial observability, real-time execution, multiple agents, e.g., ConGolog Provide a predictable model of generalization. Allow learning from demonstrations, reinforcement, advice, and hints e.g., improving or learning to select between strategies. 2/25/2019 Symposium on Reasoning and Learning

Vehicle Routing & Product Delivery 2/25/2019 Symposium on Reasoning and Learning

Symposium on Reasoning and Learning Learning Challenges Very large number of states and actions Stochastic demands by customers and shops Multiple agents (trucks, truck companies, shops, distribution centers) Partial observability Hierarchical decision making Significant real-world impact 2/25/2019 Symposium on Reasoning and Learning

ICML Workshop on Relational Reinforcement Learning Paper Deadline: April 2 Check ICML website 2/25/2019 Symposium on Reasoning and Learning