Learning Procedural Planning Knowledge in Complex Environments Douglas Pearson March 2004.

Slides:



Advertisements
Similar presentations
Program Analysis using Random Interpretation Sumit Gulwani UC-Berkeley March 2005.
Advertisements

© 2005 by Prentice Hall Chapter 13 Finalizing Design Specifications Modern Systems Analysis and Design Fourth Edition Jeffrey A. Hoffer Joey F. George.
Modelling with expert systems. Expert systems Modelling with expert systems Coaching modelling with expert systems Advantages and limitations of modelling.
Developing Event Driven State Machine Workflows S1 S2 S3 S4 Adam Calderon Principal Engineer - Interknowlogy Microsoft MVP – C#
David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
Machine Learning: Intro and Supervised Classification
Modeling and Simulation By Lecturer: Nada Ahmed. Introduction to simulation and Modeling.
Planning with Non-Deterministic Uncertainty (Where failure is not an option) R&N: Chap. 12, Sect (+ Chap. 10, Sect 10.7)
Improving System Safety through Agent-Supported User/System Interfaces: Effects of Operator Behavior Model Charles SANTONI & Jean-Marc MERCANTINI (LSIS)
Introduction to Neural Networks Computing
1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan.
Lecture 8: Three-Level Architectures CS 344R: Robotics Benjamin Kuipers.
RIPPER Fast Effective Rule Induction
Meta-Level Control in Multi-Agent Systems Anita Raja and Victor Lesser Department of Computer Science University of Massachusetts Amherst, MA
Computer Engineering 203 R Smith Project Tracking 12/ Project Tracking Why do we want to track a project? What is the projects MOV? – Why is tracking.
1 Reinforcement Learning Introduction & Passive Learning Alan Fern * Based in part on slides by Daniel Weld.
Chapter 6: Design of Expert Systems
Redux: Rapid Model Building Douglas Pearson, Ph.D. Professor John Laird ThreePenny Software University of Michigan
1 Lecture 6 Performance Measurement and Improvement.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
1 Learning from Behavior Performances vs Abstract Behavior Descriptions Tolga Konik University of Michigan.
Redux Update: Building Rules from Examples Douglas Pearson John Laird ThreePenny Software University of Michigan
Fall 2007CS 2251 Software Engineering Intro. Fall 2007CS 2252 Topics Software challenge Life-cycle models Design Issues Documentation Abstraction.
(c) 2007 Mauro Pezzè & Michal Young Ch 1, slide 1 Software Test and Analysis in a Nutshell.
Task analysis 1 © Copyright De Montfort University 1998 All Rights Reserved Task Analysis Preece et al Chapter 7.
Software Architecture Quality. Outline Importance of assessing software architecture Better predict the quality of the system to be built How to improve.
Part I: Classification and Bayesian Learning
Client Logo LEAN ENTERPRISE Implementation Workshop.
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
Issues with Data Mining
1 Chapter Eight Exception Handling. 2 Objectives Learn about exceptions and the Exception class How to purposely generate a SystemException Learn about.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
SOFTWARE ENGINEERING BIT-8 APRIL, 16,2008 Introduction to UML.
程建群 博士(Dr. Jason Cheng) 年03月
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
2-Oct-15 Bojan Orlic, TU/e Informatica, System Architecture and Networking 12-Oct-151 Homework assignment 1 feedback Bojan Orlic Architecture.
Intro: Use Case and Use Case Diagram Documentation.
Introduction Algorithms and Conventions The design and analysis of algorithms is the core subject matter of Computer Science. Given a problem, we want.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Measuring the Quality of Decisionmaking and Planning Framed in the Context of IBC Experimentation February 9, 2007 Evidence Based Research, Inc.
Knowledge Modeling, use of information sources in the study of domains and inter-domain relationships - A Learning Paradigm by Sanjeev Thacker.
1 Introduction to Software Engineering Lecture 1.
CHECKERS: TD(Λ) LEARNING APPLIED FOR DETERMINISTIC GAME Presented By: Presented To: Amna Khan Mis Saleha Raza.
© 2006 ITT Educational Services Inc. SE350 System Analysis for Software Engineers: Unit 10 Slide 1 Chapter 13 Finalizing Design Specifications.
Requirements Validation
David Streader Computer Science Victoria University of Wellington Copyright: David Streader, Victoria University of Wellington Debugging COMP T1.
Winter 2011SEG Chapter 11 Chapter 1 (Part 1) Review from previous courses Subject 1: The Software Development Process.
Chapter 10 Algorithmic Thinking. Learning Objectives Explain similarities and differences among algorithms, programs, and heuristic solutions List the.
Requirements Engineering Requirements Validation and Management Lecture-24.
1 Learning through Interactive Behavior Specifications Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University.
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
3/14/20161 SOAR CIS 479/579 Bruce R. Maxim UM-Dearborn.
1 Requirements Engineering for Agile Methods Lecture # 41.
Data Mining What is to be done before we get to Data Mining?
Learning Procedural Knowledge through Observation -Michael van Lent, John E. Laird – 인터넷 기술 전공 022ITI02 성유진.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Studio modeling basics
Knowledge Representation and Reasoning
Algorithms and Problem Solving
Chapter 11: Learning Introduction
Chapter 6: Design of Expert Systems
K Nearest Neighbors and Instance-based methods
Introduction to Data Mining, 2nd Edition by
Objective of This Course
Algorithms and Problem Solving
Machine learning: building agents that are capable to learn from their own experience An autonomous agent is expected to learn from its own experience,
Subject Name: SOFTWARE ENGINEERING Subject Code:10IS51
Presentation transcript:

Learning Procedural Planning Knowledge in Complex Environments Douglas Pearson March 2004

Characterizing the Learner Deliberate Implicit Method KR Declarative Procedural Simpler Agents Weak, slower learning Complex Agents Strong, faster learning Complex Environments Actions: Duration & Conditional Sensing: Limited, noisy, delayed Task : Timely response Domain: Change over time large state space Simple Environments Symbolic Learners Reinforcement Learning IMPROV

Why Limit Knowledge Access? Procedural – Only access by executing Declarative – Can answer when will execute/what it will do. Declarative Problems Availability –If (x^5 + 3x^3 – 5x^2 +2) > 7 then Action –Chains of rules A->B->C->Action Efficiency –O(size of knowledge base) or worse –Agent slows down as learns more IMPROV Representation –Sets of production rules for operator preconditions and actions –Assume learner can only execute rules –But allow ability to add declarative knowledge when its efficient to do so.

Focusing on Part of the Problem Task Performance 0% 100% Knowledge Representation Initial Rule Base Learn this Domain Knowledge

The Problem Cast learning problem as –Error detection (incomplete/incorrect K) –Error correction (fixing or adding K) But with just limited, procedural access Aim is to support learning in complex, scalable agents/environments.

Error Detection Problem S1 Speed-30 S2 Speed-10 S3 Speed-0 S4 Speed-30 Existing (Possibly Incorrect) Knowledge PLAN How to monitor the plan during execution without direct knowledge access?

Error Detection Solution Direct monitoring – not possible Instead detect lack of progress to the goal –No rules matching or conflicting rules S1 Speed-30 S2 Speed-10 S3 Speed-0 S4 Engine stalls No proposal Not predicting behavior of the world (useful in stochastic environments) But no implicit notion of quality of solution Can add domain specific error conditions – but not required.

IMPROVs Recovery Method Search Learning Identify Incorrect Operator(s) Train Inductive Learner Change Domain Knowledge Replan Execute Record [State,Op -> Result] Repeat until find goal Fail Reached Goal

Finding the Incorrect Operator(s) Speed-30Speed-10Speed-0Speed-30 Speed-10Speed-0Speed-30Change-Gear Change-Gear is over-specific Speed-0 is over-general By waiting can do better credit assignment

Learning to Correct the Operator Collected a set of training instances –[State, Operator -> Result] –Can identify differences between states Speed = 40 Light = green Self = car Other = car Speed = 40 Light = green Self = car Other = ambulance Used as a default bias in training inductive learner Learn preconditions as classification problem (predict operator from state)

K-Incremental Learning Collect a set of k instances Then train inductive learner Reinforcement Learners Till Correction (IMPROV) Till Unique Cause (EXPO) Non-Incremental Learners 1k1k2 n K-Incremental Learner –k does not grow over time => incremental behavior –Better decisions about what to discard when generalizing –When doing active learning bad early learning can really hurt Instance set size

Extending to Operator Actions Speed 30Speed 0Speed 20 Speed 30 Decompose into operator hierarchy Speed 0Speed 20 BrakeRelease Slow -5Slow -10 Slow 0 Terminates with operators that modify a single symbol

Correcting Actions Slow -5Slow -10 Expected effects of braking Slow -2Slow -4Slow -6 Observed effects of braking on ice => Failure Use the correction method to change the pre-conditions of these sub-operators

Change Procedural Actions Brake Changing effects of brake Braking & slow=0 & ice => reject slow -5 Braking & slow=0 & ice => propose slow -2 Specialize Slow -5 Generalize Slow -2 Supports Complex Actions Actions with durations (sequence of operators) Conditional actions (branches in sequence of operators) Multiple simultaneous effects

IMPROV Summary DeliberateImplicit Method KR Declarative Non-Incremental Procedural Incremental Symbolic Learners Reinforcement Learning IMPROV IMPROV support for: Powerful agents -- Multiple goals -- Faster, deliberate learning Complex environments -- Noise -- Complex actions -- Dynamic environments k-Incremental Learning -- Improved credit assignment -- Which operator -- Which feature General weak deliberate learner with only procedural access assumed -- General purpose error detection -- General correction method applied to preconditions and actions -- Nice re-use of precondition learner to learn actions -- Easy to add domain specific knowledge to make method stronger

Redux: Diagram-based Example-driven Knowledge Acquisition Douglas Pearson March 2004

1. User specifies desired behavior

2. User selects features – define rules Later well use ML to guess this initial feature set

3. Compare desired with rules Desired Actual Move-through(door1) Turn-to-face(threat1)Shoot(threat1) Move-through(door1) Turn-to-face(neutral1) Shoot(neutral1)

4. Identify and correct problems Detect differences between desired behavior and rules –Detect overgeneral preconditions –Detect conflicts within the scenario –Detect conflicts between scenarios –Detect choice points where theres no guidance –etc. etc. All of these errors are detected automatically when rule is created

5. Fast rule creation by expert ExpertEngineer Library of validated behavior examples A -> B C -> D E, J -> F G, A, C -> H E, G -> I J, K -> L Executable Code Analysis & generation tools Detect inconsistency Generalize Generate rules Simulate execution Simulation Environment Define behavior with diagram-based examples