AN ADAPTIVE PLANNER BASED ON LEARNING OF PLANNING PERFORMANCE Kreshna Gopal & Thomas R. Ioerger Department of Computer Science Texas A&M University College.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Thomas Trappenberg Autonomous Robotics: Supervised and unsupervised learning.
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Plan Generation & Causal-Link Planning 1 José Luis Ambite.
Algorithms + L. Grewe.
Support Vector Machines and Margins
Combining Inductive and Analytical Learning Ch 12. in Machine Learning Tom M. Mitchell 고려대학교 자연어처리 연구실 한 경 수
Lecture 13 – Perceptrons Machine Learning March 16, 2010.
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
Machine Learning Neural Networks
Artificial Neural Networks
Chapter 10 Artificial Intelligence © 2007 Pearson Addison-Wesley. All rights reserved.
Soft computing Lecture 6 Introduction to neural networks.
Case-Based Reasoning, 1993, Ch11 Kolodner Adaptation method and Strategies Teacher : Dr. C.S. Ho Student : L.W. Pan No. : M Date : 1/7/2000.
Incorporating Advice into Agents that Learn from Reinforcement Presented by Alp Sardağ.
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Optimal Adaptation for Statistical Classifiers Xiao Li.
Triple Patterning Aware Detailed Placement With Constrained Pattern Assignment Haitong Tian, Yuelin Du, Hongbo Zhang, Zigang Xiao, Martin D.F. Wong.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Radial Basis Function Networks
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Ch. 11: Optimization and Search Stephen Marsland, Machine Learning: An Algorithmic Perspective. CRC 2009 some slides from Stephen Marsland, some images.
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
Chapter 11: Artificial Intelligence
Chapter 10 Artificial Intelligence. © 2005 Pearson Addison-Wesley. All rights reserved 10-2 Chapter 10: Artificial Intelligence 10.1 Intelligence and.
Artificial Neural Networks
Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
CSC321: Neural Networks Lecture 2: Learning with linear neurons Geoffrey Hinton.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 11: Artificial Intelligence Computer Science: An Overview Tenth Edition.
A Graph Based Algorithm for Data Path Optimization in Custom Processors J. Trajkovic, M. Reshadi, B. Gorjiara, D. Gajski Center for Embedded Computer Systems.
Mining Binary Constraints in Feature Models: A Classification-based Approach Yi Li.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
CSC321 Introduction to Neural Networks and Machine Learning Lecture 3: Learning in multi-layer networks Geoffrey Hinton.
AUDIT SOFTWARE Chapter 16. Generalized Audit Software Off-the-shelf software that provides a means to gain access to and manipulate data maintained on.
Insight: Steal from Existing Supervised Learning Methods! Training = {X,Y} Error = target output – actual output.
CS621 : Artificial Intelligence
Franciszek Seredynski, Damian Kurdej Polish Academy of Sciences and Polish-Japanese Institute of Information Technology APPLYING LEARNING CLASSIFIER SYSTEMS.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Chapter 18 Connectionist Models
Chapter 8: Adaptive Networks
Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.
Randomized Kinodynamics Planning Steven M. LaVelle and James J
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Chapter 6 Neural Network.
Nawanol Theera-Ampornpunt, Seong Gon Kim, Asish Ghoshal, Saurabh Bagchi, Ananth Grama, and Somali Chaterji Fast Training on Large Genomics Data using Distributed.
Robodog Frontal Facial Recognition AUTHORS GROUP 5: Jing Hu EE ’05 Jessica Pannequin EE ‘05 Chanatip Kitwiwattanachai EE’ 05 DEMO TIMES: Thursday, April.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Linear Models & Clustering Presented by Kwak, Nam-ju 1.
Fall 2004 Backpropagation CS478 - Machine Learning.
General-Purpose Learning Machine
Deep Learning Amin Sobhani.
Chapter 11: Artificial Intelligence
One-layer neural networks Approximation problems
A Simple Artificial Neuron
Classification with Perceptrons Reading:
Biological and Artificial Neuron
Biological and Artificial Neuron
Perceptron as one Type of Linear Discriminants
Biological and Artificial Neuron
Chapter 8: Generalization and Function Approximation
A task of induction to find patterns
Reinforcement Learning (2)
Reinforcement Learning (2)
Presentation transcript:

AN ADAPTIVE PLANNER BASED ON LEARNING OF PLANNING PERFORMANCE Kreshna Gopal & Thomas R. Ioerger Department of Computer Science Texas A&M University College Station, TX

APPROACHES TO PLANNING Planning as Problem Solving Situation Calculus Planning STRIPS Partial Order Planning Hierarchical Planning Enhance Language Expressiveness Planning with Constraints Special Purpose Planning Reactive Planning Plan Execution and Monitoring Distributed, Continual Planning Planning Graphs Planning as Satisfiability Machine Learning Methods for Planning

MACHINE LEARNING METHODS FOR PLANNING Learning Macro-operators Learning Bugs and Repairs Explanation-Based Learning Reinforcement Learning Case-Based Planning Plan Reuse

PLAN REUSE: ISSUES Plan storage & indexing Plan retrieval - matching new problem with solved ones Plan modification - to suit requirements of new problem Nebel & Koehler: Plan matching is NP-hard Plan modification is worse than plan generation Motivations of proposed method Avoid plan modification Very efficient matching using a neural network

COMPONENTS OF PROPOSED PLANNING SYSTEM Default Planner Plan Library (I: Initial State, G: Goal State, P: Solution Plan) I 1 G 1 P 1 I 2 G 2 P 2. I n G n P n Training: predict default planner’s performance using a neural network

SCHEME OF REUSE New problem: Retrieved plan: P new I new G new P I P G I k G k P k Proposed approach: use default planner to generate P I and P G (instead of P new ) and return concatenation of P I, P k and P G as solution

DISTANCE AND GAIN METRICS Distance(I,G): Time default planner will take to solve Gain(I new,G new,I k,G k ): Distance(I new,G new ) Distance(I new,I k ) + Distance(G k,G new ) Choose case with maximum Gain There should be a minimum Distance and a minimum Gain for reuse

THE TRAINING PHASE Target function Time prediction, t Training experience Solved examples, D Target function representation n t =  w i f i (n features, f 0 = 1) i = 0 Learning algorithm Gradient descent: minimizes error, E, of weight vector w E(w) = ½  (t d - o d ) 2 d  D

GRADIENT DESCENT ALGORITHM Inputs:1. Training examples, where each example is a pair, where x is the input vector and t is target output value. 2. Learning rate,  3. Number of iterations, m Initialize each w i to some small random value repeat m times { Initialize each  w i to 0 for each training example do { Find the output o of the unit on input x for each linear unit weight w i do  w i =  w i +  (t - o)x i } for each linear unit weight w i do w i = w i +  w i }

FEATURE EXTRACTION Feature extraction by domain experts Knowledge-acquisition bottleneck Automatic feature extraction methods can be used Domain dependence: domain knowledge is crucial for efficient planning systems

PLAN RETRIEVAL AND REUSE ALGORITHM Inputs: LIBRARY, w, MinGain, MinTime and a new problem if Distance(I new, G new ) < MinTime then Call default planner to solve else { MaxGain = -  /* MaxGain records the maximum Gain so far */ for k = 1 to n do /* There are n cases in LIBRARY */ { k th case = Gain = Distance(I new, G new ) /[Distance(I new, I k ) + Distance(G k, G new )] if Gain > MaxGain then { MaxGain = Gain b = k /* b is the index of the best case found so far */ } if MaxGain > MinGain then { Call default planner to solve, which returns P I,b Call default planner to solve, which returns P G,b Return concatenation of P I,b, P b and P G,b } else Call default planner to solve }

EMPIRICAL EVALUATION Default planner: STRIPS (Shorts & Dickens) Learning: perceptron Blocks-world domain (3-7 blocks) Plan library ( cases) Common LISP, SPARCStation

EXAMPLE OF REUSE: SUSSMAN ANOMALY PROBLEM I new G new P I,K P G,k P K I k G k P K : MOVE-BLOCK-TO-TABLE(Blue,Red), MOVE-BLOCK-FROM-TABLE(Red,Blue) P I,K : MOVE-BLOCK-TO-BLOCK(Blue,Yellow,Red) P G,K : MOVE-BLOCK-FROM-TABLE(Yellow,Red)

BLOCKS-WORLD DOMAIN BLOCKS = {A, B, C, …} PREDICATES ON(A,B) – block A is on block B ON-TABLE(B) – block B is on table CLEAR(A) – block A is clear OPERATORS MOVE-BLOCK-TO-BLOCK(A,B,C) Move A from top of B to top of C MOVE-BLOCK-TO-TABLE(A,B) Move A from top of B to table MOVE-BLOCK-FROM-TABLE(A,B) Move A from table to top of B

FEATURES Domain-independent features Size of problem SIZE Number of conditions in goal state already satisfied in initial state SAT-CLEAR, SAT-ON, SAT-ON-TABLE Number of conditions in goal state not satisfied in initial state UNSAT-CLEAR, UNSAT-ON, UNSAT-ON-TABLE Domain-dependent features Number of stacks in the initial and goal states STACK-INIT, STACK-GOAL Number of blocks already in place i.e. they need not be moved to reach goal configuration IN-PLACE Heuristic function which guesses the number of planning steps STEPS

CONCLUSIONS Plan modification is avoided Problem ‘matching’ is done very efficiently The planning system is domain- independent Other target functions (like quality of plans) can be learned and predicted Utility problem Indexing the library Selective storage Integrate with other techniques