Learning in the Large Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning in the Large MIT CSAIL PIs:

Slides:



Advertisements
Similar presentations
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Advertisements

General learning in multiple domains transfer of learning across domains Generality and Transfer in Learning training items test items training items test.
Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning Hierarchical Reactive Skills from Reasoning and Experience.
Bayesian Abductive Logic Programs Sindhu Raghavan Raymond J. Mooney The University of Texas at Austin 1.
1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.
Partially Observable Markov Decision Process (POMDP)
Hierarchical Task and Motion Planning in Now Leslie P. Kaelbling and Tomas Lozano-Perez Department of Computer Science & Engineering, MIT Kai Liu June.
SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.
Meta-Level Control in Multi-Agent Systems Anita Raja and Victor Lesser Department of Computer Science University of Massachusetts Amherst, MA
MDP Presentation CS594 Automated Optimal Decision Making Sohail M Yousof Advanced Artificial Intelligence.
LCSLCS 18 September 2002DARPA MARS PI Meeting Intelligent Adaptive Mobile Robots Georgios Theocharous MIT AI Laboratory with Terran Lane and Leslie Pack.
What Are Partially Observable Markov Decision Processes and Why Might You Care? Bob Wall CS 536.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
Research Summary Adi Botea Computer Go Seminar, 29/09/2003.
Representing Uncertainty CSE 473. © Daniel S. Weld 2 Many Techniques Developed Fuzzy Logic Certainty Factors Non-monotonic logic Probability Only one.
Complexity of Mechanism Design Vincent Conitzer and Tuomas Sandholm Carnegie Mellon University Computer Science Department.
Knowledge Management Tools Abstract More and more companies use knowledge management to leverage theis most important resource : knowledge. Knowledge.
11/14  Continuation of Time & Change in Probabilistic Reasoning Project 4 progress? Grade Anxiety? Make-up Class  On Monday?  On Wednesday?
Robotics for Intelligent Environments
Exemplar-based accounts of “multiple system” phenomena in perceptual categorization R. M. Nosofsky and M. K. Johansen Presented by Chris Fagan.
CIS 410/510 Probabilistic Methods for Artificial Intelligence Instructor: Daniel Lowd.
8/9/20151 DARPA-MARS Kickoff Adaptive Intelligent Mobile Robots Leslie Pack Kaelbling Artificial Intelligence Laboratory MIT.
Query Planning for Searching Inter- Dependent Deep-Web Databases Fan Wang 1, Gagan Agrawal 1, Ruoming Jin 2 1 Department of Computer.
Dr. Engr. Sami ur Rahman Assistant Professor Department of Computer Science University of Malakand Research Methods in Computer Science Lecture: Research.
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
Decision-Making on Robots Using POMDPs and Answer Set Programming Introduction Robots are an integral part of many sectors such as medicine, disaster rescue.
Artificial Intelligence Dr. Paul Wagner Department of Computer Science University of Wisconsin – Eau Claire.
TEA Science Workshop #3 October 1, 2012 Kim Lott Utah State University.
Bayesian Hierarchical Clustering Paper by K. Heller and Z. Ghahramani ICML 2005 Presented by HAO-WEI, YEH.
K. J. O’Hara AMRS: Behavior Recognition and Opponent Modeling Oct Behavior Recognition and Opponent Modeling in Autonomous Multi-Robot Systems.
Recognizing Activities of Daily Living from Sensor Data Henry Kautz Department of Computer Science University of Rochester.
CSE-573 Reinforcement Learning POMDPs. Planning What action next? PerceptsActions Environment Static vs. Dynamic Fully vs. Partially Observable Perfect.
Synthetic Cognitive Agent Situational Awareness Components Sanford T. Freedman and Julie A. Adams Department of Electrical Engineering and Computer Science.
Methodology Matters: Doing Research in the Behavioral and Social Sciences ICS 205 Ha Nguyen Chad Ata.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
AI Automated Planning In A Nutshell Vitaly Mirkis March 4, 2013 Netanya Academic College Acknowledgments: Some slides are based slides of Prof. Carmel.
Decision Making Under Uncertainty Lec #8: Reinforcement Learning UIUC CS 598: Section EA Professor: Eyal Amir Spring Semester 2006 Most slides by Jeremy.
Service Service metadata what Service is who responsible for service constraints service creation service maintenance service deployment rules rules processing.
A Tutorial on the Partially Observable Markov Decision Process and Its Applications Lawrence Carin June 7,2006.
Solving Problems by searching Well defined problems A probem is well defined if it is easy to automatically asses the validity (utility) of any proposed.
Artificial Intelligence and Searching CPSC 315 – Programming Studio Spring 2013 Project 2, Lecture 1 Adapted from slides of Yoonsuck Choe.
Theoretic Frameworks for Data Mining Reporter: Qi Liu.
Technical Seminar Presentation Presented By:- Prasanna Kumar Misra(EI ) Under the guidance of Ms. Suchilipi Nepak Presented By Prasanna.
Decision Theoretic Planning. Decisions Under Uncertainty  Some areas of AI (e.g., planning) focus on decision making in domains where the environment.
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
B OIDS, D ROIDS, & N OIDS : D ESCRIPTION AND I MPLICATIONS OF AN I NTEGRATIVE R ESEARCH P ARADIGM ON M ACROCOGNITION Steve W.J. Kozlowski Georgia T. Chao.
Transfer Learning in Sequential Decision Problems: A Hierarchical Bayesian Approach Aaron Wilson, Alan Fern, Prasad Tadepalli School of EECS Oregon State.
Stochastic Grammars: Overview Representation: Stochastic grammar Representation: Stochastic grammar Terminals: object interactions Terminals: object interactions.
Supporting the design of interactive systems a perspective on supporting people’s work Hans de Graaff 27 april 2000.
Search Control.. Planning is really really hard –Theoretically, practically But people seem ok at it What to do…. –Abstraction –Find “easy” classes of.
Message Source Linguistic Channel Articulatory Channel Acoustic Channel Observable: MessageWordsSounds Features Bayesian formulation for speech recognition:
Lecture 2: Problem Solving using State Space Representation CS 271: Fall, 2008.
Partial Observability “Planning and acting in partially observable stochastic domains” Leslie Pack Kaelbling, Michael L. Littman, Anthony R. Cassandra;
The Problem of Pattern and Scale in Ecology - Summary What did this paper do that made it a citation classic? 1.It summarized a large body of work on spatial.
NTT-MIT Collaboration Meeting, 2001Leslie Pack Kaelbling 1 Learning in Worlds with Objects Leslie Pack Kaelbling MIT Artificial Intelligence Laboratory.
Statistica /Statistics Statistics is a discipline that has as its goal the study of quantity and quality of a particular phenomenon in conditions of.
OPERATING SYSTEMS CS 3502 Fall 2017
CS b659: Intelligent Robotics
Analytics and OR DP- summary.
Behavioral Finance Unit II.
Thrust IC: Action Selection in Joint-Human-Robot Teams
Dynamical Statistical Shape Priors for Level Set Based Tracking
Robust Belief-based Execution of Manipulation Programs
Hierarchical POMDP Solutions
CIS 488/588 Bruce R. Maxim UM-Dearborn
“Hard” Optimization Problems
Learning in Worlds with Objects
Computing probabilities using Expect problem-solving Trees: A worked example Jim Blythe USC/ISI.
Reinforcement Learning Dealing with Partial Observability
Presentation transcript:

Learning in the Large Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning in the Large MIT CSAIL PIs: Leslie Pack Kaelbling, Tomás Lozano-Pérez, Tommi Jaakkola

Learning in the Large Three Subprojects Learning to behave in huge domains Transfer of learned knowledge across problems and domains Learning to recognize objects and interpret scenes

Learning in the Large Three Subprojects Learning to behave in huge domains Transfer of learned knowledge across problems and domains Learning to recognize objects and interpret scenes

Learning in the Large Learning Objective Learn to act effectively in highly complex dynamic domains –Learn models of complex world dynamics involving objects, properties, and relations –Learn “meta-cognition” strategies for deciding how to focus computational attention for action selection Learning is crucial for both problems because human designers are unable to build appropriate models by hand

Learning in the Large What Is Being Learned? Learning probabilistic dynamic rules pickup(X): on(X,Y), clear(X), table(Z), inhand-nil  0.8 : inhand(X), ¬on(X,Y), clear(Y), ¬clear(X) ¬inhand-nil 0.2: ¬on(X,Y), clear(Y), on(X,Z) Important goal is to learn partial models: some aspects will be easy to learn to predict, others will take longer Take advantage of partial models as soon as they’re learned

Learning in the Large How is it Being Learned? Search in rule space –logic-based methods for learning structure –convex optimization for probabilities Effectiveness of learned models tested using planner to select actions Learning is automatic Amount of data needed depends on the frequency and reliability of phenomenon being modeled

Learning in the Large How is the Knowledge Represented? Probabilistic dynamics rules No background knowledge currently, but it would be easy to build in some rules Knowledge is task-independent (though we may use utility to focus learning) Models can account for only parts of the state evolution; and they’re probabilistic Currently, no

Learning in the Large What is the Domain? Currently: physics simulator of blocks world Would like simulation of more complex environment, e.g., –battlefield –disaster relief –making breakfast

Learning in the Large How is Progress Being Measured? First, human inspection of rules for plausibility Second by performance of agent using rules for planning Nothing changes in the experimental set-up except the learned rules Metrics: –utility gained by the agent –computation speed Easily done overnight on a workstation

Learning in the Large What are the Technical Milestones? Defined by model sophistication rather than overt performance in the task –Learn rules with quantifiers –Learn to ground symbolic predicates in perception –Learn rules in partially observable environments –Postulate hidden causes –Focus rule-learning based on utility

Learning in the Large What is Being Learned? Learning to formulate small planning problem, from a huge state space and competing goals –what are useful subgoals? –when is it appropriate to ignore certain aspects of the domain? learning inference planning perceptionaction

Learning in the Large How is it Being Learned? Learning parameters in abstract models –partial observability makes it hard –gradient descent works, but may be weak –take advantage of Russell’s methods? Compare speed and utility of resulting action- selection system Learning is automatic Amount of data needed depends on the frequency and reliability of phenomenon being modeled

Learning in the Large How is the Knowledge Represented? Parameters in strategies for building abstractions Currently most of the abstraction structure is hand-coded The knowledge depends on the distribution of problems an agent has to solve, but not on particular low-level tasks Uncertainty isn’t represented explicitly, but is handled implicitly in statistical learning We are learning at multiple levels of abstraction

Learning in the Large What is the Domain? Nethack Would like more complex simulated domain

Learning in the Large What are the Technical Milestones? Meta-learning –Learn parameters in hand-built abstractions for MDPs –Learn new abstractions for MDPs –Learn to compose abstractions –Do it all for POMDPs