Beyond Chunking: Learning in Soar March 22, 2003 John E. Laird Shelley Nason, Andrew Nuxoll and a cast of many others University of Michigan.

Slides:



Advertisements
Similar presentations
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Extending the I CARUS Cognitive Architecture Thanks to D. Choi,
Advertisements

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Modeling Social Cognition in a Unified Cognitive Architecture.
Learning Procedural Planning Knowledge in Complex Environments Douglas Pearson March 2004.
Defining Decision Support System
Modelling with expert systems. Expert systems Modelling with expert systems Coaching modelling with expert systems Advantages and limitations of modelling.
ARCHITECTURES FOR ARTIFICIAL INTELLIGENCE SYSTEMS
Soar One-hour Tutorial John E. Laird University of Michigan March Supported in part by DARPA and ONR.
Chapter Five The Cognitive Approach II: Memory, Imagery, and Problem Solving.
Outline Introduction Soar (State operator and result) Architecture
Marzano Art and Science Teaching Framework Learning Map
Group Techniques John A. Cagle California State University, Fresno.
Introduction to SOAR Based on “a gentle introduction to soar: an Architecture for Human Cognition” by Jill Fain Lehman, John Laird, Paul Rosenbloom. Presented.
1 Episodic Memory for Soar Andrew Nuxoll 15 June 2005.
1 Chunking with Confidence John Laird University of Michigan June 17, th Soar Workshop
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
1 Learning from Behavior Performances vs Abstract Behavior Descriptions Tolga Konik University of Michigan.
1 Episodic Memory for Soar Agents Andrew Nuxoll 11 June 2004.
Mary (Missy) Cummings Humans & Automation Lab
1 Soar Semantic Memory Yongjia Wang University of Michigan.
Autonomous Mobile Robots CPE 470/670 Lecture 8 Instructor: Monica Nicolescu.
The Importance of Architecture for Achieving Human-level AI John Laird University of Michigan June 17, th Soar Workshop
Cognitive Processes PSY 334 Chapter 8 – Problem Solving May 21, 2003.
A Soar’s Eye View of ACT-R John Laird 24 th Soar Workshop June 11, 2004.
Polyscheme John Laird February 21, Major Observations Polyscheme is a FRAMEWORK not an architecture – Explicitly does not commit to specific primitives.
Models of Human Performance Dr. Chris Baber. 2 Objectives Introduce theory-based models for predicting human performance Introduce competence-based models.
Understanding Knowledge. 2-2 Overview  Definitions  Cognition  Expert Knowledge  Human Thinking and Learning  Implications for Management.
Overview of Long-Term Memory laura leventhal. Reference Chapter 14 Chapter 14.
Teaching with Depth An Understanding of Webb’s Depth of Knowledge
Meaningful Learning in an Information Age
Reinforcement Learning and Soar Shelley Nason. Reinforcement Learning Reinforcement learning: Learning how to act so as to maximize the expected cumulative.
Science Inquiry Minds-on Hands-on.
L-3 PROPRIETARY 1 Real-time IO-oriented Soar Agent for Base Station-Mobile Control David Daniel, Scott Hastings L-3 Communications.
PROCESS MODELING Chapter 8 - Process Modeling
Click to edit Master title style  Click to edit Master text styles  Second level  Third level  Fourth level  Fifth level  Click to edit Master text.
Instructional software. Models for integrating technology in teaching Direct instructional approach Indirect instructional approach.
Robotica Lecture 3. 2 Robot Control Robot control is the mean by which the sensing and action of a robot are coordinated The infinitely many possible.
SLB /04/07 Thinking and Communicating “The Spiritual Life is Thinking!” (R.B. Thieme, Jr.)
A Multi-Domain Evaluation of Scaling in Soar’s Episodic Memory Nate Derbinsky, Justin Li, John E. Laird University of Michigan.
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
© 2007 Tom Beckman Features:  Are autonomous software entities that act as a user’s assistant to perform discrete tasks, simplifying or completely automating.
Robotica Lecture 3. 2 Robot Control Robot control is the mean by which the sensing and action of a robot are coordinated The infinitely many possible.
2009/11/14GPW20091 Analysis of the Behavior of People Solving Sudoku Puzzles Reijer Grimbergen School of Computer Science, Tokyo University of Technology.
Integrating Background Knowledge and Reinforcement Learning for Action Selection John E. Laird Nate Derbinsky Miller Tinkerhess.
Chapter Five The Cognitive Approach II: Memory, Imagery, and Problem Solving.
CHAPTER FIVE The Cognitive Approach II: Memory, Imagery, and Problem Solving.
Future Memory Research in Soar Nate Derbinsky University of Michigan.
Soar: An Architecture for Human Behavior Representation
Chapter 4 Decision Support System & Artificial Intelligence.
Chapter 3 Human Resource Development
Human Abilities 2 How do people think? 1. Agenda Memory Cognitive Processes – Implications Recap 2.
Introduction of Geoprocessing Lecture 9. Geoprocessing  Geoprocessing is any GIS operation used to manipulate data. A typical geoprocessing operation.
Standards-Based Science Assessment. Ohio’s Science Cognitive Demands Science is more than a body of knowledge. It must not be misperceived as lists of.
Foundations of Information Systems in Business. System ® System  A system is an interrelated set of business procedures used within one business unit.
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
SOAR A cognitive architecture By: Majid Ali Khan.
1 Learning through Interactive Behavior Specifications Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University.
Autonomous Mission Management of Unmanned Vehicles using Soar Scott Hanford Penn State Applied Research Lab Distribution A Approved for Public Release;
Learning Objectives Understand the concepts of Information systems.
Competence-Preserving Retention of Learned Knowledge in Soar’s Working and Procedural Memories Nate Derbinsky, John E. Laird University of Michigan.
3/14/20161 SOAR CIS 479/579 Bruce R. Maxim UM-Dearborn.
Learning Procedural Knowledge through Observation -Michael van Lent, John E. Laird – 인터넷 기술 전공 022ITI02 성유진.
Information Aids for Diagnosis Tasks Based on Operators’ Strategies 김 종 현.
Knowledge Representation and Reasoning
Learning Fast and Slow John E. Laird
Using Cognitive Science To Inform Instructional Design
Knowledge Representation and Reasoning
Knowledge Representation
Bryan Stearns University of Michigan Soar Workshop - May 2018
Cognitive Processes PSY 334
CIS 488/588 Bruce R. Maxim UM-Dearborn
Presentation transcript:

Beyond Chunking: Learning in Soar March 22, 2003 John E. Laird Shelley Nason, Andrew Nuxoll and a cast of many others University of Michigan

Research Methodology in Cognitive Architecture 1.Pick basic principles to guide development 2.Pick desired behavioral capabilities 3.Make design decisions consistent above 4.Build/modify architecture 5.Implement tasks 6.Evaluate performance

Soar Basic Principle: Knowledge vs. Problem Search Knowledge Search Finds knowledge relevant to current situation Architectural – not subject to change with new knowledge Not combinatorial or generative Problem Search Controlled by knowledge, arises from lack of knowledge Subject to improvement with additional knowledge Generative – combinatorial

Desired Behavioral Capabilities Interact with a complex world - limited uncertain sensing Respond quickly to changes in the world Use extensive knowledge Use methods appropriate for tasks Goal-driven Meta-level reasoning and planning Generate human-like behavior Coordinate behavior and communicate with others Learn from experience Integrate above capabilities across tasks Behavior generated with low computational expense

Example Tasks TacAir-Soar & RWA-Soar Soar Quakebot Soar HauntbotSoar MOUTbotAmber EPIC-Soar NL-Soar The horse raced past the barn fell R1-Soar

Production Memory Working Memory Soar 101 East South North Propose Operator Compare Operators North > East South > East North = South Apply Operator OutputInput Select Operator If cell in direction is not a wall, --> propose operator move If operator will move to a bonus food and operator will move to a normal food, --> operator > If an operator is selected to move --> create output move-direction Input Propose Operator Compare Operators Select Operator Apply Operator Output If operator will move to a empty cell --> operator < North > East South < move- direction North

Soar 102: Subgoals East South North Propose Operator Compare Operators Apply Operator Output Input Select Operator Input Propose Operator Compare Operators Select Operator Tie Impasse Evaluate-operator (North) North = 10 Evaluate-operator (South) Evaluate-operator (East) = 10 = 5 Chunking creates rule that applies evaluate-operator North > East South > East North = South = 10 Chunking creates rules that create preferences based on what was tested

Learning Results Decisions Score random look-ahead no chunk look-ahead during chunking look-ahead after chunking

Soar 102: Dynamic Task Decomposition Achieve Proximity Employ Weapons Search Execute Tactic Scram Get Missile LAR Select Missile Get Steering Circle Sort Group Launch Missile Lock RadarLock IRFire-Missile Wait-for Missile-Clear If intercepting an enemy and the enemy is within range ROE are met then propose employ-weapons Employ Weapons If employing-weapons and missile has been selected and the enemy is in the steering circle and LAR has been achieved, then propose launch-missile Launch Missile If launching a missile and it is an IR missile and there is currently no IR lock then propose lock-IR Lock IR Execute Mission Fly-route Ground Attack Fly-Wing Intercept If instructed to intercept an enemy then propose intercept Intercept >250 goals, >600 operators, >8000 rules

Chunking Simple architectural learning mechanism Automatically build rules that summarize/cache processing Converts deliberate reasoning/planning to reaction Problem search => knowledge search Problem solving in subgoals determines what is learned Supports deliberate/reflective learning Leads to many different types of learning strategies If reasoning is inductive, so is learning

Why Beyond Chunking? Chunking requires deliberate processing (operators) to record experiences capture statistical regularities learn new concepts (data chunking) Processing for these is done only because we want the learning, not because it is performing a task Learning competes with task at hand Hard to implement, hard to use Are there other architectural learning mechanisms?

Episodic Learning [Andrew Nuxoll] What is it? Not facts or procedures but memories of specific events Recording and recalling of experiences with the world Characteristics of Episodic Memory Autobiographical Not confused with original experience Runs forward in time Temporally annotated Why add to Soar architecture? Not appropriate as reflective learning Provides personal history and identity Memories that can aid future decision making & learning Can generalize and analyze when time and more knowledge are available

Episodic Learning When is a memory recorded? Fixed period of time “Significant” event Significant change in highest activated working memory elements What are the cues for retrieval? Everything Only input Most “activated” input / everything Domain specific features Is retrieval automatic or deliberate? What is retrieved? Changes to input Changes to working memory Changes to activated How is the memory stored? As production rule What’s missing Sense of the time when episode occurred Current implementation is not task independent

Episodic Recall Implementation East South North Propose Operator Compare Operators Apply Operator Output Input Select Operator Tie Impasse Evaluate-operator (North) North = 10 = 3 If a memory matches, it computes correct next state If no memory matches, returns default evaluation [3].

Two Approaches 1.On-line Build memories as actions taken Attempt to recall memories during look-ahead Chunk use of memories during look-ahead 2.Off-line Randomly explore while memories are recorded Off-line attempt to recall and learn from recorded memories Chunk use of memories during look-ahead

On-line Episodic Learning

On-Line Episodic Learning

Off-Line Episodic Learning

Reinforcement Learning [Shelley Nason] Why add it to Soar? Might capture statistical regularities automatically/architecturally Chunking can do this only via deliberate learning Why Soar? Potential to integrate RL with complex problem solver Quantifiers, hierarchy, … How can RL fit into Soar? Learn rules that create numeric probabilistic preferences for operators Used only when symbolic preferences are inconclusive Decision based on all preferences that are recalled for an operator Why is this going to be cool? Dynamically compute Q-values based on all rules that match state Get transfer at different levels of generality

Example Numeric Preferences East South North North = 8North =12North =15North =1North =2North =10 North = 48/6 = 8 = 8

Reinforcement Learning East = 6 South = 3 North = 11 North = 10 Create rule that creates numeric preference for North in state A using values in State B and max(proposed operators) according to standard RL State A State B Conditions of rule? > Current: all of state > Future: what was tested to produce evaluation of State B but existed in State A

Reinforcement Learning Results Score Actions Random Learning 1 Learning 2 Learning 3 Learned Greedy

Architectural Learning Automatic & ubiquitous Task independent & fixed Bounded processing Single experience-based Examples: Chunking Episodic learning Reinforcement learning Semantic/concept learning? Deliberately engaged “On top” of architecture Uses knowledge to control Uses architectural learning Can change with learning Unbounded processing Can generalize across multiple examples through recall Examples: Task acquisition Learning by instruction Learning by analogy Recovery from incorrect knowledge … Deliberate/Reflective Learning

Reflective Learning What is required to support reflective/deliberate learning? In Soar, impasses and subgoals are important What about Act-R 5.0? Seems that has declarative strategy? Way to make a decision at meta-level? Get access to memory indirectly? Syntactically complete Can learn anything that can represented