Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Mental Simulation and Learning in the I CARUS Architecture Thanks to D. Choi, G. Cleveland, A. Danielescu, N. Li, D. and D. Stracuzzi for their contributions. This talk reports research partly funded by a grant from the Office of Naval Research, which is not responsible for its contents.
Cognitive Architectures A cognitive architecture (Newell, 1990) is the infrastructure for an intelligent system that is constant across domains: the memories that store domain-specific content the systems representation and organization of knowledge the mechanisms that use this knowledge in performance the processes that learn this knowledge from experience An architecture typically comes with a programming language that eases construction of knowledge-based systems. Research in this area incorporates many ideas from psychology about the nature of human thinking.
The I CARUS Architecture I CARUS (Langley, 2006) is a computational theory of the human cognitive architecture that posits: It shares these assumptions with other cognitive architectures like Soar (Laird et al., 1987) and ACT-R (Anderson, 1993) Short-term memories are distinct from long-term stores 2. 2.Memories contain modular elements cast as symbolic structures 3. 3.Long-term structures are accessed through pattern matching 4. 4.Cognition occurs in retrieval/selection/action cycles 5. 5.Learning involves monotonic addition of elements to memory 6. 6.Learning is incremental and interleaved with performance
Goals for I CARUS Our main objectives in developing I CARUS are to produce: a computational theory of higher-level cognition in humans that is qualitatively consistent with results from psychology that exhibits as many distinct cognitive functions as possible Although quantitative fits to specific results are desirable, they can distract from achieving broad theoretical coverage.
Distinctive Features of I CARUS However, I CARUS also makes assumptions that distinguish it from these architectures: Some of these tenets also appear in Bonasso et al.s (2003) 3T, Freeds (1998) APEX, and Sun et al.s (2001) CLARION Cognition is grounded in perception and action 2. 2.Categories and skills are separate cognitive entities 3. 3.Short-term elements are instances of long-term structures 4. 4.Skills and concepts are organized in a hierarchical manner 5. 5.Inference and execution are more basic than problem solving
Cascaded Integration in I CARUS I CARUS adopts a cascaded approach to integration in which lower-level modules produce results for higher-level ones. conceptual inference skill execution problem solving learning Like other unified cognitive architectures, I CARUS incorporates a number of distinct modules.
Structure and Use of Conceptual Memory I CARUS organizes conceptual memory in a hierarchical manner. Conceptual inference occurs from the bottom up, starting from percepts to produce high-level beliefs about the current state.
I CARUS Concepts for In-City Driving ((in-rightmost-lane ?self ?clane) :percepts ((self ?self) (segment ?seg) (line ?clane segment ?seg)) :relations ((driving-well-in-segment ?self ?seg ?clane) (last-lane ?clane) (not (lane-to-right ?clane ?anylane)))) ((driving-well-in-segment ?self ?seg ?lane) :percepts ((self ?self) (segment ?seg) (line ?lane segment ?seg)) :relations ((in-segment ?self ?seg) (in-lane ?self ?lane) (aligned-with-lane-in-segment ?self ?seg ?lane) (centered-in-lane ?self ?seg ?lane) (steering-wheel-straight ?self))) ((in-lane ?self ?lane) :percepts ((self ?self segment ?seg) (line ?lane segment ?seg dist ?dist)) :tests ((> ?dist -10) (<= ?dist 0)))
Representing Short-Term Beliefs/Goals (current-street me A)(current-segment me g550) (lane-to-right g599 g601)(first-lane g599) (last-lane g599)(last-lane g601) (at-speed-for-u-turn me)(slow-for-right-turn me) (steering-wheel-not-straight me)(centered-in-lane me g550 g599) (in-lane me g599)(in-segment me g550) (on-right-side-in-segment me)(intersection-behind g550 g522) (building-on-left g288)(building-on-left g425) (building-on-left g427)(building-on-left g429) (building-on-left g431)(building-on-left g433) (building-on-right g287)(building-on-right g279) (increasing-direction me)(buildings-on-right g287 g279)
Skill Execution in I CARUS This process repeats on each cycle to produce goal-directed but reactive behavior, biased toward continuing initiated skills. Skill execution occurs from the top down, starting from goals to find applicable paths through the skill hierarchy.
((in-rightmost-lane ?self ?line) :percepts ((self ?self) (line ?line)) :start ((last-lane ?line)) :subgoals ((driving-well-in-segment ?self ?seg ?line))) ((driving-well-in-segment ?self ?seg ?line) :percepts ((segment ?seg) (line ?line) (self ?self)) :start ((steering-wheel-straight ?self)) :subgoals ((in-segment ?self ?seg) (centered-in-lane ?self ?seg ?line) (aligned-with-lane-in-segment ?self ?seg ?line) (steering-wheel-straight ?self))) ((in-segment ?self ?endsg) :percepts ((self ?self speed ?speed) (intersection ?int cross ?cross) (segment ?endsg street ?cross angle ?angle)) :start ((in-intersection-for-right-turn ?self ?int)) :actions ((steer 1))) I CARUS Skills for In-City Driving
Execution and Problem Solving in I CARUS Executed plan Problem ? Skill Hierarchy Primitive Skills Reactive Execution impasse? Problem Solving yes no Problem solving involves means-ends analysis that chains backward over skills and concept definitions, executing skills whenever they become applicable.
I CARUS Learns Skills from Problem Solving Executed plan Problem ? Skill Hierarchy Primitive Skills Reactive Execution impasse? Problem Solving yes no Skill Learning
Learning from Problem Solutions operates whenever problem solving overcomes an impasse incorporates only information available from the goal stack generalizes beyond the specific objects concerned depends on whether chaining involved skills or concepts supports cumulative learning and within-problem transfer I CARUS incorporates a mechanism for learning new skills that: This skill creation process is fully interleaved with means-ends analysis and execution. Learned skills carry out forward execution in the environment rather than backward chaining in the mind.
I CARUS Summary includes hierarchical memories for concepts and skills; interleaves conceptual inference with reactive execution; resorts to problem solving when it lacks routine skills; learns such skills from successful resolution of impasses. I CARUS is a unified theory of the cognitive architecture that: We have developed I CARUS agents for a variety of simulated physical environments, including urban driving. However, it has a number of limitations that we must address to improve its coverage of human intelligence.
Limitations of I CARUS Learning Abilities storing states that arise in each step of the given solution using means-ends analysis to explain why each step occurred acquiring a new skill for each subproblem explained this way I CARUS provides a plausible account for learning hierarchical skills from successful problem solving. Recent work (Li et al., in press) has adapted this mechanism to learn from worked-out problem solutions by: However, I CARUS cannot learn from mistakes, such as those that result from unexpected goal interactions.
Goal-Driven Execution: A Recipe for Disaster This goal determines which path through the skill hierarchy I CARUS selects for execution. As a result, the system ignores already satisfied goals while working on this objective. I CARUS incorporates a goal memory that contains a prioritized set of top-level goals. On each cycle, the architecture notes the most important goal not satisfied by its current beliefs. However, unforseen interactions among goals can produce undesirable outomes. For instance, suddenly changing lanes to avoid a stalled vehicle can lead to collision with another one.
Learning from Goal Violations An extended I CARUS that learns from unforseen events might: Implementing this approach requires three basic extensions to the I CARUS architecture Encounter a situation in which pursuing goal A leads it to violate previously satisfied goal B Use counterfactual reasoning to identify what it could have done differently to avoid the error Analyze the alternative to acquire a specialzed skill indexed by goals A and B In future runs, prefer the specialized skill during execution, leading it to avoid the error.
An Episodic Belief Memory retains all beliefs inferred on earlier cognitive cycles; and annotates beliefs with time stamps specifying when they held. Before it can analyze the reasons why an error occurred, I CARUS must encode its previous experience. We have introduced an episodic belief memory (Stracuzzi et al., in press) that: These let the architecture reconstruct states that the agent has encountered recently. The current implementation has no mechanisms for forgetting or retrieval, but we plan to add these in the future.
Learning from Counterfactual Reasoning works backward from the violated goal to consider the agents choices at each step; carries out repeated forward search to find a path that would have avoided the goal violation; and analyzes this path to create a new skill that takes both goals into account. Before it can learn what it should have done differently, I CARUS must identify an alternative behavioral trajectory. We have developed a counterfactual reasoning capability that: Because analysis starts with the conjoined goal, it produces a new skill with a specialized head and preconditions.
on- left- side crossing-into-left- lane-straight avoid- obstacles lane-aligned- straight crossing- into-left- lane wheels- straight throttle - special - value crossing-into -left-lane on-left-side crossing- into-left- lane on-right- side crossing- into- right- lane crossing- into- right- lane wheels- straight on-right- side failed attempt successful attempt failed attempt A Trace of Counterfactual Reasoning
A Specificity Bias for Skill Execution skills with more specific heads that match top-level goals skills with more specific conditions that match the state For I CARUS to benefit from skills learned by its counterfactual reasoning, it must prefer them over ones that caused errors. We have altered the architectures execution module to prefer: These lead I CARUS to mask skills indexed by single goals with ones that handle goal interactions. This in turn lets the system improve its ability to avoid errors in an incremental, cumulative manner.
Related Work on Error-Driven Learning Learning search-control rules by discrimination in SAGE (Langley, 1985) Analytical learning from failure in Soar (Laird et al., 1986) and Prodigy (Minton, 1988) Ohlssons (1996) theory of learning from constraint violations Mueller and Dyers (1985) model of learning by daydreaming Our approach to learning from execution errors differs from, but bears similarities to: The latter comes closest to our use of counterfactual reasoning, but it was not cast within a unified cognitive architecture.
Yet people can reason more deeply about the goals and actions of others, then use their inferences to make decisions. Research Plans: Reasoning about Others The framework can deal with other independent agents, but only by viewing them as other objects in the environment. We designed I CARUS to model intelligent behavior in embodied agents, but our work to date has treated them in isolation. Adding this capability to I CARUS will require extending its representation, performance processes, and learning methods.
(goal me (in-left-lane me segment16)) (belief me (goal driver2 (in-right-lane driver2 segment16))) (belief me (belief driver2 (in-right-lane me segment16))) (goal me (belief driver2 (goal me (in-left-lane me segment16)))) An Extended Representation For I CARUS to reason about other agents mental states, it must first represent and store them. We plan to introduce modal predicates like belief, goal, and intention to modify inferences like: This scheme eliminates the need for separate goal and belief memories, so a single working meomory will suffice. We can also include time stamps with each substructure to indicate its temporal scope.
A Flexible Inference Mechanism The current I CARUS inference process is both deductive and exhaustive, making it implausible and ineffective. The revised architecture will carry out hill climbing through a space of possible worlds (truth assignments on ground literals). Each step will involve changing an existing literals truth value or generating an entirely new literal. I CARUS will guide its inferential choices either by posterior probabilities or by expected values. The system will also take into account recency of elements matched by consequents or antecedents. This approach is influenced by Polyscheme, Markov logic, and theories of spreading activation.
Default Reasoning and Revisions Given basic inference rules, these changes should let I CARUS make abductive leaps about others mental states. The agents initial statements about others beliefs will be the same as those for the agent. But additional information can lead the system to revise these assumptions nonmonotically when needed. E.g., we assume that others can see what we see, then alter these beliefs if we note evidence otherwise. This explains why making inferences about others often takes extra time and effort.
Learning to Reason about Others Reasoning about others comes more easily to the experienced than to children and novices. We can explain this with a mechanism that learns inference rules from empirical regularities among beliefs by: Generating new structures based on co-occurrences of literals in working memory; and Updating probabilities associated with antecedents and rules based on later co-occurrences. When these specialized rules drive inference, they mask more basic ones, reducing the need for later revisions. This causes more direct inferences about others mental states, thus reaching conclusions with less time and effort.
Summary of Planned Research To provide I CARUS with the capability to reason about others mental states, we plan to: Extend its representation to support embedded modal literals; Alter inference to hill climb through possible worlds guided by recencies and probabilities; Combine default reasoning about others with nonmonotonic revision when appropriate; and Acquire specialized inference rules from experience to reduce the need for such belief revision. We will implement these extensions to I CARUS and test them in urban driving and other settings.
End of Presentation