Download presentation
Presentation is loading. Please wait.
Published byMyron Wilkins Modified over 9 years ago
1
1 Learning through Interactive Behavior Specifications Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University of Michigan
2
2 Goal Automatically generate cognitive agents Reduce the cost of agent development Reduce the expertise required to develop agents.
3
3 Domains Autonomous Cognitive agents Dynamic Virtual Worlds Real time decisions based on knowledge and sensed data Soar agent architecture
4
4 Learning by Observation Approach: Observe expert behavior Learn to replicate it Why? We may want human-like agents In complex domains, imitating humans maybe easier than learning from scratch
5
5 Bottleneck in pure Learning by Observation PROBLEM: You cannot observe the internal reasoning of the expert SOLUTION: Ask the expert for additional information Goal annotations Use additional knowledge sources Task & domain knowledge
6
6 Learning by Observation Agent ActionsPercepts Learner Goal annotations Additional Task Knowledge Interface Environment Expert
7
7 Agent Interface Environment ILP 2004 Machine Learning Journal (forthcoming) Learning by Observation
8
8 Learning by Observation Critic Mode Agent Interface Environment Expert critic Learner
9
9 One Body, Two Minds ? How and when to switch control How the expert and the agent program communicate ? Agent Interface Environment Expert
10
10 Expert Diagrammatic Behavior Specification Agent Environment Redux Learner
11
11 Redux Visual rule editing Diagrammatic Behavior Specification
12
12 Get-item-in-room(Item) Get-item(Item) Go-through(Door) Goto-next-room Get-item-different-room(Item) Go-to-door(D)Go-to(Door) Goal Hierarchy Task-Performance knowledge is represented with a hierarchy of durative goals. i3 r1 r2 r3 r4 d1 d2 d3d4 d5 d6 i4 i3
13
13 r1 r2 r3 r4 d1 d2 d3d4 d5 d6 i4 i3 Get-item-in-room(Item) Get-item(i3) Go-through(Door) Goto-next-roomGet-item-different-room(Item) Go-to-door(D) Go-to(Door) i3 Get-item-in-room(i3) Item=i3 Goal Hierarchy
14
14 r1 r2 r3 r4 d1 d2 d3d4 d5 d6 i4 i3 Get-item-different-room(Item)Get-item-different-room(i3) Go-to(Door) Get-item-in-room(Item) Get-item(i3) Go-through(Door) Go-to(d1) i3 Door=d1 Item=i3 Goal Hierarchy
15
15 r1 r2 r3 r4 d1 d2 d3d4 d5 d6 i4 i3 Get-item-in-room(Item) Get-item(i3) Go-through(d1) Goto-next-room Get-item-different-room(i3) Go-to-door(D)Go-to(Door) i3 Door=d1 Goal Hierarchy
16
16 r1 r2 r3 r4 d1 d2 d3d4 d5 d6 i4 i3 Get-item-in-room(Item) Get-item(i3) Go-through(Door) Goto-next-room Get-item-different-room(i3) Go-to-door(D)Go-to(d3) i3 Door=d3 Goal Hierarchy
17
17 Behavior Specification Agent Expert Expert draws initial abstract situation Create senario by selecting actions
18
18 Goal Specification Agent Expert Goals are explicitly selected The agent contributes based on the current situation, current goal and its knowledge
19
19 Switching Roles Expert generates behavior if the agent doesn’t know how to pursue the current goal Agent may propose goals, subgoals and actions If the agent is correct, the expert observes and validates Otherwise rejects, corrects, or takes over Key to the interaction is shared goals shared assumption about the current situation
20
20 Goal Hierarchy Learning by Observation perspective Unobservable mental reasoning of the expert Learning Perspective Bias hypothesis space “learn agent” problem reduced to “learn goal selection and termination” MI Perspective information exchange between the expert and the agent
21
21 Relevant Knowledge Specification Agent Prepare food Expert can mark important objects in a decision Expert
22
22 Expert specified undesired actions and goals Expert rejected actions and goals of the approximately learned agent program Watch TV Rich Behavior Trace
23
23 Hypothetical Actions and Goals Situation history : a tree structure of possible behaviors Rich Behavior Trace
24
24 Input: Relational Situations Goal and action selections and rejections Additional annotations (i.e. important objects) Background knowledge Output: Rule based agent program Learn goal/action selection/termination generalizing over multiple examples Inductive Logic Programming to combine rich knowledge structures Relational Learning by Observation
25
25 Relational Learning by Observation
26
26 Find the common structures in the decision examples Relational Learning by Observation
27
27 ? “Select a door in the current room, which leads to a room that contains the item the agent wants to get” Learn relations between what the agent wants, perceives and knows. Relational Learning by Observation
28
28 Comparing Redux to LBO Advantages of Redux No real time constraints on behavior i.e. no waiting for a 2 hour long goal can be used to describe unlikely, but critical situations i.e. “Let’s assume that there is a nuclear melt-down.” Richer annotation opportunities Increase learning speed and quality Faster focus where knowledge is lacked most Immediate expert feedback on how rules behave
29
29 Comparing Redux to LBO Disadvantages of Redux Can’t learn low level behavior. Contains domain specific components Although most of Redux is domain independent Generating behavior may be slower. Additional annotations improve learning but require extra expert effort
30
30 Relational Behavior Trace A Situation: a symbolic snapshot of the observed environment at a time Behavior Trace : The Set of Situations in execution history
31
31 Annotated Behavior Traces Behavior is annotated with actions and goals: goto-room(r1), etc.
32
32 Summary Diagrammatic behavior specification approach: To extract rich behavior knowledge Interactive behavior specification Communication medium between the agents (explicit goals and assumed situation) Relational learning by observation approach to combine multiple complex knowledge sources
33
33 Future Work Improve mixed initiative interaction of the interface Explore domain independent diagrammatic interface features Allow the expert to enter context sensitive knowledge
34
34 Mixed initiative perspective Interactive behavior specification Diagrammatic representation of behavior communication medium between the agents Explicit goals and desired behavior Facilitates interaction between the agents
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.