1 Update on Learning By Observation Learning from Positive Examples Only Tolga Konik University of Michigan
2 GOAL Generate AI agents by observing expert task execution Engineering Goal Reduce the cost of agent development Reduce the expertise required to develop agent development. AI Goal Agents that improve themselves observing experts
3 Learning Framework Episodic Database Behavior trace rules Annotations Agent Architecture Agent Program Background Knowledge examples Expert Annotated Behavior trace Behavior Recorder Environmental Interface Training Set Generator Concept Learner (ILP) Knowledge Generator Environment external Internal
4 Learning with Redux Episodic Database Behavior trace rules Annotations Agent Architecture Agent Program Background Knowledge examples Expert Annotated Behavior trace Behavior Recorder Environmental Interface Training Set Generator Concept Learner (ILP) Knowledge Generator Environment external Internal Redux
5 Current Experiments Episodic Database Behavior trace rules Annotations Agent Architecture Agent Program Background Knowledge examples Expert Annotated Behavior trace Behavior Recorder Environmental Interface Training Set Generator Concept Learner (ILP) Knowledge Generator Environment external Internal Expert Soar Agent
6 Episodic Database Behavior trace rules New Agent Program Annotations Agent Architecture Agent Program Background Knowledge examples Annotated Behavior trace Behavior Recorder Environmental Interface Training Set Generator Concept Learner (ILP) Knowledge Generator Environment external Internal Expert Learning Framework: Mode2
7 Experiments in Haunt 2 Domain
8 d1d2d3d4 Move-to example move-to-via-node move-to-connected-node r1 r2 r3 r4 d1 d2 d3d4 d5 d6 i4 i3 d5b d6b r3 move-to-area
9 move-to-via-node(Node) move-to-area(Area) An Example in Haunt Domain r1 r2 r3 r4 d1 d2 d3d4 d5 d6 move-to-connected-node(Node)
10 r1 r2 r3 r4 d1 d2 d3d4 d5 d6 move-to-via-node(Node) move-to-area(Area) move-to-connected-node(Node) An Example in Haunt Domain
11 r1 r3 d1 Correct selection condition for move-to-via-node move-to-via-node(Node) move-to-area(Area) move-to-connected-node(Node) An Example in Haunt Domain
12
13 Termination(A) A positivenegative Example Generation Operator Concepts
14 Selection(A) AB positive negative Example Generation Operator Concepts
15 A Positive Example: selection(Sit 20, move-to-via-node(d 1 ) ) r1 r2 r3 r4 d1 d2 d3d4 d5 d6 i4 i3 d5b d6b Learning Examples
16 General to Special Search with positive and negative examples
17 General to Special Search with positive and negative examples
18 General to Special Search with positive and negative examples
19 General to Special Search with positive and negative examples
20 General to Special Search with positive and negative examples
21 move-to-via-node Selection(move-to-via-node) r1 r2 r3 r4 d1 d2 d3d4 d5 d6 i4 i3 d5b d6b move-to-connected-node Problem in Choosing Parameters
22 move-to-via-node Positive Negative Selection(move-to-via-node) r1 r2 r3 r4 d1 d2 d3d4 d5 d6 i4 i3 d5b d6b move-to-connected-node Problem in Choosing Parameters
23 Specific to General Learning with Positive Examples only d1 Positive Difficult to deal with inconsistent examples
24 General to Specific Learning with Positive Examples Only Positive
25 General to Specific Learning with Positive Examples Only d1 Positive
26 A Positive Example of move-to-via-node: r1 r2 r3 r4 d1 d2 d3d4 d5 d6 i4 i3 d5b d6b Learning Examples
27 Random Examples of move-to-via-node r1 r2 r3 r4 d1 d2 d3d4 d5 d6 i4 i3 d5b d6b For each positive example, use the same situation with parameters selected in other situations Learning Examples
28 Nuggets Move-to operators are learned in Haunt domain ~ 3 mins of trace ~ situations ~ 10 min to prepare examples ~20 min for learning.
29 Coals Missing Components It is still research not a tool