Download presentation
Presentation is loading. Please wait.
Published byGloria Bennett Modified over 9 years ago
1
Joseph Xu Soar Workshop 2012 1 Learning Modal Continuous Models
2
Setting: Continuous Environment Input to the agent is a set of objects with continuous properties – Position, rotation, scaling,... Output is fixed-length vector of continuous numbers Agent runs in lock-step with environment Fully observable 2 Output -9.0 5.8 Input EnvironmentAgent 0.21.20.0 px py rx ry A B 0.00.2 pz rz 3.43.90.0 px pypz 0.0 rx ryrz AB
3
Levels of Problem Solving 3 Motor Babbling Continuous Sampling Methods (RRT) Symbolic Model Free Methods (RL) Continuous Model Symbolic Abstraction Symbolic Planning Symbolic Model Slower Task Completion Specific Solutions Faster Task Completion General Solutions Problem Solving Method Knowledge Required Characteristics None Goal Recognition
4
Continuous Model Learning Learn a function x: current continuous state vector u: current output vector y: state vector in next time step 4 xuy Continuous Output
5
Locally Weighted Regression Motor Command left voltage: -0.6 right voltage: 1.2 ? xu k nearest neighbors Weighted Linear Regression 5
6
Problems with LWR Euclidean distance doesn’t capture relational similarity Averages over neighbors exhibiting different types of interactions 6 Query Neighbor
7
Problems with LWR 7 Query Neighbor Prediction Euclidean distance doesn’t capture relational similarity Averages over neighbors exhibiting different types of interactions
8
Modal Models Object behavior can be categorized into different Modes – Behavior within a single mode is usually simple and smooth (inertia, gravity, etc...) – Behaviors across modes can be discontinuous and complex (collisions, drops) – Modes can often be distinguished by discrete spatial relationships between objects Learn two-level models composed of: – A classifier that determines the active mode using spatial relationships – A set of linear functions (initial hypothesis), one for each model 8 Mode Classifier Mode 1 model Mode 2 model Mode 3 model ScenePrediction
9
Unsupervised Learning of Modes From Data 9 Environment Mode 2 time Mode 1 Expectation Maximization Learned Mode 1 Learned Mode 2 Continuous Features Training Data 0.5, 1.1, -0.2, 4, 1721.9
10
Expectation Maximization Expectation Assuming your current model parameters are correct, what is the likelihood that the model m generated data point i ? Maximization Assuming each data point was generated by the most probable model, modify each model’s parameters to maximize likelihood of generating data Iterate until convergence to local maximum 10
11
Learning Classifier 11 Spatial Relations Training Data 0.5, 1.1, -0.2, 4, 1721.9 time Scene left-of(A,B) = 1 right-of(A,B) = 0 on-top(A,B) = 0 touch(A,B) = 0 A B 1000101011011 0101011010100 1100101100000 1010111010100 0010100010101 1110100010100 0001010100111 1111010101010 1010100001001 1010101010011 0100110010101 1 class 1 1 1 1 2 2 2 2 1 1 attributes 1000101011011 Expectation Maximization Learned Mode 1 Learned Mode 2
12
Learning Classifier 12 0101011010100 1100101100000 1010111010100 0010100010101 1110100010100 0001010100111 1111010101010 1010100001001 1010101010011 0100110010101 10001010110111 Classifier Training Data attributesclass 1 1 1 1 2 2 2 2 1 1 touch(A, B) left-of(A, B) mode 1mode 2 10 10 Use linear model for items in same model
13
Prediction Accuracy Experiment 2 Block Environment – Agent has two outputs (dx, dy) which control the x and y offsets of the controlled block at every times tep – The pushed block can’t be moved except by pushing it with the controlled block – Blocks are always axis-aligned, there’s no momentum Training – Instantiate Soar agent in a variety of spatial configurations – Run 10 time steps, each step is a training example Testing – Instantiate Soar agent in some configuration – Check accuracy of prediction for next time step 13
14
Prediction Accuracy – Pushed Block 14
15
Classification Performance 15
16
Prediction Performance Without Classification Errors 16
17
Levels of Problem Solving 17 Motor Babbling Continuous Sampling Methods (RRT) Symbolic Model Free Methods (RL) Continuous Model Symbolic Abstraction Symbolic Planning Symbolic Model Slower Task Completion Specific Solutions Faster Task Completion General Solutions Problem Solving Method Knowledge Required Characteristics None Goal Recognition
18
Symbolic Abstraction Lump continuous states sharing symbolic properties into a single symbolic state Should be Predictable – Planning requires accurate model (ex. STRIPS operators) – Tends to require more states, more symbolic properties Should be General – Fast planning and transferrable solutions – Tends to require fewer states, fewer symbolic properties 18 C2 C1 S1 S2 C1 S1: intersect(C1, C2) S2: ~intersect(C1, C2)
19
Symbolic Abstraction Hypothesis: contiguous regions of continuous space that share a single behavioral mode is a good abstract state – Planning within modes is simple because of linear behavior – Combinatorial search occurs at symbolic level Spatial predicates used in continuous model decision tree are a reasonable approximation 19
20
Abstraction Experiment 3 blocks, goal is to push c2 to t Demonstrate a solution trace to agent Agent stores sequence of abstract states in solution in epmem Agent tries to follow plan in analogous task Abstraction should include predicates about c1, c2, t, avoid predicates about d1, d2, d3 20 C2 C1 t d1 d2 d3 C2 C1 C2 C1 t d1 d2 d3 C2 C1
21
Generalization Performance 21 80 Tasks Total (16 average)
22
Conclusions For continuous environments with interacting objects, modal models are more general and accurate than uniform model The relationships that distinguish between modes serve as useful symbolic abstraction over continuous state All this work takes Soar toward being able to autonomously learn and improve behavior in continuous environments 22
23
Evaluation Coal Scaling issues: linear regression is exponential in number of objects Linear modes is insufficient for more complex physics such as bouncing -> catastrophic failure Nuggets Modal model learning is more accurate and general than uniform models Abstraction learning results are promising, but preliminary 23
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.