Curious Reconfigurable Robots Kathryn Merrick ITEE Seminar – Friday 29 th August Collaborators: Dayne Schmidt, Elanor Huntington
Overview Why build curious robots? Background Curious robots for creative play Strengths and limitations Current and future work
Why Build Curious Robots? Explore the scalability of existing curiosity algorithms Build adaptive robots: –Fault tolerant –Able to use tools –Reconfigurable –Creative
Adaptable Robots Big Dog (Boston Dynamics, 2008) (Bongard et al, 2006)
Self-Reconfigurable Robots Research focuses on: –Hardware –Communication –Navigation (University of Pennsylvania, 2008)
Curious, Developmental Robots Curious Aibo (Sony CSL, ) Research focus is on behaviour Static robot configuration
Curious, Reconfigurable Robots Assume a changing robot configuration Focus on behavioural algorithms
Perceiving the World S | ε ( : ) A unique identifier Real number A | ε A unique identifier World is represented as a variable length string from a grammar, rather than a fixed length vector World comprises states and actions
Modelling Curiosity Curiosity is modelled as a function of novelty and interest
Curious Reinforcement Learning Agent learns a mapping of states to actions and utility values Agent selects best action most of the time and a random action some of the time A1A1 A2A2 A3A3 S1S S2S S3S Reward = Curiosity value
Architecture for a Curious, Reconfigurable Robot Device Manager Resource Manager Curious Agent PerceptionCuriosityLearningActivation Abstract Sensor Abstract Actuator Memory Agent Layer Device Layer
Curious Robots for Creative Play Explores relationship between structure and behaviour Creative thinking spiral (Resnick, 2007) Imagination, creativity, play, sharing and reflection
Strengths Robot is reconfigurable – sensors and effectors can be added or removed New behaviour emerges for new structure Relationship between structure and behaviour revealed
Limitations Not exhibiting the same cyclic behaviour seen in simulated agents Cyclic behaviour harder to measure Learning takes a long time S1S1 S2S2 S3S3 (a) A1 A1 A2A2 A3A3 (b) T1 T1 T2T2 T3T3 (c)
Curious Social Force Models for Reconfigurable Robots CDF Project: Dayne Schmidt
Measuring the Performance of Curious Robots Characterising attention focus using bifurcation diagrams With Elanor Huntington 1A(move fwd port 4) 2A(move fwd port 6) 3A(move bkd port 4) 4A(move bkd port 6) 5A(stop port 4) 6A(stop port 6)
Measuring a Curious, Reconfigurable Robot Focus of attention shifts as robot is reconfigured 1A(move fwd port 4) 2A(move bkd port 4) 3A(stop port 4) 4A(move fwd port 6) 5A(move bkd port 6) 6A(stop port 6)
Learning Approaches for Curious Robots Representing the state-action table as a neural network reduces memory requirements Attention focus limited to simple tasks A1A1 A2A2 A3A3 s1s1 s2s2 s3s3 s4s4 s5s5 w 11 w 21 w 31 w 41 w 51
Alternatives to Curiosity? Modelling behaviour cycles as intrinsic reward for learning In natural systems behaviour cycles occur at biological, cognitive and social levels –Sleep-wake cycle, seasonal, tidal or lunar cycles –Learning cycle, habituation and recovery –Fashion cycles, sociological cycles (Ahlgren and Halberg, 1990)
Behaviour Cycles as Intrinsic Reward Advantages in natural systems include: –Anticipation, efficiency, competition, navigation –Creativity through exploration –Social self-advancement Also potential advantages for artificial systems
Conclusions We have created a curious, reconfigurable robot: –Sensors and effectors can be added or removed –New behaviour emerges for new structure Ongoing work for: –Understanding curiosity in complex environments –Learning speed and representation –Measurement of emergent behaviour –Alternatives to curiosity as motivation for reconfigurable robots