Peeping into the Human World Sociable Robots Peeping into the Human World
An Infant’s Advantages Non-hostile environment Actively benevolent, empathic caregiver Co-exists with mature version of self
Baby Scheme Physical form can evoke nurturing response Caregiver exaggerates voice, gestures
Kismet – a Baby Robot
Requirements Robot needs to perceive human state Computer vision, speech processing Human needs to perceive robot state Animatronics, speech generation Closed loop interaction requires both directions
Readable locus of attention Attention can be deduced from behavior Or can be expressed more directly
Expressing locus of attention Animatronic expression of locus of attention Small distance between cameras with wide and narrow field of views, simplifying mapping between the two Central location of wide camera allows head to be oriented accurately independently of the distance to an object Allows coarse open-loop control of eye direction from wide camera – improves gaze stability
Computing locus of attention “Cyclopean” camera Stereo pair Useful to have relatively stable view for computer vision tasks. Deviation from human arrangement. Retinotopic view is pretty unstable Need narrow field of view for precise foveation Could add wider field of view resource to eyes – solution on Cog Tricky aesthetically with cameras at time (and still now for wide field of view and not too expensive) So place on head Auxiliary benefit – not affected at all by eye-movement Makes for more stable system Open-loop control of eyes based on head-fixed imaging, refined by closed-loop control based on eye-fixed imaging
Computing locus of attention Visual input Pre-attentive filters Saliency map Pursuit (Cyclopean camera) Perceptual biases. Don’t have to exact – decision will be made attentively. Analogous to pop-out, pre-attentive processes in humans. Massively parallel, relatively simple operations that filter and prioritize the scene. Coarse behavioral influence. Primary information flow Modulatory influence Behavior system Eye movement
Looking Preference Internal influences bias how salience is measured “Seek toy” – low skin gain, high saturated-color gain Looking time 28% face, 72% block “Seek face” – high skin gain, low color saliency gain Looking time 80% face, 20% block Internal influences bias how salience is measured The robot is not a slave to its environment Prefers behaviorally relevant stimuli
Example (video)
Example (video) Robot’s search is task-specific Still opportunistic when appropriate Visual behavior conveys degree of commitment Gaze direction, expression conveys interest
Interpersonal Distance Comfortable interaction distance
Interpersonal Distance
Interpersonal Distance
Interpersonal Distance
Interpersonal Distance
Examples (video) “Back off buster!” “Come hither, friend” Robot backs away if person comes too close Cues person to back away too – social amplification Robot makes itself salient to call a person closer if too far away
Negotiating object showing Comfortable interaction speed Too fast – irritation response Too fast, Too close – threat response
Example (video)
Facial expressions
Facial Expressions (Russell, Scott&Smith) arousal surprise afraid elated angry stress excitement happy frustrated displeasure pleasure neutral sad content depression calm fatigued relaxed Emotions can be mapped to affect dimensions (Russell) Facial postures are related in a systematic way to these affective dimensions (Smith & Scott) bored sleepy sleep
Generating Posture, Expressions Open stance Low arousal anger fear tired Negative valence content Positive valence High arousal Closed stance
Example Facial Expression
With Posture (video)
Emotive Voice Quality
Synthesized Emotive Speech
With Vocalizations (video)
But… Is there more to life than being really, really, really, ridiculously good-looking?
Affective Intent (video)
Fernald’s Results Four cross-cultural contours of infant-directed speech Exaggerated prosody matched to infant’s innate responses time (ms) pitch, f (kHz) o approval That’s a good bo-o-y! No no baby. time (ms) pitch, f (kHz) o prohibition Can you get it? time (ms) pitch, f (kHz) o attention time (ms) pitch, f (kHz) o MMMM Oh, honey. comfort
Evidence for Fernald-like Contours Approval Prohibition Attention Soothing
Performing Recognition prohibition & high-energy neutral attention & approval energy variance soothing & low-energy neutral pitch mean Breazeal & Aryananda, Humanoids 2000
Examples (video)
Turn-Taking Cornerstone of human-style communication, learning, and instruction Four phases of turn cycle relinquish floor listen to speaker reacquire floor speak Integrates visual behavior & attention facial expression & animation body posture vocalization & lip synchronization
Example (video)
Kismet’s really, really, ridiculously detailed web-pages: Conclusions Robots can partake in “infant-caregiver” interactions These interactions are rich with scaffolding acts Prerequisite for socially situated learning Kismet’s really, really, ridiculously detailed web-pages: http://www.ai.mit.edu/projects/kismet/