Learning about Objects by Experiment – Paul Fitzpatrick – MIT Artificial Intelligence Laboratory
machine perception algorithms robust perception training data parameter tweaking
self-training perceptual system algorithms opportunities training data robust perception more algorithms
making opportunities In robotics, vision is often used to guide manipulation But manipulation can also guide vision Important for… Experimentation progressing when perception is ambiguous Correction recovering when perception is misleading Development bootstrapping when perception is primitive
a simple scene? Edges of table and cube overlap Color of cube and table are poorly separated Cube has misleading surface pattern Maybe some cruel grad-student faked the cube with paper, or glued it to the table
active segmentation Object boundaries are not always easy to detect visually
active segmentation Object boundaries are not always easy to detect visually Solution: Cog sweeps through ambiguous area
active segmentation Object boundaries are not always easy to detect visually Solution: Cog sweeps through ambiguous area Resulting object motion helps segmentation
active segmentation Object boundaries are not always easy to detect visually Solution: Cog sweeps through ambiguous area Resulting object motion helps segmentation Robot can learn to recognize and segment object without further contact
active segmentation
active segmentation
integrated behavior
segmentation examples
segmentation examples table car
boundary fidelity cube car bottle ball 0.05 0.1 0.15 0.2 0.25 0.3 0.04 0.08 0.12 0.16 cube car bottle ball (square root of) second Hu moment measure of anisiotropy measure of size (area at poking distance)
affordance exploitation poking object segmentation edge catalog object detection (recognition, localization, contact-free segmentation) affordance exploitation (rolling) manipulator detection (robot, human)
poking object segmentation
opportunities for segmentation robot, poking (Paul Fitzpatrick, Giorgio Metta) wearable, illumination (Charlie Kemp) camera, periodicity (Artur Arsenio)
segmentation on a wearable System detects when wearer reaches for an object, requests wearer to hold it up, then illuminates it
segmentation by watching a human System detects periodic motion – waving, tapping, etc. – and extracts seed points for segmentation
poking object segmentation edge catalog
sampling oriented regions
sample samples
most frequent samples 1st 21st 41st 61st 81st 101st 121st 141st 161st
Red = horizontal Green = vertical some tests Red = horizontal Green = vertical
natural images 00 22.50 900 22.50 450 22.50 450 22.50
poking object segmentation edge catalog object detection (recognition, localization, contact-free segmentation)
geometry+appearance Advantages: more selective; fast angles + colors + relative sizes invariant to scale, translation, In-plane rotation Advantages: more selective; fast Disadvantages: edges can be occluded; 2D method Property: no need for offline training
in operation
yellow on yellow
open object recognition robot’s current view recognized object (as seen during poking) pokes, segments ball sees ball, “thinks” it is cube correctly differentiates ball and cube
open object recognition
manipulator detection poking object segmentation edge catalog manipulator detection (robot, human) object detection (recognition, localization, contact-free segmentation)
manipulator recognition
affordance exploitation object segmentation poking affordance exploitation (rolling) object segmentation edge catalog manipulator detection (robot, human) object detection (recognition, localization, contact-free segmentation)
objects roll in different ways a bottle it rolls along its side a toy car it rolls forward a toy cube it doesn’t roll easily a ball it rolls in any direction
affordance exploitation
mimicry test Demonstration by human Mimicry in similar situation Mimicry when object is rotated Invoking the object’s natural rolling affordance Going against the object’s natural rolling affordance
mimicry test
affordance exploitation object segmentation poking affordance exploitation (rolling) object segmentation edge catalog manipulator detection (robot, human) object detection (recognition, localization, contact-free segmentation)
referring to objects
beyond objects familiar activities use constraint of familiar activity to discover unfamiliar entity used within it reveal the structure of unfamiliar activities by tracking familiar entities into and through them familiar entities (objects, actors, properties, …)
learning activity structure /ylo/ /lft/ /grin/ /ryt/ /ylo/ /lft/ /grin/ /ryt/ /ylo/ /lft/ /grin/ /ryt/ /ylo/ /lft/ /grin/ /ryt/ /ylo/ /lft/ /grin/ /ryt/ /ylo/ /lft/ /grin/ /ryt/ /ylo/ /lft/ /grin/ /ryt/ /ylo/ /lft/ /grin/ /ylo/ /lft/ /ylo/ /lft/ /ryt/ Extending Vocabulary /lft/ /yn/ /ryt/ Extending Vocabulary /lft/ /ylo/ /ryt/ /grin/ /lft/ /ylo/ /ryt/
conclusions active segmentation (through contact) appearance catalog (for oriented features) open object recognition for correction, enrollment affordance recognition (for rolling) open speech recognition (for isolated words) virtuous circle of development learning about and through activity