The Whole World in Your Hand: Active and Interactive Segmentation The Whole World in Your Hand: Active and Interactive Segmentation – Artur Arsenio, Paul.

Slides:



Advertisements
Similar presentations
Shoes as a Platform for Vision Paul Fitzpatrick Charles Kemp MIT CSAIL.
Advertisements

Hand Gesture for Taking Self Portrait Shaowei Chu and Jiro Tanaka University of Tsukuba Japan 12th July 15 minutes talk.
Perception and Perspective in Robotics Paul Fitzpatrick MIT Computer Science and Artificial Intelligence Laboratory Humanoid Robotics Group Goal To build.
Introduction  If you have ever been to shooting range before, you know that firing a gun is fun. Time flies when you’re sending hundreds of rounds down.
The DayOne project: how far can a robot develop in 24 hours? Paul Fitzpatrick MIT CSAIL.
Discoveries of Infancy
Carnegie Mellon University School of Computer Science Carnegie Mellon University School of Computer Science Cognitive Primitives for Mobile Robots Development.
The Free Safety Problem Using Gaze Estimation as a Meaningful Input to a Homing Task Albert Goldfain CSE 668: Animate Vision Principles Final Project Presentation.
Perception and Communications for Vulnerable Road Users safety Pierre Merdrignac Supervisors: Fawzi Nashashibi, Evangeline Pollard, Oyunchimeg Shagdar.
Object Recognition with Invariant Features n Definition: Identify objects or scenes and determine their pose and model parameters n Applications l Industrial.
1 © Ramesh Jain Social Life Networks: Ontology-based Recognition Ramesh Jain Contact:
A Study of Approaches for Object Recognition
1 Robotics and Biology Laboratory – Department of Computer Science Dov Katz Oliver Brock June 1 st, 2007 New England Manipulation Symposium Extracting.
MUltimo3-D: a Testbed for Multimodel 3-D PC Presenter: Yi Shi & Saul Rodriguez March 14, 2008.
Vision Computing An Introduction. Visual Perception Sight is our most impressive sense. It gives us, without conscious effort, detailed information about.
Visual Odometry for Ground Vehicle Applications David Nister, Oleg Naroditsky, James Bergen Sarnoff Corporation, CN5300 Princeton, NJ CPSC 643, Presentation.
© 2004 Control 4 August 4, 2015 Control4 Get the most out of your home.
 For many years human being has been trying to recreate the complex mechanisms that human body forms & to copy or imitate human systems  As a result.
Introduce about sensor using in Robot NAO Department: FTI-FHO-FPT Presenter: Vu Hoang Dung.
Introduction to Machine Vision Systems
A Review of Children, Humanoid Robots and Caregivers (Arsenio, 2004) COM3240 – Week 3 Presented by Gizdem Akdur.
A Brief Overview of Computer Vision Jinxiang Chai.
Principles of User Centred Design Howell Istance.
Active Vision Key points: Acting to obtain information Eye movements Depth from motion parallax Extracting motion information from a spatio-temporal pattern.
Object Tracking/Recognition using Invariant Local Features Applications l Mobile robots, driver assistance l Cell phone location or object recognition.
I want good Thinking on this This involves Critical Thinking – have I seen this problem before, what are the likely causes, what information do I need.
Page: 1 PHAM VAN Tien Real-Time Approach for Auto-Adjusting Vision System Reading Class International Graduate School of Dynamic Intelligent Systems.
CSCE 5013 Computer Vision Fall 2011 Prof. John Gauch
Digital Image Processing & Analysis Spring Definitions Image Processing Image Analysis (Image Understanding) Computer Vision Low Level Processes:
DIEGO AGUIRRE COMPUTER VISION INTRODUCTION 1. QUESTION What is Computer Vision? 2.
Information: Perception and Representation Lecture #7 Part A.
Bryan Willimon, Stan Birchfield, Ian Walker Department of Electrical and Computer Engineering Clemson University IROS 2010 Rigid and Non-Rigid Classification.
1 Artificial Intelligence: Vision Stages of analysis Low level vision Surfaces and distance Object Matching.
12/7/10 Looking Back, Moving Forward Computational Photography Derek Hoiem, University of Illinois Photo Credit Lee Cullivan.
1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.
Cognitive ability: Challenge: How to recognize objects in a scene; where are the object’s boundaries? This problem is known as ‘image segmentation’ in.
Object Lesson: Discovering and Learning to Recognize Objects Object Lesson: Discovering and Learning to Recognize Objects – Paul Fitzpatrick – MIT CSAIL.
Fingertip Detection with Morphology and Geometric Calculation Dung Duc Nguyen ; Thien Cong Pham ; Jae Wook Jeon Intelligent Robots and Systems, IEEE/RSJ.
Exploiting cross-modal rhythm for robot perception of objects Artur M. Arsenio Paul Fitzpatrick MIT Computer Science and Artificial Intelligence Laboratory.
Basic cognitive processes - 1 Psych 414 Prof. Jessica Sommerville.
PENGENALAN POLA DAN VISI KOMPUTER PENDAHULUAN. Vision Vision is the process of discovering what is present in the world and where it is by looking.
CONTENT FOCUS FOCUS INTRODUCTION INTRODUCTION COMPONENTS COMPONENTS TYPES OF GESTURES TYPES OF GESTURES ADVANTAGES ADVANTAGES CHALLENGES CHALLENGES REFERENCE.
DARPA Mobile Autonomous Robot Software BAA99-09 July 1999 Natural Tasking of Robots Based on Human Interaction Cues Cynthia Breazeal Rodney Brooks Brian.
Visual Odometry for Ground Vehicle Applications David Nistér, Oleg Naroditsky, and James Bergen Sarnoff Corporation CN5300 Princeton, New Jersey
Extracting Simple Verb Frames from Images Toward Holistic Scene Understanding Prof. Daphne Koller Research Group Stanford University Geremy Heitz DARPA.
Giorgio Metta · Paul Fitzpatrick Humanoid Robotics Group MIT AI Lab Better Vision through Manipulation Manipulation Manipulation.
1 2 Thinking is a matter of cleverness. 3 Wisdom is not as important as cleverness.
Duo: Towards a Wearable System that Learns about Everyday Objects and Actions Charles C. Kemp MIT CSAIL ● Goal: help machines learn an important form of.
Feel the beat: using cross-modal rhythm to integrate perception of objects, others, and self Paul Fitzpatrick and Artur M. Arsenio CSAIL, MIT.
ICT in the Foundation Stage © Crown Copyright 2004.
Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons.
AN ACTIVE VISION APPROACH TO OBJECT SEGMENTATION – Paul Fitzpatrick – MIT CSAIL.
SixthSense Technology Visit to download
We are an energetic and dynamic young company having specialization in the fields of Electronics and Communication, Mechanical Systems and Robotics. We.
V k equals the vector difference between the object and the block across the first and last frames in the image sequence or more formally: Toward Learning.
Modeling the development of mirror neurons Problem Solution
The User Lecture 2 DeSiaMore
Domo: Manipulation for Partner Robots Aaron Edsinger MIT Computer Science and Artificial Intelligence Laboratory Humanoid Robotics Group
Manipulation in Human Environments
Life in the Humanoid Robotics Group MIT AI Lab
Properties of human stereo processing
Opportunistic Perception
Perception and Perspective in Robotics
Learning about Objects
Paul Fitzpatrick Humanoid Robotics Group MIT AI Lab
Digital Image Processing
Better Vision through Experimental Manipulation
Learning complex visual concepts
Domo: Manipulation for Partner Robots Aaron Edsinger MIT Computer Science and Artificial Intelligence Laboratory Humanoid Robotics Group
Computer Vision Readings
Presentation transcript:

The Whole World in Your Hand: Active and Interactive Segmentation The Whole World in Your Hand: Active and Interactive Segmentation – Artur Arsenio, Paul Fitzpatrick, Charlie Kemp, Giorgio Metta – 3 × MIT-CSAIL + 1 × LIRA-LAB

the scientist in the crib “Walk upstairs, open the door gently, and look in the crib. What do you see? Most of us see a picture of innocence and helplessness, a clean slate. But, in fact, what we see in the crib is the greatest mind that has ever existed, the most powerful learning machine in the universe.”

watch and learn (

passive ‘domain general’ learning (Kirkham et al, Cornell)

learning about object boundaries Prior visual experience facilitates an infant’s perception of object boundaries Experience = seeing individual object before seeing object pairing Needham et al (Duke University)

act and learn

learning about object boundaries Object exploration enhances knowledge of object boundaries compared with age-matched controls Velcro mittens allow early start to active object exploration Needham et al (Duke University)

human manipulation, external perspective (Artur Arsenio) perception of object manipulation robot ‘manipulation’, first person perspective (Paul Fitzpatrick, Giorgio Metta) human manipulation, first person perspective (Charlie Kemp)

perception of object manipulation robot ‘manipulation’, first person perspective (Paul Fitzpatrick, Giorgio Metta)

experimentation helps perception Rachel:We have got to find out if [ugly naked guy]'s alive. Monica:How are we going to do that? There's no way. Joey:Well there is one way. His window's open – I say, we poke him. (brandishes the Giant Poking Device)

robots can experiment Robot:We have got to find out where this object’s boundary is. Camera:How are we going to do that? There's no way. Robot:Well there is one way. Looks reachable – I say, let’s poke it. (brandishes the Giant Poking Limb)

active segmentation  Object boundaries are not always easy to detect visually  Solution: Cog sweeps through ambiguous area  Resulting object motion helps segmentation  Robot can learn to recognize and segment object without further contact

integrated behavior

segmentation examples table car

head segmentation – the hard way!

human manipulation, external perspective (Artur Arsenio) perception of object manipulation

interactive segmentation Event Detection Tracking Multi-scale periodicity detection manipulator motion and no object motion object motion and no manipulator motion object motion and manipulator motion human is waving to introduce a stationary object to the robot human triggered a bouncy/wobbling object human is showing an object by waving it

example: child’s book

example: waving a toy System detects periodic motion – waving, tapping, etc. – and extracts seed points for segmentation

example: sweeping brush

perception of object manipulation human manipulation, first person perspective (Charlie Kemp)

human action, machine perception

the “Duo” system Intersense Cube, which measures absolute orientation Backpack with wireless communication to the cluster Headphones that give spoken requests from the wearable creature to the human Wide angle camera focused on the workspace of the hand LED array, for creature controlled lighting for object segmentation Computer cluster for real-time perceptual processing and control through wireless communication

the “Duo” system Humanoid Platform Active Hand Detector Wearable Human Reach Detector Head Motion Detector Near Head Segmentation Head Motion Inhibitor Behavior Inhibitor Daily Behavior Examine Object Behavior

say cheese… System requests wearer to reach for object, and when it is held up to view it is illuminated

human manipulation, external perspective (Artur Arsenio) perception of object manipulation robot ‘manipulation’, first person perspective (Paul Fitzpatrick, Giorgio Metta) human manipulation, first person perspective (Charlie Kemp)

using segmentation constraint object segmentation edge catalog object detection (recognition, localization, contact-free segmentation) affordance exploitation (rolling) manipulator detection (robot, human)

learning about edges

sample samples

most frequent samples 1 st 21 st 41 st 61 st 81 st 101 st 121 st 141 st 161 st 181 st 201 st 221 st

some tests Red = horizontal Green = vertical

natural images  45 0 

object recognition

yellow on yellow

open object recognition sees ball, “thinks” it is cube correctly differentiates ball and cube robot’s current view recognized object (as seen during poking) pokes, segments ball

open object recognition

conclusions Find the hand, and you may find manipulable objects The constraint of manipulation simplifies object segmentation Offers an opportunity to gather visual experience, and extend the range of situations within which segmentation is possible…

conclusions The hand is a good starting point!