The DayOne project: how far can a robot develop in 24 hours? Paul Fitzpatrick MIT CSAIL.

Slides:

Advertisements

Similar presentations

Shoes as a Platform for Vision Paul Fitzpatrick Charles Kemp MIT CSAIL.

Advertisements

The Nature of Language Learning

Kien A. Hua Division of Computer Science University of Central Florida.

Nathan Wiebe, Ashish Kapoor and Krysta Svore Microsoft Research ASCR Workshop Washington DC Quantum Deep Learning.

Perception and Perspective in Robotics Paul Fitzpatrick MIT Computer Science and Artificial Intelligence Laboratory Humanoid Robotics Group Goal To build.

Introduction To Tracking

Where has all the data gone? In a complex system such as Metalman, the interaction of various components can generate unwanted dynamics such as dead time.

Vision Based Control Motion Matt Baker Kevin VanDyke.

Detecting Pedestrians by Learning Shapelet Features

CPSC 425: Computer Vision (Jan-April 2007) David Lowe Prerequisites: 4 th year ability in CPSC Math 200 (Calculus III) Math 221 (Matrix Algebra: linear.

Brian Merrick CS498 Seminar.  Introduction to Neural Networks  Types of Neural Networks  Neural Networks with Pattern Recognition  Applications.

COMP322/S2000/L181 Pre-processing: Smooth a Binary Image After binarization of a grey level image, the resulting binary image may have zero’s (white) and.

Learning to Detect Natural Image Boundaries Using Local Brightness, Color, and Texture Cues David R. Martin Charless C. Fowlkes Jitendra Malik.

CS 561, Sessions 27 1 Towards intelligent machines Thanks to CSCI561, we now know how to… - Search (and play games) - Build a knowledge base using FOL.

Motion and Ambiguity Russ DuBois. Ambiguity = the possibility to interpret a stimulus in two or more ways Q: Can motion play a part in our interpretation.

1 Segmentation of Salient Regions in Outdoor Scenes Using Imagery and 3-D Data Gunhee Kim Daniel Huber Martial Hebert Carnegie Mellon University Robotics.

Verbal Rehearsal, Semantic Elaboration, and Imagery.

Jochen Triesch, UC San Diego, 1 COGS Visual Modeling Jochen Triesch & Martin Sereno Dept. of Cognitive Science UC.

October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 1 Creating Data Representations On the other hand, sets of orthogonal vectors.

Development Through the Lifespan Chapter 4 Physical Development in Infancy and Toddlerhood This multimedia product and its contents are protected under.

West Virginia University

CS 376b Introduction to Computer Vision 04 / 29 / 2008 Instructor: Michael Eckmann.

The Whole World in Your Hand: Active and Interactive Segmentation The Whole World in Your Hand: Active and Interactive Segmentation – Artur Arsenio, Paul.

Introduction to Computer Vision Olac Fuentes Computer Science Department University of Texas at El Paso El Paso, TX, U.S.A.

Artificial Intelligence Intro Agents

Vision Surveillance Paul Scovanner.

 The most intelligent device - “Human Brain”.  The machine that revolutionized the whole world – “computer”.  Inefficiencies of the computer has lead.

Embodied Machines The Grounding (binding) Problem –Real cognizers form multiple associations between concepts Affordances - how is an object interacted.

Lecture 2b Readings: Kandell Schwartz et al Ch 27 Wolfe et al Chs 3 and 4.

Robotica Lecture 3. 2 Robot Control Robot control is the mean by which the sensing and action of a robot are coordinated The infinitely many possible.

University of Amsterdam Search, Navigate, and Actuate - Qualitative Navigation Arnoud Visser 1 Search, Navigate, and Actuate Qualitative Navigation.

DIEGO AGUIRRE COMPUTER VISION INTRODUCTION 1. QUESTION What is Computer Vision? 2.

Quadtrees, Octrees and their Applications in Digital Image Processing.

 To Cover the basic theory and algorithms that are widely used in digital image processing.  To Expose students to current technologies and issues that.

Modern Boundary Detection II Computer Vision CS 143, Brown James Hays Many slides Michael Maire, Jitendra Malek Szeliski 4.2.

COMP322/S2000/L171 Robot Vision System Major Phases in Robot Vision Systems: A. Data (image) acquisition –Illumination, i.e. lighting consideration –Lenses,

Object Lesson: Discovering and Learning to Recognize Objects Object Lesson: Discovering and Learning to Recognize Objects – Paul Fitzpatrick – MIT CSAIL.

Robust Real Time Face Detection

1-1 Chapter 1: Introduction 1.1. Images An image is worth thousands of words.

Exploiting cross-modal rhythm for robot perception of objects Artur M. Arsenio Paul Fitzpatrick MIT Computer Science and Artificial Intelligence Laboratory.

Colour and Texture. Extract 3-D information Using Vision Extract 3-D information for performing certain tasks such as manipulation, navigation, and recognition.

Giorgio Metta · Paul Fitzpatrick Humanoid Robotics Group MIT AI Lab Better Vision through Manipulation Manipulation Manipulation.

Investigating the basis for conversation between human and robot Experiments using natural, spontaneous speech, speaking to the robot as if it were a small.

 Dev. of motor skill is individual & wide variation occurs within children of the same chronological age;  Psychomotor skills – voluntary actions initiated.

Computational Vision Jitendra Malik University of California, Berkeley.

Ahissar, Hochstein (1997) Nature Task difficulty and specificity of perceptual learning 1 st third 2nd third Final session Task difficulty Stimulus-to-mask.

Feel the beat: using cross-modal rhythm to integrate perception of objects, others, and self Paul Fitzpatrick and Artur M. Arsenio CSAIL, MIT.

AN ACTIVE VISION APPROACH TO OBJECT SEGMENTATION – Paul Fitzpatrick – MIT CSAIL.

Machine learning & object recognition Cordelia Schmid Jakob Verbeek.

Manipulation in Human Environments

NEURAL NETWORKS. An extremely simplified model of the brain !

Visual Information Processing. Human Perception V.S. Machine Perception  Human perception: pictorial information improvement for human interpretation.

Sight Our Visual Perception

Computer-Aided Design

Convolutional Neural Network

Manipulation in Human Environments

Color-Texture Analysis for Content-Based Image Retrieval

Prodi Teknik Informatika , Fakultas Imu Komputer

Perceptual Development

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Learning about Objects

Paul Fitzpatrick Humanoid Robotics Group MIT AI Lab

Better Vision through Experimental Manipulation

First Homework One week

TOPIC: Computer-Aided Design

Vehicle detection and localization

Learning complex visual concepts

Presentation transcript:

the DayOne project: how far can a robot develop in 24 hours? Paul Fitzpatrick MIT CSAIL

the DayOne project presentation: how much can I prepare in 24 hours? Paul Fitzpatrick MIT CSAIL

what is the DayOne project?  An exercise in integration: creating a robot whose abilities expand qualitatively and quickly  Motivated by ability of young of many species to “hit the ground running” when born  e.g. a foal can typically trot, groom, follow and feed from its mare, all within hours of birth  Human infants are born in relatively “premature” state

“abilities expand qualitatively, quickly”  Robot is not just getting better at a specific problem  Low-level vision –Robot learns basic edge orientation filter  Mid-level vision –Robot learns to segment familiar objects from background  Mid-level audition –Robot learns to differentiate utterances  High-level perception –Robot learns role of objects and utterances within tasks  All can run in real-time, during a single session

low-level vision  Orientation filter trained from physical probing of object boundaries

low-level vision

mid-level vision camera image response for each object implicated edges found and grouped  Object appearance found through physical probing is learned, using features that depend on a well-trained orientation filter

mid-level audition

high-level perception

system organization Pattern detector Identity inertia Slow clustering Fast clustering Identity inertia Slow clustering Fast clustering Orientation training Active segmentation Visual input Loudness-based segmentation Acoustic input (takes a few minutes) (takes a few seconds) (takes many hours) (takes a few seconds)

system organization Pattern detector Identity inertia Slow clustering Fast clustering Identity inertia Slow clustering Fast clustering Orientation training Active segmentation Visual input Loudness-based segmentation Acoustic input (takes a few minutes) (takes a few seconds) (takes many hours) (takes a few seconds)

system organization Pattern detector Identity inertia Slow clustering Fast clustering Identity inertia Slow clustering Fast clustering Orientation training Active segmentation Visual input Loudness-based segmentation Acoustic input (takes a few minutes) (takes a few seconds) (takes many hours) (takes a few seconds)

system organization Pattern detector Identity inertia Slow clustering Fast clustering Identity inertia Slow clustering Fast clustering Orientation training Active segmentation Visual input Loudness-based segmentation Acoustic input (takes a few minutes) (takes a few seconds) (takes many hours) (takes a few seconds)

system organization Pattern detector Identity inertia Slow clustering Fast clustering Identity inertia Slow clustering Fast clustering Orientation training Active segmentation Visual input Loudness-based segmentation Acoustic input (takes a few minutes) (takes a few seconds) (takes many hours) (takes a few seconds)

system organization Pattern detector Identity inertia Slow clustering Fast clustering Identity inertia Slow clustering Fast clustering Orientation training Active segmentation Visual input Loudness-based segmentation Acoustic input (takes a few minutes) (takes a few seconds) (takes many hours) (takes a few seconds)

system organization Pattern detector Identity inertia Slow clustering Fast clustering Identity inertia Slow clustering Fast clustering Orientation training Active segmentation Visual input Loudness-based segmentation Acoustic input (takes a few minutes) (takes a few seconds) (takes many hours) (takes a few seconds)

identity inertia  Convention: sender should not dramatically change the meaning of an out-going signal line  Unless requested by receiver  Like supporting a legacy API Perceptual layer filtered percepts lower-level percepts sporadic training signal sporadic training signal

problem: pattern detector is monolithic Pattern detector Identity inertia Slow clustering Fast clustering Identity inertia Slow clustering Fast clustering Orientation training Active segmentation Visual input Loudness-based segmentation Acoustic input

solution: distribute pattern detector  Make perceptual layers smarter  Basically the approach in Fitzpatrick&Arsenio, EpiRob’04  Periodic patterns are detected early  But what about more complex patterns?

desired ability SequenceGuessed pattern 01010(01)* Prediction 1010…

desired ability SequenceGuessed pattern 01010(01)* Prediction 1010… (01+)*1010…, 1011…, 1101…, 1110…, 1111…

counting patterns Distinct sequencesWith local identityLength , , , ,777,2164, ,420,48921, ,000,000,000115, ,311,670,611678, ,916,100,448,2564,213,59712

results SequenceGuessed pattern 01010(01)* Prediction 1010…

results SequenceGuessed pattern 01010(01)* Prediction 1010… (01+)*1010…, 1011…, 1101…, 1110…, 1111…

results SequenceGuessed pattern 01010(01)* Prediction 1010… (01+)*1010…, 1011…, 1101…, 1110…, 1111… (012013)*1301…

results SequenceGuessed pattern 01010(01)* Prediction 1010… (01+)*1010…, 1011…, 1101…, 1110…, 1111… (012013)*1301… (01[23])*1201…, 1301…

results SequenceGuessed pattern 01010(01)* Prediction 1010… (01+)*1010…, 1011…, 1101…, 1110…, 1111… (012013)*1301… (01[23])*1201…, 1301… (001122)*2200…

results SequenceGuessed pattern 01010(01)* Prediction 1010… (01+)*1010…, 1011…, 1101…, 1110…, 1111… (012013)*1301… (01[23])*1201…, 1301… (001122)*2200… ((.)\2)*0000…, 0011…, 0022…, 0033…, 1100…, 1111…, 1122…, ………, 3344…

obligatory baby pictures