Embodied Machines The Grounding (binding) Problem –Real cognizers form multiple associations between concepts Affordances - how is an object interacted.

Slides:



Advertisements
Similar presentations
Tuning Jenny Burr August Discussion Topics What is tuning? What is the process of tuning?
Advertisements

Cognitive Systems, ICANN panel, Q1 What is machine intelligence, as beyond pattern matching, classification and prediction. What is machine intelligence,
Why it is Hard to Label Our Concepts Jesse Snedeker and Lila Gleitman Harvard and U. Penn.
CSCTR Session 11 Dana Retová.  Start bottom-up  Create cognition based on sensori-motor interaction ◦ Cohen et al. (1996) – Building a baby ◦ Cohen.
How Children Learn Language. Lec. 3
MICHAEL MILFORD, DAVID PRASSER, AND GORDON WYETH FOLAMI ALAMUDUN GRADUATE STUDENT COMPUTER SCIENCE & ENGINEERING TEXAS A&M UNIVERSITY RatSLAM on the Edge:
Perception and Perspective in Robotics Paul Fitzpatrick MIT Computer Science and Artificial Intelligence Laboratory Humanoid Robotics Group Goal To build.
Cognitive Linguistics Croft & Cruse 6 A dynamic construal approach to sense relations I: hyponymy and meronymy.
Patch to the Future: Unsupervised Visual Prediction
Imaginary Worlds and Grounded (Embodied) Cognition John B. Black Teachers College Columbia University.
Decision Making.
Yiannis Demiris and Anthony Dearden By James Gilbert.
Reference & Denotation Connotation Sense Relations
Introduction to Alice Web Design Section 8-2 Alice is named in honor of Lewis Carroll’s Alice in Wonderland.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
1 SAFIRE Project DHS Update – July 15, 2009 Introductions  Update since last teleconference Demo Video - Fire Incident Command Board (FICB) SAFIRE Streams.
Statistical Recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Kristen Grauman.
ECE 7340: Building Intelligent Robots QUALITATIVE NAVIGATION FOR MOBILE ROBOTS Tod S. Levitt Daryl T. Lawton Presented by: Aniket Samant.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Specific Learning Difficulties: Dyslexia is one of many labels for a Specific Learning Difficulty. Other Labels for other Learning Difficulties include:

1 Human simulations of vocabulary learning Présentation Interface Syntaxe-Psycholinguistique Y-Lan BOUREAU Gillette, Gleitman, Gleitman, Lederer.
Part I: Classification and Bayesian Learning
Meaning and Language Part 1.
RESEARCH METHODS IN EDUCATIONAL PSYCHOLOGY
Unit TDA 2.1 Child and young person development (Part 1)
Introduction to Alice Alice is named in honor of Lewis Carroll’s Alice in Wonderland.
Biointelligence Laboratory School of Computer Science and Engineering Seoul National University Cognitive Robots © 2014, SNU CSE Biointelligence Lab.,
Guide to Simulation Run Graphic: The simulation runs show ME (memory element) activation, production matching and production firing during activation of.
Embodied Machines Artificial vs. Embodied Intelligence –Artificial Intelligence (AI) –Natural Language Processing (NLP) Goal: write programs that understand.
Probabilistic Context Free Grammars for Representing Action Song Mao November 14, 2000.
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe & Computer Vision Laboratory ETH.
Beyond Gazing, Pointing, and Reaching A Survey of Developmental Robotics Authors: Max Lungarella, Giorgio Metta.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
The Brain. 3 Memory Storage Systems Sensory Short Term Long Term.
Mapping words to actions and events: How do 18-month-olds learn a verb? Mandy J. Maguire, Elizabeth A. Hennon, Kathy Hirsh-Pasek, Roberta M. Golinkoff,
1 SATWARE: A Semantic Middleware for Multi Sensor Applications Sharad Mehrotra.
From Allesandro Lenci. Linguistic Ontologies Mikrokosmos (Nirenburg, Mahesh et al.) Generalized Upper Model (Bateman et al.)Generalized Upper Model WordNet.
MACHINE VISION Machine Vision System Components ENT 273 Ms. HEMA C.R. Lecture 1.
Obj: Introduction to Alice HW: Read handout and answer questions. Alice is named in honor of Lewis Carroll’s Alice in Wonderland Day 5.
I Robot.
HCI 입문 Graphics Korea University HCI System 2005 년 2 학기 김 창 헌.
Object Lesson: Discovering and Learning to Recognize Objects Object Lesson: Discovering and Learning to Recognize Objects – Paul Fitzpatrick – MIT CSAIL.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 3. Word Association.
Introduction to Alice Alice is named in honor of Lewis Carroll’s Alice in Wonderland.
Chapter 1. Cognitive Systems Introduction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Park, Sae-Rom Lee, Woo-Jin Statistical.
Stress Management.
Animating Idle Gaze Humanoid Agents in Social Game Environments Angelo Cafaro Raffaele Gaito
Chapter 8. Situated Dialogue Processing for Human-Robot Interaction in Cognitive Systems, Christensen et al. Course: Robots Learning from Humans Sabaleuski.
Embodied Machines The Talking Heads Guessing Game –Speakers role: Speaker agent randomly searches environment, locates an area of interest (context) Focuses.
Content for Today and Next Wednesday Development from infancy to adulthood Concept of Development Modal model for describing cognitive processes Perception.
WHAT IS LANGUAGE?. 4 The study of language (linguistics) may treat a language as a self- contained system; or it may treat it as an object that varies.
Investigating the basis for conversation between human and robot Experiments using natural, spontaneous speech, speaking to the robot as if it were a small.
Feel the beat: using cross-modal rhythm to integrate perception of objects, others, and self Paul Fitzpatrick and Artur M. Arsenio CSAIL, MIT.
1 Evaluation of Multi-Media Data QA Systems AQUAINT Breakout Session – June 2002 Howard Wactlar, Carnegie Mellon Yiming Yang, Carnegie Mellon Herb Gish,
ESSENTIAL QUESTION: HOW DOES AN INFANT’S BRAIN DEVELOP AND WHAT CAN CAREGIVERS DO TO PROMOTE DEVELOPMENT? Chapter 9: Intellectual Development in Infants.
VISUAL WORD RECOGNITION. What is Word Recognition? Features, letters & word interactions Interactive Activation Model Lexical and Sublexical Approach.
Machine learning & object recognition Cordelia Schmid Jakob Verbeek.
Chapter 4 Comprehension, Memory, and Cognitive Learning
Action-Grounded Push Affordance Bootstrapping of Unknown Objects
Detecting Semantic Concepts In Consumer Videos Using Audio Junwei Liang, Qin Jin, Xixi He, Gang Yang, Jieping Xu, Xirong Li Multimedia Computing Lab,
Phonological Priming and Lexical Access in Spoken Word Recognition
Helping Children Learn
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
What is a Robot?.
Multimodal Human-Computer Interaction New Interaction Techniques 22. 1
Phonological Priming and Lexical Access in Spoken Word Recognition
Using Natural Language Processing to Aid Computer Vision
Information Retrieval
Presentation transcript:

Embodied Machines The Grounding (binding) Problem –Real cognizers form multiple associations between concepts Affordances - how is an object interacted with Frames - Background structure against which concept is understood -- sometimes highly complex (Educational system, family relationships) Emotions - witnessing event/seeing object conjures up emotional states Mental simulation - comprehending language may trigger imagistic modeling of event based on experience

Embodied Machines –Mouse Mammal, Small, furry, grey to brown, long whiskers, cats like to play with them and then eat them, they’re used in experiments, ladies stand on chairs when they’re around, they squeak, they’re prolific breeders, they’re sold live as snake food, they’re one kind of rodent, they look a lot like rats, they are sometimes pets, they like to run on a wheel… –Play The opposite of work, it’s fun, kids do it, scheduled in during grade school, you play games, you play with words, …

Embodied Machines –Approaches to meaning construction NLP –Text/speech is considered comprehended when parsed syntactically, and when word meanings have been assigned –Meaning is pre-determined by humans in some way Embodied approach –World has no structure until body begins to interact in it »Need goals & sensorimotor system –Experience --> meaning –Words map onto meaning

Embodied Machines –Steel’s talking heads Simple robots –Auditory & visual systems –Motivating goal = language game Simple environment –2 dimensional world containing objects Robots determine their own categories for objects Robots determine their own labels for categories Robots and environment are real physical entities

Embodied Machines –Cangelosi & Parisi Virtual agents, virtual world A kind of embodied learning –Agents have physical location, orientation, movement capabilities within their environment –Agents consume mushrooms which affects their energy status –Agents (collectively) have a motivating task --> increase fitness of species –They sense perceptual characteristics, not mushrooms --> they learn which characteristics describe real vs. poisonous mushrooms –Agents (collectively) learn to categorize and label mushrooms

Embodied Machines –CELL (Deb Roy) Cross channel Early Lexical Learning Models embodied language learning using input that approximates input to human infants Instantiated in robot body with microphone/camera CELL learns to form word meaning correspondences from raw (unsegmented) audio and visual input

Embodied Machines –First Task Segmentation –Audio stream parsing into segments –Video stream parsing into objects –Segmentation process produces channel of ‘words’ and channel of shapes –Second Task Build a lexicon by identifying frequently co-occurring pairs of audio & visual segments

Embodied Machines Illustrative example (not from actual data) Imagine an utterance: “…don’t throw the ball at the cat…” Uttered in a scene containing these identified objects (Noise present)

Embodied Machines Objects not necessarily identified in same order as named in utterance Time delays between utterance and object recognition highly likely …throw the ball at the cat

Embodied Machines –Short term memory (STM) – look at a temporal window surrounding each word –Aim is to go back or forward far enough in time to have the word and referent in same window …throw the ball at the cat Short term memory

Embodied Machines –Window marches through data stream collecting segmented objects and words for possible mapping …throw the ball at the cat Short term memory

Embodied Machines …throw the ball at the cat Short term memory

Embodied Machines …throw the ball at the cat Short term memory

Embodied Machines Audio and visual segments that have a high degree of mutual information—are likely semantically linked and should be saved in long term memory (LTM) Objects Words …… Ball5 Cat6 The4050  unique  Unique occurrences ,000

Embodied Machines Mutual information MI = P(a&b)  co-occurrence (a&b) P(a) P(b) occurrence (a) * occurrence (b) P (‘the’ & ) = 40/(90,000 * 59) = P (‘cat’ & ) = 40/(100 * 59) = Words like ‘the’ are promiscuous. They co-occur with so many categories, they lack predictive power.

Embodied Machines Two implementations of CELL –Robot –Learning from observing Infant/Caregiver interaction

Embodied Machines Robot –Input: spoken utterances and images of objects acquired from video camera mounted on robot –Experimenter places objects in front of the robot and describes them –Acquisition of lexicon Robot gathers visual information about environment while listening to speech (discovers high MI pairs) –Speech generation Search for objects in environment then describe –Speech understanding (maps word to object)

Embodied Machines Learning from infant-caregiver interaction –Infants played with 7 classes of objects Balls, shoes, keys, toy cars, trucks, dogs, horses Care-giver/infant interaction was natural –CELL attempted to build up lexicon from observing these interactions Segmentation accuracy (segment boundaries correspond to word boundaries?) Word discovery (segments correspond to single word?) Semantic accuracy (if word segmented properly, is it properly mapped to an object?)

Embodied Machines Segmentation accuracy – 28% (compared to 7% for acoustic only model) Word discovery – 72% of segmented items were single words (compared to 31% for acoustic only model) Semantic accuracy – 57% of hypothesized lexical candidates are both valid words and were linked to semantically relevant visual categories