gesture features for coreference

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

DDDAS: Stochastic Multicue Tracking of Objects with Many Degrees of Freedom PIs: D. Metaxas, A. Elgammal and V. Pavlovic Dept of CS, Rutgers University.
A Machine Learning Approach to Coreference Resolution of Noun Phrases By W.M.Soon, H.T.Ng, D.C.Y.Lim Presented by Iman Sen.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 4 – Digital Image Representation Klara Nahrstedt Spring 2009.
Layered Acting for Character Animation By Mira Dontcheva Gary Yngve Zoran Popović presented by Danny House SIGGRAPH 2003.
Multiple People Detection and Tracking with Occlusion Presenter: Feifei Huo Supervisor: Dr. Emile A. Hendriks Dr. A. H. J. Stijn Oomes Information and.
 INTRODUCTION  STEPS OF GESTURE RECOGNITION  TRACKING TECHNOLOGIES  SPEECH WITH GESTURE  APPLICATIONS.
Cams Cam Basics.
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
Tracking Objects with Dynamics Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/21/15 some slides from Amin Sadeghi, Lana Lazebnik,
Rodent Behavior Analysis Tom Henderson Vision Based Behavior Analysis Universitaet Karlsruhe (TH) 12 November /9.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Building the Design Studio of the Future Aaron Adler Jacob Eisenstein Michael Oltmans Lisa Guttentag Randall Davis October 23, 2004.
Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.
Introduction to Simple Machines Lou Loftin FETC Conference Orlando, FL January 28 – 31,
Exploiting video information for Meeting Structuring ….
Human-Computer Interaction Human-Computer Interaction Tracking Hanyang University Jong-Il Park.
Person detection, tracking and human body analysis in multi-camera scenarios Montse Pardàs (UPC) ACV, Bilkent University, MTA-SZTAKI, Technion-ML, University.
Collaborative Annotation of the AMI Meeting Corpus Jean Carletta University of Edinburgh.
Research Projects 6v81 Multimedia Database Yohan Jin, T.A.
Vision-based human motion analysis: An overview Computer Vision and Image Understanding(2007)
Tracking CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.
Natural Tasking of Robots Based on Human Interaction Cues Brian Scassellati, Bryan Adams, Aaron Edsinger, Matthew Marjanovic MIT Artificial Intelligence.
Chapter Two Measurements.
MIT 6.893; SMA 5508 Spring 2004 Larry Rudolph Lecture Introduction Sketching Interface.
Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Object Lesson: Discovering and Learning to Recognize Objects Object Lesson: Discovering and Learning to Recognize Objects – Paul Fitzpatrick – MIT CSAIL.
Rick Parent - CIS681 Motion Analysis – Human Figure Processing video to extract information of objects Motion tracking Pose reconstruction Motion and subject.
Rotation Around a Point. A Rotation is… A rotation is a transformation that turns a figure around a fixed point called the center of rotation. A rotation.
Generating Query Substitutions Alicia Wood. What is the problem to be solved?
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
Using Semantic Relations to Improve Information Retrieval
Image features and properties. Image content representation The simplest representation of an image pattern is to list image pixels, one after the other.
Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts Khairun-nisa Hassanali 1, Yang Liu 1 and Thamar.
Enabling Natural Interaction Randall Davis Aaron Adler, Sonya Cates, Jacob Eisenstein, Tracy Hammond, Mike Oltmans, Metin Sezgin, Chen Li, David Pitman.
AN ACTIVE VISION APPROACH TO OBJECT SEGMENTATION – Paul Fitzpatrick – MIT CSAIL.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
On Defining Cephalic Gesture Categories
“ A force to be reckoned with”
Introducing: Motion and Forces
Introduction Multimedia initial focus
Grounding by nodding GESPIN 2009, Poznan, Poland
College Physics, 7th Edition
Tracking Objects with Dynamics
Essential Question: What is physics?
Rotational Dynamics Chapter 9.
Chapter 2: Input and output devices
Ying He Wuhan University of Technology Twitter: #AMIA2017
Multimedia Information Retrieval
Qualitative Research.
Social Knowledge Mining
Clustering Algorithms for Noun Phrase Coreference Resolution
Ch 1 Science Skills Science involves asking questions about nature and then finding ways to answer them. Brazfield.
Chapter 9 Use Cases.
Brief Review of Recognition + Context
GT Rotation and Centripetal
Multimedia Information Retrieval
Chapter 1 Preview Objectives Physics The Scientific Method Models
Ying Dai Faculty of software and information science,
CS246: Information Retrieval
Introduction to Object Tracking
Accelerometer on a Cart Activities
Chapter 4 . Trajectory planning and Inverse kinematics
Information Retrieval
Rotation Around a Point
Tapping Your Knowledge What is your personal definition of art? How did this grow out of your past experiences with art?
Let’s be more precise about describing moves of figures in the plane.
Presentation transcript:

gesture features for coreference Jacob Eisenstein Randall Davis MIT CSAIL

coreference resolution when do two noun phrases refer to the same thing? "This circle is rotating clockwise and this piece of wood is attached at this point and this point but it can rotate. So as the circle rotates, this moves in and out. So this whole thing is just going back and forth."

coreference resolution when do two noun phrases refer to the same thing? "This circle is rotating clockwise and this piece of wood is attached at this point and this point but it can rotate. So as the circle rotates, this moves in and out. So this whole thing is just going back and forth."

coreference resolution “This Wheel” The same? “This Bar” “This”

coreference resolution “This Wheel” Multimodal Coreference Resolution Demonstrative NP Singular / Neutral Gender Traditional Coreference Resolution The same? “This Bar” “This” Demonstrative NP Singular / Neutral Gender Pronoun Singular / Neutral Gender

coreference annotated cheaply and reliably a building block for NLP applications summarization segmentation information retrieval

coreference and catchments recurring gesture features match semantic patterns when gesture features disambiguate coreference  catchment studying coreference gives a quantitative analysis of catchments

dataset new corpus of spontaneous multimodal communication nine speaker-listener pairs explanations of mechanical device behavior manipulation: which modalities are available speech + {diagram | sketch | gesture only} for this study, it’s speech + diagram only more deixis, easier to interpret Total of 16 documents, 2-3 minutes in length

tracking hand position motion, color, and edge cues are used to guide an articulated upper-body model 13DOF, 2.5D

particle filtering online search of model configurations sampled representation to maintain multiple hypotheses at each time step: update weights based on new observation resample particles (with replacement) “drift” to capture system dynamics

extracted data position, velocity, acceleration hands, arms, body and head occlusion model directly manually annotated speech transcripts force-aligned for time synchronization coreference annotations

gesture features features on pairs of gestures to predict coreference features on individual gestures to predict whether an NP introduces a new entity to predict whether gesture is relevant to coreference

features on pairs of gestures distance between gestures is the same hand gesturing?

features on individual gestures speed jitter purpose = speed / jitter bimanual synchronization

results: pairwise features distance between gestures (pixels) coreferent: mean distance = 48.4 non-coreferent: mean distance = 74.8 which hand is used? same hand different hands no gesture corefer 59.9 19.9 20.2 non-corefer 52.8 22.2 25.1

results: single-gesture features does the NP have “parents?” not predicted by these features does the NP have “children?” predicted by speed, purpose

results: meta-features correlate single-gesture features with discriminability of pairwise distance speed, purpose (r = -.17) x distance from body center (r = .22) regression of single gesture features (r = .42)

when do catchments happen? what types of NP coreference are disambiguated by gesture? we assumed pronouns, “this.” not so. definite NPs are not predicted well by gesture

when do catchments happen? there’s a lot of research on gesture-speech synchronization typically measures time at beginning of motion this is a different way to measure gesture-speech synchronization quite precisely

where do catchments happen?

future work move beyond deictic data, features we have data without diagrams, which includes more representational gestures recognize or annotate hand shape pairwise features that compare gesture trajectories

done? almost

does gesture actually improve coreference resolution? initial evaluation described in NAACL 2006 the answer is yes, but not by as much as you’d hope 54.9% with gestures, 52.8% without coreference resolution in spoken dialogues is hard better feature combination techniques may improve performance, as with prosody need to figure out how to use the meta-features

All Done! Thank You