Identifying Human-Object Interaction in Range and Video Data

Slides:

Advertisements

Similar presentations

Joint Face Alignment The Recognition Pipeline

Advertisements

Probabilistic Tracking and Recognition of Non-rigid Hand Motion

INTERACTING WITH SIMULATION ENVIRONMENTS THROUGH THE KINECT Fayez Alazmi Supervisor: Dr. Brett Wilkinson Flinders University Image 1Image 2Image 3 Source.

- Recovering Human Body Configurations: Combining Segmentation and Recognition (CVPR’04) Greg Mori, Xiaofeng Ren, Alexei A. Efros and Jitendra Malik -

Tracking Learning Detection

1.Introduction 2.Article [1] Real Time Motion Capture Using a Single TOF Camera (2010) 3.Article [2] Real Time Human Pose Recognition In Parts Using a.

Real Time Motion Capture Using a Single Time-Of-Flight Camera

Real-Time Human Pose Recognition in Parts from Single Depth Images Presented by: Mohammad A. Gowayyed.

Human-Computer Interaction Human-Computer Interaction Segmentation Hanyang University Jong-Il Park.

Virtual Dart: An Augmented Reality Game on Mobile Device Supervisor: Professor Michael R. Lyu Prepared by: Lai Chung Sum Siu Ho Tung.

Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.

Kinect Case Study CSE P 576 Larry Zitnick

CS335 Principles of Multimedia Systems Multimedia and Human Computer Interfaces Hao Jiang Computer Science Department Boston College Nov. 20, 2007.

Redaction: redaction: PANAKOS ANDREAS. An Interactive Tool for Color Segmentation. An Interactive Tool for Color Segmentation. What is color segmentation?

Quick Overview of Robotics and Computer Vision. Computer Vision Agent Environment camera Light ?

ICCV 2003UC Berkeley Computer Vision Group Recognizing Action at a Distance A.A. Efros, A.C. Berg, G. Mori, J. Malik UC Berkeley.

Recognizing Action at a Distance A.A. Efros, A.C. Berg, G. Mori, J. Malik UC Berkeley.

Sean Ryan Fanello. ^ (+9 other guys. )

A Brief Overview of Computer Vision Jinxiang Chai.

Multimedia Specification Design and Production 2013 / Semester 2 / week 8 Lecturer: Dr. Nikos Gazepidis

3D Motion Capture Assisted Video human motion recognition based on the Layered HMM Myunghoon Suk & Ashok Ramadass Advisor : Dr. B. Prabhakaran Multimedia.

BraMBLe: The Bayesian Multiple-BLob Tracker By Michael Isard and John MacCormick Presented by Kristin Branson CSE 252C, Fall 2003.

Introduction to Computer Vision Olac Fuentes Computer Science Department University of Texas at El Paso El Paso, TX, U.S.A.

Self-paced Learning for Latent Variable Models

REU Project RGBD gesture recognition with the Microsoft Kinect Steven Hickson.

Miguel Reyes 1,2, Gabriel Dominguez 2, Sergio Escalera 1,2 Computer Vision Center (CVC) 1, University of Barcelona (UB) 2

A Method for Hand Gesture Recognition Jaya Shukla Department of Computer Science Shiv Nadar University Gautam Budh Nagar, India Ashutosh Dwivedi.

Person detection, tracking and human body analysis in multi-camera scenarios Montse Pardàs (UPC) ACV, Bilkent University, MTA-SZTAKI, Technion-ML, University.

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Gesture Recognition Using 3D Appearance and Motion Features Guangqi Ye, Jason J. Corso, Gregory D. Hager.

#MOTION ESTIMATION AND OCCLUSION DETECTION #BLURRED VIDEO WITH LAYERS

DIEGO AGUIRRE COMPUTER VISION INTRODUCTION 1. QUESTION What is Computer Vision? 2.

The University of Texas at Austin Vision-Based Pedestrian Detection for Driving Assistance Marco Perez.

Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik.

Tracking People by Learning Their Appearance Deva Ramanan David A. Forsuth Andrew Zisserman.

Human pose recognition from depth image MS Research Cambridge.

Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.

3d Pose Detection Used by Kinect

CAMEO: Face Recognition Year 1 Progress and Year 2 Goals Fernando de la Torre, Carlos Vallespi, Takeo Kanade.

Human Activity Recognition at Mid and Near Range Ram Nevatia University of Southern California Based on work of several collaborators: F. Lv, P. Natarajan,

Team Members Ming-Chun Chang Lungisa Matshoba Steven Preston Supervisors Dr James Gain Dr Patrick Marais.

Extracting Simple Verb Frames from Images Toward Holistic Scene Understanding Prof. Daphne Koller Research Group Stanford University Geremy Heitz DARPA.

Detecting Occlusion from Color Information to Improve Visual Tracking

Week 3 Emily Hand UNR. Online Multiple Instance Learning The goal of MIL is to classify unseen bags, instances, by using the labeled bags as training.

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

Hand Gestures Based Applications

3D Puppetry: A Kinect-based Interface for 3D Animation

Computer vision: models, learning and inference

San Diego May 22, 2013 Giovanni Saponaro Giampiero Salvi

Intrinsic images and shape refinement

Contents Team introduction Project Introduction Applicability

A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology

ASTER image – one of the fastest changing places in the U.S. Where??

Lecture 26 Hand Pose Estimation Using a Database of Hand Images

Gait Analysis for Human Identification (GAHI)

Recognizing Humans: Action Recognition

Real-Time Human Pose Recognition in Parts from Single Depth Image

Head pose estimation without manual initialization

Developing systems with advanced perception, cognition, and interaction capabilities for learning a robotic assembly in one day Dr. Dimitrios Tzovaras.

COMPUTER VISION Tam Lam

Iterative Optimization

Eric Grimson, Chris Stauffer,

Kinect for Creative Development with open source frameworks

Learning complex visual concepts

Interactive Medium-Fi Prototype

Announcements Project 2 artifacts Project 3 due Thursday night

Visual Recognition of American Sign Language Using Hidden Markov Models 문현구 문현구.

Problem Image and Volume Segmentation:

Report 2 Brandon Silva.

Sign Language Recognition With Unsupervised Feature Learning

Presentation transcript:

Identifying Human-Object Interaction in Range and Video Data Ben Packer, Varun Ganapathi, Suchi Saria, and Daphne Koller First Stage: Results of Full Model Aim: Understand and classify human actions while simultaneously tracking objects of interaction Capture initial depth with no foreground Capture video/depth of action involving object Pose tracker runs simultaneously in real-time Every pixel is either background (same depth as initial image), pose, or possible object Train visual object detector from most “confident” candidate objects Use (smoothed) detector on the full sequence Kinect Video Data Depth Image Video and Tracked Pose Pick Up Base Model Full Model Put Down Tasks: What action is being performed? Where is the manipulated object? Drop Colored blobs indicate candidate objects, ranging from red (least likely) to yellow (most likely) Why is this easy? Full Model of Action and Interaction Depth sensor allows us to easily detect foreground/background Existing pose tracker accurately finds human Extremely efficient, runs in real-time, so a large amount of data can be easily collected Knowing the action will help track object Use “spatio-temporal interaction primitives” e.g. “moving away from foot,” “in hand” Model each action as HMM over primitives Allows for simple learning and inference Kick A Why is this hard? Even with background subtraction and pose estimation, object may still be in many places Generic object tracking can help locate the object, but often fails Action recognition involving human-object interaction is largely unsolved Toss First Attempt: C,F: candidate positions and appearance (obs.) J: human joint positions (obs.) A: action of entire sequence, S: state O: object position, P: active primitive Action Classification