The Recognition of Human Movement Using Temporal Templates Liat Koren.

Slides:



Advertisements
Similar presentations
Active Appearance Models
Advertisements

CSCE643: Computer Vision Bayesian Tracking & Particle Filtering Jinxiang Chai Some slides from Stephen Roth.
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
Caroline Rougier, Jean Meunier, Alain St-Arnaud, and Jacqueline Rousseau IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5,
Face Recognition. Introduction Why we are interested in face recognition? Why we are interested in face recognition? Passport control at terminals in.
Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.
Actions in video Monday, April 25 Kristen Grauman UT-Austin.
Patch to the Future: Unsupervised Visual Prediction
Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.
Vision Based Control Motion Matt Baker Kevin VanDyke.
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
 INTRODUCTION  STEPS OF GESTURE RECOGNITION  TRACKING TECHNOLOGIES  SPEECH WITH GESTURE  APPLICATIONS.
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
December 5, 2013Computer Vision Lecture 20: Hidden Markov Models/Depth 1 Stereo Vision Due to the limited resolution of images, increasing the baseline.
Computer Vision Spring ,-685 Instructor: S. Narasimhan Wean 5403 T-R 3:00pm – 4:20pm Lecture #20.
Exchanging Faces in Images SIGGRAPH ’04 Blanz V., Scherbaum K., Vetter T., Seidel HP. Speaker: Alvin Date: 21 July 2004.
Face Recognition Based on 3D Shape Estimation
CSSE463: Image Recognition Day 30 Due Friday – Project plan Due Friday – Project plan Evidence that you’ve tried something and what specifically you hope.
Highlights Lecture on the image part (10) Automatic Perception 16
Stockman MSU Fall Computing Motion from Images Chapter 9 of S&S plus otherwork.
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.
Face Recognition and Retrieval in Video Basic concept of Face Recog. & retrieval And their basic methods. C.S.E. Kwon Min Hyuk.
EE392J Final Project, March 20, Multiple Camera Object Tracking Helmy Eltoukhy and Khaled Salama.
Distinctive Image Features from Scale-Invariant Keypoints By David G. Lowe, University of British Columbia Presented by: Tim Havinga, Joël van Neerbos.
1 Template-Based Classification Method for Chinese Character Recognition Presenter: Tienwei Tsai Department of Informaiton Management, Chihlee Institute.
Flow Based Action Recognition Papers to discuss: The Representation and Recognition of Action Using Temporal Templates (Bobbick & Davis 2001) Recognizing.
Knowledge Systems Lab JN 9/10/2002 Computer Vision: Gesture Recognition from Images Joshua R. New Knowledge Systems Laboratory Jacksonville State University.
Multimodal Interaction Dr. Mike Spann
Active Vision Key points: Acting to obtain information Eye movements Depth from motion parallax Extracting motion information from a spatio-temporal pattern.
1. Introduction Motion Segmentation The Affine Motion Model Contour Extraction & Shape Estimation Recursive Shape Estimation & Motion Estimation Occlusion.
A Method for Hand Gesture Recognition Jaya Shukla Department of Computer Science Shiv Nadar University Gautam Budh Nagar, India Ashutosh Dwivedi.
September 23, 2014Computer Vision Lecture 5: Binary Image Processing 1 Binary Images Binary images are grayscale images with only two possible levels of.
HCI / CprE / ComS 575: Computational Perception Instructor: Alexander Stoytchev
Digital Image Processing CCS331 Relationships of Pixel 1.
Generalized Hough Transform
November 13, 2014Computer Vision Lecture 17: Object Recognition I 1 Today we will move on to… Object Recognition.
Recognizing Action at a Distance Alexei A. Efros, Alexander C. Berg, Greg Mori, Jitendra Malik Computer Science Division, UC Berkeley Presented by Pundik.
Pattern Recognition April 19, 2007 Suggested Reading: Horn Chapter 14.
Vision-based human motion analysis: An overview Computer Vision and Image Understanding(2007)
December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.
Tracking CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.
Lecture 7: Features Part 2 CS4670/5670: Computer Vision Noah Snavely.
Vehicle Segmentation and Tracking From a Low-Angle Off-Axis Camera Neeraj K. Kanhere Committee members Dr. Stanley Birchfield Dr. Robert Schalkoff Dr.
Action and Gait Recognition From Recovered 3-D Human Joints IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS— PART B: CYBERNETICS, VOL. 40, NO. 4, AUGUST.
Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,
Autonomous Robots Vision © Manfred Huber 2014.
HCI/ComS 575X: Computational Perception Instructor: Alexander Stoytchev
By: Amarya Mothana Habib Allah Muhammad
Visual Computing Computer Vision 2 INFO410 & INFO350 S2 2015
Team Members Ming-Chun Chang Lungisa Matshoba Steven Preston Supervisors Dr James Gain Dr Patrick Marais.
Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.
CSC321 Lecture 5 Applying backpropagation to shape recognition Geoffrey Hinton.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.
Lecture 9 Feature Extraction and Motion Estimation Slides by: Michael Black Clark F. Olson Jean Ponce.
Robotics Chapter 6 – Machine Vision Dr. Amit Goradia.
Toward humanoid manipulation in human-centered environments T. Asfour, P. Azad, N. Vahrenkamp, K. Regenstein, A. Bierbaum, K. Welke, J. Schroder, R. Dillmann.
Over the recent years, computer vision has started to play a significant role in the Human Computer Interaction (HCI). With efficient object tracking.
Student Gesture Recognition System in Classroom 2.0 Chiung-Yao Fang, Min-Han Kuo, Greg-C Lee, and Sei-Wang Chen Department of Computer Science and Information.
Another Example: Circle Detection
Contents Team introduction Project Introduction Applicability
Dynamical Statistical Shape Priors for Level Set Based Tracking
Common Classification Tasks
Computer Vision Lecture 5: Binary Image Processing
Vehicle Segmentation and Tracking in the Presence of Occlusions
Tremor Detection Using Motion Filtering and SVM Bilge Soran, Jenq-Neng Hwang, Linda Shapiro, ICPR, /16/2018.
Computer Vision Lecture 16: Texture II
HCI/ComS 575X: Computational Perception
Object Recognition Today we will move on to… April 12, 2018
Learning complex visual concepts
Presentation transcript:

The Recognition of Human Movement Using Temporal Templates Liat Koren

- 2 - Lecture subjects Introduction Prior work The Temporal Templates Usage example

- 3 - Introduction Computer vision trends –Less image or camera motion –More on labeling of action Reasons –More computational power –Wireless application –Interactive environments

- 4 - Introduction – cont. Recent efforts are in Three Dimensional object reconstruction –Assuming it will have to be used in the recognition of human motion. This article claims otherwise –View-based approach –Direct recognition

- 5 - Motivating Example

- 6 - Motivating Example

- 7 - Motivating Example Static pictures –Hard to recognize. Sequence on motion –Human can recognize without three dimensional reconstruction. Conclusion –It is possible to recognize movement using only the motion itself.

- 8 - Process –Recover the pose of the person at each time instant using a 3D model. –The model’s projected image should be as close as possible to the object (e.g. edges of body in the image) Drawbacks –Complicated process –Human interference is usually required –Special imaging environment

D Based recognition Action is a sequence of static poses of object. Requires –Normalization –Removal of background

Wilson and Bobik’s approach Actions are usually hand gestures Representation –Actual image –Grayscale –No background Benefits: –Hand appearance is fairly similar over a wide range of people Problems –Actions that include the appearance of the whole body are not visually consistent across different people.

Yamato’s et al. approach Representation –No background –Black and white silhouettes Matching –Vector quantize –Usage of a mathematical method Benefits –Help handling the variability between people Problems –Disappearance of movement inside the silhouette

Summery of prior work Action is a sequence of static poses. Requires individual features or properties that can be extracted and tracked from each frame. Recognition of movement from a sequence of images is a complicated task. Usually requires previous recognition and segmentation of the person.

Motion based recognition Attempt to characterize the motion itself without reference to the underlying static poses of the body. Possible approaches –Blob like representation –Tracking of predefined regions (e.g., legs, head, mouth) using motion. Face expression patches Whole body patches –Measure typical patterns of muscle activation

Terms Movement –where – motion has occurred in image sequence. MEI – Motion Energy Image –how – the motion is moving. MHI – Motion History Image + Temporal Templates

Temporal Templates Representation of movement –View specific –Movement is motion in time –Vector image that can be matched against stored representations of movements. Assumptions –Background is static –Camera movements can be removed –Motion of irrelevant objects can be eliminated

M otion- E nergy I mages where did the movement occurred ….

M otion- E nergy I mages Notice that: –If τ is very big, all the differences are accumulated –Τ has a vast influence on the temporal representation of a movement.

M otion- E nergy I mages Smooth change in the viewing angle causes a smooth change in the viewed image, thus coarse sampling of the viewing circle is enough (30°)

M otion- H istory I mages Intensity of a pixel represents the temporal history in that pixel. Newer movement is brighter.

M otion- H istory I mages A time-window of size τ is used – movement older than τ is ignored. The results of the article uses a simple replacement and decay operator: Notice that MEI can be calculated out of MHI by painting in white any non-black pixel One may wonder, why not use only MHI ? Answers will be given later…

MEI and MHI in a nutshell MEI and MHI are two vector images designed to encode a variety of motion properties. Benefits in this representation is that the calculation is recursive, thus only up-to- date information need to be stored, making the computation both fast and space efficient.

Matching Temporal Templates Collect training examples of each movement from a variety of viewing angles. Compute statistical representation of the MHI/MEI images (Hu moments) Given an input movement: –Calculate a statistical representation –Use mahalanobis distance to find a stored movement, that is the nearest to the input.

Mahalanobis Distance Example

Reasoning for the algorithm Mahanobis distance provides: –Good matching as shown in the results of the article. –Simple calculation which makes real-time applications feasible. Hu moments allow representation of images, that is invariant to scale or translation. One problem with Hu moments is that: “Hu moments are difficult to reason about intuitively” (the authors)

Testing the system 18 exercises performed by experienced aerobic instructor. MEIs are on the bottom rows.

Why both MHI and MEI ? Because MHI and MEI perceive two different characteristics of the movement (the “where” and the “how”) they look different,and thus, both essential.

First experiment Input 30° left of the subject Match against all seven views of all 18 moves 12 out of 18 are correctly recognized 0° 30° 60° 90° 120° 150° 180°

Analyze the results of 1 st exp. Move 13 in 30 °Move 6 in 0 °The correct match input falsecorrect

Combining multiple views Two cameras with orthogonal views Minimize the sum of the mahalanobis distance between the two input templates and two stored views of movement that have 90° between them. Hidden assumption: we know the angular relationship between the cameras.

Second Experiment 0° 30° 60° 90° 120° 150° Input with two cameras: 30° left of the subject 60° right of the subject Match against all seven views of all 18 moves 15 out of 18 are correctly recognized

Analyze the results of 2 nd exp. Move 16Move 15 The correct match input falsecorrect

Segmentation and Recognition Problem : speed of performance is different among different people. Solution: Segmentation –When training the system, calculate τ max and τ min for each movement. – Use algorithm to match over a wide range of τ.

Problems Problems with current system –One person partially occludes another Solution: Use several cameras –More than one person appears in the view point Solution: use a tracking bounding box

More Problems Motion of part of the body is not specified during a movement –Possible solutions Automatically mask away regions of this type of motion Always include them Camera motion –Rather easy to eliminate since camera motion is limited. Person is performing the movement while locomotion

The KidsRoom: An Application room is aware of the children (at most 4) The room takes the children to a story. The room’s reaction is influenced by the actions of the children. Current story : adventurous tour to monster land In the last scene the monsters teach the children to dance. Then, the monsters follow the children if they perform movements they “know” The narration coerces the children to room locations where occlusions is not a problem

- 36 -