University of Coimbra ISR – Institute of Systems and Robotics University of Coimbra - Portugal WP5: Behavior Learning And Recognition Structure For Multi-modal.

Slides:

Advertisements

Similar presentations

Bayesian Belief Propagation

Advertisements

Reconstructing Non-stationary Articulated Objects in Monocular Video using Silhouette Information Saad M. Khan and Mubarak Shah University of Central Florida,

Lecture 11: Two-view geometry

Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.

Stereo Many slides adapted from Steve Seitz. Binocular stereo Given a calibrated binocular stereo pair, fuse it to produce a depth image Where does the.

Visual Event Detection & Recognition Filiz Bunyak Ersoy, Ph.D. student Smart Engineering Systems Lab.

EVENTS: INRIA Work Review Nov 18 th, Madrid.

Tracking Multiple Occluding People by Localizing on Multiple Scene Planes Professor ：王聖智教授 Student ：周節.

Tracking Multiple Occluding People by Localizing on Multiple Scene Planes Saad M. Khan and Mubarak Shah, PAMI, VOL. 31, NO. 3, MARCH 2009, Donguk Seo

Intelligent Systems Lab. Extrinsic Self Calibration of a Camera and a 3D Laser Range Finder from Natural Scenes Davide Scaramuzza, Ahad Harati, and Roland.

Christian LAUGIER – e-Motion project-team Bayesian Sensor Fusion for “Dynamic Perception” “Bayesian Occupation Filter paradigm (BOF)” Prediction Estimation.

Modeling Pixel Process with Scale Invariant Local Patterns for Background Subtraction in Complex Scenes (CVPR’10) Shengcai Liao, Guoying Zhao, Vili Kellokumpu,

A Robotic Wheelchair for Crowded Public Environments Choi Jung-Yi EE887 Special Topics in Robotics Paper Review E. Prassler, J. Scholz, and.

MULTI-TARGET TRACKING THROUGH OPPORTUNISTIC CAMERA CONTROL IN A RESOURCE CONSTRAINED MULTIMODAL SENSOR NETWORK Jayanth Nayak, Luis Gonzalez-Argueta, Bi.

A System for Observing and Recognizing Objects in the Real World J.O. Eklundh, M. Björkman, E. Hayman Computational Vision and Active Perception Lab Royal.

Multi-view stereo Many slides adapted from S. Seitz.

Automatic 2D-3D Registration Student: Lingyun Liu Advisor: Prof. Ioannis Stamos.

1 Integration of Background Modeling and Object Tracking Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung IEEE ICME, 2006.

Lecture 20: Two-view geometry CS6670: Computer Vision Noah Snavely.

Multiple View Geometry Marc Pollefeys University of North Carolina at Chapel Hill Modified by Philippos Mordohai.

Grid Maps for Robot Mapping. Features versus Volumetric Maps.

3D Computer Vision and Video Computing 3D Vision Topic 8 of Part 2 Visual Motion (II) CSC I6716 Spring 2004 Zhigang Zhu, NAC 8/203A

InerVis Mobile Robotics Laboratory Institute of Systems and Robotics ISR – Coimbra Contact Person: Jorge Lobo Human inertial sensor:

55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography.

Epipolar geometry Class 5. Geometric Computer Vision course schedule (tentative) LectureExercise Sept 16Introduction- Sept 23Geometry & Camera modelCamera.

The 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems October 11-15, 2009 St. Louis, USA.

1 DARPA TMR Program Collaborative Mobile Robots for High-Risk Urban Missions Second Quarterly IPR Meeting January 13, 1999 P. I.s: Leonidas J. Guibas and.

BACS Review Meeting FCT-UC Jorge Dias 17 th – 19 th March 2008 Collège de France, Paris.

SLAM (Simultaneously Localization and Mapping)

Bala Lakshminarayanan AUTOMATIC TARGET RECOGNITION April 1, 2004.

Shape-Based Human Detection and Segmentation via Hierarchical Part- Template Matching Zhe Lin, Member, IEEE Larry S. Davis, Fellow, IEEE IEEE TRANSACTIONS.

KinectFusion : Real-Time Dense Surface Mapping and Tracking IEEE International Symposium on Mixed and Augmented Reality 2011 Science and Technology Proceedings.

A General Framework for Tracking Multiple People from a Moving Camera

Visual Perception PhD Program in Information Technologies Description: Obtention of 3D Information. Study of the problem of triangulation, camera calibration.

SA-1 Mapping with Known Poses Ch 4.2 and Ch 9. 2 Why Mapping? Learning maps is one of the fundamental problems in mobile robotics Maps allow robots to.

Imaging Geometry for the Pinhole Camera Outline: Motivation |The pinhole camera.

1 PhD dissertation Hadi Aliakbarpour Faculty of Science and Technology October 2012, University of Coimbra.

An Information Fusion Approach for Multiview Feature Tracking Esra Ataer-Cansizoglu and Margrit Betke ) Image and.

Metrology 1.Perspective distortion. 2.Depth is lost.

Stereo Many slides adapted from Steve Seitz.

Online Kinect Handwritten Digit Recognition Based on Dynamic Time Warping and Support Vector Machine Journal of Information & Computational Science, 2015.

Crowd Analysis at Mass Transit Sites Prahlad Kilambi, Osama Masound, and Nikolaos Papanikolopoulos University of Minnesota Proceedings of IEEE ITSC 2006.

Chapter 5 Multi-Cue 3D Model- Based Object Tracking Geoffrey Taylor Lindsay Kleeman Intelligent Robotics Research Centre (IRRC) Department of Electrical.

Learning to Navigate Through Crowded Environments Peter Henry 1, Christian Vollmer 2, Brian Ferris 1, Dieter Fox 1 Tuesday, May 4, University of.

AUTOMATIC TARGET RECOGNITION AND DATA FUSION March 9 th, 2004 Bala Lakshminarayanan.

Sean M. Ficht.  Problem Definition  Previous Work  Methods & Theory  Results.

Based on the success of image extraction/interpretation technology and advances in control theory, more recent research has focused on the use of a monocular.

Looking at people and Image-based Localisation Roberto Cipolla Department of Engineering Research team

Final Review Course web page: vision.cis.udel.edu/~cv May 21, 2003  Lecture 37.

Using Adaptive Tracking To Classify And Monitor Activities In A Site W.E.L. Grimson, C. Stauffer, R. Romano, L. Lee.

Journal of Visual Communication and Image Representation

IEEE International Conference on Multimedia and Expo.

Presented by: Idan Aharoni

Motion Features for Action Recognition YeHao 3/11/2014.

Robotics Chapter 6 – Machine Vision Dr. Amit Goradia.

Announcements Final is Thursday, March 18, 10:30-12:20 –MGH 287 Sample final out today.

University of Pennsylvania 1 GRASP Control of Multiple Autonomous Robot Systems Vijay Kumar Camillo Taylor Aveek Das Guilherme Pereira John Spletzer GRASP.

Ehsan Nateghinia Hadi Moradi (University of Tehran, Tehran, Iran) Video-Based Multiple Vehicle Tracking at Intersections.

3D Single Image Scene Reconstruction For Video Surveillance Systems

CS4670 / 5670: Computer Vision Kavita Bala Lec 27: Stereo.

A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology

Autonomous Cyber-Physical Systems: Autonomous Systems Software Stack

DEPTH RANGE ACCURACY FOR PLENOPTIC CAMERAS

Filtering Things to take away from this lecture An image as a function

Probabilistic Map Based Localization

Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu

Principle of Bayesian Robot Localization.

Filtering An image as a function Digital vs. continuous images

Presented by Xu Miao April 20, 2005

Optical flow and keypoint tracking

Presentation transcript:

University of Coimbra ISR – Institute of Systems and Robotics University of Coimbra - Portugal WP5: Behavior Learning And Recognition Structure For Multi-modal Fusion Part I

University of Coimbra Relationship of the WP3,4 and 5 WP3 (Sensor modeling and multi-sensor fusion techniques ) Task 3.3 Bayesian network structures for multi-modal fusion WP4 (Localization and tracking techniques as applied to humans ) Task 4.3 Online adaptation and learning WP5 Behavior learning and recognition Trackers results, Events detected, ids on re-identification situations Levels of the fusion (pixel, feature or decision level), Bayesian structures for the implementation of the scenarios of WP2

University of Coimbra Proposal Multi-layer Multi-modal Homography-based Occupancy Grid Using data coming from stationary sensors (Structure): – Image Data – Range Data – Sound Source Data

University of Coimbra Inertial Compensated Homography Projecting a world point on a reference plane in two phases X Y Z Real camera Virtual camera [Luiz2007] Gravity First step: Projecting real world point on the virtual image plane Second step: Projecting form virtual image plane on a common plane

University of Coimbra Inertial Compensated Homography X Y Z Real camera Virtual camera [Luiz2007] Gravity Infinite homography Homography between two planes Camera calibration matrix Rotation between virtual and real camera (given by IMU)

University of Coimbra Image Registration X Y Z [Luiz2007] Gravity

University of Coimbra Extending A Virtual Plane To More Plane to image homography: Vanishing points for X,Y and Z directions Vanishing line of reference plane normalized [Khan2007] Vanishing point of reference direction Scale factor Scale value encapsulating  and z X Y Z

University of Coimbra Relationship Between Different Planes In The Structure Homography between views i and j, induced by a plane  i parallel to  ref having the homography of reference plane: [Khan2007] X Y Z Vanishing point of reference direction Scale value Homography of reference plane

University of Coimbra Image & Laser Geometric Registration X Y Z [Luiz2007-Hadi2009] Gravity

University of Coimbra Registering LRF Data In a Multi-Camera Scenario [Hadi2009] Reprojection of LRF data on the image (blue points) Image planes Projection of points observed by LRF Transformation matrix between camera and LRF obtained by calibration Projection of points observed by LRF on the image plane + Result A set of cameras and laser range finder Camera projection matrix Image Range data

University of Coimbra Image & Laser & Sound Geometric Registration X Y Z Gravity PA(  ) [JFC2008, 2009]

University of Coimbra Bayesian Binaural System for 3D Localisation –Binaural sensing interaural time and level differences – ITD , quasi frequency-independent, and ILDs  L(f c k ))For sources within 2 meters range, binaural cues alone (interaural time and level differences – ITD , quasi frequency-independent, and ILDs  L(f c k )) can be used to fully localise the source in 3D space (i.e. volume confined in azimuth , elevation  and distance  ). If the source is more than 2 meters away the source can only be localised to a volume (cone of confusion) in azimuth. 1m2m Binaural cue information + + Distance  Elevation  Azimuth θ Azimuth θ only Z

University of Coimbra Bayesian Binaural System for 3D Localisation Subset of [JFC2008, 2009]

University of Coimbra Direct Auditory Sensor Model: () Direct Auditory Sensor Model: (DASM) (Bayesian learning through HRTF calibration using ITDs  and ILDs  L) Azimuth Elevation Distance Binary variable denoting “Cell C occupied by sound-source” Inverse Auditory Sensor Model: () Inverse Auditory Sensor Model: (IASM) Bayesian Binaural System for 3D Localisation Bayes Rule Auditory Saliency Map Solution: cluster local saliency maxima points (i.e. cells with maximum probability of occupancy, 1 per sound-source) (front -to-back confusion effect avoided by considering only frontal hemisphere estimates)

University of Coimbra Bayesian Binaural System for Localisation in Azimuth Planes of Arrival Direct Auditory Sensor Model: () Direct Auditory Sensor Model: (DASM) (Bayesian learning through HRTF calibration of interaural time differences – ITDs  ) Inverse Auditory Sensor Model: () Inverse Auditory Sensor Model: (IASM)  -90º90º Bayes Rule Auditory Saliency Map Solution: cluster local saliency maxima planes of arrival (PA) per sound-source PA(  ) (front -to-back confusion effect avoided by considering only frontal hemisphere estimates)

University of Coimbra Demos on Bayesian Binaural System (Arrival Direction of Sound Source) Two Talking personsA walking person

University of Coimbra A Sample View Of.

University of Coimbra Image & Laser & Sound Occupancy Grid Image, Range and Sound Occupancy Grid X Z Y X Fusion Y Z

University of Coimbra Bibliography

University of Coimbra Bibliography Franco, J. & Boyer, E. Fusion of Multi-View Silhouette Cues Using a Space Occupancy Grid Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05), 2005 Christophe Braillon, Kane Usher, C. P. J. L. C. & Laugier, C. Fusion of stereo and optical flow data using occupancy grids Computer Vision and Pattern Recognition, CVPR IEEE Conference on, Saad M. Khan, P. Y. & Shah, M. A Homographic Framework for the Fusion of Multi-view Silhouettes Computer Vision, ICCV IEEE 11th International Conference on, 2007 R. Eshel, Y. M. Homography Based Multiple Camera Detection and Tracking of People in a Dense Crowd Computer Vision and Pattern Recognition, CVPR IEEE Conference on, 2008 Conference (Arsic2008) D. Arsic, E. Hristov, N. L. Applying multi layer homography for multi camera person tracking Distributed Smart Cameras, ICDSC Second ACM/IEEE International Conference on, 2008 Francois Fleuret, Jerome Berclaz, R. L. & Fua, P. Multi-Camera People Tracking with a Probabilistic Occupancy Map IEEE transactions on Pattern analysis and Machine Intelligence, 2008

University of Coimbra Bibliography Sangho Park, M. M. T. Understanding human interactions with track and body synergies (TBS) captured from multiple views Computer Vision and Image Understanding, 2008 Yuxin Jin, Linmi Tao, H. D. R. N. & Xu, G. Background modeling from a free-moving camera by Multi-Layer Homography Algorithm Image Processing, ICIP th IEEE International Conference on, 2008 Luiz G. B. Mirisola, Jorge Dias, A. T. d. A. Trajectory Recovery and 3D Mapping from Rotation- Compensated Imagery for an Airship Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems San Diego, CA, USA, Oct 29 - Nov 2, 2007, 2007 Mirisola, L. G. B. & Dias, J. Tracking from a Moving Camera with Attitude Estimates ICR08, 2008 Batista, J. P. Tracking Pedestrians Under Occlusion Using Multiple Cameras Image Analysis and Recognition, Springer Berlin-Heidelberg., 2004, 3212/2004, Joao Filipe Ferreira, Pierre Bessière, K. M. C. P. J. L. C. L. & Dias, J. Bayesian Models for Multimodal Perception of 3D Structure and Motion C. Chen, C. Tay, K. M. & C. Laugier (INRIA, F. Dynamic environment modeling with gridmap: a multiple-object tracking application 9th International Conference on Control, Automation, Robotics and Vision, ICARCV '06., 2006

University of Coimbra Bibliography J. F. Ferreira, P. Bessière, K. Mekhnacha, J. Lobo, J. Dias, and C. Laugier, “Bayesian Models for Multimodal Perception of 3D Structure and Motion,” in International Conference on Cognitive Systems (CogSys 2008), pp , University of Karlsruhe, Karlsruhe, Germany, April C. Pinho, J. F. Ferreira, P. Bessière, and J. Dias, “A Bayesian Binaural System for 3D Sound- Source Localisation,” in International Conference on Cognitive Systems (CogSys 2008), pp , University of Karlsruhe, Karlsruhe, Germany, April Ferreira, J. F., Pinho, C., and Dias, J., “Implementation and Calibration of a Bayesian Binaural System for 3D Localisation”, in 2008 IEEE International Conference on Robotics and Biomimetics (ROBIO 2008), Bangkok, Tailand, Hadi Aliakbarpour, Pedro Nunez, Jose Prado, Kamrad Khoshhal and Jorge Dias. An Efficient Algorithm for Extrinsic Calibration between a 3D Laser Range Finder and a Stereo Camera for Surveillance, ICAR2009.

University of Coimbra Institute of Systems and Robitcs