ESR 2 / ER 2 Testing Campaign Review A. CrivellaroY. Verdie.

Slides:



Advertisements
Similar presentations
Using Strong Shape Priors for Multiview Reconstruction Yunda SunPushmeet Kohli Mathieu BrayPhilip HS Torr Department of Computing Oxford Brookes University.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Visual Servo Control Tutorial Part 1: Basic Approaches Chayatat Ratanasawanya December 2, 2009 Ref: Article by Francois Chaumette & Seth Hutchinson.
RGB-D object recognition and localization with clutter and occlusions Federico Tombari, Samuele Salti, Luigi Di Stefano Computer Vision Lab – University.
For Internal Use Only. © CT T IN EM. All rights reserved. 3D Reconstruction Using Aerial Images A Dense Structure from Motion pipeline Ramakrishna Vedantam.
Caroline Rougier, Jean Meunier, Alain St-Arnaud, and Jacqueline Rousseau IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 21, NO. 5,
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Joydeep Biswas, Manuela Veloso
MASKS © 2004 Invitation to 3D vision Lecture 7 Step-by-Step Model Buidling.
Two-View Geometry CS Sastry and Yang
Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.
Automatic Feature Extraction for Multi-view 3D Face Recognition
Parallel Tracking and Mapping for Small AR Workspaces Vision Seminar
IIIT Hyderabad Pose Invariant Palmprint Recognition Chhaya Methani and Anoop Namboodiri Centre for Visual Information Technology IIIT, Hyderabad, INDIA.
Chapter 6 Feature-based alignment Advanced Computer Vision.
Cambridge, Massachusetts Pose Estimation in Heavy Clutter using a Multi-Flash Camera Ming-Yu Liu, Oncel Tuzel, Ashok Veeraraghavan, Rama Chellappa, Amit.
Segmentation-Free, Area-Based Articulated Object Tracking Daniel Mohr, Gabriel Zachmann Clausthal University, Germany ISVC.
Structure and Motion from Line Segments in Multiple Images Camillo J. Taylor, David J. Kriegman Presented by David Lariviere.
A Versatile Depalletizer of Boxes Based on Range Imagery Dimitrios Katsoulas*, Lothar Bergen*, Lambis Tassakos** *University of Freiburg **Inos Automation-software.
Robotics Simulator Intelligent Systems Lab. What is it ? Software framework - Simulating Robotics Algorithms.
MASKS © 2004 Invitation to 3D vision Lecture 11 Vision-based Landing of an Unmanned Air Vehicle.
Accurate Non-Iterative O( n ) Solution to the P n P Problem CVLab - Ecole Polytechnique Fédérale de Lausanne Francesc Moreno-Noguer Vincent Lepetit Pascal.
A Study of Approaches for Object Recognition
Spike Sorting Algorithm implemented on FPGA Elad Ilan Asaf Gal Sup: Alex Z.
Many slides and illustrations from J. Ponce
Lecture 11: Structure from motion CS6670: Computer Vision Noah Snavely.
Fitting a Model to Data Reading: 15.1,
Multiple View Reconstruction Class 23 Multiple View Geometry Comp Marc Pollefeys.
Lecture 6: Feature matching and alignment CS4670: Computer Vision Noah Snavely.
Lecture 12: Structure from motion CS6670: Computer Vision Noah Snavely.
Visual Tracking with Online Multiple Instance Learning
1 TEMPLATE MATCHING  The Goal: Given a set of reference patterns known as TEMPLATES, find to which one an unknown pattern matches best. That is, each.
Vision-based Landing of an Unmanned Air Vehicle
Lecture 4: Feature matching CS4670 / 5670: Computer Vision Noah Snavely.
Self-Calibration and Metric Reconstruction from Single Images Ruisheng Wang Frank P. Ferrie Centre for Intelligent Machines, McGill University.
3D Reconstruction Jeff Boody. Goals ● Reconstruct 3D models from a sequence of at least two images ● No prior knowledge of the camera or scene ● Use the.
WRM FUTURE DEVELOPMENT DANIELE FELICI (ER1), ALI ABDALLAH (ESR1) WP2 EDUSAFE MEETING CERN, JUNE 2015.
A Comparative Evaluation of Three Skin Color Detection Approaches Dennis Jensch, Daniel Mohr, Clausthal University Gabriel Zachmann, University of Bremen.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
Tracking People by Learning Their Appearance Deva Ramanan David A. Forsuth Andrew Zisserman.
Vision-based human motion analysis: An overview Computer Vision and Image Understanding(2007)
Source: Computer Vision and Pattern Recognition Workshops (CVPRW), 2010 IEEE Computer Society Conference on Author: Paucher, R.; Turk, M.; Adviser: Chia-Nian.
EECS 274 Computer Vision Geometric Camera Calibration.
3d Pose Detection Used by Kinect
Cherevatsky Boris Supervisors: Prof. Ilan Shimshoni and Prof. Ehud Rivlin
February Testing Campaign Summary. Disclaimer This presentation is based on the reports of the testing campaign and focuses on the problems arisen during.
Visual Odometry David Nister, CVPR 2004
Lecture 9 Feature Extraction and Motion Estimation Slides by: Michael Black Clark F. Olson Jean Ponce.
Visual Odometry for Ground Vehicle Applications David Nistér, Oleg Naroditsky, and James Bergen Sarnoff Corporation CN5300 Princeton, New Jersey
Robust Estimation Course web page: vision.cis.udel.edu/~cv April 23, 2003  Lecture 25.
Camera Calibration Course web page: vision.cis.udel.edu/cv March 24, 2003  Lecture 17.
MASKS © 2004 Invitation to 3D vision Lecture 3 Image Primitives andCorrespondence.
Grouping and Segmentation. Sometimes edge detectors find the boundary pretty well.
Geometric Camera Calibration
REAL-TIME DETECTOR FOR UNUSUAL BEHAVIOR
Signal and Image Processing Lab
Karel Lebeda, Simon Hadfield, Richard Bowden
WP2 – Testing campaign and beyond
Recognition of biological cells – development
Real Time Dense 3D Reconstructions: KinectFusion (2011) and Fusion4D (2016) Eleanor Tursman.
Tracking parameter optimization
Approximate Models for Fast and Accurate Epipolar Geometry Estimation
A WRM-based Application
Combining Geometric- and View-Based Approaches for Articulated Pose Estimation David Demirdjian MIT Computer Science and Artificial Intelligence Laboratory.
An Introduction of Marker and Markerless In AR
Sensor Fusion Localization and Navigation for Visually Impaired People
Unsupervised Perceptual Rewards For Imitation Learning
Calibration and homographies
Presentation transcript:

ESR 2 / ER 2 Testing Campaign Review A. CrivellaroY. Verdie

Pose Parametrization Pose: 6 dofs of the rigid transform between a fixed world reference system and the camera ref. system: 3x3 rotation matrix (3 dofs) translation (3 dofs) Accuracy We considered the following error measures: L2 norm of rotation array [ ] L2 norm of translation array [m] 3D distance between the the predicted position of the box and its real position [m] 2D reprojection error [pixels]

Evaluation Metric: AUC Error Threshold Proportion of frames Area Under Curve score is normalized to lie in [0,1]

Testing pipeline 1. Take a video 2. Extract GT – Marker based tracking – Manual labeling of frames 3.Run tracking for all frames and evaluate quantitative metrics Pose GT has to be known Run accuracy tests offline:

Our Algorithm IDEA: Instead of using all image information, ambiguous and misleading, we directly focus on how to exploit a minimal amount of reliable information object parts. Detect Object Parts Compute pose of each part Compute Object pose

Our Algorithm Detect Object Parts : 2D real time, robust detector (TILDE, CNN) Compute “pose” for each part : Learn a regressor to predict the projection of a stencil of 3D control points on the detected patch Compute object pose: solve an iterative PnP (Gauss Newton) minimizing reprojection error and distance from prior 3D -2D corresp 3D -2D corresp

Our Algorithm Detect object parts Compute pose for each part Compute object pose 10 Pose Priors WRM segments Final pose = best hypothesis 10 Pose Hypothesis Score each hypothesis

Quantitative Results 1.Quantitative results on offline videos were much better than online results -> lack of generalization of the learning. 2. We empirically validated some optimizations choices (RANSAC detection selection, pose scoring, etc.) 3. WRM showed comparable performances to much slower segment detector (LSD)

Since then … We introduced: - better detections CNN detector - better virtual points from 3 to7 points and non-linear regressor - better hypothesis scoring Much better accuracy, at an increased computational cost (by now)

New Qualitative Results

Computational Time [ms] DetectionsVPPose estimation Segment Detection Total OLD NEW

Learning from Experience Our current code has to support 144 configurations: -Server – PTU -WRM – LSD – Canny -Stand alone – TCP/IP - Ubitrack interface -Marker - markerless -2 logging protocols (Ubitrack – JSON) -Closed – Open Box We need to reduce the number of configurations: - Use WRM -Only one bewteen ‘server’ and ‘PTU’ architecture -Only one interface supported (Ubitrack? Shared lib? TCP/IP ?) -Only one logging protocol -Both marker and markerless will be supported CERN server ?

Learning from Experience Divide et impera: It was very useful to have pieces of code simulating all SW interacting with us. Don’t trust Machine Learning: It is crucial to get to test in ‘real life’ situation ASAP. HW can hide surprises: image artifacts and instabilities in testing HW, not completely under control (afawk). “Experience is simply the name we give our mistakes.” O. Wilde

Learning from Experience: BEFORE the final Testing Campaign: Define testing HW and testing scenario ASAP (now?), for collecting learning data and test robustness in ‘real life’ situation Run quantitative tests on the chosen platform Reduce computational time ( <= 200ms) Get rid of HW constraints (GPGPU) Better exploit (more sophisticated) WRM information

Final Testing Campaign Plan - Extensive (offline) quantitative tests on improved visual pose estimation -Online tests including sensor fusion and rendering; visual quality evaluation -Integration and support of additional features (e.g. logging) -Integration and profiling with final version of WRM

Thank you