Consistent Visual Information Processing Axel Pinz EMT – Institute of Electrical Measurement and Measurement Signal Processing TU Graz – Graz University of Technology
“Consistency” Active vision systems / 4D data streams Imprecision Ambiguity Contradiction Multiple visual information
This Talk: Consistency in Active vision systems: –Active fusion –Active object recognition Immersive 3D HCI: –Augmented reality –Tracking in VR/AR
AR as Testbed Consistent perception in 4D: Space –Registration –Tracking Time –Lag-free –Prediction
Agenda Active fusion Consistency Applications –Active object recognition –Tracking in VR/AR Conclusions
Active Fusion Simple top level decision-action-fusion loop:
Active Fusion (2) Fusion schemes –Probabilistic –Possibilistic (fuzzy) –Evidence theoretic (Dempster & Shafer)
Probabilistic Active Fusion N measurements, sensor inputs: m i M hypotheses: o j, O = {o 1, …, o M } Bayes formula: Use entropy H(O) to measure the quality of P(O)
Probabilistic Active Fusion (2) Flat distribution: P(o j )=const. H max Measurements can be: difficult, expensive, N can be prohibitively large, … Find iterative strategy to minimize H(O) Pronounced distribution: P(o c ) = 1; P(o j ) = 0, j c H = 0
Probabilistic Active Fusion (3) Start with A 1 measurements: P(o j |m 1, …,m A ), H A Iteratively take more measurements: m A+1, …,m B Until: P(o j |m 1, …,m B ), H B Threshold
Summary: Active Fusion Multiple (visual) information, many sensors, measurements,… Selection of information sources Maximize information content / quality Optimize effort (number / cost of measurements, …) Information gain by entropy reduction
Summary: Active Fusion (2) Active systems (robots, mobile cameras) –Sensor planning –Control –Interaction with the scene “Passive” systems (video, wearable,…) –Filtering –Selection of sensors / measurements
Consistency Consistency vs. Ambiguity –Unimodal subsets O k Representations –Distance measures
Consistent Subsets Hypotheses O = {o 1,…, o M } Ambiguity: P(O) is multimodal Consistent unimodal subsets O k O Application domains Support of hypotheses Outlier rejection Benefits:
Distance Measures Depend on representations, e.g.: Pixel-levelSSD, correlation, rank EigenspaceEuclidean 3D modelsEuclidean Feature-basedMahalanobis, … SymbolicMutual information GraphsSubgraph isomorphism
Mutual Information Shannon´s measure of mutual information: O = {o 1,…, o M } A O, B O I(A,B) = H(A) + H(B) – H(A,B)
Applications Active object recognition –Videos –Details Tracking in VR / AR –Landmark definition / acquisition –Real-time tracking
Active vision laboratory
Active Object Recognition
Active Object Recognition in Parametric Eigenspace Classifier for a single view Pose estimation per view Fusion formalism View planning formalism Estimation of object appearance at unexplored viewing positions
Applications Active object recognition –Videos –Details Control of active vision systems Tracking in VR / AR –Landmark definition / acquisition –Real-time tracking Selection, combination, evaluation Constraining of huge spaces
Landmark Definition / Acquisition cornersblobsnatural landmarks What is a “landmark” ?
Automatic Landmark Acquisition Capture a dataset of the scene: –calibrated stereo rig –trajectory (by magnetic tracking) –n stereo pairs Process this dataset –visually salient landmarks for tracking
Automatic Landmark Acquisition visually salient landmarks for tracking salient points in 2D image 3D reconstruction clusters in 3D: –compact, many points –consistent feature descriptions cluster centers landmarks
Processing Scheme
Office Scene
Office Scene - Reconstruction
Unknown Scene Real-Time Tracking Landmark Acquisition
Real-Time Tracking Measure position and orientation of object(s) Obtain trajectories of object(s) Stationary observer – “outside-in” –Vision-based Moving observer, egomotion – “inside-out” –Hybrid Degrees of Freedom – DoF –3 DoF (mobile robot) –6 DoF (head and device tracking in AR)
Outside-in Tracking (1) stereo-rig IR-illumination wireless 1 marker/device: 3 DoF 2 markers: 5 DoF 3 markers: 6 DoF devices
Outside-in Tracking (2)
Consistent Tracking (1) Complexity –Many targets –Exhaustive search vs. Real-time Occlusion –Redundancy (targets | cameras) Ambiguity in 3D –Constraints
Consistent Tracking (2) Dynamic interpretation tree –Geometric / spatial consistency Local constraints –Multiple interpretations can happen –Global consistency is impossible Temporal consistency –Filtering, prediction
Consistent Tracking (3)
Hybrid Inside-Out Tracking (1) 3 accelerometers 3 gyroscopes signal processing interface Inertial Tracker
Hybrid Inside-Out Tracking (2) complementary sensors fusion
Summary: Consistency in Active vision systems: –Active fusion –Active object recognition Immersive 3D HCI: –Augmented reality –Tracking in VR/AR
Conclusion Consistent processing of visual information can significantly improve the performance of active and real-time vision systems
Acknowledgement Thomas Auer, Hermann Borotschnig, Markus Brandner, Harald Ganster, Peter Lang, Lucas Paletta, Manfred Prantl, Miguel Ribo, David Sinclair Christian Doppler Gesellschaft, FFF, FWF, Kplus VRVis, EU TMR Virgo