Inference in generative models of images and video John Winn MSR Cambridge May 2004.

Slides:



Advertisements
Similar presentations
Real-Time Detection, Alignment and Recognition of Human Faces
Advertisements

Part 2: Unsupervised Learning
Joint Face Alignment The Recognition Pipeline
Active Shape Models Suppose we have a statistical shape model –Trained from sets of examples How do we use it to interpret new images? Use an “Active Shape.
Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
Face detection Behold a state-of-the-art face detector! (Courtesy Boris Babenko)Boris Babenko.
Weiwei Zhang, Jian Sun, and Xiaoou Tang, Fellow, IEEE.
Learning to estimate human pose with data driven belief propagation Gang Hua, Ming-Hsuan Yang, Ying Wu CVPR 05.
LOCUS (Learning Object Classes with Unsupervised Segmentation) A variational approach to learning model- based segmentation. John Winn Microsoft Research.
GrabCut Interactive Foreground Extraction using Iterated Graph Cuts Carsten Rother Vladimir Kolmogorov Andrew Blake Microsoft Research Cambridge-UK.
Modeling 3D Deformable and Articulated Shapes Yu Chen, Tae-Kyun Kim, Roberto Cipolla Department of Engineering University of Cambridge.
Face detection Many slides adapted from P. Viola.
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
Student: Yao-Sheng Wang Advisor: Prof. Sheng-Jyh Wang ARTICULATED HUMAN DETECTION 1 Department of Electronics Engineering National Chiao Tung University.
Image Parsing: Unifying Segmentation and Detection Z. Tu, X. Chen, A.L. Yuille and S-C. Hz ICCV 2003 (Marr Prize) & IJCV 2005 Sanketh Shetty.
Model: Parts and Structure. History of Idea Fischler & Elschlager 1973 Yuille ‘91 Brunelli & Poggio ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis,
The Viola/Jones Face Detector (2001)
Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.
Variational Inference and Variational Message Passing
Real-time Embedded Face Recognition for Smart Home Fei Zuo, Student Member, IEEE, Peter H. N. de With, Senior Member, IEEE.
1 Interest Operators Find “interesting” pieces of the image –e.g. corners, salient regions –Focus attention of algorithms –Speed up computation Many possible.
A simple classifier Ridge regression A variation on standard linear regression Adds a “ridge” term that has the effect of “smoothing” the weights Equivalent.
A Study of Approaches for Object Recognition
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Presentation by.
Rodent Behavior Analysis Tom Henderson Vision Based Behavior Analysis Universitaet Karlsruhe (TH) 12 November /9.
Face detection and recognition Many slides adapted from K. Grauman and D. Lowe.
1 Interest Operator Lectures lecture topics –Interest points 1 (Linda) interest points, descriptors, Harris corners, correlation matching –Interest points.
Face Recognition with Harr Transforms and SVMs EE645 Final Project May 11, 2005 J Stautzenberger.
TextonBoost : Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation J. Shotton*, J. Winn†, C. Rother†, and A.
4EyesFace-Realtime face detection, tracking, alignment and recognition Changbo Hu, Rogerio Feris and Matthew Turk.
The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects By John Winn & Jamie Shotton CVPR 2006 presented by Tomasz.
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
FACE DETECTION AND RECOGNITION By: Paranjith Singh Lohiya Ravi Babu Lavu.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
A Tutorial on Object Detection Using OpenCV
EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.
Object Detection Using the Statistics of Parts Presented by Nicholas Chan – Advanced Perception Robust Real-time Object Detection Henry Schneiderman.
Detecting Pedestrians Using Patterns of Motion and Appearance Paul Viola Microsoft Research Irfan Ullah Dept. of Info. and Comm. Engr. Myongji University.
Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.
Automated Detection and Classification Models SAR Automatic Target Recognition Proposal J.Bell, Y. Petillot.
Pedestrian Detection and Localization
A Comparative Evaluation of Three Skin Color Detection Approaches Dennis Jensch, Daniel Mohr, Clausthal University Gabriel Zachmann, University of Bremen.
Robust Real-time Face Detection by Paul Viola and Michael Jones, 2002 Presentation by Kostantina Palla & Alfredo Kalaitzis School of Informatics University.
Face Detection Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Real-Time Detection, Alignment and Recognition of Human Faces Rogerio Schmidt Feris Changbo Hu Matthew Turk Pattern Recognition Project June 12, 2003.
Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,
The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Human Activity Recognition at Mid and Near Range Ram Nevatia University of Southern California Based on work of several collaborators: F. Lv, P. Natarajan,
Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.
Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.
Jigsaws: joint appearance and shape clustering John Winn with Anitha Kannan and Carsten Rother Microsoft Research, Cambridge.
Text From Corners: A Novel Approach to Detect Text and Caption in Videos Xu Zhao, Kai-Hsiang Lin, Yun Fu, Member, IEEE, Yuxiao Hu, Member, IEEE, Yuncai.
FACE DETECTION : AMIT BHAMARE. WHAT IS FACE DETECTION ? Face detection is computer based technology which detect the face in digital image. Trivial task.
Markov Random Fields & Conditional Random Fields
Part 4: combined segmentation and recognition Li Fei-Fei.
Face detection Many slides adapted from P. Viola.
Portable Camera-Based Assistive Text and Product Label Reading From Hand-Held Objects for Blind Persons.
Face Detection 蔡宇軒.
Introduction to Skin and Face Detection
2. Skin - color filtering.
Data Driven Attributes for Action Detection
LOCUS: Learning Object Classes with Unsupervised Segmentation
CSE 455 – Guest Lectures 3 lectures Contact Interest points 1
Brief Review of Recognition + Context
Paper Reading Dalong Du April.08, 2011.
A Tutorial on Object Detection Using OpenCV
Anti-Faces for Detection
Presentation transcript:

Inference in generative models of images and video John Winn MSR Cambridge May 2004

Overview Generative vs. conditional models Combined approach Inference in the flexible sprite model Extending the model

We have an image I and latent variables H which we wish to infer, e.g. object position, orientation, class. There will also be other sources of variability, e.g. illumination, parameterised by θ. Generative vs. conditional models Generative model: P(H, θ, I) Conditional model: P(H, θ|I) or P(H|I)

Conditional models use features Features are functions of I which aim to be informative about H but invariant to θ. Edge featuresCorner features Blob features

Conditional models Using features f(I), train a conditional model e.g. using labelled data Example: Viola & Jones face recognition using rectangle features and AdaBoost

Conditional models Advantages Simple - only model variables of interest Inference is fast - due to use of features and simple model Disadvantages Non-robust Difficult to compare different models Difficult to combine different models

Generative models A generative model defines a process of generating the image pixels I from the latent variables H and θ, giving a joint distribution over all variables: P(H, θ, I) Learning and inference carried out using standard machine learning techniques e.g. Expectation Maximisation, MCMC, variational methods. No features!

Generative models Example: image modeled as layers of ‘flexible’ sprites.

Generative models Advantages Accurate – as the entire image is modeled Can compare different models Can combine different models Can generate new images Disadvantages Inference is difficult due to local minima Inference is slower due to complex model Limitations on model complexity

Combined approach Use a generative model, but speed up inference using proposal distributions given by a conditional model. A proposal R(X) suggests a new distribution over some of the latent variables X  H, θ. Inference is extended to allow accepting or rejecting the proposal e.g. depending on whether it improves the model evidence.

Using proposals in an MCMC framework Proposals for text and facesAccepted proposals From Tu et al, 2003 Generative model: textured regions combined with face and text models Conditional model: face and text detector using AdaBoost (Viola & Jones)

Using proposals in an MCMC framework Proposals for text and facesReconstructed image From Tu et al, 2003 Generative model: textured regions combined with face and text models Conditional model: face and text detector using AdaBoost (Viola & Jones)

Proposals in the flexible sprite model

Flexible sprite model x Set of images e.g. frames from a video

Flexible sprite model x

πf x Sprite shape and appearance

Flexible sprite model π m f T x Sprite transform for this image (discretised) Transformed mask instance for this image

Flexible sprite model π m fb T x Background

Inference method & problems Apply variational inference with factorised Q distribution Slow – since we have to search entire discrete transform space Limited size of transform space e.g. translations only (160  120). Many local minima.

Proposals in the flexible sprite model π m T We wish to create a proposal R(T). Cannot use features of the image directly until object appearance found. Use features of the inferred mask. proposal

Moment-based features Use the first and second moments of the inferred mask as features. Learn a proposal distribution R(T). True location C-of-G of mask Contour of proposal distribution over object location Can also use R to get a probabilistic bound on T.

Iteration #1

Iteration #2

Iteration #3

Iteration #4

Iteration #5

Iteration #6

Iteration #7

Results on scissors video. On average, ~1% of transform space searched. Always converges, independent of initialisation. OriginalReconstruction Foreground only

Beyond translation

Extended transform space OriginalReconstruction

Extended transform space OriginalReconstruction

Extended transform space Normalised video Learned sprite appearance

Corner features Learned sprite appearance Masked normalised image

Corner feature proposals

Preliminary results

Future directions

Extensions to the generative model Very wide range of possible extensions: Local appearance model e.g. patch-based Multiple layered objects Object classes Illumination modelling Incorporation of object-specific models e.g. faces Articulated models

Further investigation of using proposals Investigate other bottom-up features, including: Optical flow Color/texture Use of standard invariant features e.g. SIFT Discriminative models for particular object classes e.g. faces, text

π m fb T x N