On-the-fly Specific Person Retrieval University of Oxford 24 th May 2012 Omkar M. Parkhi, Andrea Vedaldi and Andrew Zisserman.

Slides:

Advertisements

Similar presentations

Image Retrieval With Relevant Feedback Hayati Cam & Ozge Cavus IMAGE RETRIEVAL WITH RELEVANCE FEEDBACK Hayati CAM Ozge CAVUS.

Advertisements

Multi-Attribute Spaces: Calibration for Attribute Fusion and Similarity Search University of Oxford 5 th December 2012 Walter Scheirer, Neeraj Kumar, Peter.

Jan-Michael Frahm, Enrique Dunn Spring 2013

A Unified Framework for Context Assisted Face Clustering

Content-Based Image Retrieval

Rapid Object Detection using a Boosted Cascade of Simple Features Paul Viola, Michael Jones Conference on Computer Vision and Pattern Recognition 2001.

Neeraj Kumar, Alexander C. Berg, Peter N. Belumeur, and Shree K. Nayar Presented by Gregory Teodoro.

Face detection Behold a state-of-the-art face detector! (Courtesy Boris Babenko)Boris Babenko.

Data-driven Visual Similarity for Cross-domain Image Matching

Online Multiple Classifier Boosting for Object Tracking Tae-Kyun Kim 1 Thomas Woodley 1 Björn Stenger 2 Roberto Cipolla 1 1 Dept. of Engineering, University.

Character retrieval and annotation in multimedia

Tom-vs-Pete Classifiers and Identity- Preserving Alignment for Face Verification Thomas Berg Peter N. Belhumeur Columbia University 1.

GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.

Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.

DISCRIMINATIVE DECORELATION FOR CLUSTERING AND CLASSIFICATION ECCV 12 Bharath Hariharan, Jitandra Malik, and Deva Ramanan.

The Viola/Jones Face Detector (2001)

Object retrieval with large vocabularies and fast spatial matching

Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.

Automatic Face Recognition for Film Character Retrieval in Feature-Length Films Ognjen Arandjelović Andrew Zisserman.

ICME 2004 Tzvetanka I. Ianeva Arjen P. de Vries Thijs Westerveld A Dynamic Probabilistic Multimedia Retrieval Model.

PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang1;2, Manohar Paluri1, Marc’Aurelio Ranzato1, Trevor Darrell2, Lubomir Bourdev1 1: Facebook.

Improving web image search results using query-relative classifiers Josip Krapacy Moray Allanyy Jakob Verbeeky Fr´ed´eric Jurieyy.

Wang, Z., et al. Presented by: Kayla Henneman October 27, 2014 WHO IS HERE: LOCATION AWARE FACE RECOGNITION.

Describing People: A Poselet-Based Approach to Attribute Classification Lubomir Bourdev 1,2 Subhransu Maji 1 Jitendra Malik 1 1 EECS U.C. Berkeley 2 Adobe.

Week 3 Report Misrak Seifu, Wesna Lalanne Mentor: Mahdi M. Kalayeh.

Dimensionality Reduction: Principal Components Analysis Optional Reading: Smith, A Tutorial on Principal Components Analysis (linked to class webpage)

Lecture #32 WWW Search. Review: Data Organization Kinds of things to organize –Menu items –Text –Images –Sound –Videos –Records (I.e. a person ’ s name,

“Hello! My name is... Buffy” Automatic Naming of Characters in TV Video Mark Everingham, Josef Sivic and Andrew Zisserman Arun Shyam.

COMPUTER VISION: SOME CLASSICAL PROBLEMS ADWAY MITRA MACHINE LEARNING LABORATORY COMPUTER SCIENCE AND AUTOMATION INDIAN INSTITUTE OF SCIENCE June 24, 2013.

Building Face Dataset Shijin Kong. Building Face Dataset Ramanan et al, ICCV 2007, Leveraging Archival Video for Building Face DatasetsLeveraging Archival.

“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)

Obtaining Data for Face Recognition from the web By Tal blum Advisor: Henry Schneiderman.

Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.

Learning Collections of Parts for Object Recognition and Transfer Learning University of Illinois at Urbana- Champaign.

Deep face recognition Omkar M. Parkhi, Andrea Vedaldi, Andrew Zisserman.

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

BAGGING ALGORITHM, ONLINE BOOSTING AND VISION Se – Hoon Park.

MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.

A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,

SVMs for (x) Recognition (From Moghaddam / Yang’s “Gender Classification with SVMs”) Brian Whitman.

The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.

Data Mining, ICDM '08. Eighth IEEE International Conference on Duy-Dinh Le National Institute of Informatics Hitotsubashi, Chiyoda-ku Tokyo,

Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11.

M4 / September Integrating multimodal descriptions to index large video collections M4 meeting – Munich Nicolas Moënne-Loccoz, Bruno Janvier,

Discovering Objects and their Location in Images Josef Sivic 1, Bryan C. Russell 2, Alexei A. Efros 3, Andrew Zisserman 1 and William T. Freeman 2 Goal:

Detecting Eye Contact Using Wearable Eye-Tracking Glasses.

Locally Linear Support Vector Machines Ľubor Ladický Philip H.S. Torr.

Query by Image and Video Content: The QBIC System M. Flickner et al. IEEE Computer Special Issue on Content-Based Retrieval Vol. 28, No. 9, September 1995.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.

Face detection Many slides adapted from P. Viola.

PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang 1,2 Manohar Paluri 1 Marć Aurelio Ranzato 1 Trevor Darrell 2 Lumbomir Boudev 1 1 Facebook.

Sreekanth Vempati ( ) Advisors: Dr. C. V. Jawahar ( IIIT Hyderabad ), Dr. Andrew Zisserman ( Univ. of Oxford ) Efficient SVM based object classification.

1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.

The topic discovery models

Performance of Computer Vision

Can Computer Algorithms Guess Your Age and Gender?

Video Google: Text Retrieval Approach to Object Matching in Videos

Misrak Seifu, Wesna Lalanne Mentor: Mahdi M. Kalayeh

Tingdan Luo 05/02/2016 Interactively Optimizing Information Retrieval Systems as a Dueling Bandits Problem Tingdan Luo

Unsupervised Face Alignment by Robust Nonrigid Mapping

The topic discovery models

Deep Face Recognition Omkar M. Parkhi Andrea Vedaldi Andrew Zisserman

“Bayesian Identity Clustering”

“The Truth About Cats And Dogs”

The topic discovery models

Presented by Wanxue Dong

Video Google: Text Retrieval Approach to Object Matching in Videos

Presentation transcript:

On-the-fly Specific Person Retrieval University of Oxford 24 th May 2012 Omkar M. Parkhi, Andrea Vedaldi and Andrew Zisserman

Overview People “Barack Obama” “George Bush” “Courtney Cox” Ranked ShotsTextual Queries Search for: On-the-fly i.e. with no previous knowledge or model for these queries Large collection of un-annotated videos Large collection of un-annotated videos

Scrubs Data Set 12 Episodes from Seasons 1-5 and 8 5 hours of video data About 400k frames, partitioned into 5k shots About 300k near frontal face detections 768 x 576 MPEG2 format

Demo Search for “Courteney Cox” in Scrubs dataset. Steps: 1.Download example images from Google 1.Train a ranking function 1.Apply ranking function to video collection

DEMO

Demo- Scrubs Data Set

On the fly person retrieval system Text Query “Courteney Cox” Negative Training Images Video Collection Face Tracks Facial Features & Descriptors Facial Features & Descriptors Results ON-LINE PROCESSING OFF-LINE PROCESSING Google Image Search “Courteney Cox” Facial Features & Descriptors Fast Linear Classifier Ranking

Detection and Tracking Viola-Jones face detection on each frame Tracking measures “ connectedness ” of a pair of faces by point tracks intersecting both Doesn ’ t require contiguous detections No drift Faces clustered into tracks [Everingham et al. 2006, Apostoloff & Zisserman, 2007]

Scrubs Data Set 12 Episodes from Seasons 1-5 and 8 5 hours of video data About 400k frames, partitioned into 5k shots 300k face detections 6k face tracks 768 x 576 MPEG2 format

Detecting facial feature points Pictorial structure model Joint model of feature appearance and position [Felzenszwalb and Huttenlocher’2004, Everingham et al ]

Face Appearance Representation  Affine transformation of face to canonical frame  Independent photometric normalization of parts  Represent gradients over circle centred on facial feature points  Feature descriptor is a 3849 dimensional vector [ Everingham et al ]

Negative Training Images Combination of faces from Random downloaded images Labeled Faces in the Wild dataset Caltech Faces dataset About 16k face detections. Caltech 10, 000 Web Faces: Labeled Faces in the Wild:

On-the-fly Person Retrieval Text Query “Courteney Cox” Negative Training Images Video Collection Face Tracks Facial Features & Descriptors Facial Features & Descriptors Results ON-LINE PROCESSING OFF-LINE PROCESSING Google Image Search “Courteney Cox” Facial Features & Descriptors Fast Linear Classifier Ranking

DEMO

Demo- Scrubs Data Set

TRECVid 2011 (IACC.1.B) About 200 hours of video data. 8k videos. MPEG4, 320x240 pixels 130k shots, About 3 million face detections 25,535 face tracks.

DEMO

Demo - TRECVid 2011 (IACC.1.B)

Facial attributes – FaceTracer project  Examples:  gender: male, female  age: baby, child, youth, middle age, senior  race: white, black, asian  smiling, mustache, eye-wear, hair colour N. Kumar, P. N. Belhumeur and S. K. Nayar, FaceTracer: A Search Engine for Large Collections of Images with Faces, European Conference on Computer Vision (ECCV), Method person independent training set with attribute facial feature representation discriminative training of classifier for attribute

DEMO

Facial attributes – Glasses

DEMO Facial attributes – Beard

Facial attributes – Eyes Closed

Quantitative Performance - Scrubs Dataset Performance evaluation for 3 guest actors (Brendan Fraser, Courteney Cox and Michael J Fox) 12 dataset videos split into training and test sets (3 Training, 9 Testing) Annotations: Manual labeling of training and test set for each actor Manual labeling of positive training images from Google Negative training images from Caltech Faces dataset.

Quantitative Performance - Scrubs Dataset Retrieval Average Precision (AP) Training Examples SourceAverage Precision Positive Negative Brendan Fraser Courteney Cox Michael J Fox Scrubs GoogleScrubs GoogleCaltech

Quantitative Performance - Scrubs Dataset Using more training data per track Training Examples Source # samples per track Average Precision Positive Negative Brendan Fraser Courteney Cox Michael J Fox Scrubs Single Scrubs Multiple

Future Work Exploring sources for positive examples Better feature representations Combination of attributes and identities

 Any Questions?