Walter J. Scheirer, Samuel E. Anthony, Ken Nakayama & David D. Cox

Slides:



Advertisements
Similar presentations
Object recognition and scene “understanding”
Advertisements

Detecting Faces in Images: A Survey
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
SVM—Support Vector Machines
Lecture 31: Modern object recognition
Face detection Many slides adapted from P. Viola.
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei Li,
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
1 Fast Asymmetric Learning for Cascade Face Detection Jiaxin Wu, and Charles Brubaker IEEE PAMI, 2008 Chun-Hao Chang 張峻豪 2009/12/01.
The Viola/Jones Face Detector (2001)
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Presentation by.
COS 429 PS5: Finding Nemo. Exemplar -SVM Still a rigid template,but train a separate SVM for each positive instance For each category it can has exemplar.
Spatial Pyramid Pooling in Deep Convolutional
Lecture 29: Recent work in recognition CS4670: Computer Vision Noah Snavely.
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
Multiclass object recognition
Human tracking and counting using the KINECT range sensor based on Adaboost and Kalman Filter ISVC 2013.
Yao, B., and Fei-fei, L. IEEE Transactions on PAMI(2012)
Lecture 29: Face Detection Revisited CS4670 / 5670: Computer Vision Noah Snavely.
ALIP: Automatic Linguistic Indexing of Pictures Jia Li The Pennsylvania State University.
Object Detection with Discriminatively Trained Part Based Models
Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Pedestrian Detection and Localization
Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.
Object Recognition in Images Slides originally created by Bernd Heisele.
Deformable Part Model Presenter : Liu Changyu Advisor : Prof. Alex Hauptmann Interest : Multimedia Analysis April 11 st, 2013.
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Frontiers in the Convergence of Bioscience and Information Technologies 2007 Seyed Koosha Golmohammadi, Lukasz Kurgan, Brendan Crowley, and Marek Reformat.
U NIVERSITEIT VAN A MSTERDAM IAS INTELLIGENT AUTONOMOUS SYSTEMS 1 M. Hofmann Prof. Dr. D. M. Gavrila Intelligent Systems Laboratory Informatics Institute,
VIP: Finding Important People in Images Clint Solomon Mathialagan Andrew C. Gallagher Dhruv Batra CVPR
The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.
Network Lasso: Clustering and Optimization in Large Graphs
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Evaluation of Gender Classification Methods with Automatically Detected and Aligned Faces Speaker: Po-Kai Shen Advisor: Tsai-Rong Chang Date: 2010/6/14.
Facial Smile Detection Based on Deep Learning Features Authors: Kaihao Zhang, Yongzhen Huang, Hong Wu and Liang Wang Center for Research on Intelligent.
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Big data classification using neural network
PREDICT 422: Practical Machine Learning
Cascade for Fast Detection
Debesh Jha and Kwon Goo-Rak
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
The Problem: Classification
Efficient Image Classification on Vertically Decomposed Data
Performance of Computer Vision
Trees, bagging, boosting, and stacking
Lit part of blue dress and shadowed part of white dress are the same color
Recognition using Nearest Neighbor (or kNN)
Week 6 Cecilia La Place.
Object detection as supervised classification
Enhanced-alignment Measure for Binary Foreground Map Evaluation
Finding Clusters within a Class to Improve Classification Accuracy
Efficient Image Classification on Vertically Decomposed Data
Bilinear Classifiers for Visual Recognition
Convolutional Neural Networks for Visual Tracking
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei.
Measuring motion in biological vision systems
Institute of Neural Information Processing (Prof. Heiko Neumann •
Brief Review of Recognition + Context
Machine Learning 101 Intro to AI, ML, Deep Learning
Creating Data Representations
KFC: Keypoints, Features and Correspondences
On Convolutional Neural Network
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Support Vector Machine I
Perceptual learning Nisheeth 15th February 2019.
CS639: Data Management for Data Science
Automatic Handwriting Generation
Presentation transcript:

Perceptual Annotation: Measuring Human Vision to Improve Computer Vision Walter J. Scheirer, Samuel E. Anthony, Ken Nakayama & David D. Cox IEEE Transactions on Pattern Analysis and Machine Intelligence (2014), 36(8), 1679-1686 Presented by: Talia Retter

Human performance vs. computer vision Introduction: “For many classes of problems, the goal of computer vision is to solve visual challenges for which human observers have effortless expertise…” (Even further, problems are defined by human perception: the goal of computer vision not to uncover “ground truths” of an image, but to analyze it in a way that corresponds to human vision, i.e., is functionally useful for us.)

“A case study in face detection” Face detection: Detecting whether an image contains a face or not Performance: measured in accuracy and speed from Fig. 7

Human performance > computer vision Especially in challenging views/environments (“in the wild”)

The inspiration of human vision Past: computers learn by simple coding = “face” or “no face” Present: Enrich computer learning (support vector machines) with “perceptual annotation” (guidance from human “learnability” of faces)

Visual psychophysics for perceptual annotation Steps 1&2) Two experiments: “Face in the branches” Only 10-30% of face visible 3-alternative forced choice: “which of 3 images presented together contains a face?” (450 or 900 ms) 102 trials per subject (~1,000 or 2,000 face images) > 3,000 subjects in ~7 weeks with TestMyBrain website “Fast face finder” Images from AFLW dataset 50 ms: face or non-face? 204 trials per subject (1/3 faces) (~4,000 difference face images) > 400 subjects in ~2 weeks Measure: accuracy and response time

Perceptual annotation for SVMs Step 3) Train a SVM classifier to detect faces (Non-convex) human-weighted loss function that defines the cost of misclassification (for perceptually annotated images) Leads to fewer vectors than a hinge-defined loss function Hinge Human # of s vectors

Augment the face detector with the annotation Steps 4&5) Improve classifier Stage 1: Filter using Haar features instead of a sliding window and varying spatial scale Stage 2: Filter with perceptually annotated SVM  Detection predictions

Results (1/3) *new dataset: FDDB faces Human-weighted loss function performs better than hinge-weighted at face detection (across stimulus sets, feature definitions, and behavioral accuracy and RT)

*new dataset: FDDB faces Results (2/3) Perceptually annotated classifier with biologically-defined features outperforms all others

*new dataset: FDDB faces Results (3/3) Perceptually annotated classifier with biologically-defined features outperforms all others

Conclusions Human perceptual annotation is informative for machine learning (SVM classification) Could be applied with neurophysiological human data Could also be applied with other classifier techniques (e.g., neural networks) Interplay between computer science and human perception: but might there be instances in which computers can perform unlike humans to perform better? (e.g., incorporating infrared imaging, Pavlidis & Symosek, 2000; Bebis et al., 2006)