Walter J. Scheirer, Samuel E. Anthony, Ken Nakayama & David D. Cox

Slides:

Advertisements

Similar presentations

Object recognition and scene “understanding”

Advertisements

Detecting Faces in Images: A Survey

EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.

SVM—Support Vector Machines

Lecture 31: Modern object recognition

Face detection Many slides adapted from P. Viola.

Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei Li,

EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.

1 Fast Asymmetric Learning for Cascade Face Detection Jiaxin Wu, and Charles Brubaker IEEE PAMI, 2008 Chun-Hao Chang 張峻豪 2009/12/01.

The Viola/Jones Face Detector (2001)

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Presentation by.

COS 429 PS5: Finding Nemo. Exemplar -SVM Still a rigid template,but train a separate SVM for each positive instance For each category it can has exemplar.

Spatial Pyramid Pooling in Deep Convolutional

Lecture 29: Recent work in recognition CS4670: Computer Vision Noah Snavely.

© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,

Multiclass object recognition

Human tracking and counting using the KINECT range sensor based on Adaboost and Kalman Filter ISVC 2013.

Yao, B., and Fei-fei, L. IEEE Transactions on PAMI(2012)

Lecture 29: Face Detection Revisited CS4670 / 5670: Computer Vision Noah Snavely.

ALIP: Automatic Linguistic Indexing of Pictures Jia Li The Pennsylvania State University.

Object Detection with Discriminatively Trained Part Based Models

Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.

BING: Binarized Normed Gradients for Objectness Estimation at 300fps

Pedestrian Detection and Localization

Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.

Object Recognition in Images Slides originally created by Bernd Heisele.

Deformable Part Model Presenter ： Liu Changyu Advisor ： Prof. Alex Hauptmann Interest ： Multimedia Analysis April 11 st, 2013.

BING: Binarized Normed Gradients for Objectness Estimation at 300fps

Frontiers in the Convergence of Bioscience and Information Technologies 2007 Seyed Koosha Golmohammadi, Lukasz Kurgan, Brendan Crowley, and Marek Reformat.

U NIVERSITEIT VAN A MSTERDAM IAS INTELLIGENT AUTONOMOUS SYSTEMS 1 M. Hofmann Prof. Dr. D. M. Gavrila Intelligent Systems Laboratory Informatics Institute,

VIP: Finding Important People in Images Clint Solomon Mathialagan Andrew C. Gallagher Dhruv Batra CVPR

The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.

Network Lasso: Clustering and Optimization in Large Graphs

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.

Evaluation of Gender Classification Methods with Automatically Detected and Aligned Faces Speaker: Po-Kai Shen Advisor: Tsai-Rong Chang Date: 2010/6/14.

Facial Smile Detection Based on Deep Learning Features Authors: Kaihao Zhang, Yongzhen Huang, Hong Wu and Liang Wang Center for Research on Intelligent.

1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.

Big data classification using neural network

PREDICT 422: Practical Machine Learning

Cascade for Fast Detection

Debesh Jha and Kwon Goo-Rak

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

The Problem: Classification

Efficient Image Classification on Vertically Decomposed Data

Performance of Computer Vision

Trees, bagging, boosting, and stacking

Lit part of blue dress and shadowed part of white dress are the same color

Recognition using Nearest Neighbor (or kNN)

Week 6 Cecilia La Place.

Object detection as supervised classification

Enhanced-alignment Measure for Binary Foreground Map Evaluation

Finding Clusters within a Class to Improve Classification Accuracy

Efficient Image Classification on Vertically Decomposed Data

Bilinear Classifiers for Visual Recognition

Convolutional Neural Networks for Visual Tracking

Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei.

Measuring motion in biological vision systems

Institute of Neural Information Processing (Prof. Heiko Neumann •

Brief Review of Recognition + Context

Machine Learning 101 Intro to AI, ML, Deep Learning

Creating Data Representations

KFC: Keypoints, Features and Correspondences

On Convolutional Neural Network

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Support Vector Machine I

Perceptual learning Nisheeth 15th February 2019.

CS639: Data Management for Data Science

Automatic Handwriting Generation

Presentation transcript:

Perceptual Annotation: Measuring Human Vision to Improve Computer Vision Walter J. Scheirer, Samuel E. Anthony, Ken Nakayama & David D. Cox IEEE Transactions on Pattern Analysis and Machine Intelligence (2014), 36(8), 1679-1686 Presented by: Talia Retter

Human performance vs. computer vision Introduction: “For many classes of problems, the goal of computer vision is to solve visual challenges for which human observers have effortless expertise…” (Even further, problems are defined by human perception: the goal of computer vision not to uncover “ground truths” of an image, but to analyze it in a way that corresponds to human vision, i.e., is functionally useful for us.)

“A case study in face detection” Face detection: Detecting whether an image contains a face or not Performance: measured in accuracy and speed from Fig. 7

Human performance > computer vision Especially in challenging views/environments (“in the wild”)

The inspiration of human vision Past: computers learn by simple coding = “face” or “no face” Present: Enrich computer learning (support vector machines) with “perceptual annotation” (guidance from human “learnability” of faces)

Visual psychophysics for perceptual annotation Steps 1&2) Two experiments: “Face in the branches” Only 10-30% of face visible 3-alternative forced choice: “which of 3 images presented together contains a face?” (450 or 900 ms) 102 trials per subject (~1,000 or 2,000 face images) > 3,000 subjects in ~7 weeks with TestMyBrain website “Fast face finder” Images from AFLW dataset 50 ms: face or non-face? 204 trials per subject (1/3 faces) (~4,000 difference face images) > 400 subjects in ~2 weeks Measure: accuracy and response time

Perceptual annotation for SVMs Step 3) Train a SVM classifier to detect faces (Non-convex) human-weighted loss function that defines the cost of misclassification (for perceptually annotated images) Leads to fewer vectors than a hinge-defined loss function Hinge Human # of s vectors

Augment the face detector with the annotation Steps 4&5) Improve classifier Stage 1: Filter using Haar features instead of a sliding window and varying spatial scale Stage 2: Filter with perceptually annotated SVM  Detection predictions

Results (1/3) *new dataset: FDDB faces Human-weighted loss function performs better than hinge-weighted at face detection (across stimulus sets, feature definitions, and behavioral accuracy and RT)

*new dataset: FDDB faces Results (2/3) Perceptually annotated classifier with biologically-defined features outperforms all others

*new dataset: FDDB faces Results (3/3) Perceptually annotated classifier with biologically-defined features outperforms all others

Conclusions Human perceptual annotation is informative for machine learning (SVM classification) Could be applied with neurophysiological human data Could also be applied with other classifier techniques (e.g., neural networks) Interplay between computer science and human perception: but might there be instances in which computers can perform unlike humans to perform better? (e.g., incorporating infrared imaging, Pavlidis & Symosek, 2000; Bebis et al., 2006)