Article Review Todd Hricik.

Slides:



Advertisements
Similar presentations
Object recognition and scene “understanding”
Advertisements

CS590M 2008 Fall: Paper Presentation
By: Michael Vorobyov. Moments In general, moments are quantitative values that describe a distribution by raising the components to different powers.
Advanced topics.
Rajat Raina Honglak Lee, Roger Grosse Alexis Battle, Chaitanya Ekanadham, Helen Kwong, Benjamin Packer, Narut Sereewattanawoot Andrew Y. Ng Stanford University.
Tiled Convolutional Neural Networks TICA Speedup Results on the CIFAR-10 dataset Motivation Pretraining with Topographic ICA References [1] Y. LeCun, L.
Presented by: Mingyuan Zhou Duke University, ECE September 18, 2009
Recent Developments in Deep Learning Quoc V. Le Stanford University and Google.
Unsupervised Learning With Neural Nets Deep Learning and Neural Nets Spring 2015.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
A Study of Approaches for Object Recognition
Deep Belief Networks for Spam Filtering
K-means Based Unsupervised Feature Learning for Image Recognition Ling Zheng.
AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/
Autoencoders Mostafa Heidarpour
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Comp 5013 Deep Learning Architectures Daniel L. Silver March,
Multiclass object recognition
Nantes Machine Learning Meet-up 2 February 2015 Stefan Knerr CogniTalk
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
A shallow introduction to Deep Learning
Presented by: Mingyuan Zhou Duke University, ECE June 17, 2011
Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,
Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,
CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Meeting 8: Features for Object Classification Ullman et al.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition arXiv: v4 [cs.CV(CVPR)] 23 Apr 2015 Kaiming He, Xiangyu Zhang, Shaoqing.
Date of download: 7/8/2016 Copyright © 2016 SPIE. All rights reserved. A scalable platform for learning and evaluating a real-time vehicle detection system.
Yann LeCun Other Methods and Applications of Deep Learning Yann Le Cun The Courant Institute of Mathematical Sciences New York University
Unsupervised Learning of Video Representations using LSTMs
Learning Deep Generative Models by Ruslan Salakhutdinov
Guillaume-Alexandre Bilodeau
Deep Learning Amin Sobhani.
an introduction to: Deep Learning
Energy models and Deep Belief Networks
Data Mining, Neural Network and Genetic Programming
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
CLASSIFICATION OF TUMOR HISTOPATHOLOGY VIA SPARSE FEATURE LEARNING Nandita M. Nayak1, Hang Chang1, Alexander Borowsky2, Paul Spellman3 and Bahram Parvin1.
Session 7: Face Detection (cont.)
Week III: Deep Tracking
Recognizing Deformable Shapes
Restricted Boltzmann Machines for Classification
Nonparametric Semantic Segmentation
Supervised Training of Deep Networks
Deep learning and applications to Natural language processing
Training Techniques for Deep Neural Networks
Deep Belief Networks Psychology 209 February 22, 2013.
High-Level Vision Face Detection.
FaceNet A Unified Embedding for Face Recognition and Clustering
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Self-similarity and points of interest
State-of-the-art face recognition systems
Deep learning Introduction Classes of Deep Learning Networks
AlexNet+Max Pooling MIL AlexNet+Label Assign. MIL
Gaurav Aggarwal, Mark Shaw, Christian Wolf
Very Deep Convolutional Networks for Large-Scale Image Recognition
A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE
On Convolutional Neural Network
المشرف د.يــــاســـــــــر فـــــــؤاد By: ahmed badrealldeen
Representation Learning with Deep Auto-Encoder
Analysis of Trained CNN (Receptive Field & Weights of Network)
Deep Learning Some slides are from Prof. Andrew Ng of Stanford.
Autoencoders Supervised learning uses explicit labels/correct output in order to train a network. E.g., classification of images. Unsupervised learning.
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
Reuben Feinman Research advised by Brenden Lake
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Presentation transcript:

Article Review Todd Hricik

Learning High Level Features Previous studies in computer vision have used labeled data to learn “higher” level features Requires a large training set containing features you wish to recognize Difficult to obtain in many cases Focus of current work in this paper is to build high level , class specific, feature detectors from unlabeled images

Learning Features From Unlabeled Data RBMs (Hinton et al.,2006) Autoencoders (Hinton & Salakhutdinov, 2006; Bengio et al., 2007) Sparse coding (Lee et al., 2007) and K-means (Coates et al., 2011) To date most have only succeeded in learning low-level features such as “lines” or “globs” Authors consider the possibility of capturing more complex features using deep autoencoders on unlabeled data

Deep Autoencoders Made up of symmetrical encoding (blue) and decoding (red) deep belief networks

Training Set Randomly sample 200x200 pixel frames from 10 million YouTube videos OpenCV face detector was run on 60x60 randomly-sampled patches from the training set 3% of the 100,000 sampled patches contained faces learned by OpenCV

Deep Autoencoder Architecture 1 billion trainable parameters Still tiny. Human Visual Cortex is 106 times larger Local Receptive Fields (LRF) each feature connects to small region of the lower layer Local L2 Pooling Square root of sum of squares (inputs) Local Contrast Normalization (LCN) H W1

Learning and Optimization Parameters H are fixed to uniform weights Encoding weights W1 and decoding weights W2 of the first sublayers Lambda: tradeoff parameter between sparsity and reconstruction m, k: number of examples and pooling units in a layer respectively The objective function of the model is the sum of the individual objectives of the three layers

Validation of Higher Level Features Learned Control experiments used to analyze invariance properties of the face detector Test set consists of 37,000 images (containing 13,026 faces) and were sampled from Labeled Faces In the Wild dataset (Huang et al., 2007) ImageNet dataset (Deng et al., 2009) After training, test set was used to measure the performance of each neuron in classifying faces against distractors For each neuron, compute its maximum and minimum activation values and then picked 20 equally spaced thresholds in between The reported accuracy is the best classification accuracy among 20 thresholds

Sub-sample of test set positive/negative = 1 Validation Results Best neuron obtained 81.7% accuracy in detecting faces (serendipity?) Random Guess Accuracy achieved 64.8% accuracy Best neuron in one layered network achieved 71% accuracy Sub-sample of test set positive/negative = 1 Entire test set

Validation Results Analysis I Removing the LCF layer reduced accuracy of best performing neuron to 78.5% Robustness of face detector to translation, scaling and out-of-plane rotation (Fig. 4,5) Remove all images that have faces from the training set and repeat experiment results in 72.5% accuracy

Can Other Well Performing Neurons Recognize Other High Level Features? Constructed two datasets having positive/negative ratios similar to face ratios in training data Human bodies vs. distractors (Keller et al., 2009) Cat faces vs. distractors (Zhang et al., 2008)

Can Other Well Performing Neurons Recognize Other High Level Features?

Summary of Results and Comparisons to State of the Art Methods Thank You Questions?