Neural networks in modern image processing Petra Budíková DISA seminar, 30. 9. 2014.

Slides:

Advertisements

Similar presentations

Rich feature Hierarchies for Accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitandra Malik (UC Berkeley)

Advertisements

1 Image Classification MSc Image Processing Assignment March 2003.

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

ImageNet Classification with Deep Convolutional Neural Networks

Tiled Convolutional Neural Networks TICA Speedup Results on the CIFAR-10 dataset Motivation Pretraining with Topographic ICA References [1] Y. LeCun, L.

Presented by: Mingyuan Zhou Duke University, ECE September 18, 2009

Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang1;2, Manohar Paluri1, Marc’Aurelio Ranzato1, Trevor Darrell2, Lubomir Bourdev1 1: Facebook.

AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/

Spatial Pyramid Pooling in Deep Convolutional

MACHINE LEARNING AND ARTIFICIAL NEURAL NETWORKS FOR FACE VERIFICATION

Overview of Back Propagation Algorithm

Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.

Machine learning Image source:

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

Generic object detection with deformable part-based models

Machine learning Image source:

Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.

A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.

A shallow introduction to Deep Learning

Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos VC 14/15 – TP19 Neural Networks & SVMs Miguel Tavares.

Avoiding Segmentation in Multi-digit Numeral String Recognition by Combining Single and Two-digit Classifiers Trained without Negative Examples Dan Ciresan.

Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,

Object detection, deep learning, and R-CNNs

Deep Convolutional Nets

M. Wang, T. Xiao, J. Li, J. Zhang, C. Hong, & Z. Zhang (2014)

1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.

Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:

Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation.

CS 188: Artificial Intelligence Learning II: Linear Classification and Neural Networks Instructors: Stuart Russell and Pat Virtue University of California,

Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov

ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.

Object Recognizing. Deep Learning Success in 2012 DeepNet and speech processing.

Convolutional Neural Network

Semantic search-based image annotation Petra Budíková, FI MU CEMI meeting, Plzeň,

Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Recent developments in object detection

Big data classification using neural network

The Relationship between Deep Learning and Brain Function

CS 6501: 3D Reconstruction and Understanding Convolutional Neural Networks Connelly Barnes.

Compact Bilinear Pooling

Object detection with deformable part-based models

Data Mining, Neural Network and Genetic Programming

Data Mining, Neural Network and Genetic Programming

DeepCount Mark Lenson.

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Article Review Todd Hricik.

CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.

Lecture 24: Convolutional neural networks

Training Techniques for Deep Neural Networks

Deep Belief Networks Psychology 209 February 22, 2013.

CS6890 Deep Learning Weizhen Cai

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Introduction to Deep Learning for neuronal data analyses

Computer Vision James Hays

Introduction to Neural Networks

Image Classification.

Deep learning Introduction Classes of Deep Learning Networks

KFC: Keypoints, Features and Correspondences

A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE

Neural Networks Geoff Hulten.

Deep Learning Some slides are from Prof. Andrew Ng of Stanford.

Computer Vision Lecture 19: Object Recognition III

CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.

An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,

CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Goodfellow: Chapter 14 Autoencoders

Presentation transcript:

Neural networks in modern image processing Petra Budíková DISA seminar,

Artificial neural networks  Computational models inspired by an animal's central nervous systems  Systems of interconnected "neurons" which can compute values from inputs  Are capable of approximating non-linear functions of their inputs  Mathematically, a neuron's network function f(x) is defined as a composition of other functions g i (x), which can further be defined as a composition of other functions.  Known since 1950s  Typical applications: pattern recognition in speech or images 2/22

Artificial neural networks (cont.) 3/22

Artificial neural networks (cont.)  Network node in detail  Network learning process = tuning the synaptic weights  Initialize randomly  Repeatedly compute the ANN result for a given task, compare with ground truth, update ANN weights by backpropagation algorithm to improve ANN performance 4/22

Artificial neural networks – example  ALVINN system for automatic car driving (ANN illustration form [Mitchell97]) 5/22

Neural networks before 2009(+-) and after  Before 2009: ANNs typically with 2-3 layers  Reason 1: computation times  Reason 2: problems of the backpropagation algorithm  Local optimization only (needs a good initialization, or re-initialization)  Prone to over-fitting (too many parameters to estimate, too few labeled examples)  => Skepticism: A deep network often performed worse than a shallow one  After 2009: Deep neural networks  Fast GPU-based implementations  Weights can be initialized better (Use of unlabeled data, Restricted Boltzmann Machines)  Large collections of labeled data available  Reducing the number of parameters by weight sharing  Improved backpropagation algorithm  Success in different areas, e.g. traffic sign recognition, handwritten digits problem 6/22

Convolutional neural networks  A type of feed-forward ANN where the individual neurons are tiled in such a way that they respond to overlapping regions in the visual field  Inspired by biological processes  Widely used for image recognition  Multiple layers of small neuron collections which look at small portions of the input image  The input hidden units in the m-th layer are connected to a local subset of units in the (m-1)-th layer, which have spatially contiguous receptive fields 7/22

Convolutional neural networks  Shared weights: each sparse filter h i is replicated across the entire visual field. The replicated units form a feature map, which share the same parametrization, i.e. the same weight vector and the same bias.  Weights of the same color are shared, i.e. are constrained to be identical  Replicating units allows for features to be detected regardless of their position in the visual field.  Weight sharing greatly reduces the number of free parameters to learn.  MaxPooling: another important concept of CNNs  non-linear down-sampling – the input image is partitioned into a set of non- overlapping rectangles and maximum value is taken for each such sub-region  Advantages:  It reduces the computational complexity for upper layers  It provides a form of translation invariance 8/22

Krizhevsky 2012: ImageNet neural network  The ImageNet challenge: recognize 1000 image categories  Training data: 1.2M manually cleaned training images (obtained by crowdsourcing)  Krizhevsky solution: deep convolutional neural network  5 convolutional layers, 3 fully connected layers  60 million parameters and 650,000 neurons  New function for nodes (Rectified Linear Units)  Efficient GPU implementation of NN learning, highly-optimized implementation of 2D convolution  Data augmentation  generating image translations and horizontal reflections  five 224 × 224 patches (the four corner patches and the center patch) as well as their horizontal reflection from each 256×256 image  => transformation invariance, reduces overfitting  Additional refinements such as the “dropout” regularization method 9/22

Krizhevsky 2012 (cont.)  Great success!!! 10/22

Krizhevsky 2012 – more than just classification?  Indications that the last hidden layers carry semantics!  Suggestion in [Krizhevsky12]:  Responses of the last hidden layer can be used as a compact global image descriptor  Semantically similar images should have small Euclidean distance 11/22

Convolutional neural network implementations  cuda-convent  Original implementation by Alex Krizhevsky  decaf  Python framework for training neural networks  Caffe  Convolutional Architecture for Fast Feature Embedding  Berkeley Vision and Learning Center  C++/CUDA framework for deep learning and vision  An active research and development community  Main advantage in comparison with other implementations: it is FAST  Wrappers for Python and MATLAB 12/22

DeCAF  decaf  Python framework for training neural networks  Deprecated, replaced by Caffe  DeCAF  Image features derived from neural network trained for the ImageNet competition  3 types: DeCAF 5, DeCAF 6, DeCAF 7  Derived from last 3 hidden layers of the ImageNet neural network  Descriptor sizes: ??? dimensions for DeCAF 5, 4096 dimensions for DeCAF 6, DeCAF 7 13/22

DeCAF (cont.)  Performance of DeCAF features analyzed in [Donahue14] in context of several image classification tasks  DeCAF 5 not so good  DeCAF 6 and DeCAF 7 very good, in many cases outperform state-of-the-art descriptors  DeCAF 6 typically more successful, but only by small margin 14/22

Utilization of DeCAF descriptors  Recognition of new (unseen in ImageNet) categories by training (a linear) classifier on top of the DeCAF descriptors  [Donahue14]  [Girshick14]  Two solutions of ImageCLEF 2014 Scalable Concept Annotation Challenge  …  Very good results reported 15/22

Similarity search: MPEG7 vs. DeCAF 7  Similarity search in 20M images; 1 st image is the query MPEG7 descriptors DeCAF 7 descriptors 16/22

Similarity search: MPEG7 vs. DeCAF 7 MPEG7 descriptors DeCAF 7 descriptors 17/22

Similarity search: MPEG7 vs. DeCAF 7 MPEG7 descriptors DeCAF 7 descriptors 18/22

Similarity search: MPEG7 vs. DeCAF 7 MPEG7 descriptors DeCAF 7 descriptors 19/22

Similarity search: MPEG7 vs. DeCAF 7 MPEG7 descriptors DeCAF 7 descriptors 20/22

Literature Books [Mitchell97] T. Mitchell. Machine Learning. ISBN McGraw Hill, Research papers [Donahue14] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. ICML, [Girshick14] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR, [Jia14] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell. Caffe: An Open Source Convolutional Architecture for Fast Feature Embedding. Submitted to ACM MULTIMEDIA 2014 Open Source Software Competition. [Krizhevsky12] A. Krizhevsky, I. Sutskever, G. E. Hinton: ImageNet Classification with Deep Convolutional Neural Networks. NIPS /22

Literature (cont.) Other   J. Materna: Deep Learning: budoucnost strojového učení?  J. Čech: A Shallow Introduction into the Deep Machine Learning.  Basic explanation of convolutional neural networks principles /22