Attentional Neural Network: Feature Selection Using Cognitive Feedback

Slides:



Advertisements
Similar presentations
Perceptron Learning Rule
Advertisements

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Lecture 14 – Neural Networks
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
Simple Neural Nets For Pattern Classification
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
Ensemble Learning: An Introduction
Introduction to Neural Networks Simon Durrant Quantitative Methods December 15th.
October 5, 2010Neural Networks Lecture 9: Applying Backpropagation 1 K-Class Classification Problem Let us denote the k-th class by C k, with n k exemplars.
Autoencoders Mostafa Heidarpour
Image Denoising and Inpainting with Deep Neural Networks Junyuan Xie, Linli Xu, Enhong Chen School of Computer Science and Technology University of Science.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
A Genetic Algorithms Approach to Feature Subset Selection Problem by Hasan Doğu TAŞKIRAN CS 550 – Machine Learning Workshop Department of Computer Engineering.
Multiple-Layer Networks and Backpropagation Algorithms
Boris Babenko Department of Computer Science and Engineering University of California, San Diego Semi-supervised and Unsupervised Feature Scaling.
Chapter 9 Neural Network.
Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10
NEURAL NETWORKS FOR DATA MINING
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
The Perceptron. Perceptron Pattern Classification One of the purposes that neural networks are used for is pattern classification. Once the neural network.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Neural Text Categorizer for Exclusive Text Categorization Journal of Information Processing Systems, Vol.4, No.2, June 2008 Taeho Jo* 報告者 : 林昱志.
November 21, 2013Computer Vision Lecture 14: Object Recognition II 1 Statistical Pattern Recognition The formal description consists of relevant numerical.
Artificial Neural Networks (ANN). Artificial Neural Networks First proposed in 1940s as an attempt to simulate the human brain’s cognitive learning processes.
ECE 471/571 - Lecture 16 Hopfield Network 11/03/15.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Artificial Neural Networks This is lecture 15 of the module `Biologically Inspired Computing’ An introduction to Artificial Neural Networks.
Ensemble Classifiers.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning: Ensemble Methods
Neural networks.
When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.
Big data classification using neural network
Multiple-Layer Networks and Backpropagation Algorithms
Some Slides from 2007 NIPS tutorial by Prof. Geoffrey Hinton
Neural Network Architecture Session 2
Deep Feedforward Networks
Artificial Neural Networks
Deep Learning Amin Sobhani.
Compact Bilinear Pooling
an introduction to: Deep Learning
Learning with Perceptrons and Neural Networks
ECE 471/571 - Lecture 15 Hopfield Network 03/29/17.
Artificial neural networks:
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Classification with Perceptrons Reading:
ECE 471/571 - Lecture 19 Hopfield Network.
Hidden Markov Models Part 2: Algorithms
Neural Networks Advantages Criticism
Artificial Neural Network & Backpropagation Algorithm
Neuro-Computing Lecture 4 Radial Basis Function Network
of the Artificial Neural Networks.
Deep Belief Nets and Ising Model-Based Network Construction
network of simple neuron-like computing elements
Capabilities of Threshold Neurons
Representation Learning with Deep Auto-Encoder
Ensemble learning.
Ch4: Backpropagation (BP)
View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions 1,2 1.
Autoencoders hi shea autoencoders Sys-AI.
COSC 4335: Part2: Other Classification Techniques
Computer Vision Lecture 19: Object Recognition III
Roc curves By Vittoria Cozza, matr
The Network Approach: Mind as a Web
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &
Learning and Memorization
Ch4: Backpropagation (BP)
Presentation transcript:

Attentional Neural Network: Feature Selection Using Cognitive Feedback Qian Wang1, Sen Song1, Jiaxing Zhang2, Zheng Zhang3 1 Department of Biomedical Engineering, Tsinghua University, Beijing, China 2 Microsoft Research Asia, Beijing, China 3 Department of Computer Science, NYU Shanghai, Shanghai, China Presented by Amir K. Afshar Wayne State University Department of Computer Science 5 February 2018

What is an Attentional Neural Network? Motivation and Inspiration The human visual system is capable of achieving curiously robust performance in the identification and classification of objects and figures. This work proposes a model (Attentional Neural Networks, or aNNs) that attempts to explain, or perhaps gain some insight into the mysterious cognitive processes to make such results possible.

What is an Attentional Neural Network What is an Attentional Neural Network? An Introduction, Part I: The Building Blocks The aNN is a novel architecture which combines two major tasks, namely segmentation and classification. They are composed with a collection of simple modules: 1. A segmentation module* 2. A denoising module 3. A classification module * The construction of the segmentation module is influenced by a “cognitive bias” vector “b,” b ϵ {0, 1}N (to be detailed on the following slide).

What is an Attentional Neural Network What is an Attentional Neural Network? A Brief Introduction, Part II: The Segmentation Module, Continued As aforementioned, the aNN has two primary tasks: the first being to segment the input data, which is the objective of the segmentation module, as the name implies. The ith element of this bias vector contains a prior belief of the membership of a segmented object to class i. As an example, if N = 3, b = (0 1 0) indicates that it is believed that an object y belongs to the second class of objects in. The input image x is then mapped into a feature vector, denoted h, with h = σ (W · x), where W is the feature weight matrix and σ is the sigmoid function. Simultaneously, b generates a gating vector, denoted g, with g = σ (U · b), with feedback weights U and again, σ the sigmoid function. g may then select or deselect features by modifying hg , with hg = h .* g, the element-wise product of h and g. From here the reconstructed segment to be classified is computed by z, with z = σ (W’ · hg). It must be noted as well that b need not be a binary vector; it may instead be pdf containing a mixture of guesses as to which class y may belong. For the sake of simplicity, only two (simpler scenarios) were considered: b is a binary vector indicating whether there is a particular class of objects associated with its weights, UG, or A universal (group) bias bG with equal weights for all classes, indicating the certain presence of an object (but of no particular class). The segmentation model (with cognitive bias vector b), denoted by M Note that this diagram shows y, not z. This will be clarified on the next slide.

What is an Attentional Neural Network What is an Attentional Neural Network? A Brief Introduction, Part III: Segmentation to Classification The second primary task of an aNN is classification. Denoising is an intermediate step and not nearly as critical (refer to slide 10). Here, it would seem natural to feed y into a classifier C (depicted). A critical issue with this, however, is the proneness to misclassification altogether, due to loss of details during the segmentation process. For example, suppose that during the reconstruction of y, M was given the wrong bias vector b?! As a precautionary measure for such a potential mishap, the reconstructed segment y was used to gate the image raw x with a threshold, ε (that is, y MUST exceed this threshold), in order to produce the gated image z (as in the previous slide) with z = (y > ε) .* x for classification. Figure 1 Figure 2 ( .* above indicates element-wise multiplication) Figure 1 illustrates the aNN framework discussed until now. Figure 2 illustrates the same framework in principle, but extended to an iterative design (reminiscent of a RNN) to handle more complex segmentation problems. The red circles in the figures above indicate the denoising modules (Wang, Song, Zhang & Zhang, 2)

aNN Classification: Some More Details In the case of iterative classification (Figure 2 of the previous slide), the system may be given an initial cognitive bias. Subsequently, a series of guesses b and classification results C will be produced. Should the bias b agree with result C, then b is to be considered a candidate for the final classification result. If this is not the case, i.e., b is chosen incorrectly, the raw image x may be transformed incorrectly, but those segments with correct biases will often be better than the transformed images under the wrong bias altogether.

Some Notes and Details on Training Attentional Neural Networks A shallow Restricted Boltzmann Machine was used for the generative model, and, when compared with autoencoders, qualitatively-similar results were achieved. The feature and feedback weights W and U, respectively, were difficult to learn simultaneously, due to their multiplicative nature. To make the training more feasible, first, a feedback-disabled RBM was trained with noisy data to learn W. Next, fixing W, U was trained on noisy data with clean target data using backpropagation. This process constrained U to learn to include features of relevance and discard those that were not.

Results and Analysis: The Data and Initial Methods The MNIST and MNIST-2 datasets were used to evaluate effectiveness. “MNIST-2” is a dataset (exclusive to this paper) composed by laying two randomly chosen MNIST digits on top of each other. A 3-layer perceptron with 256 hidden nodes was trained on clean MNIST data, yielding a 1.6% error rate.

Results and Analysis Continued Under the assumption that feature selection is sensitive to the choice of the cognitive bias (and it is), then any given b should result in the activation of the corresponding relevant features. Here, the hidden units are hidden by the associated weights in U for a given bias from the set {0, 1, 2, 8} are and inspected their associated feature weights in W. The top features, when superimposed, do compose a rough version of the digit of interest. The “Top features” image illustrates the most popular features selected by different cognitive biases, namely b = 0, b = 1, b = 2, b = 8, and their accumulations. The “Reconstruction” image illustrates the effects of: 1. Denoising without bias 2. Denoising with group bias 3. Denoising with correct bias, and 4. Denoising with wrong bias The “Feature selection” image illustrates how the cognitive bias selects and eliminates features. Feature selection

Results and Analysis Continued 2

Questions or Comments?

Thank you!