Presented by Yuting Liu

Slides:



Advertisements
Similar presentations
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
Advertisements

The Binary Numbering Systems
ImageNet Classification with Deep Convolutional Neural Networks
Face Detection using the Viola-Jones Method
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.
1 End-to-End Learning for Automatic Cell Phenotyping Paolo Emilio Barbano, Koray Kavukcuoglu, Marco Scoffier, Yann LeCun April 26, 2006.
Essential components of the implementation are:  Formation of the network and weight initialization routine  Pixel analysis of images for symbol detection.
CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Neural Networks References: “Artificial Intelligence for Games” "Artificial Intelligence: A new Synthesis"
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Today’s Topics 11/10/15CS Fall 2015 (Shavlik©), Lecture 21, Week 101 More on DEEP ANNs –Convolution –Max Pooling –Drop Out Final ANN Wrapup FYI:
AP CSP: Pixelation – B&W/Color Images
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
cs638/838 - Spring 2017 (Shavlik©), Week 11
CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.
David Meredith Aalborg University
Evaluating Classifiers
Artificial Neural Networks
Deep Learning Amin Sobhani.
Ananya Das Christman CS311 Fall 2016
Artificial Intelligence
cs638/838 - Spring 2017 (Shavlik©), Week 10
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Computer Science and Engineering, Seoul National University
CS Fall 2016 (© Jude Shavlik), Lecture 4
Perceptrons Lirong Xia.
Classification with Perceptrons Reading:
Supervised Training of Deep Networks
Deep Belief Networks Psychology 209 February 22, 2013.
Random walk initialization for training very deep feedforward networks
CS 4/527: Artificial Intelligence
Introduction to CuDNN (CUDA Deep Neural Nets)
Each column represents another power of the base
Deep Learning Convoluted Neural Networks Part 2 11/13/
cs638/838 - Spring 2017 (Shavlik©), Week 7
CS Fall 2016 (© Jude Shavlik), Lecture 6, Week 4
CS 235 Nearest Neighbor Classification
Bird-species Recognition Using Convolutional Neural Network
Machine Learning Today: Reading: Maria Florina Balcan
CSSE463: Image Recognition Day 20
Neural Networks Advantages Criticism
CS 4501: Introduction to Computer Vision Training Neural Networks II
cs540 - Fall 2016 (Shavlik©), Lecture 20, Week 11
INF 5860 Machine learning for image classification
Numbers and their bases
cs540 - Fall 2016 (Shavlik©), Lecture 18, Week 10
Very Deep Convolutional Networks for Large-Scale Image Recognition
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
network of simple neuron-like computing elements
CSC 578 Neural Networks and Deep Learning
Neural Networks Geoff Hulten.
On Convolutional Neural Network
Aleysha Becker Ece 539, Fall 2018
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Artificial Intelligence 12. Two Layer ANNs
Convolutional Neural Networks
CSSE463: Image Recognition Day 18
Word embeddings (continued)
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
CSC321: Neural Networks Lecture 11: Learning in recurrent networks
David Kauchak CS158 – Spring 2019
Introduction to Neural Networks
Neural Networks Weka Lab
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Image recognition.
CS 295: Modern Systems Lab2: Convolution Accelerators
Perceptrons Lirong Xia.
Presentation transcript:

Presented by Yuting Liu Cs 638/838 Lab3 Presented by Yuting Liu Feb 7, 2017

Outline 1. Data Set 2. Code Skeleton 3. Requirements 4. Suggestions

O1 Data Set Flower Star Fish Watch Butterfly Airplane Grand Piano Includes 912 images (from http://www.vision.caltech.edu/Image_Datasets/Caltech101/) Down-sampled to 128 * 128 Classified into 6 categories Image files (in three subdirs) are named in format: label_image_XXXX (4 digits) Flower Star Fish Watch Butterfly Airplane Grand Piano

O2 Code Skeleton Code and data (zipped here) on line Code hierarchy Lab3.java (main function) Loads in the image as Instance object Stores all image instances in the dataset Dataset.java Stores all image instances as a list in the dataset Instance.java Provides the methods to separate RGB channels in the image Provides the method to change image into grey-scale image

O2 Code Skeleton Lab3.java Void main() Converts images to (1D) ‘fixed length feature vectors’ Instance.java BufferedImage image String label int width, height 2D matrix with separate RGB channels Gray-scale images; you choose if you wish to use RGB or grey scale FOUR 2D matrices in total: R + G + B + Grey Each matrix element an int in [0, 255] Dataset.java void add(Instance image) List<Instance> getImages()

Requirements O3 See Day 1 ‘course overview’ slides for required network structure Only edit Lab3.java (we will load the other provided files during testing – do NOT edit them! We will fix any bugs, announce, and update files) Lab3.java should read in UP TO FOUR strings: the name of the train, tune, and test directories plus imageSize (details later) For each of the six categories, we already put the 5th example in TUNE and 6th in TEST (rest in TRAIN)

O3 Requirements Use DROPOUT (50% of weights) Use ReLU for hiddens and the six output nodes How many HUs per layer? Up to you See how fast your code runs, discuss in Piazza Use early stopping to reduce overfitting Max number of epochs? Whatever is feasible Discuss in Piazza as you run experiments O3

O3 Requirements Probably test-set accuracy will be poor  Learning curve (later) will give us sense if more data would help Optional Experiment 1: alter the provided TRAIN images to create more examples - Shift a little, change colors?, etc - Post links to scientific papers on ideas in the literature Optional Experiment 2: try more layers

WHAT TO OUTPUT Your code should output (in plain ASCII) a CONFUSION MATRIX (see below) Each TEST SET example increments the count in ONE CELL Also report overall testset accuracy O3 CORRECT Category Flower Star Fish Watch Piano Airplane Butterfly PREDICTED Cat

PERCEPTRON TESTSET RESULTS(with drop out) CORRECT Category airplanes butterfly flower grand_piano starfish watch airplanes 36 2 1 1 1 1 row SUM = 42 butterfly 0 6 3 0 3 2 row SUM = 14 flower 0 2 28 2 4 3 row SUM = 39 grand_piano 0 0 0 13 0 0 row SUM = 13 starfish 1 5 2 1 4 2 row SUM = 15 watch 4 3 3 2 5 38 row SUM = 55 column SUM 41 18 37 19 17 46 total examples = 178, errors = 53 (29.8%) PREDICTED Cat Used FOUR input units per pixel (red + blue + green + grey) Trained SIX perceptrons, 100% TRAIN accuracy after about 50 epochs (used early stopping) Trained EACH as a BINARY CLASSIFIER (eg, watch vs. not_watch) But when PREDICTING used perceptron with largest weighted sum

PERCEPTRON TESTSET RESULTS(w/o drop out) CORRECT Category airplanes butterfly flower grand_piano starfish watch airplanes 36 2 1 3 0 3 row SUM = 45 butterfly 0 6 1 0 2 1 row SUM = 10 flower 0 1 23 1 3 2 row SUM = 30 grand_piano 0 0 0 10 1 0 row SUM = 11 starfish 1 8 11 1 8 4 row SUM = 33 watch 4 1 1 4 3 36 row SUM = 49 column SUM 41 18 37 19 17 46 total examples = 178, errors = 59 (33.2%) PREDICTED Cat Used FOUR input units per pixel (red + blue + green + grey) Trained SIX perceptrons, 100% TRAIN accuracy after about 50 epochs (used early stopping) Trained EACH as a BINARY CLASSIFIER (eg, watch vs. not_watch) But when PREDICTING used perceptron with largest weighted sum

O3 Experiment Create and discuss a learning curve Plot TEST and maybe TUNE ERROR RATES as function of # of training examples Train on 25%, 50%, 75%, and 100% of the train examples (if your code is fast enough, create more points on the learning curve) Use SAME TUNE AND TEST for all points on learning curve (if feasible, multiple times sample the trainset and plot mean) The code you submit for our testing should use 100% of the training examples

Fraction of Training Set Used Perceptron LearninG Curve O3 GREY only (w/ drop out) with drop out Fraction of Training Set Used

O3 Image Size Raw images are 128x128 We provide code for re-sizing Some perceptron test-set error rates 8x8 41% (46% grey only) 16x16 33% (39%) 32x32 31% (36%) 64x64 31% (35%) 128x128 30% (33%)

O3 One-LAYEr Hidden UNITS Used 250 HUs, 1000 epochs, dropout wgts Not doing as well as perceptrons; why? 8x8: 48% error rate on testset 16x16: 58% (46% GREY only)     32x32: (47% GREY only) Training slow; still debugging - will post more stats later - will start on convolution-net soon

SuggestioNS You might want to try using your Lab 1 and/or Lab 2 code first! - get everything to work, then add layers - nice ‘baseline’ system to evaluate value of DEEP First get training to work and report errors Next create confusion matrix on train set After that create confusion matrices on tune and test sets (use test=train to debug) Add ‘drop out’ next Then incorporate early-stopping code Finally allow using a fraction of TRAIN to generate the ‘learning curve’ (permute, then copy first N items) O4

DropOuT TIPS O4 Need to remember which weights are ‘dropped out’ during forward prop, so they are also ignored during back prop Boolean[] dropOutMask = (dropoutRate > 0.0 ? new Boolean[inputVectorSize] : null); Remember on TUNE and TEST examples, to multiply all weights by (1 – dropoutRate) so average weighted sum is the same as during training I never dropped out the weight rep’ing the BIAS (so didn’t multiply by 1-dropoutRate) Ideally would do multiple measurements per plotted point since random numbers used

Enjoy it! 