with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017

Slides:



Advertisements
Similar presentations
Artificial Intelligence 12. Two Layer ANNs
Advertisements

Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Neural Networks  A neural network is a network of simulated neurons that can be used to recognize instances of patterns. NNs learn by searching through.
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
Model Selection and Validation
Aula 4 Radial Basis Function Networks
Image Compression Using Neural Networks Vishal Agrawal (Y6541) Nandan Dubey (Y6279)
Radial-Basis Function Networks
Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)
COMP3503 Intro to Inductive Modeling
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Multi-Layer Perceptrons Michael J. Watts
Introduction to Neural Networks Debrup Chakraborty Pattern Recognition and Machine Learning 2006.
Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10
Learning from observations
1 COMP3503 Inductive Decision Trees with Daniel L. Silver Daniel L. Silver.
Other NN Models Reinforcement learning (RL)
Neural Nets: Something you can use and something to think about Cris Koutsougeras What are Neural Nets What are they good for Pointers to some models and.
CS Inductive Bias1 Inductive Bias: How to generalize on novel data.
CSC321: Lecture 7:Ways to prevent overfitting
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Today’s Topics Read: Chapters 7, 8, and 9 on Logical Representation and Reasoning HW3 due at 11:55pm THURS (ditto for your Nannon Tourney Entry) Recipe.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 23: Linear Support Vector Machines Geoffrey Hinton.
Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.
Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Machine Learning Supervised Learning Classification and Regression
Theory and Application of Artificial Neural Networks
Reinforcement Learning
Regularization Techniques in Neural Networks
Learning Deep Generative Models by Ruslan Salakhutdinov
Deep Feedforward Networks
Artificial Neural Networks
Deep Learning Amin Sobhani.
Data Mining, Neural Network and Genetic Programming
Learning in Neural Networks
Data Mining, Neural Network and Genetic Programming
CS 9633 Machine Learning Inductive-Analytical Methods
Intelligent Information System Lab
Neural Networks A neural network is a network of simulated neurons that can be used to recognize instances of patterns. NNs learn by searching through.
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Parallel Processing and GPUs
Unsupervised Learning and Autoencoders
Deep Learning Workshop
Random walk initialization for training very deep feedforward networks
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Restricted Boltzman Machines
MLP Based Feedback System for Gas Valve Control in a Madison Symmetric Torus Andrew Seltzman Dec 14, 2010.
Machine Learning Today: Reading: Maria Florina Balcan
ECE 471/571 - Lecture 17 Back Propagation.
ANN Design and Training
Artificial Intelligence Methods
cs540 - Fall 2016 (Shavlik©), Lecture 20, Week 11
INF 5860 Machine learning for image classification
Neuro-Computing Lecture 4 Radial Basis Function Network
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
network of simple neuron-like computing elements
An Introduction To The Backpropagation Algorithm
Neural Networks Geoff Hulten.
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Artificial Intelligence 12. Two Layer ANNs
Neural networks (3) Regularization Autoencoder
Computer Vision Lecture 19: Object Recognition III
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
David Kauchak CS51A Spring 2019
Prediction Networks Prediction A simple example (section 3.7.3)
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
1 CogNova Technologies Theory and Application of Artificial Neural Networks with Daniel L. Silver, PhD Daniel L. Silver, PhD Copyright (c), 2014 All Rights.
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017 ANN Generalization with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017 07/12/2018 Deep Learning Workshop

Generalization The objective of learning is to achieve good generalization to new cases, otherwise just use a look-up table. Generalization can be defined as a mathematical interpolation or regression over a set of training points Example f(x) x

An Example: Computing Parity Generalization An Example: Computing Parity Can it learn from m examples to generalize to all 2^n possibilities? Parity bit value +1 +1 -1 >0 >1 >2 (n+1)^2 weights n bits of input 2^n possible examples

Network test of 10-bit parity Generalization Network test of 10-bit parity (Denker et. al., 1987) 100% When number of training cases, m >> number of weights, then generalization occurs. Test Error .25 .50 .75 1.0 Fraction of cases used during training

A Probabilistic Guarantee Generalization A Probabilistic Guarantee N = # hidden nodes m = # training cases W = # weights = error tolerance (< 1/8) Network will generalize with 95% confidence if: 1. Error on training set < Based on PAC theory => provides a good rule of practice.

Generalization Consider 20-bit parity problem: 20-20-1 net has 441 weights For 95% confidence that net will predict with , we need training examples Not bad considering Parity is one of the most difficult problems to learn strictly by example using any method. It is, in fact, the XOR problem generalized to n bits. NOTE: most real-world processes/functions which are learnable are not this difficult. It should also be noted that certain sequences of events or random bit patterns cannot be modeled (example: weather - Kolmogorov showed that compressibility of a sequence is an indicator of its ability to be modeled. Interesting fact: (1) consider pie .. probability of compression is 1 -it can be compressed to a very small algorithm, yet its limit is unknown (a transcendental number) (2) consider a series of 100 random numbers between 1 and 1000 .. probability of compression is less than 1%. Always consider: It is possible that certain factors effecting a process may be non-deterministic, in which case the best model will be an approximation

Generalization Training Sample & Network Complexity Based on : W - to reduced size of training sample Optimum W=> Optimum # Hidden Nodes W - to supply freedom to construct desired function

Generalization Over-Training Is the equivalent of over-fitting a set of data points to a curve which is too complex Occam’s Razor (1300s) : “plurality should not be assumed without necessity” The simplest model which explains the majority of the data is usually the best Show over-training OH

Deep Learning Workshop Generalization How can we prevent over-fitting and control number of effective weights? Tuning of network architecture Early stopping Weight-decay Model averaging – ensemble methods Weight-sharing Generative pre-training Dropout 07/12/2018 Deep Learning Workshop

Tuning of network architecture Tweak number of layers and number of hidden nodes until best test error is achieved Available Examples 70% Divide randomly 30% Generalization error = test error Training Set Val. Set Production Set Compute Test error Used to develop one ANN model

Generalization Preventing Over-training: Use a separate validation or tuning set of examples Monitor error on the val. set as network trains Stop network training just prior to over-fit error occurring - early stopping or tuning Number of effective weights is reduced Most new systems have automated early stopping methods

Generalization Weight Decay: an automated method of effective weight control Adjust the bp error function to penalize the growth of unnecessary weights: where: = weight -cost parameter λ is decayed by an amount proportional to its magnitude; those not reinforced => 0

TUTORIAL 2 Generalization using a validation set (Python code) Backpropagation_ovt.py

Face Image Morphing Liangliang Tu (2010) Image Morphing: Inductive transfer between tasks that have multiple outputs Transforms 30x30 grey scale images using inductive transfer Three mapping tasks NA NH NS

Face Image Morphing Passport Angry Filtered Passport Sad Filtered