Intelligent Leaning -- A Brief Introduction to Artificial Neural Networks Chiung-Yao Fang
Learning What is leaning? The type of learning Incremental Learning Active Learning The type of learning Supervised Learning Unsupervised Learning Reinforcement Learning 2018/12/9
Understanding the Brain Levels of analysis (Marr, 1982) Computational theory Representation and algorithm Hardware implementation Example: sorting The same computational theory may have multiple representations and algorithms. A given representation and algorithm may have multiple hardware implementations. Reverse engineering: From hardware to theory 2018/12/9
Understanding the Brain Parallel processing: SIMD vs MIMD SIMD: single instruction multiple data machines All processors execute the same instruction but on different pieces of data MIMD: multiple instruction multiple data machines Different processors may execute different instructions on different data Neural net: NIMD: neural instruction multiple data machines Each processor corresponds to a neuron, local parameters correspond to its synaptic weights, and the whole structure is a neural network. Learning: Update by training/experience Learning from examples 2018/12/9
Biological-Type Neural Networks 2018/12/9
Application-Driven Neural Networks Three main characteristics: Adaptiveness and self-organization Nonlinear network processing Parallel processing 2018/12/9
Perceptron (Rosenblatt, 1962) 2018/12/9
2018/12/9
What a Perceptron Does Regression: y=wx+w0 x0: bias unit y y w0 w x x Connection weight w w0 y x x0=+1 y x 2018/12/9
What a Perceptron Does Classification: y= 1(wx+w0>0) Define s (.) as the threshold function Choose C1 if s (wx+w0)>0 else choose C2 w0 w w0 y x s 2018/12/9
K Outputs 2018/12/9
Learning Boolean AND 2018/12/9
XOR No w0, w1, w2 satisfy: (Minsky and Papert, 1969) 2018/12/9
Multilayer Perceptrons (Rumelhart et al., 1986) 2018/12/9
x1 XOR x2 = (x1 AND ~x2) OR (~x1 AND x2) y x1 XOR x2 = (x1 AND ~x2) OR (~x1 AND x2) 2018/12/9
Structures of Neural Networks 2018/12/9
Connection Structures Four types of weighted connections: Feedforward connections Feedback connections Lateral connections Time-delay connections 2018/12/9
Connection Structures Single-layer example 2018/12/9
Taxonomy of Neural Networks HAM SOM 2018/12/9
Supervised and Unsupervised Networks 2018/12/9
A Top-down Perspective 2018/12/9
2018/12/9
Applications: Association Auto-Association Hetero-Association 2018/12/9
Applications: Classification Unsupervised classification (clustering) Supervised classification 2018/12/9
Applications: Pattern Completions Two kinds of pattern completion problems: Static pattern completion Multilayer nets, Boltzmann machines, and Hopfield nets Temporal pattern completion Markov models and time-delay dynamic networks 2018/12/9
Applications: Regression and Generalization 2018/12/9
Applications: Optimization 2018/12/9
Examples: A Toy OCR Optical character recognition (OCR) Supervised learning The retrieving phase The training phase 2018/12/9
Examples: A Toy OCR 2018/12/9
2018/12/9
Supervised Learning Neural Networks Backpropagation HAM
Backpropagation 2018/12/9
Regression Backward Forward x 2018/12/9
Hidden Layer Do we have more hidden layers? Yes! But complicate. “Long and narrow” network vs “Short and fat” network Two hidden layer example: For every input case of region, that region can be delimited by hyperplanes on all sides using hidden units on the first hidden layer. A hidden unit in the second layer than ANDs them together to bound the region. It has been proven that an MLP with one hidden layer can learn any nonlinear function of the input. 2018/12/9
HAM (Hetero-Associative Memory) Neural Network j Output layer (Competitive layer) Excitatory connection Input layer wij xj i vi v1 v2 vn 2018/12/9
Training Patterns for HAM 2018/12/9
2018/12/9
Unsupervised Learning Neural Networks SOM ART1 ART2
Self-organization Feature Maps 2018/12/9
2018/12/9
2018/12/9
An Assembly of SSO Neural Networks for Character Recognition 2018/12/9
An Assembly of SSO Neural Networks for Character Recognition 2018/12/9
ART1 Neural Networks 2018/12/9
Attentional subsystem ART2 Neural Networks r p u w v x q y Input vector i Input representation field F1 Attentional subsystem Orienting subsystem G Category representation field F2 Reset signal + - Signal generator S 2018/12/9
Road Sign Recognition System 2018/12/9
Classification Results of ART2 Training Set Test Set 2018/12/9
Conclusions 2018/12/9
STA Neural Networks
STA (Spatial-temporal attention ) Neural Network ak ai Output layer (Attention layer) nk Inhibitory connection ni wij Excitatory connection xj nj Input layer 2018/12/9
The linking strengths between the input and the attention layers STA Neural Network The input to attention neuron ni due to input stimuli x: The linking strengths between the input and the attention layers corresponding neurons wkj ni nj nk Input neuron Attention layer rk Gaussian function G 2018/12/9
“Mexican-hat” function of lateral interaction STA Neural Network The input to attention neuron ni due to lateral interaction: Lateral distance “Mexican-hat” function of lateral interaction Interaction + 2018/12/9
STA Neural Network The net input to attention neuron ni : : a threshold to limit the effects of noise where 1< d <0 2018/12/9
The activation of an attention neuron in response to a stimulus. STA Neural Network (5) stimulus activation t 1 1 p pd The activation of an attention neuron in response to a stimulus. 2018/12/9
Results of STA Neural Networks 2018/12/9
Results of STA Neural Networks 2018/12/9