Transfer functions: hidden possibilities for better neural networks. Włodzisław Duch and Norbert Jankowski Department of Computer Methods, Nicholas Copernicus.

Slides:



Advertisements
Similar presentations
Visualization of the hidden node activities or hidden secrets of neural networks. Włodzisław Duch Department of Informatics Nicolaus Copernicus University,
Advertisements

CSC321: Introduction to Neural Networks and Machine Learning Lecture 24: Non-linear Support Vector Machines Geoffrey Hinton.
Slides from: Doug Gray, David Poole
1 Image Classification MSc Image Processing Assignment March 2003.
Universal Learning Machines (ULM) Włodzisław Duch and Tomasz Maszczyk Department of Informatics, Nicolaus Copernicus University, Toruń, Poland ICONIP 2009,
Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.
Navneet Goyal, BITS-Pilani Perceptrons. Labeled data is called Linearly Separable Data (LSD) if there is a linear decision boundary separating the classes.
Support Vector Machines
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
GhostMiner Wine example Włodzisław Duch Dept. of Informatics, Nicholas Copernicus University, Toruń, Poland ISEP Porto,
Heterogeneous Forests of Decision Trees Krzysztof Grąbczewski & Włodzisław Duch Department of Informatics, Nicholas Copernicus University, Torun, Poland.
Ch. 4: Radial Basis Functions Stephen Marsland, Machine Learning: An Algorithmic Perspective. CRC 2009 based on slides from many Internet sources Longin.
Machine Learning Neural Networks
Decision Support Systems
Heterogeneous adaptive systems Włodzisław Duch & Krzysztof Grąbczewski Department of Informatics, Nicholas Copernicus University, Torun, Poland.
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
K-separability Włodzisław Duch Department of Informatics Nicolaus Copernicus University, Torun, Poland School of Computer Engineering, Nanyang Technological.
Radial Basis Functions
I welcome you all to this presentation On: Neural Network Applications Systems Engineering Dept. KFUPM Imran Nadeem & Naveed R. Butt &
Almost Random Projection Machine with Margin Maximization and Kernel Features Tomasz Maszczyk and Włodzisław Duch Department of Informatics, Nicolaus Copernicus.
Coloring black boxes: visualization of neural network decisions Włodzisław Duch School of Computer Engineering, Nanyang Technological University, Singapore,
Support Vector Neural Training Włodzisław Duch Department of Informatics Nicolaus Copernicus University, Toruń, Poland School of Computer Engineering,
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.
Greg GrudicIntro AI1 Introduction to Artificial Intelligence CSCI 3202: The Perceptron Algorithm Greg Grudic.
Competent Undemocratic Committees Włodzisław Duch, Łukasz Itert and Karol Grudziński Department of Informatics, Nicholas Copernicus University, Torun,
Supervised Learning Networks. Linear perceptron networks Multi-layer perceptrons Mixture of experts Decision-based neural networks Hierarchical neural.
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks I PROF. DR. YUSUF OYSAL.
Data mining and statistical learning - lecture 12 Neural networks (NN) and Multivariate Adaptive Regression Splines (MARS)  Different types of neural.
Aula 4 Radial Basis Function Networks
SVMs, cont’d Intro to Bayesian learning. Quadratic programming Problems of the form Minimize: Subject to: are called “quadratic programming” problems.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Radial Basis Function (RBF) Networks
Last lecture summary.
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
8/10/ RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.
Radial Basis Function Networks
Soft Computing Colloquium 2 Selection of neural network, Hybrid neural networks.
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Back-Propagation MLP Neural Network Optimizer ECE 539 Andrew Beckwith.
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Radial Basis Function Networks:
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Chapter 6: Techniques for Predictive Modeling
Towards CI Foundations Włodzisław Duch Department of Informatics, Nicolaus Copernicus University, Toruń, Poland Google: W. Duch WCCI’08 Panel Discussion.
Non-Bayes classifiers. Linear discriminants, neural networks.
381 Self Organization Map Learning without Examples.
Towards Science of DM Włodzisław Duch Department of Informatics, Nicolaus Copernicus University, Toruń, Poland Google: W. Duch WCCI’08 Panel Discussion.
Dr.Abeer Mahmoud ARTIFICIAL INTELLIGENCE (CS 461D) Dr. Abeer Mahmoud Computer science Department Princess Nora University Faculty of Computer & Information.
Computational Intelligence: Methods and Applications Lecture 29 Approximation theory, RBF and SFN networks Włodzisław Duch Dept. of Informatics, UMK Google:
Support-Vector Networks C Cortes and V Vapnik (Tue) Computational Models of Intelligence Joon Shik Kim.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Neural networks and support vector machines
Support Feature Machine for DNA microarray data
NEURAL NETWORKS Lab 1 dr Zoran Ševarac FON, 2013.
Tomasz Maszczyk and Włodzisław Duch Department of Informatics,
Projection of network outputs
Neuro-Computing Lecture 4 Radial Basis Function Network
Neural Network - 2 Mayank Vatsa
Visualization of the hidden node activities or hidden secrets of neural networks. Włodzisław Duch Department of Informatics Nicolaus Copernicus University,
Support Vector Machines
Introduction to Radial Basis Function Networks
CSCE833 Machine Learning Lecture 9 Linear Discriminant Analysis
Visualization of the hidden node activities or hidden secrets of neural networks. Włodzisław Duch Department of Informatics Nicolaus Copernicus University,
Support Vector Neural Training
Heterogeneous adaptive systems
Linear Discrimination
Review for test #3 Radial basis functions SVM SOM.
Modeling IDS using hybrid intelligent systems
Presentation transcript:

Transfer functions: hidden possibilities for better neural networks. Włodzisław Duch and Norbert Jankowski Department of Computer Methods, Nicholas Copernicus University, Torun, Poland.

Why is this an important issue? MLPs are universal approximators - no need for other TF? Wrong bias => poor results, complex networks. Example of a 2-class problems: Class 1 inside the sphere, Class 2 outside. MLP: at least N +1 hyperplanes, O(N 2 ) parameters. RBF: 1 Gaussian, O(N) parameters. Class 1 in the corner defined by (1,1... 1) hyperplane, C2 outside. MLP: 1 hyperplane, O(N) parameters. RBF: many Gaussians, O(N 2 ) parameters, poor approximation.

Inspirations Logical rule: IF x 1 >0 & x 2 >0 THEN Class1 Else Class2 is not properly represented neither by MLP nor RBF! Result: decision trees and logical rules perform on some datasets (cf. hypothyroid) significantly better than MLPs! Speed of learning and network complexity depends on TF. Fast learning requires flexible „brain modules” - TF. Biological inspirations: sigmoidal neurons are crude approximation at the basic level of neural tissue. Interesting brain functions are done by interacting minicolumns, implementing complex functions. Modular networks: networks of networks. First step beyond single neurons: transfer functions providing flexible decision borders.

Transfer functions Transfer function f(I(X)): vector activation I(X) and scalar output o(I). 1. Fan-in, scalar product activation W. X, hyperplanes. 3. Mixed activation functions 2. Distance functions as activations, for example Gaussian functions:

Taxonomy - activation f.

Taxonomy - output f.

Taxonomy - TF

TF in Neural Networks Choices: 1. Homogenous NN: select best TF, try several types Ex: RBF networks; SVM kernels (today 50=>80% change). 2. Heterogenous NN: one network, several types of TF Ex: Adaptive Subspace SOM (Kohonen 1995), linear subspaces. Projections on a space of basis functions. 3. Input enhancement: adding f i (X) to achieve separability. Ex: functional link networks (Pao 1989), tensor products of inputs; D-MLP model. Heterogenous : 1. Start from large network with different TF, use regularization to prune 2. Construct network adding nodes selected from a pool of candidates 3. Use very flexible TF, force them to specialize.

Most flexible TFs Conical functions: mixed activations Lorentzian: mixed activations Bicentral - separable functions

Bicentral + rotations 6N parameters, most general. Box in N-1 dim x rotated window. Rotation matrix with band structure makes 2x2 rotations.

Some properties of TFs For logistic functions: Renormalization of a Gaussian gives logistic function where: W i =4D i /b i 2

Example of input transformation Minkovsky’s distance function: Sigmoidal activation changed to: Adding a single input renormalizing the vector:

Conclusions Radial and sigmoidal functions are not the only choice. StatLog report: large differences of RBF and MLP on many datasets. Better learning cannot repair wrong bias of the model. Systematic investigation and taxonomy of TF is worthwhile. Networks should select/optimize their functions. StatLog report: large differences of RBF and MLP on many datasets. Better learning cannot repair wrong bias of the model. Systematic investigation and taxonomy of TF is worthwhile. Networks should select/optimize their functions. Open questions: Optimal balance between complex nodes/interactions (weights)? How to train heterogeneous networks? How to optimize nodes in a constructive algorithms? Hierarchical, modular networks: nodes that are networks themselves. Open questions: Optimal balance between complex nodes/interactions (weights)? How to train heterogeneous networks? How to optimize nodes in a constructive algorithms? Hierarchical, modular networks: nodes that are networks themselves.

The End ? Perhaps the beginning...