Investigating the role of individual neurons as outlier detectors

Slides:



Advertisements
Similar presentations
Artificial Intelligence 12. Two Layer ANNs
Advertisements

A brief review of non-neural-network approaches to deep learning
Associative Learning Memories -SOLAR_A
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Classification Neural Networks 1
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
Neural Networks. R & G Chapter Feed-Forward Neural Networks otherwise known as The Multi-layer Perceptron or The Back-Propagation Neural Network.
Introduction to Neural Networks Simon Durrant Quantitative Methods December 15th.
November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance.
CS 484 – Artificial Intelligence
Radial-Basis Function Networks
Radial Basis Function Networks
Foundation of High-Dimensional Data Visualization
1 Prediction of Software Reliability Using Neural Network and Fuzzy Logic Professor David Rine Seminar Notes.
MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way
Biointelligence Laboratory, Seoul National University
Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.
© Negnevitsky, Pearson Education, Will neural network work for my problem? Will neural network work for my problem? Character recognition neural.
Multi-Layer Perceptrons Michael J. Watts
Artificial Neural Networks An Overview and Analysis.
Chapter 9 Neural Network.
Outline What Neural Networks are and why they are desirable Historical background Applications Strengths neural networks and advantages Status N.N and.
NEURAL NETWORKS FOR DATA MINING
Classification / Regression Neural Networks 2
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Applying Neural Networks Michael J. Watts
Quality control of daily data on example of Central European series of air temperature, relative humidity and precipitation P. Štěpánek (1), P. Zahradníček.
EE459 Neural Networks Examples of using Neural Networks Kasin Prakobwaitayakit Department of Electrical Engineering Chiangmai University.
IE 585 Competitive Network – Learning Vector Quantization & Counterpropagation.
Programming for Geographical Information Analysis: Advanced Skills Online mini-lecture: Introduction to Neural Nets Dr Andy Evans.
Non-Bayes classifiers. Linear discriminants, neural networks.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
EEE502 Pattern Recognition
Neural networks – Hands on
1 Traffic accident analysis using machine learning paradigms Miao Chong, Ajith Abraham, Mercin Paprzycki Informatica 29, P89, 2005 Report: Hsin-Chan Tsai.
Machine Learning Artificial Neural Networks MPλ ∀ Stergiou Theodoros 1.
Learning: Neural Networks Artificial Intelligence CMSC February 3, 2005.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Today’s Lecture Neural networks Training
Gilad Lerman Math Department, UMN
Applying Neural Networks
an introduction to: Deep Learning
Artificial neural networks
Neural Networks Winter-Spring 2014
Erich Smith Coleman Platt
DEPARTMENT: COMPUTER SC. & ENGG. SEMESTER : VII
Compositional Human Pose Regression
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Classification / Regression Neural Networks 2
Prof. Carolina Ruiz Department of Computer Science
Lecture 9 MLP (I): Feed-forward Model
Classification Neural Networks 1
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
DataMining, Morgan Kaufmann, p Mining Lab. 김완섭 2004년 10월 27일
Neuro-Computing Lecture 4 Radial Basis Function Network
General Aspects of Learning
Artificial Intelligence 12. Two Layer ANNs
Backpropagation Disclaimer: This PPT is modified based on
Neural networks (1) Traditional multi-layer perceptrons
Facultad de Ingeniería, Centro de Cálculo
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
David Kauchak CS158 – Spring 2019
Sanguthevar Rajasekaran University of Connecticut
Data exploration and visualization
Prof. Carolina Ruiz Department of Computer Science
Presentation transcript:

Investigating the role of individual neurons as outlier detectors Carlos López-Vázquez Laboratorio LatinGEO SGM+Universidad ORT del Uruguay September 15th, 2015 carloslopez@uni.ort.edu.uy

Agenda Motivation for Outlier detection stage ANN as a regression tool Formulation of the rule Case 1 of application: small dataset Case 2 of application: large dataset

Why to worry with outliers? Outliers are unusual (in some sense) events Might adversely affect further calculations OR Might be the most valuable result! Usually ANN produces an output given an input Always! What about the consequences? We might want to detect spurious inputs

Example #1: Medical From Lucila Ohno-Machado, 2004 Given some inputs, detect/classify a possible coronary disease Lucila Ohno-Machado Decision Systems Group Brigham and Women’s Hospital Department of Radiology

Myocardial Infarction Network Duration Intensity Elevation Pain Pain ECG: ST Smoker Age Male Answer: just a number y=“Probability” of MI 0.8 No room for I DON'T KNOW!

Example #2: Autonomous Land Vehicle NN learns to steer an autonomous vehicle. 960 input units, 4 hidden units, 30 output units Driving at speeds up to 70 miles per hour ALVINN System Image of a forward - mounted camera From bi.snu.ac.kr/Courses/g-video12s/files/NN_suppl.ppt Biointelligence Laboratory Department of Computer Engineering Seoul National University Weight values for one of the hidden units

Coming soon?

Goal Identify unlikely coming events And thus (maybe) refuse to estimate outputs! Supplement ANN answer (numerical, categorical) with some credibility flag How? Showing unlikely events during training (supervised) Relying on already trained ANN (unsupervised)

Multi Layer Perceptron (MLP) y=18.4*v1-22.1*v2+10.2*v3 y=10.4*v1+5.12*v2+8.9*v3 y=20.2*v1+0.18*v2-9.1*v3 X1 adjustable weights 18.4 -22.1 10.2 10.4 5.12 8.9 20.2 0.18 -9.1 X2 v1 X3 v2 y v3 X4 X5

Why weights are so different? Conjecture: It might denote a specific role for the neuron Such role can be connected to outliers Wow! Which one are candidates? Large weights? Small weights? Preliminary analysis suggested that Large WeightsOutlier detectors But... convince me!

Two different problems 1) Does the rule indeed works? If so: 2) How it performs when compared with other outlier detection procedures?

Example #3: Iris Flower Classification 3 species of Iris – SETOSA, VERSICOLOR, VIRGINICA Each flower has parts called sepal & petal Length and Width of sepal & petal can be used to determine iris type Data collected on large number of iris flowers For example, in one flower petal length=6.7mm and width=4.3mm also sepal length=22.4mm & sepal width =62.4mm. Iris type was SETOSA An ANN can be trained to determine specie of iris for given set of petal and sepal width and length

Iris training and testing data Sepal Length Sepal Width Petal Length Petal Width Iris Class 0.224 0.624 0.067 0.043 Setosa 0.749 0.502 0.627 0.541 Veracolor 0.557 0.847 1.000 Virginica 0.110 0.051 0.722 0.459 0.663 0.584 0.776 0.416 0.831 0.196 0.667 0.612 0.333 0.812 0.875 0.055 0.082 0.165 0.208 0.592 0.027 0.376 0.639 0.498 0.710 0.306 0.086 0.000 0.424 0.694 0.792 0.137

Classification using regression Somewhat unusual paper: used regression with a single output instead of the common three binary outputs! Unusual paper: internal weights of the ANN were published! petal width petal length sepal width sepal length v1 v2 y v3 From Benítez et al., 1997

Classification using regression From Benítez et al., 1997 petal width petal length sepal width sepal length v1 v2 y v3 The ANN can be simplified...

Pruned ANN and the classification is still good despite not exact sepal length sepal width petal length petal width Which role had the other two?

All misclassifications now announced by z=1! Modified version z=“credibility flag” y=2.143v3 All misclassifications now announced by z=1!

Example #4: daily rain dataset Weather records typically have missing values Many applications require complete databases Well established linear methods for interpolate spatial observations exist Their performance is poor for daily rain records Why not ANN?

Data and test area description 30 years of daily records for 10 stations were available 30 % of the events have missing values More than 80% of the readings are of zero rain, evenly distributed in the year Annual averages ranges from 1600 to 500 mm/day; time correlation is low

Non-linear interpolants: ANN We used ANN as interpolators, with 9 inputs and 1 output The training was performed with one third of the dataset using backpropagation and minimizing the RMSE Some different architectures were considered (one and two hidden layers; different number of neurons, etc.) as well as some transformations of the data

Skipping other details… We applied our rule to each of the 10 ANN Run a Monte Carlo experiment, seeding known outliers at random and locating them afterwards Thorough comparison against state-of-the-art alternatives (details in the paper) The ANN-based outlier detection tool performed very well Best, when outlier size (Mozilla effect) was ignored Satisfactory otherwise

Pros… Training stage is as usual; no special routine is required We inspect the internal weights; no need to retraining Unsupervised classifications: outliers are not declared as such in advance Might offer an objective criteria to suspect underfitting

Cons… Weights might be sensible to outliers (masking effect) which in turn might prevent to detect them Which outliers are located? Only some suitable ones?

SGM+Universidad ORT del Uruguay Questions? Carlos López-Vázquez Laboratorio LatinGEO SGM+Universidad ORT del Uruguay September 15th, 2015 carloslopez@uni.ort.edu.uy