Anomaly Detection for the

Slides:



Advertisements
Similar presentations
Neural networks Introduction Fitting neural networks
Advertisements

My name is Dustin Boswell and I will be presenting: Ensemble Methods in Machine Learning by Thomas G. Dietterich Oregon State University, Corvallis, Oregon.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Unsupervised Learning With Neural Nets Deep Learning and Neural Nets Spring 2015.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Autoencoders Mostafa Heidarpour
Face Detection and Neural Networks Todd Wittman Math 8600: Image Analysis Prof. Jackie Shen December 2001.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Artificial Neural Network Theory and Application Ashish Venugopal Sriram Gollapalli Ulas Bardak.
Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Yang, Luyu.  Postal service for sorting mails by the postal code written on the envelop  Bank system for processing checks by reading the amount of.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
CSC321 Introduction to Neural Networks and Machine Learning Lecture 3: Learning in multi-layer networks Geoffrey Hinton.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Machine Learning Artificial Neural Networks MPλ ∀ Stergiou Theodoros 1.
Optical Character Recognition
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
CSC2535: Lecture 4: Autoencoders, Free energy, and Minimum Description Length Geoffrey Hinton.
Introduction to Machine Learning, its potential usage in network area,
Usman Roshan Dept. of Computer Science NJIT
A Generic Approach to Big Data Alarms Prioritization
Experience Report: System Log Analysis for Anomaly Detection
Today’s Lecture Neural networks Training
Big data classification using neural network
Deep Learning Software: TensorFlow
Unsupervised Learning of Video Representations using LSTMs
Data Transformation: Normalization
Deep Feedforward Networks
Data Mining, Neural Network and Genetic Programming
Public Patrick Papsdorf Adviser European Central Bank
Computer Science and Engineering, Seoul National University
CSSE463: Image Recognition Day 11
Matt Gormley Lecture 16 October 24, 2016
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Unsupervised Learning: Principle Component Analysis
Classification with Perceptrons Reading:
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules
So What Do We Need To Do? Find a Useful Theory of Networking
CSSE463: Image Recognition Day 11
A survey of network anomaly detection techniques
Introduction to Deep Learning with Keras
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
network of simple neuron-like computing elements
Word Embedding Word2Vec.
Creating Data Representations
On Convolutional Neural Network
Zip Codes and Neural Networks: Machine Learning for
Neural Networks ICS 273A UC Irvine Instructor: Max Welling
Representation Learning with Deep Auto-Encoder
CSSE463: Image Recognition Day 13
Neural networks (1) Traditional multi-layer perceptrons
Autoencoders hi shea autoencoders Sys-AI.
Autoencoders Supervised learning uses explicit labels/correct output in order to train a network. E.g., classification of images. Unsupervised learning.
CSSE463: Image Recognition Day 11
CSSE463: Image Recognition Day 11
David Kauchak CS158 – Spring 2019
Usman Roshan Dept. of Computer Science NJIT
Yining ZHAO Computer Network Information Center,
CAMCOS Report Day December 9th, 2015 San Jose State University
NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &
Modeling IDS using hybrid intelligent systems
Ch4: Backpropagation (BP)
EE 193/Comp 150 Computing with Biological Parts
CS249: Neural Language Model
Goodfellow: Chapter 14 Autoencoders
Presentation transcript:

Anomaly Detection for the IDEAS Mapping System A Machine Learning Approach David Meyer Brocade Chief Scientist, VP and Fellow Senior Research Scientist, Computer Science, University of Oregon dmm@{brocade.com,uoregon.edu,1-4-5.net,..} IDEAS Kickoff Meeting IETF 97 13 – 18 Nov 2016 Seoul, Republic of Korea

Agenda Potential Use Cases What Can We Do Today, And What Are The Challenges? Discussion Technical explanations/code https://github.com/davidmeyer/ml http://www.1-4-5.net/~dmm/ml

Use Case / problem definition Title Goes Here 6/1/2018 First, What Does The Scientific Method Look Like When Applied To Machine Learning? This needs to be fast! Scripting HPC, GPUs, … Automation meets ML Use Case / problem definition Hypothesis Design ML Experiment Execute ML Experiment Use cases collected through Research/Customer(s)/others e.g. Autoencoders can detect anomalies in various data sources Prototype environment, variable def., experimental control, significance, etc. Execute, measure, and update hypothesis if necessary “If you want to increase your success rate, double your failure rate” Thomas Watson Sr. (founder of IBM) Original slide courtesy Armin Wasicek © 2016 BROCADE COMMUNICATIONS SYSTEMS, INC.

Prototype Use Case We want to use Machine Learning (ML) to Detect anomalous traffic flows that could be malicious e.g., DDoS against the mapping infrastructure Want to protect the Map Servers and Resolvers and other system components that could benefit Map-Registers, Map-Requests, Map-Replies How might we do this? Collect KPIs from the Map {Server,Resolver} and surrounding infrastructure ETL the data Use ML to Detect Anomalies Remediate Iterate Proposal: Occam’s Razor: Do the simplest thing first Use a simple autoencoder to do binary classification

Agenda What is the Use Case? What Can We Do Today, And What Are The Challenges? Discussion Technical explanations/code https://github.com/davidmeyer/ml http://www.1-4-5.net/~dmm/ml

Briefly, What is Anomaly Detection? Title Goes Here 6/1/2018 Briefly, What is Anomaly Detection? What’s going on here? Data don’t fit an explanation model Impossible, assuming the model is correct Data do not conform to some normal behavior Clustering - K-means - K-NN - … PCA Autoencoders …. Dimension Reduction 1. A = { x | s(x) > t } 2. f(A) → e ∧ mine( False Positives , …) We assume that the anomalous data are generated by a different process than our baseline  stationary distribution Original slide courtesy Armin Wasicek © 2016 BROCADE COMMUNICATIONS SYSTEMS, INC.

Autoencoder Example Detecting Anomalies in Handwritten Digits MNIST: 28x28 Grayscale Handwritten Digit Dataset Training set: 55000 images Test set: 10000 images Validation set: 5000 images Features here are pixels (28x28 = 784) I’m using MNIST here because you easily visualize what is going on. In our case, instead of the input being vectors of {0,1} we will have vectors of counters of map control messages, bandwidth, and other host based KPIs. We can analyze mapping system data in the same way Features here are various counters andother KPIs

One Way To Learn To Recognize MNIST: Use An Autoencoder Special Kind of Neural Network 784 100 Artificial Neurons Artificial Neurons Key Characteristic: Hidden layer has fewer units than input/output  Compression Goal: Minimize reconstruction (decode) error How to define error (loss, cost)? Binary classification: Threshold reconstruction error  normal/abnormal Unsupervised learning

But First: What Does A Single Neuron Do? 784 100 Dot Product BTW, how many parameters? 784x100 + 100x784 = 156800

All Cool, But How Does The Autoencoder Actually Work?

How Hard To Code This Up In Tensorflow? https://github.com/davidmeyer/ml/blob/master/tensorflow/autoencoder.{py,ipynb}

Autoencoder Output 1 example 10 examples 100 examples 1000 examples After training, the AE gets low reconstruction error on digits from MNIST and high reconstruction error on everything else: It has learned to recognize MNIST

BTW, How Much Of This Has Been Applied To Networking? Not too much. Many reasons: Still early days Diverse types of network data Flows, logs, various KPIs, … with no obvious way to combine Incomplete data sets, non-iid data data Network data not designed for ML Different models for different data types Still active area of investigation Occam’s Razor Is there a useful “Theory of Network”? Consider the problem of object recognition/conv nets Transfer learning Community challenges: Skill sets, proprietary data sets and use-cases, … Concern about the probabilistic nature of ML

Next Steps Build a prototype Get feedback from group Iterate (again) Dino Provide raw data from the LISP mapping system dmm process data into appropriate form Use ML to detect anomalies Classify anomaly types Remediation Frequency hopping idea Iterate Get feedback from group Iterate (again)

Agenda What is the Use Case? What Can We Do Today, And What Are The Challenges? Discussion Technical explanations/code https://github.com/davidmeyer/ml http://www.1-4-5.net/~dmm/ml

(have more questions/comments? dmm@1-4-5.net) Q&A Thank you (have more questions/comments? dmm@1-4-5.net)