Internet Traffic Classification Using Bayesian Analysis Techniques

Slides:



Advertisements
Similar presentations
COMPUTER AIDED DIAGNOSIS: CLASSIFICATION Prof. Yasser Mostafa Kadah –
Advertisements

Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
Probabilistic Generative Models Rong Jin. Probabilistic Generative Model Classify instance x into one of K classes Class prior Density function for class.
Data Mining Classification: Naïve Bayes Classifier
Assuming normally distributed data! Naïve Bayes Classifier.
Classification and risk prediction
1 Chapter 12 Probabilistic Reasoning and Bayesian Belief Networks.
Application Identification in information-poor environments Charalampos Rotsos 02/02/20101 What is application identification Current status My work Future.
CS 590M Fall 2001: Security Issues in Data Mining Lecture 3: Classification.
Logistic Regression Rong Jin. Logistic Regression Model  In Gaussian generative model:  Generalize the ratio to a linear model Parameters: w and c.
A Probabilistic Model for Classification of Multiple-Record Web Documents June Tang Yiu-Kai Ng.
Statistical based IDS background introduction. Statistical IDS background Why do we do this project Attack introduction IDS architecture Data description.
Lecture #1COMP 527 Pattern Recognition1 Pattern Recognition Why? To provide machines with perception & cognition capabilities so that they could interact.
Kernel Methods Part 2 Bing Han June 26, Local Likelihood Logistic Regression.
Jeff Howbert Introduction to Machine Learning Winter Classification Bayesian Classifiers.
Crash Course on Machine Learning
A fast identification method for P2P flow based on nodes connection degree LING XING, WEI-WEI ZHENG, JIAN-GUO MA, WEI- DONG MA Apperceiving Computing and.
Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.
Bayesian Networks. Male brain wiring Female brain wiring.
Processing of large document collections Part 2 (Text categorization, term selection) Helena Ahonen-Myka Spring 2005.
University of Palestine University of Palestine Eng. Wisam Zaqoot Eng. Wisam Zaqoot May 2011 May 2011 Steganalysis ITSS 4201 Internet Insurance and Information.
Naive Bayes Classifier
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Bayesian Networks Martin Bachler MLA - VO
ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.
Remote Sensing Supervised Image Classification. Supervised Image Classification ► An image classification procedure that requires interaction with the.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Probabilistic Graphical Models for Semi-Supervised Traffic Classification Rotsos Charalampos, Jurgen Van Gael, Andrew W. Moore, Zoubin Ghahramani Computer.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.
Classification Techniques: Bayesian Classification
Naïve Bayes Classifier Ke Chen Modified and extended by Longin Jan Latecki
Optimal Bayes Classification
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Mehdi Ghayoumi MSB rm 132 Ofc hr: Thur, a Machine Learning.
1 Value of information – SITEX Data analysis Shubha Kadambe (310) Information Sciences Laboratory HRL Labs 3011 Malibu Canyon.
Chapter 6 – Three Simple Classification Methods © Galit Shmueli and Peter Bruce 2008 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Centre de Comunicacions Avançades de Banda Ampla (CCABA) Universitat Politècnica de Catalunya (UPC) Identification of Network Applications based on Machine.
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
29 August 2013 Venkat Naïve Bayesian on CDF Pair Scores.
Dimensionality Reduction in Unsupervised Learning of Conditional Gaussian Networks Authors: Pegna, J.M., Lozano, J.A., Larragnaga, P., and Inza, I. In.
Data Mining and Decision Support
1 Network Simulation and Testing Polly Huang EE NTU
NTU & MSRA Ming-Feng Tsai
1 Transport Layer: Basics Outline Intro to transport UDP Congestion control basics.
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
Linear Models (II) Rong Jin. Recap  Classification problems Inputs x  output y y is from a discrete set Example: height 1.8m  male/female?  Statistical.
PEER TO PEER BOTNET DETECTION FOR CYBER- SECURITY (DEFENSIVE OPERATION): A DATA MINING APPROACH Masud, M. M. 1, Gao, J. 2, Khan, L. 1, Han, J. 2, Thuraisingham,
Naïve Bayes Classifier April 25 th, Classification Methods (1) Manual classification Used by Yahoo!, Looksmart, about.com, ODP Very accurate when.
Machine Learning in Practice Lecture 21 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss Pedro Domingos, Michael Pazzani Presented by Lu Ren Oct. 1, 2007.
Naive Bayes Classifier. REVIEW: Bayesian Methods Our focus this lecture: – Learning and classification methods based on probability theory. Bayes theorem.
Machine Learning with Spark MLlib
Naive Bayes Classifier
Lightweight Application Classification for Network Management
Classification Techniques: Bayesian Classification
Confidence Intervals Chapter 10 Section 1.
Strayer University at Arlington, VA
Classification & Prediction
Classification and Prediction
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
LECTURE 07: BAYESIAN ESTIMATION
Leveraging Textual Specifications for Grammar-based Fuzzing of Network Protocols Samuel Jero, Maria Leonor Pacheco, Dan Goldwasser, Cristina Nita-Rotaru.
Naive Bayes Classifiers (NBC)
Logistic Regression [Many of the slides were originally created by Prof. Dan Jurafsky from Stanford.]
Statistical based IDS background introduction
Naïve Bayes Classifier
Presentation transcript:

Internet Traffic Classification Using Bayesian Analysis Techniques Presentation by Umamaheswararao K

Overview Statistical Method Uses Supervised Machine learning Uses only flow records Based on descriminators of the flows - port, inter-packet gap etc… Applies Naïve Bayesian techniques Reasonably high accuracy

Machine Learned Classification Deterministic Approach Assigns data points to one of mutually exclusive classes Probabilistic Approach assigns the flow with probabilties of belonging to certain class - Current technique falls into this category

Probabilistic Approach: Can Identify similar Characteristics of flows after their probabilistic class assignment Robust to measurement error Provides a mechanism for quantifying class assignment probabilities Available in many implementations

Terminology Objects: Entities to be classfied – here traffic-flows which is a tuple of src/dst IP, protocol, src/dst port Discriminators: Characteristics parameterizing the flow behaviour – flow duration, TCP port etc - Here only complete TCP connections are considered

Discriminators/Categories

Analysis Tools Naïve Bayesian Classifier

Bayes Tech: Contd.. Assumptions – Discriminators Independent TCP header length proportional to pak len or vice versa Discriminator distribution is assumed to be normal (Gaussian) - Distribution can be multimodal

Example

Naïve Bayes: Kernel Estimation Descriminator distribution is not Gaussian

Naïve Bayes vs Kernel

Descriminator selection Remove Irrelevant descriminators Cannot differentiate the class Same distribution for all classes Remove Redundant descriminators highly correlated with another discriminator

Descriminator reduction: Filter Uses characteristics of training data to see how relevant the descriminator to the class degree of correlation b/w discriminator & class Wrapper uses results of a classifier to build optimal set

FCBF Fast-correlation based filter for discriminator filtering Two stage process Identifying the relevance of a discriminator Identifying the redundancy of a feature with respect to discriminators

Results

Results: contd.. Accuracy: Correctly classified flows/Total number of flows Trust: Probability that a flow that has been classified into some class in fact from this class

Naïve Bayes- Trust

Trust: Kernel est.

Results for new data set

Identification of discriminators

Strengths Payload access not needed High accuracy and Trust with FCBF Easily implementable Single flow based (a strength and a weakness) Allows any categorization

Weaknesses Bunch of them but then …? Accuracy/Trust depends mainly on how good the training set is Trust of some classes is really poor works on flow based, characterization some flows require to see many flows (eg. Attacks) Temporal stability is not really good Discriminators are dependent on network dynamics

Weaknesses: Contd… Training is not automatic Assumes discriminator independence Gaussian distribution assumption inaccurate

Future Work A significantly new approach hence can lead to many ideas Spatial independence of traffic classification Check from weaknesses section