Vipin Kumar, AHPCRC, University of Minnesota

Slides:



Advertisements
Similar presentations
Loss-Sensitive Decision Rules for Intrusion Detection and Response Linda Zhao Statistics Department University of Pennsylvania Joint work with I. Lee,
Advertisements

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
Data Mining Classification: Alternative Techniques
Application of Bayesian Network in Computer Networks Raza H. Abedi.
Performance Evaluation of the Fuzzy ARTMAP for Network Intrusion Detection Nelcileno Araújo Ruy de Oliveira Ed’Wilson Tavares Ferreira Valtemir Nascimento.
Data Mining for Network Intrusion Detection Vipin Kumar Army High Performance Computing Research Center Department of Computer Science University of Minnesota.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ What is Cluster Analysis? l Finding groups of objects such that the objects in a group will.
Mining with Rare Cases Paper by Gary M. Weiss Presenter: Indar Bhatia INFS 795 April 28, 2005.
Cyber Threat Analysis  Intrusions are actions that attempt to bypass security mechanisms of computer systems  Intrusions are caused by:  Attackers accessing.
Rare Category Detection in Machine Learning Prafulla Dawadi Topics in Machine Learning.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Unsupervised Intrusion Detection Using Clustering Approach Muhammet Kabukçu Sefa Kılıç Ferhat Kutlu Teoman Toraman 1/29.
© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.
Data Mining for Network Intrusion Detection Vipin Kumar Army High Performance Computing Research Center Department of Computer Science University of Minnesota.
Clustering. 2 Outline  Introduction  K-means clustering  Hierarchical clustering: COBWEB.
Ensemble-based Adaptive Intrusion Detection Wei Fan IBM T.J.Watson Research Salvatore J. Stolfo Columbia University.
Machine Learning as Applied to Intrusion Detection By Christine Fossaceca.
Testing Intrusion Detection Systems: A Critic for the 1998 and 1999 DARPA Intrusion Detection System Evaluations as Performed by Lincoln Laboratory By.
Lucent Technologies – Proprietary Use pursuant to company instruction Learning Sequential Models for Detecting Anomalous Protocol Usage (work in progress)
Data Mining for Intrusion Detection: A Critical Review Klaus Julisch From: Applications of data Mining in Computer Security (Eds. D. Barabara and S. Jajodia)
CS548 Spring 2015 Anomaly Detection Showcase Anomaly-based Network Intrusion Detection (A-NIDS) by Nitish Bahadur, Gulsher Kooner, Caitlin Kuhlman 1.
Chirag N. Modi and Prof. Dhiren R. Patel NIT Surat, India Ph. D Colloquium, CSI-2011 Signature Apriori based Network.
A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data Authors: Eleazar Eskin, Andrew Arnold, Michael Prerau,
A Statistical Anomaly Detection Technique based on Three Different Network Features Yuji Waizumi Tohoku Univ.
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Anomaly Detection Presented by: Anupam Das CS 568MCC Spring 2013
Using Bayesian Networks for Detecting Network Anomalies Lane Thames ECE 8833 Intelligent Systems.
Detecting Network Violation Based on Fuzzy Class-Association-Rule Mining Using Genetic Network Programming.
Network Intrusion Detection Using Random Forests Jiong Zhang Mohammad Zulkernine School of Computing Queen's University Kingston, Ontario, Canada.
IIT Indore © Neminah Hubballi
Cost-Sensitive Bayesian Network algorithm Introduction: Machine learning algorithms are becoming an increasingly important area for research and application.
1 Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Benchmark H. Güneş Kayacık Nur Zincir-Heywood Malcolm I. Heywood.
Outlier Detection Using k-Nearest Neighbour Graph Ville Hautamäki, Ismo Kärkkäinen and Pasi Fränti Department of Computer Science University of Joensuu,
Intrusion Detection Using Hybrid Neural Networks Vishal Sevani ( )
1 A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu, Chris Jermaine University of Florida September 2007.
11 Automatic Discovery of Botnet Communities on Large-Scale Communication Networks Wei Lu, Mahbod Tavallaee and Ali A. Ghorbani - in ACM Symposium on InformAtion,
A Data Mining Approach for Building Cost-Sensitive and Light Intrusion Detection Models PI Meeting - July, 2000 North Carolina State University Columbia.
An Analysis of the 1999 DARPA/Lincoln Laboratory Evaluation Data for Network Anomaly Detection Matt Mahoney Feb. 18, 2003.
An Overview of Intrusion Detection Using Soft Computing Archana Sapkota Palden Lama CS591 Fall 2009.
1 Impact of IT Monoculture on Behavioral End Host Intrusion Detection Dhiman Barman, UC Riverside/Juniper Jaideep Chandrashekar, Intel Research Nina Taft,
Learning Rules for Anomaly Detection of Hostile Network Traffic Matthew V. Mahoney and Philip K. Chan Florida Institute of Technology.
Charles Elkan 1999 Conference on Knowledge Discovery and Data Mining
Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah.
ICNSC 2007Slide 1 A Novel Soft Computing Model Using Adaptive Neuro-Fuzzy Inference System for Intrusion Detection Authors: A. Nadjaran Toosi;
Data Reduction via Instance Selection Chapter 1. Background KDD  Nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable.
Anomaly Detection in Data Mining. Hybrid Approach between Filtering- and-refinement and DBSCAN Eng. Ştefan-Iulian Handra Prof. Dr. Eng. Horia Cioc ârlie.
Data Mining Anomaly/Outlier Detection Lecture Notes for Chapter 10 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction.
1 Data Mining: Concepts and Techniques (3 rd ed.) — Chapter 12 — Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign.
Data Mining Anomaly Detection Lecture Notes for Chapter 10 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction to.
Consensus Extraction from Heterogeneous Detectors to Improve Performance over Network Traffic Anomaly Detection Jing Gao 1, Wei Fan 2, Deepak Turaga 2,
Class Imbalance in Text Classification
Anomaly Detection.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Genetic Algorithms (in 1 Slide) l GA: based on an analogy to biological evolution l Each.
Feature Selection Poonam Buch. 2 The Problem  The success of machine learning algorithms is usually dependent on the quality of data they operate on.
Anomaly Detection. Network Intrusion Detection Techniques. Ştefan-Iulian Handra Dept. of Computer Science Polytechnic University of Timișoara June 2010.
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
1. ABSTRACT Information access through Internet provides intruders various ways of attacking a computer system. Establishment of a safe and strong network.
Anomaly Detection Carolina Ruiz Department of Computer Science WPI Slides based on Chapter 10 of “Introduction to Data Mining” textbook by Tan, Steinbach,
Anomaly Detection Nathan Dautenhahn CS 598 Class Lecture March 3, 2011.
Semi-Supervised Clustering
An Enhanced Support Vector Machine Model for Intrusion Detection
Data Mining Classification: Alternative Techniques
Outlier Discovery/Anomaly Detection
A survey of network anomaly detection techniques
Prepared by: Mahmoud Rafeek Al-Farra
Prepared by: Mahmoud Rafeek Al-Farra
Data Mining Anomaly/Outlier Detection
Data Mining for Intrusion Detection
Exploiting the Power of Group Differences to Solve Data Analysis Problems Outlier & Intrusion Detection Guozhu Dong, PhD, Professor CSE
Presentation transcript:

Vipin Kumar, AHPCRC, University of Minnesota Data Mining for Network Intrusion Detection: Experience with KDDCup’99 Data set Vipin Kumar, AHPCRC, University of Minnesota Group members: L. Ertoz, M. Joshi, A. Lazarevic, H. Ramnani, P. Tan, J. Srivastava

Introduction Key challenge Misuse Detection Anomaly Detection Maintain high detection rate while keeping low false alarm rate Misuse Detection Two phase learning – PNrule Classification based on Associations (CBA) approach Anomaly Detection Unsupervised (e.g. clustering) and supervised methods to detect novel attacks

DARPA 1998 - KDDCup’99 Data Set Modification of DARPA 1998 data set prepared and managed by MIT Lincoln Lab DARPA 1998 data includes a wide variety of intrusions simulated in a military network environment 9 weeks of raw TCP dump data simulating a typical U.S. Air Force LAN 7 weeks for training (5 million connection records) 2 weeks for training (2 million connection records)

KDDCup’99 Data Set Connections are labeled as normal or attacks Attacks fall into 4 main categories (38 attack types) - DOS - Denial Of Service Probe - e.g. port scanning U2R - unauthorized access to root privileges, R2L - unauthorized remote login to machine, U2R and R2L extremely small classes 3 groups of features Basic, content based, time based features (details)

KDDCup’99 Data Set Training set - ~ 5 million connections 10% training set - 494,021 connections Test set - 311,029 connections Test data has attack types that are not present in the training data => Problem is more realistic Train set contains 22 attack types Test data contains additional 17 new attack types that belong to one of four main categories

Performance of Winning Strategy Cost-sensitive bagged boosting (B. Pfahringer)

Simple RIPPER classification RIPPER trained on 10% of data (494,021 connections) Test on entire test set (311,029 connections)

Simple RIPPER on modified data Remove duplicates and merge new train and test data sets Sample 69,980 examples from the merged data set Sample from neptune and normal subclass. Other subclasses remain intact. Divide in equal proportion to training and test sets Apply RIPPER algorithm on the new data set

Building Predictive Models in NID Models should handle skewed class distributions Accuracy is not sufficient metric for evaluation Focus on both recall and precision Recall (R) = TP/(TP + FN) Precision (P) = TP/(TP + FP) F – measure = 2*R*P/(R+P) rare class – C large class – NC

Predictive Models for Rare Classes Over-sampling the small class [Ling, Li, KDD 1998] Down-sizing the large class [Kubat, ICML 1997] Internally bias discrimination process to compen-sate for class imbalance [Fawcett, DMKDD 1997] PNrule and related work [Joshi, Agarwal, Kumar, SIAM, SIGMOD 2001] RIPPER with stratification SMOTE algorithm [Chawla, JAIR 2002] RareBoost [Joshi, Agarwal, Kumar, ICDM 2001]

PNrule Learning P-phase: N-phase: cover most of the positive examples with high support seek good recall N-phase: remove FP from examples covered in P-phase N-rules give high accuracy and significant support C C NC NC Existing techniques can possibly learn erroneous small signatures for absence of C PNrule can learn strong signatures for presence of NC in N-phase

RIPPER vs. PNrule Classification Model Attack Recall (%) Precision (%) F-value RIPPER U2R 17.1 6.7 9.6 R2L 13.9 84.9 23.9 Probe 77.8 64.7 70.7 PN rule 18.4 56.8 27.8 14.1 72.8 23.7 83.8 69.2 75.9 5% sample from normal, smurf (DOS), neptune (DOS) from 10% of training data (494,021 connections) Test on entire test set (311,029 connections)

Classification Based on Associations (CBA) What are Association patterns? Frequent itemset: captures the set of “items” that co-occur together frequently in a transaction database. Association Rule: predicts the occurrence of a set of items in a transaction given the presence of other items. Association Rule: X , a s Þ y Support: Confidence: Example:

Classification Based on Associations (CBA) Previous work: Use association patterns to improve the overall performance of traditional classifiers. Integrating Classification and Association Rule Mining [Liu, Li, KDD 1998] CMAR: Accurate Classification Based on Multiple Class-Association Rules [Han, ICDM 2001] Associations in Network Intrusion Detection Use classification based on associations for anomaly detection and misuse detection [Lee, Stolfo, Mok 1999] Look for abnormal associations [Barbara, Wu, Jajodia, 2001]

Frequent Itemset Generation Methodology F1: {A, B,C} => dos F2: {B,D} => dos … DOS Overall data set Feature Selection F1: {A, C, D} => u2r F2: {E,F,H} => u2r … U2R F1: {C,K,L} => r2l F2: {F,G,H} => r2l … R2L F1: {B,F} => probe F2: {B,C,H}=> probe … probe normal F1: {A, B} => normal F2: {E,G} => normal … Feed to classifier Stratification Frequent Itemset Generation

Methodology Current approaches use confidence-like measures to select the best rules to be added as features into the classifiers. This may work well only if each class is well-represented in the data set. For the rare class problems, some of the high recall itemsets could be potentially useful, as long as their precision is not too low. Our approach: Apply frequent itemset generation algorithm to each class. Select itemsets to be added as features based on precision, recall and F-Measure. Apply classification algorithm, i.e., RIPPER, to the new data set.

Experimental Results (on modified data) Original RIPPER RIPPER with high Precision rules RIPPER with high Recall rules RIPPER with high F-measure rules

Experimental Results (on modified data) Original RIPPER RIPPER with high Precision rules RIPPER with high Recall rules RIPPER with high F-measure rules For rare classes, rules ordered according to F-Measure produce the best results.

CBA Summary Association rules can improve the overall performance of classifiers Measure used to select rules for feature addition can affect the performance of classifiers The proposed F-measure rule selection approach leads to better overall performance

Anomaly Detection – Related Work Detect novel intrusions using pseudo-Bayesian estimators to estimate prior and posterior probabilities of new attacks [Barbara, Wu, SIAM 2001] Generate artificial anomalies (intrusions) and then use RIPPER to learn intrusions [Fan et al, ICDM 2001] Detect intrusions by computing changes in esti-mated probability distributions [Eskin, ICML 2000] Clustering based approaches [Portnoy et al, 2001]

SNN Clustering on KDD Cup 99’ data SNN clustering suited for finding clusters of varying sizes, shapes, densities in the presence of noise Dataset 10,000 examples were sampled from neptune, smurf and normal both from training and test Other sub-classes remain intact Total number of instances : 97,000 Applied shared nearest neighbors based clustering and k-means clustering

Clustering Results SNN clusters of pure new attack types are found Cluster name Size Same category Wrong category apache2 (dos) 211 183 4 mscan (probe) 142 118 xterm + ps (u2r) 117 57 24 (r2l), 36 (normal) snmpgetattack (r2l) 69 34 (normal) 131 104 processtable (dos) 146 87 1 1 (dos), 3 (r2l)

Clustering Results K-means performance SNN clustering performance All k-means clusters Tightest k-means clusters

Nearest Neighbor (NN) based Outlier Detection For each point in the training set, calculate the distance to the closest point Build a histogram Choose a threshold such that a small percentage (e.g., 2%) of the training set are classified as outliers

Anomaly Detection using NN Scheme attack

Novel Attack Detection Using NN Scheme Normal Correct Attack Group Incorrect Attack Group Anomaly Total 12040 176 173 12389 Known Attacks 1119 7581 225 1814 10739 781 347 139 2755 4022 Detection Rate for Novel Attacks = 68.50% False Positive Rate for Normal connections = 2.82%

Novel Attack Detection Using NN Scheme novel attacks details details

Conclusions Predictive models specifically designed for rare class can help in improving the detection of small attack types SNN clustering based approach shows promise in identifying novel attack types Simple nearest neighbor based approaches appear capable of detecting anomalies

KDDCup’99 Data Set KDDCup’99 contains derived high-level features 3 groups of features basic features of individual TCP connections (duration, protocol type, service, src & dest bytes, …) content features within a connection suggested by domain knowledge (e.g. # of failed login attempts) time-based traffic features of the connection records ''same host'' features examine only the connections that have the same destination host as the current connection ''same service'' features examine only the connections that have the same service as the current connection back

1-NN on Anomalies back

1-NN on Known Attacks back