Ensemble-based Adaptive Intrusion Detection Wei Fan IBM T.J.Watson Research Salvatore J. Stolfo Columbia University.

Slides:



Advertisements
Similar presentations
Inductive Learning in Less Than One Sequential Data Scan Wei Fan, Haixun Wang, and Philip S. Yu IBM T.J.Watson Shaw-hwa Lo Columbia University.
Advertisements

A Fully Distributed Framework for Cost-sensitive Data Mining Wei Fan, Haixun Wang, and Philip S. Yu IBM T.J.Watson, Hawthorne, New York Salvatore J. Stolfo.
A Framework for Scalable Cost- sensitive Learning Based on Combining Probabilities and Benefits Wei Fan, Haixun Wang, and Philip S. Yu IBM T.J.Watson Salvatore.
Pruning and Dynamic Scheduling of Cost-sensitive Ensembles Wei Fan, Haixun Wang, and Philip S. Yu IBM T.J.Watson, Hawthorne, New York Fang Chu UCLA, Los.
Systematic Data Selection to Mine Concept Drifting Data Streams Wei Fan IBM T.J.Watson.
A General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions Jing Gao Wei Fan Jiawei Han Philip S. Yu University of Illinois.
Is Random Model Better? -On its accuracy and efficiency-
Decision Tree Evolution using Limited number of Labeled Data Items from Drifting Data Streams Wei Fan 1, Yi-an Huang 2, and Philip S. Yu 1 1 IBM T.J.Watson.
ReverseTesting: An Efficient Framework to Select Amongst Classifiers under Sample Selection Bias Wei Fan IBM T.J.Watson Ian Davidson SUNY Albany.
When Efficient Model Averaging Out-Perform Bagging and Boosting Ian Davidson, SUNY Albany Wei Fan, IBM T.J.Watson.
Yinyin Yuan and Chang-Tsun Li Computer Science Department
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Data Mining Classification: Alternative Techniques
Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.
1 Machine Learning: Lecture 10 Unsupervised Learning (Based on Chapter 9 of Nilsson, N., Introduction to Machine Learning, 1996)
Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.
AdaBoost & Its Applications
Polymorphic blending attacks Prahlad Fogla et al USENIX 2006 Presented By Himanshu Pagey.
Anomaly Detection in Data Docent Xiao-Zhi Gao
Service Discrimination and Audit File Reduction for Effective Intrusion Detection by Fernando Godínez (ITESM) In collaboration with Dieter Hutter (DFKI)
Data Mining and Intrusion Detection
CS 590M Fall 2001: Security Issues in Data Mining Lecture 3: Classification.
On Appropriate Assumptions to Mine Data Streams: Analyses and Solutions Jing Gao† Wei Fan‡ Jiawei Han† †University of Illinois at Urbana-Champaign ‡IBM.
Unsupervised Intrusion Detection Using Clustering Approach Muhammet Kabukçu Sefa Kılıç Ferhat Kutlu Teoman Toraman 1/29.
Neural Technology and Fuzzy Systems in Network Security Project Progress 2 Group 2: Omar Ehtisham Anwar Aneela Laeeq
Efficient Text Categorization with a Large Number of Categories Rayid Ghani KDD Project Proposal.
Clementine Server Clementine Server A data mining software for business solution.
Mining Behavior Models Wenke Lee College of Computing Georgia Institute of Technology.
For Better Accuracy Eick: Ensemble Learning
3 ème Journée Doctorale G&E, Bordeaux, Mars 2015 Wei FENG Geo-Resources and Environment Lab, Bordeaux INP (Bordeaux Institute of Technology), France Supervisor:
Real-Time Odor Classification Through Sequential Bayesian Filtering Javier G. Monroy Javier Gonzalez-Jimenez
1. Introduction Generally Intrusion Detection Systems (IDSs), as special-purpose devices to detect network anomalies and attacks, are using two approaches.
Data Mining for Intrusion Detection: A Critical Review Klaus Julisch From: Applications of data Mining in Computer Security (Eds. D. Barabara and S. Jajodia)
CS490D: Introduction to Data Mining Prof. Chris Clifton April 14, 2004 Fraud and Misuse Detection.
Where Are the Nuggets in System Audit Data? Wenke Lee College of Computing Georgia Institute of Technology.
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Alert Correlation for Extracting Attack Strategies Authors: B. Zhu and A. A. Ghorbani Source: IJNS review paper Reporter: Chun-Ta Li ( 李俊達 )
Modeling Information Diffusion in Networks with Unobserved Links Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University.
Intrusion Detection Adam Ashenfelter Nicholas J. Tyrrell.
Network Intrusion Detection Using Random Forests Jiong Zhang Mohammad Zulkernine School of Computing Queen's University Kingston, Ontario, Canada.
1 Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Benchmark H. Güneş Kayacık Nur Zincir-Heywood Malcolm I. Heywood.
ENSEMBLE LEARNING David Kauchak CS451 – Fall 2013.
Predictive Modeling with Heterogeneous Sources Xiaoxiao Shi 1 Qi Liu 2 Wei Fan 3 Qiang Yang 4 Philip S. Yu 1 1 University of Illinois at Chicago 2 Tongji.
A Data Mining Approach for Building Cost-Sensitive and Light Intrusion Detection Models PI Meeting - July, 2000 North Carolina State University Columbia.
1 ENTROPY-BASED CONCEPT SHIFT DETECTION PETER VORBURGER, ABRAHAM BERNSTEIN IEEE ICDM 2006 Speaker: Li HueiJyun Advisor: Koh JiaLing Date:2007/11/6 1.
An Overview of Intrusion Detection Using Soft Computing Archana Sapkota Palden Lama CS591 Fall 2009.
One-class Training for Masquerade Detection Ke Wang, Sal Stolfo Columbia University Computer Science IDS Lab.
BOOSTING David Kauchak CS451 – Fall Admin Final project.
Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah.
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Anomaly Detection in Data Mining. Hybrid Approach between Filtering- and-refinement and DBSCAN Eng. Ştefan-Iulian Handra Prof. Dr. Eng. Horia Cioc ârlie.
ISQS 6347, Data & Text Mining1 Ensemble Methods. ISQS 6347, Data & Text Mining 2 Ensemble Methods Construct a set of classifiers from the training data.
Intrusion Detection System (IDS). What Is Intrusion Detection Intrusion Detection is the process of identifying and responding to malicious activity targeted.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Anomaly Detection.
Computational Biology Group. Class prediction of tumor samples Supervised Clustering Detection of Subgroups in a Class.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Anomaly Detection. Network Intrusion Detection Techniques. Ştefan-Iulian Handra Dept. of Computer Science Polytechnic University of Timișoara June 2010.
1 Systematic Data Selection to Mine Concept-Drifting Data Streams Wei Fan Proceedings of the 2004 ACM SIGKDD international conference on Knowledge discovery.
Efficient Text Categorization with a Large Number of Categories Rayid Ghani KDD Project Proposal.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Anomaly Detection Carolina Ruiz Department of Computer Science WPI Slides based on Chapter 10 of “Introduction to Data Mining” textbook by Tan, Steinbach,
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Ensemble Classifiers.
Under the Guidance of V.Rajashekhar M.Tech Assistant Professor
Active Learning Intrusion Detection using k-Means Clustering Selection
COMP61011 : Machine Learning Ensemble Models
An Enhanced Support Vector Machine Model for Intrusion Detection
Exploiting the Power of Group Differences to Solve Data Analysis Problems Outlier & Intrusion Detection Guozhu Dong, PhD, Professor CSE
Presentation transcript:

Ensemble-based Adaptive Intrusion Detection Wei Fan IBM T.J.Watson Research Salvatore J. Stolfo Columbia University

Data Mining for Intrusion Detection Connection Records Feature Construction Training Data Inductive Learner Intrusion Detection Model Label Existing Connections (telnet, 10,3,...) (ftp,10,20,...)

Some interesting requirements ƒNew types of intrusions are constantly invented by hackers.  Most recent coordinated attacks on many ebusiness websites in ƒHackers tend to use new types of intrusions that intrusion detection system is unaware of or weak at detecting them successfully. ƒData mining for intrusion detection is a very data- intensive process.  very large data  revolving patterns  real-time detection

Question ƒWhen new types of intrusions are invented, can we quickly adapt our existing model to be able to detect these new intrusions before they cause more damages?  If we don't have a solution, the new attack will make significant damage.  For this kind of problem, having a solution that is not completely satisfactory is better than having no solution.

Naive Approach - Complete Re- training Existing Training Data New Data Merged Training Data Inductive Learner NEW Intrusion Detection Model

Problem with the Naive Approach ƒSince data (existing plus new) will be very large, it takes a long time to compute a detection model. ƒBy the time, the model is constructed, the new attack probably will have already made enough damage to our system.

New Approach New Data Learner NEW Model Existing Model Combined Model Key point: we only compute model from the data on new types of intrusions only

How do we label connections? a new connection existing model connection type unrecognized normal or previously known intrusion types NEW Model normal or new intrusion types

Basic Idea ƒExisting model is built to identify THREE classes  normal  some type of intrusions  and anomaly: some connection that is neither normal nor some known types of intrusions. ƒ anomaly detection - we use the artificial anomaly generation method (Fan et al, ICDM 2001)

Anomaly Detection ƒGenerate "artificial anomalies" from training data: similar to "near misses". ƒArtificial anomalies are data points that are different from the training data. ƒThe algorithm concentrates on feature values that are infrequent in the training data. ƒDistribution-based Artificial Anomaly (Fan et al, ICDM2001)

Four Configurations ƒH 1 (x): existing model. ƒH 2 (x): new model. ƒThey differ in how H 2 (x) is computed. ƒand how H 1 (x) and H 2 (x) are combined ƒand how a connection is processed and classified.

Configuration I

Configuration II

Configuration III

Configuration IV

Experiment ƒ1998 DARPA Intrusion Detection Evaluation Dataset ƒ22 different types of intrusions.

Experiment ƒSequence to introduce intrusions into the training data to simulate new intrusions are being invented and launched by hackers  22! unique sequences  we randomly used 3 unique sequences. ƒThe results are averaged. ƒRIPPER  unordered rulesets

3 Unique Sequences

Measurements ƒAll results on the new intrusion types ƒPrecision:  If I catch a potential thief, what is the probability that it is a real thief? ƒRecall:  What is the probability that real thieves are detected? ƒAnomaly Detection Rate classified as anomaly ƒOther classified as other types of intrusions.

Precision Results

Recall Results

Anomaly Detection Rate

Other Detection Rate Results

Summary of results ƒThe most accurate is Configuration 1 where  new model is trained from normal and the new intrusion type  all predicted normal and anomalies by the old model is examined by the new model. ƒReason:  Existing model's precision to detect normal connection influences combined model's accuracy.  New data is limited in amount. Artificial anomalies generated from new data is limited as well.

Training Efficiency

Related Work (incomplete list) ƒAnomaly Detection:  SRI's IDES use probability distribution of past activities to measure abnormality of host events. We measure network events.  Forrest et al uses absence of subsequence to measure abnormality.  Lane and Brodley employ a similar approach but use incremental learning approach to update stored sequence from UNIX shell commands.  Ghosh and Schwarzbard use neural network to learn profile of normality and distance function to detect abnormality. ƒGenerating Artificial Data:  Nigam et al assign label to unlabelled data using classifier trained from labeled data.  Chang and Lippman applied voice transformation techniques to add artificial training talkers to increase variability. ƒMultiple classifiers:  Asker and Macline "Ensembles as a sequence of classifiers"

Summary and Future Work ƒProposed a two-step two classifier approach for efficient training and fast model deployment. ƒEmpirically tested in the intrusion detection domain. ƒNeed to test if it works well for other domains.