Jiawei Han and Micheline Kamber Department of Computer Science

Slides:



Advertisements
Similar presentations
Intrusion Detection Systems (I) CS 6262 Fall 02. Definitions Intrusion Intrusion A set of actions aimed to compromise the security goals, namely A set.
Advertisements

Applications of one-class classification
Data Mining and Intrusion Detection
Cyber Threat Analysis  Intrusions are actions that attempt to bypass security mechanisms of computer systems  Intrusions are caused by:  Attackers accessing.
 Firewalls and Application Level Gateways (ALGs)  Usually configured to protect from at least two types of attack ▪ Control sites which local users.
Intrusion Detection Systems and Practices
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
5/1/2006Sireesha/IDS1 Intrusion Detection Systems (A preliminary study) Sireesha Dasaraju CS526 - Advanced Internet Systems UCCS.
1 Intrusion Detection CSSE 490 Computer Security Mark Ardis, Rose-Hulman Institute May 4, 2004.
Unsupervised Intrusion Detection Using Clustering Approach Muhammet Kabukçu Sefa Kılıç Ferhat Kutlu Teoman Toraman 1/29.
Neural Technology and Fuzzy Systems in Network Security Project Progress 2 Group 2: Omar Ehtisham Anwar Aneela Laeeq
© 2006 Cisco Systems, Inc. All rights reserved. Implementing Secure Converged Wide Area Networks (ISCW) Module 6: Cisco IOS Threat Defense Features.
Lesson 13-Intrusion Detection. Overview Define the types of Intrusion Detection Systems (IDS). Set up an IDS. Manage an IDS. Understand intrusion prevention.
Mining Behavior Models Wenke Lee College of Computing Georgia Institute of Technology.
School of Computer Science and Information Systems
seminar on Intrusion detection system
Intrusion Detection Systems. Definitions Intrusion –A set of actions aimed to compromise the security goals, namely Integrity, confidentiality, or availability,
Intrusion Detection - Arun Hodigere. Intrusion and Intrusion Detection Intrusion : Attempting to break into or misuse your system. Intruders may be from.
Lecture 11 Intrusion Detection (cont)
Department Of Computer Engineering
Intrusion Detection System Marmagna Desai [ 520 Presentation]
WAC/ISSCI Automated Anomaly Detection Using Time-Variant Normal Profiling Jung-Yeop Kim, Utica College Rex E. Gantenbein, University of Wyoming.
Intrusion and Anomaly Detection in Network Traffic Streams: Checking and Machine Learning Approaches ONR MURI area: High Confidence Real-Time Misuse and.
Lucent Technologies – Proprietary Use pursuant to company instruction Learning Sequential Models for Detecting Anomalous Protocol Usage (work in progress)
Data Mining for Intrusion Detection: A Critical Review Klaus Julisch From: Applications of data Mining in Computer Security (Eds. D. Barabara and S. Jajodia)
Information Systems CS-507 Lecture 40. Availability of tools and techniques on the Internet or as commercially available software that an intruder can.
Penetration Testing Security Analysis and Advanced Tools: Snort.
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Alert Correlation for Extracting Attack Strategies Authors: B. Zhu and A. A. Ghorbani Source: IJNS review paper Reporter: Chun-Ta Li ( 李俊達 )
IIT Indore © Neminah Hubballi
IDS Intrusion Detection Systems CERT definition: A combination of hardware and software that monitors and collects system and network information and analyzes.
Chapter 1 Introduction to Data Mining
Principles of Computer Security: CompTIA Security + ® and Beyond, Third Edition © 2012 Principles of Computer Security: CompTIA Security+ ® and Beyond,
Data Mining Approaches for Intrusion Detection Wenke Lee and Salvatore J. Stolfo Computer Science Department Columbia University.
Intrusion Detection Prepared by: Mohammed Hussein Supervised by: Dr. Lo’ai Tawalbeh NYIT- winter 2007.
SNORT Feed the Pig Vicki Insixiengmay Jon Krieger.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Chapter 5: Implementing Intrusion Prevention
Intrusion Detection System (IDS). What Is Intrusion Detection Intrusion Detection is the process of identifying and responding to malicious activity targeted.
Cryptography and Network Security Sixth Edition by William Stallings.
Consensus Extraction from Heterogeneous Detectors to Improve Performance over Network Traffic Anomaly Detection Jing Gao 1, Wei Fan 2, Deepak Turaga 2,
Intrusion Detection Systems Paper written detailing importance of audit data in detecting misuse + user behavior 1984-SRI int’l develop method of.
Intrusion Detection System
WebWatcher A Lightweight Tool for Analyzing Web Server Logs Hervé DEBAR IBM Zurich Research Laboratory Global Security Analysis Laboratory
I NTRUSION P REVENTION S YSTEM (IPS). O UTLINE Introduction Objectives IPS’s Detection methods Classifications IPS vs. IDS IPS vs. Firewall.
1. ABSTRACT Information access through Internet provides intruders various ways of attacking a computer system. Establishment of a safe and strong network.
Unit 2 Personal Cyber Security and Social Engineering Part 2.
SIEM Rotem Mesika System security engineering
IDS Intrusion Detection Systems
Chapter 19: Network Management
Access control techniques
NETWORKS Fall 2010.
Intrusion Control.
Security Methods and Practice CET4884
Intrusion Detection Systems
Principles of Computer Security
Damiano Bolzoni, Sandro Etalle, Pieter H. Hartel
NET 412 Network Security protocols
NET 412 Network Security protocols
Data Warehousing and Data Mining
Intrusion Detection Systems
Lecture 3: Secure Network Architecture
Lecture 8: Intrusion Detection
Intrusion Detection system
©Jiawei Han and Micheline Kamber
Intrusion Detection Systems
Modeling IDS using hybrid intelligent systems
Lecture 7: Intrusion Detection
Presentation transcript:

Data Mining: Concepts and Techniques — Chapter 11 — — Data Mining and Intrusion Detection — Jiawei Han and Micheline Kamber Department of Computer Science University of Illinois at Urbana-Champaign www.cs.uiuc.edu/~hanj ©2006 Jiawei Han and Micheline Kamber. All rights reserved. Acknowledgements: Jian Pei and Huiping Chen (Spring 2004) 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Outline Intrusion detection and computer security Current intrusion detection approaches Data Mining Approaches for Intrusion Detection Summary 9/23/2018 Data Mining: Principles and Algorithms

Intrusion Detection and Computer Security Computer security goals: Confidentiality, integrity, and availability Intrusion is a set of actions aimed to compromise these security goals Intrusion prevention (authentication, encryption, etc.) alone is not sufficient Intrusion detection is needed 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Intrusion Examples Intrusions: Any set of actions that threaten the integrity, availability, or confidentiality of a network resource Examples Denial of service (DoS): attempts to starve a host of resources needed to function correctly Scan: reconnaissance on the network or a particular host Worms and viruses: replicating on other hosts Compromises: obtain privileged access to a host by known vulnerabilities 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Intrusion Detection Intrusion detection: The process of monitoring and analyzing the events occurring in a computer and/or network system in order to detect signs of security problems Primary assumption: User and program activities can be monitored and modeled Steps Monitoring and analyzing traffic Identifying abnormal activities Assessing severity and raising alarm 9/23/2018 Data Mining: Principles and Algorithms

Monitoring and Analyzing Traffic TCPdump and Windump Provide insight into the traffic activity on a network ftp://ftp.ee.lbl.gov/tcpdump.tar.Z http://netgroupserv.polito.it/windump Ethereal GUI to interpret all layers of the packet 9/23/2018 Data Mining: Principles and Algorithms

Goals of Intrusion Detection System (IDS) Detect wide variety of intrusions Previously known and unknown attacks Suggests need to learn/adapt to new attacks or changes in behavior Detect intrusions in timely fashion May need to be real-time, especially when system responds to intrusion Problem: analyzing commands may impact response time of system May suffice to report intrusion occurred a few minutes or hours ago 9/23/2018 Data Mining: Principles and Algorithms

Goals of Intrusion Detect. System (IDS) (2) Present analysis in simple, easy-to-understand format Be accurate Minimize false positives, false negatives False positive: An event, incorrectly identified by the IDS as being an intrusion when none has occurred False negative: An event that the IDS fails to identify as an intrusion when one has in fact occurred Minimize time spent verifying attacks, looking for them 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms IDS Architecture Sensors (agent) to collect data and forward info to the analyzer network packets log files system call traces Analyzers (detector) To receive input from one or more sensors or from other analyzers To determine if an intrusion has occurred User interface To enable a user to view output from the system or control the behavior of the system 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms IDS Architecture 9/23/2018 Data Mining: Principles and Algorithms

Signature-Based Intrusion Detection Human analysts investigate suspicious traffic Extract signatures Features of known intrusions Use pre-defined signatures to discover malicious packets Examples LaBrea Tarpit by Tom Liston Snort and Snort rules Marty Roesch 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Snort by Marty Roesch An open source free network intrusion detection system Signature-based, use a combination of rules and preprocessors On many platforms, including UNIX and Windows www.snort.org Preprocessors IP defragmentation, port-scan detection, web traffic normalization, TCP stream reassembly, … Can analyze streams, not only a single packet at a time 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Snort—Overview Typical run from the command line GUI available by IDScenter/Demarc/Puresecure Modes Sniff: dump sniffed traffic to the screen Packet log: log the packets to the disk NIDS: compare the network traffic with a preconfigured set of signatures Output can be stored into spool files or a database 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Snort Rules Two parts Rule header: define who must be involved Rule options: define what must be involved (action) The rule triggers when an outsider attempt to make an internal TCP connection If both SYN and FIN are set, a message of “SYN-FIN scan” is reported with the alert Rule header Rule options alert tcp !1.2.3.0/24 any -> 1.2.3.0/24 any (flags: SF; msg:”SYN-FIN scan;) 9/23/2018 Data Mining: Principles and Algorithms

Application of Snort Rules A packet triggers the first rule that matches and does not examine the remainder The ordering of rules is critical Each Snort rule inspects only one packet Use preprocessors such as IP defragmentation or TCP stream reassembly to handle a series of packets 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Snort Rule Sets Snort comes with a very large set of rules Not recommended that all rules used on installation New Snort rules are released as soon as hours after a new exploit is discovered A new rule may not be a good rule The attackers may change the signatures easily 9/23/2018 Data Mining: Principles and Algorithms

Snort and Event Viewer on Snort and Event Viewer on NT 9/23/2018 Data Mining: Principles and Algorithms

Problems in Signature-Based Intrusion Detection Systems Many false positives: prone to generating alerts when there is no problem in fact Signatures are not specific enough A packet is not examined in context with those that precede it or those that follow Cannot detect unknown intrusions Rely on signatures extracted by human experts 9/23/2018 Data Mining: Principles and Algorithms

Misuse vs. Anomaly Detection Misuse detection: use patterns of well-known attacks to identify intrusions Classification based on known intrusions E.g., three consecutive login failures: password guessing. Anomaly detection: use deviation from normal usage patterns to identify intrusions Any significant deviations from the expected behavior are reported as possible attacks 9/23/2018 Data Mining: Principles and Algorithms

Misuse vs. Anomaly Detection Misuse Detection Anomaly Detection Definition matching the sequence of “signature actions” of known intrusion scenarios using statistical measure on system features Shortcoming Has to hand-coded known pattern. Unable to detect any future intrusion Rely upon in selecting the system features. Has to study sequential interrelation between transactions Example STAT [HLMS90] IDES [LTG+92] 9/23/2018 Data Mining: Principles and Algorithms

Host-based vs. Network-based According to data sources Host-based detection: the data is collected from an individual host Directly monitor the host data files and OS processes Can determine exactly which host resources are the targets of a particular attack Network-based detection: the data is traffic across the network A set of traffic sensors within the network Can easily harder against attacks and hide from the attackers 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms OUTLINE Intrusion detection and computer security Current intrusion detection approaches Data Mining Approaches for Intrusion Detection Summary 9/23/2018 Data Mining: Principles and Algorithms

Current Intrusion Detection Approaches—Misuse Detection Record the specific patterns of intrusions Monitor current audit trails (event sequences) and pattern matching Report the matched events as intrusions Representation models: expert rules, Colored Petri Net, and state transition diagrams, etc. 9/23/2018 Data Mining: Principles and Algorithms

Misuse Detection Example Expert systems: use a set of rules to describe attacks IDES, ComputerWatch, NIDX, P-BEST, ISOA Signature analysis: capture features of attacks in audit trail Haystack, NetRanger, RealSecure, MuSig State-transition analysis: use state-transition diagrams STAT,USTAT and NetSTAT Other approaches Colored petri nets, e.g., IDIOT Case-based reasoning, e.g., AUTOGUARD 9/23/2018 Data Mining: Principles and Algorithms

Current Intrusion Detection Approaches—Anomaly Detection Establishing the normal behavior profiles Observing and comparing current activities with the (normal) profiles Reporting significant deviations as intrusions Statistical measures as behavior profiles: ordinal and categorical (binary and linear) 9/23/2018 Data Mining: Principles and Algorithms

Anomaly Detection Example Statistical methods: multivariate, temporal analysis IDES, NIDES, EMERALD Expert systems ComputerWatch, Wisdom & Sense 9/23/2018 Data Mining: Principles and Algorithms

Problems of Current Intrusion Detection Approaches Main problems: manual and ad-hoc Misuse detection: Known intrusion patterns have to be hand-coded Unable to detect any new intrusions (that have no matched patterns recorded in the system) Anomaly detection: Selecting the right set of system features to be measured is ad hoc and based on experience Unable to capture sequential interrelation between events 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms OUTLINE Intrusion detection and computer security Current intrusion detection approaches Data Mining Approaches for Intrusion Detection Summary 9/23/2018 Data Mining: Principles and Algorithms

Data Mining Approaches for Intrusion Detection A systematic framework Why Can Data Mining Help? Relevant data mining techniques Building Classifiers for Intrusion Detection Mining Patterns from Audit Data 9/23/2018 Data Mining: Principles and Algorithms

A Systematic Framework—J.Stolfo et al. Build good models: select appropriate features of audit data to build intrusion detection models Build better models: architect a hierarchical detector system that combines multiple detection models Build updated models: dynamically update and deploy new detection system as needed 9/23/2018 Data Mining: Principles and Algorithms

A Systematic Framework Support for the feature selection and model construction: Apply data mining algorithms to find consistent inter- and intra- audit record (event) patterns Use the features and time windows in the discovered patterns to build detection models A support environment to semi-automate this process 9/23/2018 Data Mining: Principles and Algorithms

A Systematic Framework Combining multiple detection models: Each (base) detector model monitors one aspect of the system They can employ different techniques and be independent of each other The learned (meta) detector combines evidence from a number of base detectors An intelligent agent-based architecture: learning agents: continuously compute (learn) the detection models detection agents: use the (updated) models to detect intrusions 9/23/2018 Data Mining: Principles and Algorithms

A Systematic Framework 9/23/2018 Data Mining: Principles and Algorithms

Data Mining Approaches for Intrusion Detection A systematic framework Why Can Data Mining Help? Relevant data mining techniques Building Classifiers for Intrusion Detection Mining Patterns from Audit Data 9/23/2018 Data Mining: Principles and Algorithms

Why Can Data Mining Help? Data mining: applying specific algorithms to extract patterns from data Normal and intrusive activities leave evidence in audit data From the data-centric point view, intrusion detection is a data analysis process 9/23/2018 Data Mining: Principles and Algorithms

Why Can Data Mining Help? Successful applications in related domains, e.g., fraud detection, fault/alarm management Learn from traffic data Supervised learning: learn precise models from past intrusions Unsupervised learning: identify suspicious activities Maintain or update models on dynamic data 9/23/2018 Data Mining: Principles and Algorithms

Data Mining Approaches for Intrusion Detection A systematic framework Why Can Data Mining Help? Relevant data mining techniques Building Classifiers for Intrusion Detection Mining Patterns from Audit Data 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Frequent Patterns Patterns that occur frequently in a database Mining Frequent patterns – finding regularities Process of Mining Frequent patterns for intrusion detection Phase I: mine a repository of normal frequent itemsets for attack-free data Phase II: find frequent itemsets in the last n connections and compare the patterns to the normal profile 9/23/2018 Data Mining: Principles and Algorithms

Frequent Pattern Mining in MINDS MINDS: a IDS using data mining techniques University of Minnesota Summarizing attacks using association rules {Src IP=206.163.27.95, Dest Port=139, Bytes[150, 200)}  {ATTACK} 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Patterns About Alerts Ning et al. CCS’02 Find correlated alerts – the frequent patterns of alerts Attack scenarios – the logical connections between alerts A hyper-alerts correlation graph approach Use the correlation of intrusion alerts to identify high level attacks 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Associate rules Used for link analysis E.g.: If the number of failed login attempts (num_failed_login_attempts) and the network service on the destination (service) are features, an example of rule is: num_failed_login_attempts = 6, service = FTP => attack = DoS [1, 0.28 ] 9/23/2018 Data Mining: Principles and Algorithms

Sequential Pattern Analysis Models sequence patterns (Temporal) order is important in many situations Time-series databases and sequence databases Frequent patterns  (frequent) sequential patterns Sequential patterns for intrusion detection Capture the signatures for attacks in a series of packets 9/23/2018 Data Mining: Principles and Algorithms

Classification: A Two-Step Process Model construction: describe a set of predetermined classes Training dataset: tuples for model construction Each tuple/sample belongs to a predefined class Classification rules, decision trees, or math formulae Model application: classify unseen objects Estimate accuracy of the model using an independent test set Acceptable accuracy  apply the model to classify data tuples with unknown class labels 9/23/2018 Data Mining: Principles and Algorithms

Classification Methods Basic Algorithm ID3 Neural networks Bayesian classification Naïve Bayesian classification Bayesian belief network Support vector machines 9/23/2018 Data Mining: Principles and Algorithms

Classification for Intrusion Detection Misuse detection Classification based on known intrusions Example: Sinclair et al. “An application of machine learning to network intrusion detection” Use decision trees and ID3 on host session data Use genetic algorithms to generate rules If <pattern> then <alert> 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms HIDE “A hierarchical network intrusion detection system using statistical processing and neural network classification” by Zheng et al. Five major components Probes collect traffic data Event preprocessor preprocesses traffic data and feeds the statistical model Statistical processor maintains a model for normal activities and generates vectors for new events Neural network classifies the vectors of new events Post processor generates reports 9/23/2018 Data Mining: Principles and Algorithms

Intrusion Detection by NN and SVM S. Mukkamala et al., IEEE IJCNN May 2002 Discover useful patterns or features that describe user behavior on a system Use the set of relevant features to build classifiers SVMs have great potential to be used in place of NNs due to its scalability and faster training and running time NNs are especially suited for multi-category classification 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Clustering Group data into clusters What is a good clustering High intra-class similarity and low inter-class similarity Depending on the similarity measure The ability to discover some or all of the hidden patterns Clustering Approaches K-means Hierarchical Clustering Density-based methods Grid-based methods Model-based 9/23/2018 Data Mining: Principles and Algorithms

Clustering for Intrusion Detection Anomaly detection Any significant deviations from the expected behavior are reported as possible attacks Build clusters as models for normal activities “A scalable clustering for intrusion signature recognition” by Ye and Li Use description of clusters as signatures of intrusions 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Alert Correlation F. Cuppens and A. Miege, in IEEE S&P’02 Use clustering and merging functions to recognize alerts that correspond to the same occurrence of an attack Create a new alert that merge data contained in these various alerts Generate global and synthetic alerts to reduce the number of alerts further 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Mining Data Streams Continuous arrival data in multiple, rapid, time-varying, possibly unpredictable and unbounded streams Many applications Financial applications, network monitoring, security, telecommunications data management, web application, manufacturing, sensor networks, etc. 9/23/2018 Data Mining: Principles and Algorithms

Mining Data Streams for Intrusion Detection Maintaining profiles of normal activities The profiles of normal activities may drift Identifying novel attacks Identifying clusters and outliers in traffic data streams 9/23/2018 Data Mining: Principles and Algorithms

Data Mining Approaches for Intrusion Detection A systematic framework Why Can Data Mining Help? Relevant data mining techniques Building Classifiers for Intrusion Detection Mining Patterns from Audit Data 9/23/2018 Data Mining: Principles and Algorithms

Building Classifiers for Intrusion Detection—J.Stolfo et al. Experiments in constructing classification models for anomaly detection Two experiments: sendmail system call data network tcpdump data Use meta classifier to combine multiple classification models 9/23/2018 Data Mining: Principles and Algorithms

Classification Models on sendmail The data: sequence of system calls made by sendmail. Classification models (rules): describe the “normal” patterns of the system call sequences. The rule set is the normal profile of sendmail Detection: calculate the deviation from the profile large number/high scores of “violations” to the rules in a new trace suggests an exploit 9/23/2018 Data Mining: Principles and Algorithms

Classification Models on sendmail The sendmail data: Each trace has two columns: the process ids and the system call numbers Normal traces: sendmail and sendmail daemon Abnormal traces: sunsendmailcap, syslog-remote, syslog-remote, decode, sm5x and sm56a attacks 9/23/2018 Data Mining: Principles and Algorithms

Classification Models on sendmail Lessons learned: Normal behavior can be established and used to detect anomalous usage Need to collect near “complete” normal data in order to build the “normal” model But how do we know when to stop collecting? Need tools to guide the audit data gathering process 9/23/2018 Data Mining: Principles and Algorithms

Classification Models on tcpdump The tcpdump data (part of a public data visualization contest): Packets of incoming, out-going, and internal broadcast traffic One trace of normal network traffic Three traces of network intrusions 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Data Preprocessing Extract the “connection” level features: Record connection attempts Watch how connection is terminated Each record has: start time and duration participating hosts and ports (applications) statistics (e.g., # of bytes) flag: normal or a connection/termination error protocol: TCP or UDP Divide connections into 3 types: incoming, out-going, and inter-lan 9/23/2018 Data Mining: Principles and Algorithms

Building Classifier for Each Type of Connections Use the destination service (port) as the class label Training data: 80% of the normal connections Testing data: 20% of the normal connections and connections in the 3 intrusion traces Apply RIPPER to learn rules 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Lessons Learned Data preprocessing requires extensive domain knowledge Adding temporal features improves classification accuracy Need tools to guide (temporal) feature selection 9/23/2018 Data Mining: Principles and Algorithms

Meta Classifier that Combines Evidence from Multiple Detection Models Build base classifiers that each model one aspect of the system The meta learning task: each record has a collection of evidence from base classifiers, and a class label “normal”or “abnormal” on the state of the system Apply a learning algorithm to produce the meta classifier 9/23/2018 Data Mining: Principles and Algorithms

Data Mining Approaches for Intrusion Detection A systematic framework Why Can Data Mining Help? Relevant data mining techniques Building Classifiers for Intrusion Detection Mining Patterns from Audit Data 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Associate rules Motivations Audit data can be easily formatted into a database table Program executions and user activities have frequent correlation among system features Incremental updating of the rule set is easy 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Frequent Episodes frequent events occurring within a time window X => Y, confidence, support, window: X and Y are subsets of the attribute values in a record support is the percentage of (sliding) windows that contain X and Y 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Frequent Episodes Motivation: Sequence information needs to be included in a detection model An example from a department’s web log: home, research => theory, [0.2, 0.05], [30] Meaning: 20% of the time, after home and research pages are visited (in that order), the theory is then visited within 30 seconds from when home is visited; and visiting these three pages constitutes 5% of all visits to the web site 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms Using the Axis Attribute(s):a form of item constraints, the essentialattribute(s) of a record (transaction). Level-wise Approximate Mining 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms 9/23/2018 Data Mining: Principles and Algorithms

Using the Mined Patterns Guide the audit data gathering process Run a program under different settings For each run, calculate the association rules and frequent episodes from its audit data Merge them into an aggregate rule set Stop gathering audit data when no rules can be added from a new run Support the feature selection process System features in the association rules and frequent episodes should be included in the classification models Time window and features in the frequent episodes suggest additional temporal features should be considered 9/23/2018 Data Mining: Principles and Algorithms

Adaptive Intrusion Detection System Intrusion detection model based on data mining and fuzzy logic Integration of fuzzy logic with data mining Similarity function Optimization of fuzzy membership function parameters 9/23/2018 Data Mining: Principles and Algorithms

The framework of the Adaptive Intrusion Detection System 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms References W. Lee et al. A data mining framework for building intrusion detection models. In Information and System Security, Vol. 3, No. 4, 2000. C. Kruegel and G. Vigna. Anomaly detection of web-based attacks, in ACM CCS’03 S. Mukkamala et al., Intrusion detection using neural networks and support vector machines, in IEEE IJCNN (May 2002). Bertrand Portier, Data Mining Techniques for Intrusion Detection S. Axelsson, Intrusion Detection Systems: A Survey and Taxonomy J. Allen et al., State of the Practice of Intrusion Detection Technologies Susan M. Bridges et al. DATA MINING AND GENETIC ALGORITHMS APPLIED TO INTRUSION DETECTION S. Mukkamala et al. Intrusion detection using neural networks and support vector machines, IEEE IJCNN (May 2002) 9/23/2018 Data Mining: Principles and Algorithms

Data Mining: Principles and Algorithms 9/23/2018 Data Mining: Principles and Algorithms