Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah.

Slides:

Advertisements

Similar presentations

ECG Signal processing (2)

Advertisements

ONLINE ARABIC HANDWRITING RECOGNITION By George Kour Supervised by Dr. Raid Saabne.

Support Vector Machines

Particle swarm optimization for parameter determination and feature selection of support vector machines Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen,

Performance Evaluation of the Fuzzy ARTMAP for Network Intrusion Detection Nelcileno Araújo Ruy de Oliveira Ed’Wilson Tavares Ferreira Valtemir Nascimento.

An Overview of Machine Learning

CS 590M Fall 2001: Security Issues in Data Mining Lecture 3: Classification.

Properties of Machine Learning Applications for Use in Metamorphic Testing Chris Murphy, Gail Kaiser, Lifeng Hu, Leon Wu Columbia University.

Learning Classifier Systems to Intrusion Detection Monu Bambroo 12/01/03.

Statistical based IDS background introduction. Statistical IDS background Why do we do this project Attack introduction IDS architecture Data description.

Copyright 2002, Center for Secure Information Systems 1 Panel: Role of Data Mining in Cyber Threat Analysis Professor Sushil Jajodia Center for Secure.

Neural Networks Chapter Feed-Forward Neural Networks.

Mining Behavior Models Wenke Lee College of Computing Georgia Institute of Technology.

Machine Learning as Applied to Intrusion Detection By Christine Fossaceca.

CS Instance Based Learning1 Instance Based Learning.

LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

Lucent Technologies – Proprietary Use pursuant to company instruction Learning Sequential Models for Detecting Anomalous Protocol Usage (work in progress)

1. Introduction Generally Intrusion Detection Systems (IDSs), as special-purpose devices to detect network anomalies and attacks, are using two approaches.

Data Mining for Intrusion Detection: A Critical Review Klaus Julisch From: Applications of data Mining in Computer Security (Eds. D. Barabara and S. Jajodia)

A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data Authors: Eleazar Eskin, Andrew Arnold, Michael Prerau,

Where Are the Nuggets in System Audit Data? Wenke Lee College of Computing Georgia Institute of Technology.

A Statistical Anomaly Detection Technique based on Three Different Network Features Yuji Waizumi Tohoku Univ.

Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.

A Vehicular Ad Hoc Networks Intrusion Detection System Based on BUSNet.

Presentation by : Samad Najjar Enhancing the performance of intrusion detection system using pre-process mechanisms Supervisor: Dr. L. Mohammad Khanli.

Detecting Network Violation Based on Fuzzy Class-Association-Rule Mining Using Genetic Network Programming.

Network Intrusion Detection Using Random Forests Jiong Zhang Mohammad Zulkernine School of Computing Queen's University Kingston, Ontario, Canada.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

INTRUSION DETECTION INTRUSION DETECTION INTRUSION DETECTION INTRUSION DETECTION INTRUSION DETECTION INTRUSION DETECTION INTRUSION DETECTION INTRUSION DETECTION.

Prediction model building and feature selection with SVM in breast cancer diagnosis Cheng-Lung Huang, Hung-Chang Liao, Mu- Chen Chen Expert Systems with.

Intrusion Detection Using Hybrid Neural Networks Vishal Sevani ( )

An Example of Course Project Face Identification.

NEURAL NETWORKS FOR DATA MINING

Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.

An Overview of Intrusion Detection Using Soft Computing Archana Sapkota Palden Lama CS591 Fall 2009.

Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.

Charles Elkan 1999 Conference on Knowledge Discovery and Data Mining

A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.

EE459 Neural Networks Examples of using Neural Networks Kasin Prakobwaitayakit Department of Electrical Engineering Chiangmai University.

Advanced Analytics on Hadoop Spring 2014 WPI, Mohamed Eltabakh 1.

Centre de Comunicacions Avançades de Banda Ampla (CCABA) Universitat Politècnica de Catalunya (UPC) Identification of Network Applications based on Machine.

Consensus Extraction from Heterogeneous Detectors to Improve Performance over Network Traffic Anomaly Detection Jing Gao 1, Wei Fan 2, Deepak Turaga 2,

Improving Support Vector Machine through Parameter Optimized Rujiang Bai, Junhua Liao Shandong University of Technology Library Zibo , China { brj,

Data Mining and Decision Support

WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.

Anomaly Detection. Network Intrusion Detection Techniques. Ştefan-Iulian Handra Dept. of Computer Science Polytechnic University of Timișoara June 2010.

RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU 2: 1 Nonlinear Models.

A distributed PSO – SVM hybrid system with feature selection and parameter optimization Cheng-Lung Huang & Jian-Fan Dun Soft Computing 2008.

Machine Learning Usman Roshan Dept. of Computer Science NJIT.

Prepared by Fayes Salma.  Introduction: Financial Tasks  Data Mining process  Methods in Financial Data mining o Neural Network o Decision Tree  Trading.

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

CYBERCRIME and Avoidance Techniques

Intrusion Detection using Deep Neural Networks

Machine Learning for Computer Security

ONR MURI area: High Confidence Real-Time Misuse and Anomaly Detection

Active Learning Intrusion Detection using k-Means Clustering Selection

An Enhanced Support Vector Machine Model for Intrusion Detection

Damiano Bolzoni, Sandro Etalle, Pieter H. Hartel

A survey of network anomaly detection techniques

Prepared by: Mahmoud Rafeek Al-Farra

An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.

Intrusion Detection with Neural Networks my awesome graphic ↑

Department of Electrical Engineering

Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen, Zne-Jung Lee

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Identifying Slow HTTP DoS/DDoS Attacks against Web Servers DEPARTMENT ANDDepartment of Computer Science & Information SPECIALIZATIONTechnology, University.

Statistical based IDS background introduction

Enabling Dynamic Network Access Control with Anomaly-based IDS and SDN

Modeling IDS using hybrid intelligent systems

Presentation transcript:

Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah Ramli Department of Electrical Engineering Universitas Indonesia

Network Security

 The most important element In the network security: IDS  Intrusion detection principles:  Misuse detection (signature base)  Anomaly detection (statistics)  Classification with Machine Learning (research) Background: IDS

 Intrusion detection too many false alarm  More often arise new types of attack  Required effective and adaptable detection method  Classification with Machine Learning gives the best result depend on the kernel function and its parameters, and network data attributes/features.  There are no systematic theories concerning how to choose the appropriate kernel/parameters. Background: Problem

1.Capturing packets transferred on the network. 2.Extracting an extensive set of attributes/features of the network packets data that can describe a network connection or a host session. 3.Learning a model that can accurately describe the behavior of abnormal and normal activities by applying data mining techniques. 4.Detecting the intrusions by using the learned models. Data Mining Approach for IDS

Classification (Supervised) Clustering (Unsupervised) K Nearest Neighbor (K-NN)K-Means Naïve BayesHierarchical Clustering Artificial Neural NetworkDBSCAN Support Vector MachineFuzzy C-Means Fuzzy K-NNSelf Organizing Map Data Mining Approach

Machine Learning Input Training Data (x,y) Input Training Data (x,y) Model Development Learning Algorithm Model Implementation Input Test Data (x,?) Input Test Data (x,?) Output Test Data (x,y) Output Test Data (x,y)

SVM Classification

Kernel NameDefinition of Function Linear K(x,y)= x.y PolynomialK(x,y)= (x.y + c) d Gaussian RBF K(x,y)= exp(- II x-y II 2 /2.σ 2 ) Sigmoid (Tangent Hyperbolic) K(x,y)= tanh(σ(x.y) + c) Inverse Multiquadric K(x,y)= 1 / √ II x-y II 2 + c Kernel Function x and y pair of data from train dataset σ, c, d > 0 constant parameter

 How to choose the optimal/significant input dataset feature.  How to set the best kernel function and parameters: σ, ε and C. SVM Performance

 Three important dynamic properties: the intrinsic stochastic property, ergodicity and regularity  Advantage of chaos escape from local minima  More efficient to obtain optimization parameters by means of its powerful global searching ability Chaos

System Design

Metodologi Data Collection Data Preprocessing Model Development Data Classification Training Dataset Test Dataset KDDCUP ’99 DARPA Dataset Predicted Intrusion Data

Data Preprocessing Dataset Transformation Dataset Normalization Range Discretization Format Conversion Dataset Division: Training & Test KDDCUP ’99 DARPA Dataset Test Dataset Training Dataset

Model Development Input Training Data (x,y) Input Training Data (x,y) Parameter Selection with Chaos Optimization Learning Algorithm (SVM) Learning Algorithm (SVM) Model Implementation Input Test Data (x,?) Input Test Data (x,?) Output Test Data (x,y) Output Test Data (x,y) Kernel Function Selection

Fitur 1-9 : intrinsic feature extracted from header paket Fitur : atribut konten yang didapat dari pengetahuan ahli dari paket Fitur : atribut konten dari koneksi 2 detik sebelumnya Fitur : atribut trafik dari mesin yang didapat dari 100 koneksi sebelumnya Fitur Payload : payload berdasarkan waktu (minggu) Feature in KddCup

Intrinsic Attributes These attributes are extracted from the headers' area of the network packets

Content Attributes These attributes are extracted from the contents area of the network packets based on expert person knowledge

Time Traffics Attributes To calculate these attributes we considered the connections that occurred in the past 2 seconds

Machine Traffic Attributes To calculate these attributes we took into account the previous 100 connections

21 Network Traffic Classification

The features that used in previous works are eight features from Mukkamala are: src_bytes, dst_bytes, Count, srv_count, dst_host_count, dst_host_srv_count, dst_host_same_src_port_rate, dst_host_srv_diff _host_rate. Selected Features

The features that used in previous works are 24 features from Natesan are: Duration, protocol_type, Service, Flag, src_bytes, dst_bytes, Hot, num_failed_logins, logged-in, num_compromised, root_shell, num_root, num_file_creations, num_shells, num_access_files, is_host_login, is_guest_login, Count, serror_rate, rerror_rate, diff_srv_rate, dst_host_count, dst_host_diff_srv_rate, dst_host_srv_serror_rate. Selected Features

Proposed Features

Data Pre-processing

Simulation Experiment

Simulation Process Design

Using payload can improve accuracy of IDS in detecting R2L. Using SVM with RBF kernel, accuracy detection rates up to 98.2%. Based on experiment, average detection of all features are best using 28 features using payload : Experiment Result