Roland Kwitt & Tobias Strohmeier

Slides:

Advertisements

Similar presentations

Applications of one-class classification

Advertisements

Loss-Sensitive Decision Rules for Intrusion Detection and Response Linda Zhao Statistics Department University of Pennsylvania Joint work with I. Lee,

Anomaly Detection Steven M. Bellovin Matsuzaki ‘maz’ Yoshinobu 1.

Polymorphic blending attacks Prahlad Fogla et al USENIX 2006 Presented By Himanshu Pagey.

Models and Security Requirements for IDS. Overview The system and attack model Security requirements for IDS –Sensitivity –Detection Analysis methodology.

 Firewalls and Application Level Gateways (ALGs)  Usually configured to protect from at least two types of attack ▪ Control sites which local users.

5/1/2006Sireesha/IDS1 Intrusion Detection Systems (A preliminary study) Sireesha Dasaraju CS526 - Advanced Internet Systems UCCS.

This work is supported by the National Science Foundation under Grant Number DUE Any opinions, findings and conclusions or recommendations expressed.

Unsupervised Intrusion Detection Using Clustering Approach Muhammet Kabukçu Sefa Kılıç Ferhat Kutlu Teoman Toraman 1/29.

© 2006 Cisco Systems, Inc. All rights reserved. Implementing Secure Converged Wide Area Networks (ISCW) Module 6: Cisco IOS Threat Defense Features.

Anomaly Detection. Anomaly/Outlier Detection  What are anomalies/outliers? The set of data points that are considerably different than the remainder.

seminar on Intrusion detection system

1 Collaborative Online Passive Monitoring for Internet Quarantine Weidong Cui SAHARA Winter Retreat, 2004.

Department Of Computer Engineering

Intrusion Detection System Marmagna Desai [ 520 Presentation]

FIREWALL TECHNOLOGIES Tahani al jehani. Firewall benefits  A firewall functions as a choke point – all traffic in and out must pass through this single.

Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.

Intrusion Detection for Grid and Cloud Computing Author Kleber Vieira, Alexandre Schulter, Carlos Becker Westphall, and Carla Merkle Westphall Federal.

Network Intrusion Detection Using Random Forests Jiong Zhang Mohammad Zulkernine School of Computing Queen's University Kingston, Ontario, Canada.

Improving Intrusion Detection System Taminee Shinasharkey CS689 11/2/00.

Sections 6-1 and 6-2 Overview Estimating a Population Proportion.

Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.

An Overview of Intrusion Detection Using Soft Computing Archana Sapkota Palden Lama CS591 Fall 2009.

One-class Training for Masquerade Detection Ke Wang, Sal Stolfo Columbia University Computer Science IDS Lab.

Chapter 8 Sampling Variability and Sampling Distributions.

7.5 Intrusion Detection Systems Network Security / G.Steffen1.

A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.

Trajectory Sampling for Direct Traffic Oberservation N.G. Duffield and Matthias Grossglauser IEEE/ACM Transactions on Networking, Vol. 9, No. 3 June 2001.

Section 6-3 Estimating a Population Mean: σ Known.

1 A Network Security Monitor Paper By: Heberlein et. al. Presentation By: Eric Hawkins.

Intrusion Detection Systems Paper written detailing importance of audit data in detecting misuse + user behavior 1984-SRI int’l develop method of.

Automated Worm Fingerprinting Authors: Sumeet Singh, Cristian Estan, George Varghese and Stefan Savage Publish: OSDI'04. Presenter: YanYan Wang.

Machine Learning 5. Parametric Methods.

Sampling Theory and Some Important Sampling Distributions.

Effective Anomaly Detection with Scarce Training Data Presenter: 葉倚任 Author: W. Robertson, F. Maggi, C. Kruegel and G. Vigna NDSS

Hybrid Intelligent Systems for Network Security Lane Thames Georgia Institute of Technology Savannah, GA

Machine Learning for Network Anomaly Detection Matt Mahoney.

Role Of Network IDS in Network Perimeter Defense.

Using Honeypots to Improve Network Security Dr. Saleh Ibrahim Almotairi Research and Development Centre National Information Centre - Ministry of Interior.

DOWeR Detecting Outliers in Web Service Requests Master’s Presentation of Christian Blass.

Kernel Based Anomaly Detection Andrew Arnold (aoa5) 2nd Annual Project Student Day Columbia University -- 4/26/01 Intrusion Detection Systems -- IDS Machine.

Estimating standard error using bootstrap

Nonparametric Density Estimation – k-nearest neighbor (kNN) 02/20/17

Sampling Variability & Sampling Distributions

QianZhu, Liang Chen and Gagan Agrawal

Maximum Likelihood Estimation

When Security Games Go Green

De-anonymizing the Internet Using Unreliable IDs By Yinglian Xie, Fang Yu, and Martín Abadi Presented by Peng Cheng 03/22/2017.

Outlier Discovery/Anomaly Detection

Goodness-of-Fit Tests

More about Posterior Distributions

A survey of network anomaly detection techniques

Sampling Distribution

Sampling Distribution

Soft Error Detection for Iterative Applications Using Offline Training

Neural Networks and Their Application in the Fields of Coporate Finance By Eric Séverin Hanna Viinikainen.

Chapter 10: Estimating with Confidence

Identifying Slow HTTP DoS/DDoS Attacks against Web Servers DEPARTMENT ANDDepartment of Computer Science & Information SPECIALIZATIONTechnology, University.

ADVANCED ANOMALY DETECTION IN CANARY TESTING

Chapter 8: Estimating with Confidence

Multivariate Methods Berlin Chen

Chapter 8: Estimating with Confidence

Multivariate Methods Berlin Chen, 2005 References:

Estimating a Population Mean:  Known

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

Machine Learning – a Probabilistic Perspective

Chapter 8: Estimating with Confidence

Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

Presentation transcript:

Roland Kwitt & Tobias Strohmeier Towards Anomaly Detection in Network Traffic by Statistical Means and Machine Learning Roland Kwitt & Tobias Strohmeier Salzburg Research Forschungsgesellschaft m.b.H.| Jakob-Haringer-Str. 5/III | A-5020 Salzburg T +43.662.2288-200 | F +43.662.2288-222 | info@salzburgresearch.at | www.salzburgresearch.at

Overview Motivation Dependency between Anomalies and Attacks The Detection Process Assumptions Statistical Analysis Machine Learning Results Further Work 12.11.2018 © Salzburg Research

Motivation Attacks against computer networks are dramatically increasing every year Security solutions merely based on firewalls are not enough any longer  Prevention + Detection of malicious activity Attacks tend to vary over time  signature based approaches lack flexibility  Anomaly detection provides means to detect variations of old attacks as well as novel attacks 12.11.2018 © Salzburg Research

Dependency between Anomalies and Attacks Actually, there is a huge amount of possibilities! Examples: Network probes or scans are necessarily anomalous since they seek information legitimate users already possess Many attack rely on the ability of an attacker to construct client protocols themselves  in most cases the target environment is not duplicated carefully enough A lot of successfully executed attacks result in so called response anomalies! For example, due to the exploitation of implementation failures Deliberately manipulated protocols designed to pass improperly configured firewall systems 12.11.2018 © Salzburg Research

The Detection Process (1)::Assumptions Many malicious activities deviate from normal activities A subset of them can be detected by monitoring the distributions of certain packet header fields Based on monitoring benign traffic, a baseline profile, describing normal traffic behavior can be established Do we have enough almost anomaly free traffic (training data) ? Is the training data representative for normal traffic conditions ? We make the assumption the data generating process is (weak) stationary (idealization) ! 12.11.2018 © Salzburg Research

The Detection Process (1)::Assumptions (Sample) Training Test - Normal Test – Attack 12.11.2018 © Salzburg Research

The Detection Process (2)::Statistical Analysis (1) Let a random variable X denote whether a monitored header field takes on a certain value or not  Bernoulli Experiment  X ~ Bernoulli (1,p) Let a random variable Y denote the number of successes in v executions of the same experiment  Actually, we do observe the whole domain DK of a header field. Each random experiment can thus result in mk = |DK| outcomes  (k … k-th header field) 12.11.2018 © Salzburg Research

The Detection Process (2)::Statistical Analysis (2) We introduce a learning window L and a (sliding) test window T of n-packets Calculation of the Maximum Likelihood Estimator (MLE) of both multinomial distributions Actually the MLEs of L are the expected probabilities under normal traffic conditions  The difference between the MLEs of both windows is an indicator for anomalous activity Problem: Too much fluctuations  Calculate the ECDFs of the MLE differences  Same system: learning window + (sliding) test window of n-fluctuations Determine the differences between the areas under both ECDFs (complexity O(1))  anomaly score for each header field 12.11.2018 © Salzburg Research

The Detection Process (3)::Machine Learning Problem: Reduce k-dimensional anomaly vector to 1-dimensional anomaly score Solution: Self-Organizing Map (unsupervised learning) In the training phase (normal traffic) the SOM builds cluster centers for normal anomaly vectors  the SOM learns the usual ECDF differences under benign traffic conditions Anomalous high ECDF differences result in anomalous vectors  no suitable cluster center can be found  the distance (quantization error) to the nearest cluster center is very high 12.11.2018 © Salzburg Research

Results (1) Monitor Stack 12.11.2018 © Salzburg Research

Results (2) Source: DAPRA 1999 ID Data Set Training = Week 1, Day 1; Test = Week 2, Day 4; Host = Marx 12.11.2018 © Salzburg Research

Results (3) Source: DAPRA 1999 ID Data Set Training = Week 1, Day 1; Test = Week 2, Day 4; Host = Marx 12.11.2018 © Salzburg Research

Further Work Eliminate assumption of stationarity  determine stationary intervals  piecewise stationarity Introduce higher level features such as connection details or temporal statistics (#connections in t-seconds for example) Evaluate other machine learning methods (Growing Neural Gas, Growing Cell Structures …) Testing with Endace’s DAG card to eliminate the performance bottleneck caused by libpcap! 12.11.2018 © Salzburg Research

Thanks for your attention ! 12.11.2018 © Salzburg Research