Models and Security Requirements for IDS. Overview The system and attack model Security requirements for IDS –Sensitivity –Detection Analysis methodology.

Slides:



Advertisements
Similar presentations
Efficient classification for metric data Lee-Ad GottliebWeizmann Institute Aryeh KontorovichBen Gurion U. Robert KrauthgamerWeizmann Institute TexPoint.
Advertisements

Applications of one-class classification
Shortest Vector In A Lattice is NP-Hard to approximate
Computational Privacy. Overview Goal: Allow n-private computation of arbitrary funcs. –Impossible in information-theoretic setting Computational setting:
On Complexity, Sampling, and -Nets and -Samples. Range Spaces A range space is a pair, where is a ground set, it’s elements called points and is a family.
Foundations of Cryptography Lecture 10 Lecturer: Moni Naor.
Fast Algorithms For Hierarchical Range Histogram Constructions
COMP 553: Algorithmic Game Theory Fall 2014 Yang Cai Lecture 21.
Mining Data Streams.
Abhinn Kothari, 2009CS10172 Parth Jaiswal 2009CS10205 Group: 3 Supervisor : Huzur Saran.
Visual Recognition Tutorial
Complexity 15-1 Complexity Andrei Bulatov Hierarchy Theorem.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
1 Complexity of Network Synchronization Raeda Naamnieh.
Pattern Recognition and Machine Learning
UNIVERSITY OF JYVÄSKYLÄ Yevgeniy Ivanchenko Yevgeniy Ivanchenko University of Jyväskylä
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
1 Undecidability Andreas Klappenecker [based on slides by Prof. Welch]
Firewall Policy Queries Author: Alex X. Liu, Mohamed G. Gouda Publisher: IEEE Transaction on Parallel and Distributed Systems 2009 Presenter: Chen-Yu Chang.
Discrete Mathematics Lecture 4 Harper Langston New York University.
Co-operative Private Equality Test(CPET) Ronghua Li and Chuan-Kun Wu (received June 21, 2005; revised and accepted July 4, 2005) International Journal.
CSE 830: Design and Theory of Algorithms
Tirgul 8 Universal Hashing Remarks on Programming Exercise 1 Solution to question 2 in theoretical homework 2.
Evaluating Hypotheses
Michael Bender - SUNY Stony Brook Dana Ron - Tel Aviv University Testing Acyclicity of Directed Graphs in Sublinear Time.
A Hierarchical Energy-Efficient Framework for Data Aggregation in Wireless Sensor Networks IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, VOL. 55, NO. 3, MAY.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Probability Grid: A Location Estimation Scheme for Wireless Sensor Networks Presented by cychen Date : 3/7 In Secon (Sensor and Ad Hoc Communications and.
Experimental Evaluation
Foundations of Cryptography Lecture 9 Lecturer: Moni Naor.
CMSC 414 Computer and Network Security Lecture 3 Jonathan Katz.
Lucent Technologies – Proprietary Use pursuant to company instruction Learning Sequential Models for Detecting Anomalous Protocol Usage (work in progress)
Software Testing Sudipto Ghosh CS 406 Fall 99 November 9, 1999.
Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)
Online Detection of Change in Data Streams Shai Ben-David School of Computer Science U. Waterloo.
Cloud and Big Data Summer School, Stockholm, Aug., 2015 Jeffrey D. Ullman.
ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.
Report on Intrusion Detection and Data Fusion By Ganesh Godavari.
Constraint Satisfaction Problems (CSPs) CPSC 322 – CSP 1 Poole & Mackworth textbook: Sections § Lecturer: Alan Mackworth September 28, 2012.
TECH Computer Science NP-Complete Problems Problems  Abstract Problems  Decision Problem, Optimal value, Optimal solution  Encodings  //Data Structure.
© 2010 AT&T Intellectual Property. All rights reserved. AT&T, the AT&T logo and all other AT&T marks contained herein are trademarks of AT&T Intellectual.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
Communication System A communication system can be represented as in Figure. A message W, drawn from the index set {1, 2,..., M}, results in the signal.
Process-oriented System Analysis Process Mining. BPM Lifecycle.
Similarity Searching in High Dimensions via Hashing Paper by: Aristides Gionis, Poitr Indyk, Rajeev Motwani.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Externally growing self-organizing maps and its application to database visualization and exploration.
AGC DSP AGC DSP Professor A G Constantinides©1 Signal Spaces The purpose of this part of the course is to introduce the basic concepts behind generalised.
Effective Anomaly Detection with Scarce Training Data Presenter: 葉倚任 Author: W. Robertson, F. Maggi, C. Kruegel and G. Vigna NDSS
CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.
Giansalvo EXIN Cirrincione unit #4 Single-layer networks They directly compute linear discriminant functions using the TS without need of determining.
Learning with General Similarity Functions Maria-Florina Balcan.
SSE-2 Step1: keygen(1 k ):s {0,1} k,output K=s Step2:Buildindex(K,D): 建立 table T, p=word bit+max bit R 假設 w 1 出現在 D 1,D 3 T[π s (w 1 ||1)]=D 1 T[π s (w.
Section Recursion 2  Recursion – defining an object (or function, algorithm, etc.) in terms of itself.  Recursion can be used to define sequences.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
SketchVisor: Robust Network Measurement for Software Packet Processing
Topic 36: Zero-Knowledge Proofs
CS 9633 Machine Learning Support Vector Machines
Machine Learning Applications in Grid Computing
Random Testing: Theoretical Results and Practical Implications IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 2012 Andrea Arcuri, Member, IEEE, Muhammad.
Approximating the MST Weight in Sublinear Time
Relational Algebra Chapter 4, Part A
Lecture 18: Uniformity Testing Monotonicity Testing
Roland Kwitt & Tobias Strohmeier
Objective of This Course
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Chapter 11 Limitations of Algorithm Power
Presentation transcript:

Models and Security Requirements for IDS

Overview The system and attack model Security requirements for IDS –Sensitivity –Detection Analysis methodology IDS satisfying the framework Combinatorial tools in intrusion detection 2/43

The system and attack model The model of the system: –Scenario What are the elements of the network? –Connectivity How are these elements connected? –Action What traffic is sent between these elements? 3/43

The system and attack model Scenario –A large network, also called Autonomous System (AS ) –AS can have many points of entry, called Border Gateways (BG ) of the AS. 4/43

The system and attack model Connectivity –The traffic is generated by external users. –Each user (U) can send traffic to each BG. BG AS U U U 5/43

The system and attack model Action (1) –The network traffic is a sequence of atomic packets. –The abstraction of a packet: p =(sid, time, poe, pl ) sid – the identity of the sender (U) time – a timestamp of the action poe – point of entry (BG) pl – the payload – what is actually sent. 6/43

The system and attack model Action (2) –At any time, the action in an AS is a stream of packets entering AS through any of its BGs. –Each packet in this stream can trigger an event in the AS. 7/43

The system and attack model The model of an attack (1) –Any sequence of c packets, c  1, that successfully alters the state of the nodes (hosts) in an AS in order to achieve a specific (malicious) goal. –Let  t be the state of the AS at the time instant t. The state may include, for example: Available bandwidth Internal states of all hosts within the AS. 8/43

The system and attack model The model of an attack (2) –We can then define a polynomial time computable predicate (predicates are functions that take binary values)  (1 n, t,  t ) n – a security parameter 1 n – input, unary string of length n 9/43

The system and attack model The model of an attack (3) –Attack (1) A probability distribution A over all packet sequences ps =(p 1,…,p l ) Samples with this distribution can be obtained efficiently (efficiently samplable distribution) The probability that the experiment E(A ) is unsuccessful is negligible, i.e. smaller than 1/p (n ), for all positive polynomials p and all sufficiently large n. 10/43

The system and attack model The model of an attack (4) –Attack (2) The experiment E(D ), for any distribution D : –A sequence p of packets is drawn from D –The sequence p is sent to the network –AS turns into the state  t –The predicate  (1 n, t,  t ) evaluates to the value b  {0,1} E(D ) is successful if b =1. 11/43

The system and attack model The model of an attack (5) –A class of attacks C ={A 1,A 2,…} –Normal traffic distribution Efficiently samplable probability distribution N over the set of packets, such that the probability that the experiment E(N ) is successful is negligible. 12/43

The system and attack model The model of an IDS (1) –An IDS is a triple of algorithms: A representation algorithm R (data filtering, formatting, feature selection, etc.) A data structure algorithm S (data collection, aggregation, knowledge base creation, etc.) A classification algorithm C (detection in all forms – pattern-based, rule-based, anomaly-based, response, refinement, information tracing, visualization, etc.) 13/43

The system and attack model The model of an IDS (2) –Two phases in the execution of an IDS: An initialization phase A detection phase. –The algorithm S is run in the initialization phase. –The algorithm C is run in the detection phase. –Both S and C use the algorithm R as a subroutine. 14/43

The system and attack model The model of an IDS (3) –In the initialization phase: The algorithm S uses the algorithm R to process a stream of packet data obtained from normal traffic distributions or known attack distributions. The output from the algorithm S is a data structure that will be used in the detection phase. It is assumed that the traffic generated in the initialization phase is not subject to an attack, unless it simulates a known attack. 15/43

The system and attack model The model of an IDS (4) –In the detection phase: The algorithm C is run on the input data structure and a sequence of traffic packets (possibly subject to a known or a new attack). It returns an assessment of whether the input sequence of packets contains an attack (and if so whether this attack is new). The algorithm R maps the sequence of packets entering the AS into a fixed-length tuple having a more compact form (e.g. a point in a high- dimensional space) 16/43

Security requirements for IDS Given the following: –A security parameter n –Normal traffic distribution N –(Known) attack distributions A 1,…,A t N, A 1,…,A t are efficiently samplable and pairwise disjoint. 17/43

Security requirements for IDS An IDS is a triple of polynomial time algorithms R, S, C such that: –Given a sequence of rw packets p, algorithm R returns a d -tuple r. –Given distributions N, A 1,…,A t, algorithm S returns a data structure ds of size at most m [init ]. –Given a data structure ds, a sequence m [det ] packets p, a detection window dw and a class of attacks C 1, algorithm C returns a classification value out. 18/43

Security requirements for IDS IDS data (1): rw - representation window the window of packets used in a single execution of R usually a small value. m [init ] - the length of the stream of packets used in the initialization phase. 19/43

Security requirements for IDS IDS data (2): m [det ] - the length of the stream of packets used in the detection phase, to be classified by algorithm C Considered arbitrarily large, but polynomially dependent on n and rw. dw - maximum distance between the first and the last packet of an attack sequence within the stream m [det ]. 20/43

Security requirements for IDS In general, rw, d, m [init ], m [det ] and dw are all bounded by a polynomial in n. A typical setting: rw =O (n ) d =O (1) m [init ]=n a m [det ]=n b rw  dw  m [det ] a,b >1, potentially large constants. 21/43

Security requirements for IDS An IDS can satisfy two requirements –Sensitivity –Detection 22/43

Sensitivity We would like the output d -tuple of the algorithm R to capture differences between normal traffic and attack traffic. Capturing these differences is formalized using the notion of computational distinguishability. We require this distinguishability with respect to a single sample of the distributions, because an attack may be executed only once. 23/43

Sensitivity Informal definition of sensitivity (1): –A is an attack distribution –N is a normal traffic distribution –The sensitivity of a representation algorithm R is defined on the basis of the distinguishability of the packet streams taken from the distributions A and N. 24/43

Sensitivity Informal definition of sensitivity (2): –The measure of sensitivity is probabilistic: it describes the probability  that an attack distribution A can be distinguished from a normal traffic distribution N. The definition of sensitivity can be generalized to families of distributions. 25/43

Detection The representation algorithm R should give different outputs given fixed-window attack/normal traffic packet streams. It does not clarify anything about the nature of this difference. It does not give any constructive algorithm to distinguish which of two different outputs is of which type. 26/43

Detection We would like the algorithms S and C to directly provide “good enough” detection properties on arbitrarily large traffic sequences as long as the algorithm R has “good enough” sensitivity properties on small and fixed traffic sequences. 27/43

Detection Operation of an IDS (1): –In the first phase, the data structure algorithm S is given access to a stream of m packets and can run the representation algorithm on inputs of length rw. –S is allowed to query both the normal traffic distribution N and several (known) attack distributions A 1,…,A t. –At the end of the first phase, S returns the data structure ds. 28/43

Detection Operation of an IDS (2): –A sequence of dw packets q is generated and the classification algorithm C returns an output out saying if q contains a sample from one of the known attacks A 1,…,A t, or a different (unknown) attack A or no attack at all. –The IDS is successful if this classification is correct. 29/43

Detection Informal definition of detection: –If A is an attack distribution (potentially unknown), the IDS will detect that the given packet sequence q originates from A with probability , for any q. This definition can also be generalized for classes of attack distributions. 30/43

Detection  is always smaller than . An IDS is considered a “good” detector if  is close to . If A is not distinguishable from N (i.e.  =0), then no pair of algorithms S,C can be a detector. 31/43

Analysis methodology An ideal methodology to analyze an IDS would prove that it satisfies: –The sensitivity requirement (for some appropriate parameter values) –The detection requirement (for some appropriate parameter values) under the assumption that it satisfies the sensitivity requirement. 32/43

Analysis methodology A mathematical proof that an IDS satisfies the sensitivity requirement is difficult to obtain, because of the unpredictable nature of a generic unknown attack. Because of that, validating the sensitivity of the representation algorithm is performed by simulation. 33/43

Analysis methodology Once the sensitivity property is validated for the representation algorithm R, the challenge is to formally prove that the given IDS is a detector. 34/43

IDS satisfying the framework IDS-1 –The algorithm C is based on the approximate nearest neighbour search. IDS-2 –The algorithm C is based on clustering – allows for more than one distribution for normal traffic – the class of detectable attacks with IDS-2 is larger than that of IDS-1. 35/43

IDS satisfying the framework Approximate nearest neighbour search problem (1) –V is a vector space of dimension d. –  is a distance function defined over V. –Given a set Q of k d -component vectors in V, an error parameter  and a d-component vector q  V, we define the (1+  )-approximate nearest neighbour of q as the vector v in Q such that  (q,v )  (1+  )  (q,w ), for any w  Q. –Problem: find the nearest neighbour in Q for any q  V. 36/43

IDS satisfying the framework Approximate nearest neighbour search problem (2) –A solution is a pair of algorithms (Init, Search): On input an k-size set Q of d -length vectors and parameters  and , the algorithm Init returns a data structure ds. On input data structure ds, a vector q and parameter , the algorithm Search returns a vector v. With probability at least , v  Q and v is a (1+  )- approximate nearest neighbour of q. 37/43

IDS satisfying the framework Approximate nearest neighbour search problem (3) –The algorithm Init must run in time polynomial in k and d. –The algorithm Search must run in time polynomial in d and logk. –Init is used in the initialization phase (off-line). –Search is used in the detection phase (on-line). –Such algorithms Init and Search exist. 38/43

Combinatorial tools in ID We would like to have an IDS with arbitrary detection window. We start with IDS 1 =(R 1,S 1,C 1 ) with the representation window rw 1 and detection window dw 1 =k. IDS 1 with its level of sensitivity can detect attacks having l effective packets. 39/43

Combinatorial tools in ID We construct IDS 2 =(R 2,S 2,C 2 ) from IDS 1, with representation window rw 2 and detection window dw 2 =m. This can be done by means of a covering set system (l,k,m ) – a combinatorial object. 40/43

Combinatorial tools in ID Covering set system (covering design) (1) –l,k,m – positive integers. –S – a set of cardinality m. –T={T 1,…,T s } – a set of subsets of S of cardinality k. –T is an (l,k,m )-covering set system for S if for any S i  S of cardinality l, there exists a subset T j  T such that S i  T j. 41/43

Combinatorial tools in ID Covering set system (2) –Space efficiency of the covering set system T is the cardinality s of T (can be a function of l, k, m ). –Time efficiency of T is the running time (as a function of l, k, m ) that an algorithm takes to construct T. 42/43

Combinatorial tools in ID Starting from IDS 1 =(R 1,S 1,C 1 ) with representation window rw 1 and detection window dw 1 =k and given an (l,k,m )- covering set system for S ={1,…,m } with time efficiency t and space efficiency s, it is possible to construct IDS 2 =(R 2,S 2,C 2 ) with rw 2 =rw 1 and dw 2 =m, for any m polynomial in k, where C 2 runs in time O(t +s  time(C 1 )). R 2 =R 1, S 2 =S 1. 43/43