Presentation is loading. Please wait.

Presentation is loading. Please wait.

Carnegie Mellon Selected Topics in Automated Diversity Stephanie Forrest University of New Mexico Mike Reiter Dawn Song Carnegie Mellon University.

Similar presentations


Presentation on theme: "Carnegie Mellon Selected Topics in Automated Diversity Stephanie Forrest University of New Mexico Mike Reiter Dawn Song Carnegie Mellon University."— Presentation transcript:

1 Carnegie Mellon Selected Topics in Automated Diversity Stephanie Forrest University of New Mexico Mike Reiter Dawn Song Carnegie Mellon University

2 Carnegie Mellon Automated Diversity for Security Computer systems are highly uniform  Easy targets for standardized attacks. Use idea of biological diversity:  Introduce changes that make each system unique  Attack will need to be rewritten for each computer  Provide population resilience to unknown environmental threats Two approaches:  Interface diversity: Adapt vulnerable interfaces such as machine language, system call numbers, and standard library locations.  Implementation diversity: Utilize diverse implementations of common services Two projects:  Randomized instruction set emulation [Barrantes, Ackley and Forrest]  Behavioral distance for anomaly detection [Gao, Reiter and Song]

3 Carnegie Mellon Randomized Instruction Set Emulation (RISE) An example of interface diversity Many current attacks insert binary code into a running program which is then executed. RISE protects the code itself, rather than points-of-entry:  Perimeter defense (e.g., stack protection) not enough. Randomize binary code instruction set for every program:  Foreign malicious code will try to execute code in the standard format and will fail.  Knowledge of a particular translation will gain access only to that particular program. Modify compiler/virtual machine to accept this “new” language:  Prototype in open-source binary-to-binary translator Valgrind.  Related to encrypting compilers.

4 Carnegie Mellon How does foreign code infect a running program?

5 Carnegie Mellon

6

7

8 Results Prototype implementation available under GPL from http://www.cs.unm.edu/~immsec: http://www.cs.unm.edu/~immsec:  Normal code runs properly.  Binary code injection attacks stopped (100% of tested examples). Performance (preliminary):  Emulation overhead of Valgrind is high.  Incremental cost of RISE is small.  (Very) roughly a factor of 2 slowdown in current configuration.  Significant space penalty:  Libraries  Mask

9 Carnegie Mellon

10 Host-Based Anomaly Detector User Space Kernel Space Is this system call request anomalous? Model 3511 Anomalous? (Y/N) Can we use another computer as the model?

11 Carnegie Mellon Fault-Tolerant System Commercial Off-the-shelf applications: may not produce the same responses Intrusions that do not result in observable deviation in the responses Need to observe the behavior

12 Carnegie Mellon The Problem 343534 9630210466222 Match? Diverse Platform (Linux and Windows)  System call numbers observed do not have semantic meanings  System calls may not have one-to-one correspondence  System call sequences may have different length Diverse Implementation (Apache and Abyss)  Correspondence may not exist between individual system calls

13 Carnegie Mellon Evolutionary Distance Are two DNA sequences derived from a common ancestral sequence? Evolutionary distance between two DNA sequences  Substitutions  Deletions  Insertions ATGCGTCGTT ATCCGCGAT ATGC-GTCGTT AT-CCG-CGAT ACGT- A0---- C0.30--- G0.1 0-- T0.2 0.10- -0.30.60.50.80 Insertion/Deletion (I/D) Symbols

14 Carnegie Mellon Behavioral Distance and Evolutionary Distance Similarities  Evaluate difference between two sequences  Substitutions, Deletions and Insertions Differences  Same system call number in two sequences are not the “same”  We do not have the cost table in behavioral distance measure  We have training data

15 Carnegie Mellon Behavioral Distance Behavioral distance calculation Learning the cost table  Initializing the cost table  Iteratively updating the cost table System call phrase extraction

16 Carnegie Mellon Behavioral Distance Calculation The set of sequences obtained by inserting n-len(s) I/D symbols into s, at any location ATGCGTCGTT ATCCGCGAT ATGC-GTCGTT AT-CCG-CGAT

17 Carnegie Mellon Learning the Cost Table Training data: subjecting the replicas to a battery of well-formed (benign) requests and observing the system calls induced Initializing the cost table  The first approach: comparing semantics of individual system calls  The second approach: using frequency information Iteratively updating the cost table  Use the initialized cost table to calculate behavioral distance between system call sequences in the training data  Results of the behavioral distance reveal the “proper alignments” between system calls  Use these “proper alignments” to update the cost table

18 Carnegie Mellon System call Phrases Correspondence may not exist between individual system calls Behavioral distance calculation is very slow when sequences are long Solution: group system calls into system call phrases  System call phrases are also called system call subsequences  A system call phrase is a sequence of system calls that frequently appear together in program execution  TEIRESIAS algorithm (also taken from Biology)  TEIRESIAS algorithm has been used in other intrusion/anomaly detection systems

19 Carnegie Mellon Evaluation – Experimental Setup

20 Carnegie Mellon Behavioral Distance – Same Application Apache Webserver Myserver Webserver

21 Carnegie Mellon Behavioral Distance – Different Application Linux: Apache Webserver Windows: Myserver Webserver Linux: Myserver Webserver Windows: Apache Webserver

22 Carnegie Mellon Behavioral Distance – Mimicry Attacks Server on Linux ApacheMyserver Apache Server on Windows ApacheMyserverApacheMyserver Mimicry on Linux 10.283194 99.9093% 26.656983 100% 6.908590 99.4555% 32.764897 100% Mimicry on Windows 6.842813 99.4555% 9.967780 99.4555% 13.354194 100% 5.280875 99.4555% Mimicry on Linux 3.736 98.9111% 13.657 100% 2.731 98.9111% 13.813 100% Mimicry on Windows 2.65 98.7296% 2.174 98.0944% 2.187 98.9111% 2.64 97.8221% Attacker knows individual IDS on one replica Attack knows behavioral distance and the cost table Behavioral distance of the best mimicry attack True acceptance rate when threshold is set to detect the best mimicry attack

23 Carnegie Mellon Performance Overhead

24 Carnegie Mellon Conclusion Behavioral distance detects an attack on one process that causes its behavior to deviate from that of another Behavioral distance makes evasion attacks more difficult with moderate overhead


Download ppt "Carnegie Mellon Selected Topics in Automated Diversity Stephanie Forrest University of New Mexico Mike Reiter Dawn Song Carnegie Mellon University."

Similar presentations


Ads by Google