Automatic Network Protocol Analysis

Slides:



Advertisements
Similar presentations
Introduction to C Programming
Advertisements

Semantic Analysis and Symbol Tables
Some Properties of SSA Mooly Sagiv. Outline Why is it called Static Single Assignment form What does it buy us? How much does it cost us? Open questions.
By Philipp Vogt, Florian Nentwich, Nenad Jovanovic, Engin Kirda, Christopher Kruegel, and Giovanni Vigna Network and Distributed System Security(NDSS ‘07)
Effective and Efficient Malware Detection at the End Host Clemens Kolbitsch, Paolo Milani TU Vienna Christopher UCSB Engin Kirda.
IS 1181 IS 118 Introduction to Development Tools VB Chapter 06.
Program Design and Development
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
Cs164 Prof. Bodik, Fall Symbol Tables and Static Checks Lecture 14.
1 Compression Techniques to Simplify the Analysis of Large Execution Traces Abdelwahab Hamou-Lhadj and Dr. Timothy C. Lethbridge {ahamou,
Programming by Example using Least General Generalizations Mohammad Raza, Sumit Gulwani & Natasa Milic-Frayling Microsoft Research.
SIMULATING A MOBILE PEER-TO-PEER NETWORK Simo Sibakov Department of Communications and Networking (Comnet) Helsinki University of Technology Supervisor:
CSE Lectures 22 – Huffman codes
Detection and Resolution of Anomalies in Firewall Policy Rules
Reverse Engineering State Machines by Interactive Grammar Inference Neil Walkinshaw, Kirill Bogdanov, Mike Holcombe, Sarah Salahuddin.
NDSS 2007 Philipp Vogt, Florian Nentwich, Nenad Jovanovic, Engin Kirda, Christopher Kruegel, Giovanni Vigna.
CIS Computer Programming Logic
1 Chapter 4: Selection Structures. In this chapter, you will learn about: – Selection criteria – The if-else statement – Nested if statements – The switch.
Discoverer: Automatic Protocol Reverse Engineering from Network Traces Weidong Cui Jayanthkumar Kannan Helen J. Wang Microsoft Research USENIX Security.
AccessMiner Using System- Centric Models for Malware Protection Andrea Lanzi, Davide Balzarotti, Christopher Kruegel, Mihai Christodorescu and Engin Kirda.
Automatically Generating Models for Botnet Detection Presenter: 葉倚任 Authors: Peter Wurzinger, Leyla Bilge, Thorsten Holz, Jan Goebel, Christopher Kruegel,
University of Palestine software engineering department Introduction to data structures Control Statements: Part 1 instructor: Tasneem Darwish.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Automatic Protocol Format Reverse Engineering through Context-Aware Monitored Execution Zhiqiang Lin 1 Xuxian Jiang 2, Dongyan Xu 1, Xiangyu Zhang 1 1.
Property of Jack Wilson, Cerritos College1 CIS Computer Programming Logic Programming Concepts Overview prepared by Jack Wilson Cerritos College.
Christopher Kruegel University of California Engin Kirda Institute Eurecom Clemens Kolbitsch Thorsten Holz Secure Systems Lab Vienna University of Technology.
Network Protocol System Fingerprinting - A Formal Approach Guoqiang Shu and David Lee INFOCOM 2006 Speaker: Chang Huan Wu 2008/10/31.
Hassen Grati, Houari Sahraoui, Pierre Poulin DIRO, Université de Montréal Extracting Sequence Diagrams from Execution Traces using Interactive Visualization.
Deriving Input Syntactic Structure From Execution Zhiqiang Lin Xiangyu Zhang Purdue University November 11 th, 2008 The 16th ACM SIGSOFT International.
Automatic Protocol Format Reverse Engineering through Context-Aware Monitored Execution Zhiqiang Lin 1 Xuxian Jiang 2, Dongyan Xu 1, Xiangyu Zhang 1 1.
CASE/Re-factoring and program slicing
Open Source Server Side Scripting ECA 236 Open Source Server Side Scripting Files & Directories.
Chapter 1 Introduction Major Data Structures in Compiler
STL CSSE 250 Susan Reeder. What is the STL? Standard Template Library Standard C++ Library is an extensible framework which contains components for Language.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Saner: Composing Static and Dynamic Analysis to Validate Sanitization in Web Applications Davide Balzarotti, Marco Cova, Vika Felmetsger, Nenad Jovanovic,
CSE 425: Functional Programming I Programs as Functions Some programs act like mathematical functions –Associate a set of input values from the function’s.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
C Program Control September 15, OBJECTIVES The essentials of counter-controlled repetition. To use the for and do...while repetition statements.
C++ for Engineers and Scientists Second Edition Chapter 4 Selection Structures.
Lecture 9 Symbol Table and Attributed Grammars
Chapter 14: Sequential Access Files
Chapter 4: Control Structures I (Selection)
Chapter 4 – C Program Control
TriggerScope: Towards Detecting Logic Bombs in Android Applications
PRINCIPLES OF COMPILER DESIGN
Introduction to Compiler Construction
5.13 Recursion Recursive functions Functions that call themselves
A Simple Syntax-Directed Translator
Static Detection of Cross-Site Scripting Vulnerabilities
Constructing Precedence Table
Compiler Construction (CS-636)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
How to Define Separate Order Counters for Separate Sub-Libraries
System Programming and administration
Control Statements: Part 2
Troubleshooting IP Communications
TriggerScope Towards detecting logic bombs in android applications
High Coverage Detection of Input-Related Security Faults
Introduction to Programming
WEB PROGRAMMING JavaScript.
Lecture 15 (Notes by P. N. Hilfinger and R. Bodik)
Representation, Syntax, Paradigms, Types
Compiler design.
Trees Addenda.
Introduction to Data Structures
Rational Publishing Engine RQM Multi Level Report Tutorial
COMPILER CONSTRUCTION
Presentation transcript:

Automatic Network Protocol Analysis Gilbert Wondracek, Paolo Milani Comparetti, Christopher Kruegel, and Engin Kirda NDSS 2008 Speaker: Chang Huan Wu 2009/2/17

Outline Introduction Protocol Analysis Evaluation Conclusions Analysis of a Single Message Analysis of Multiple Messages Evaluation Conclusions

Introduction (1/3) Protocol reverse engineering is the process of extracting application-level protocol specifications Especially for closed protocols Security applications Black-box testing for protocol programs Deep packet inspection Reveal differences in server implementations

Introduction (2/3) Manual protocol analysis is time-consuming Only very popular protocols such as SMB can be justified => Automatically analysis

Introduction (3/3) Existing automatic approach Input a binary program and outputs the set of inputs that this program accepts Unable to determine the complete set of inputs Low scalability Input network traffic trace Limited precision

Goal Focus on determining the format specification of a certain type of message first

Approach Use dynamic taint analysis to observe the data flow Observe how the program processes input messages Analyze individual messages Generalize to a message format by messages of a given type

Dynamic Taint Analysis Assign a unique label to each byte of network input Monitor the program, and analyze which byte is processed by which instruction (e.g., mov, sub)

Analysis of a Single Message - Finding delimiters Delimiter is one or more bytes that separate a field or message Record all operations that compare a tainted input byte with an untainted value Traverse each list and check consecutive labels Ex. Compares the first three bytes with ‘a’, and the fourth byte with ’b’ char Label list a 0, 1, 2 b 3 … Message H A B C Label 1 2 3

Analysis of a Single Message - Scopes and delimiter hierarchy Scope fields: A certain delimiter can be present multiple times in the scope field Delimited fields: A certain delimiter present once in the delimited field A delimited field can itself be a scope field for another character A hierarchy of fields reflects nested scopes

Analysis of a Single Message - Identifying length fields A length field is a number of bytes that store the length of another field (target field) Use static analysis to detect loops Look for loops where an exit condition tests the same labels on every iteration => Length field candidate

Analysis of a Single Message - Identifying target fields For each length field candidate, look at labels that is “touched” inside the loop Remove labels touched in all iterations Because those bytes are independent of the current loop iteration

Analysis of a Single Message - Extracting additional information Protocol keywords Compare input data with constant string File names Argument of a system call that opens or creates files Echoed fields Pointers (to somewhere else in packet) Unused fields

Analysis of Multiple Messages – Generalization (1/3) Message alignment Based on Needlman-Wunsch algorithm Extended to a hierarchy of fields

Analysis of Multiple Messages – Generalization (2/3) Operate on a tree of fields, not on a string of bytes To align two inner nodes, recursively call NW on the sequence of child nodes To align two leaf nodes, take into account field semantics Repetition detection Merge two or more consecutive, optional nodes into a single repetition node

Analysis of Multiple Messages – Generalization (3/3)

Evaluation (1/4)

Evaluation (2/4)

Evaluation (3/4) The results in these tables were obtained by manually comparing our specifications with official RFC documents and with Wireshark output Most of the fields were correctly identified Parsing another set of messages by generated specifications succeeded

Evaluation (4/4)

Conclusion Introduced a novel approach to automatic protocol reverse engineering Tested on common servers and protocols

Comments Automatically generate high-precision protocol specification Generated specification may be affected by program implementation