Download presentation
Presentation is loading. Please wait.
Published byNancy Briggs Modified over 9 years ago
1
Polygraph: Automatically Generating Signatures for Polymorphic Worms James Newsome, Brad Karp, and Dawn Song Carnegie Mellon University Presented by Ryan Gates
2
Overview Goal Composition of a worm Invariant bytes and Tokens Types of signatures ◦ Conjunction ◦ Token Subsequence ◦ Bayes Polygraph Signature Generator Metrics Results Evaluation
3
Goal Automate the generation of worm signatures ◦ Specifically polymorphic worms Prevent polymorphic worms from going undetected ◦ Including perfectly polymorphic instances
4
Decomposition of a worm Figure 1. Polymorphed ApacheKnacker Invariant bytes Wild card bytes Code bytes
5
Invariant Bytes Invariant framing ◦ Reserved key words or well known binary constants that are part of the wire protocol ◦ For example "HTTP" or "GET" Invariant overwrite values ◦ High order bytes of the overwritten address ◦ For example in BIND-TSIG "\xFF\xBF" Many invariant substrings are not sufficiently long to not prevent false positives. The solution is to let each set of invariant bytes be represented by a token
6
Tokens Tokens must not be a substring of another token ◦ For example HTTP not TTP Conjunction Signature Token Sub-sequence Signature Bayes Signature ◦ Each token value represents the probability of that token being present in an actual worm flow.
7
Conjunction Signatures Every token in the conjunction signature must be found in the payload for there to be a match All tokens are required to match Reduce false positives For example in the Apache-Knacker signature, ‘GET’, ‘HTTP/1.1\r\n’,’:’ are tokens in a conjunction signature
8
Token Subsequence Signatures Similar to the conjunction signature, but more restrictive. All tokens must be present in the correct order to reduce false positives Typically modeled using Regular Expressions For example in the BIND-TSIG signature, “GET.*HTTP/1.1\r\n.*…”
9
Bayes Signature Set of tokens, and each with a score If the sum the tokens exceeds a threshold then it is considered a match. A sample signature would include ‘\x00\x00\xFA’: 1.7574 Benefits ◦ Less rigid, which helps prevent false positives for common tokens. ◦ Higher quality signatures with a more diverse suspicious pool.
10
Limitations of Signature Types Bayes signature is unaffected by noise, until it grows beyond 80%. At this point there will be 100% false negatives. ◦ Flow classifier did a very poor job of classifying the flows. Conjunction and Token Subsequence cannot handle multiple types of worms ◦ The solution is to use clustering to separate the worms into manageable clusters
11
Clustering Clustering helps the conjunction and token subsequence signatures deal with variety Used to divide the suspicious flows into a number of different pools. Divide the suspicious pool into several clusters which contain types of flows ◦ Clusters should not be too general ◦ Clusters should not be too specific
12
Polygraph Signature Generator The polygraph monitor must have access to the network's packet flow. An imperfect flow classifier sorts packet flows into either the suspicious or innocuous pool.
13
Polygraph Signature Generator It will not distinguish between different worms, but merely suspicious flows and innocuous flows. Flow classifier is reliable, but imperfect. The result is noise.
14
Polygraph Signature Generator Uses samples to determine appropriate signatures for worms present in the suspicious flow pool. Resilient to noise in the system
15
Metrics Quality ◦ Low percentage of false positives and false negatives Efficiency in generation ◦ Lower computational cost Efficiency in matching ◦ Should not inhibit the network traffic Generate small signature sets ◦ Limit the number of signatures Robustness ◦ Yield high quality signature even with noise and a variety of worms ◦ Resistance to clever evasion by worms
16
Results | ApacheKnacker Table 1. ApacheKnacker signatures. These signatures were successfully generated for innocuous pools containing at least 3 worm samples. Best performer was Token Subsequence The ordering used in the Token Subsequence signature helps reduce the number of false positives.
17
Results | BIND-TSIG Table 2. BINDTSIG signatures. These signatures were successfully generated for innocuous pools containing at least 3 worm samples. The best performers were Conjunction and Token Subsequence. Bayes signature quality is degraded when the tokens are common in other innocuous flows.
18
Results | Coincidental Pattern Coincidental Patter attack injects invariant bytes in wildcard bytes to confuse the signature generater.
19
Contribution Polygraph helps to automate signature generation Examined the effects that implementing polymorphism on worms could have on worm signature generation and matching. Introduced imperfections in the classifying of network flows
20
Limitations Worms that lack invariant code Requires a flow classifier and at least 3 worm samples If the innocuous pool is too diverse, there will be too many false positives.
21
Improvements and Future Work Take advantage of multiple cores. Incorporate the design of an efficient flow classifier Determine how feasible it is to inspect network traffic Determine an algorithm to choose best signature to use
22
References J. Newsome, B. Karp, and D. Song. Polygraph: Automatically generating signatures for polymorphic worms. In IEEE Security and Privacy Symposium, 2005.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.