Presentation is loading. Please wait.

Presentation is loading. Please wait.

Normalizing Metamorphic Malware Using Term Rewriting

Similar presentations


Presentation on theme: "Normalizing Metamorphic Malware Using Term Rewriting"— Presentation transcript:

1 Normalizing Metamorphic Malware Using Term Rewriting
A. Walenstein, R. Mathur, M. R. Chouchane, and A. Lakhotia Software Research Laboratory The University of Louisiana at Lafayette Sixth IEEE International Workshop on Source Code Analysis and Manipulation 27th-29th September 2006 Philadelphia, PA, USA

2 He has since graduated and is now working at McAfee.
About this Work The core of the paper's work formed the Master's thesis of Rachit Mathur. He has since graduated and is now working at McAfee. 9/28/2006 SCAM'06

3 Malware Identification
Malware are malicious programs such as viruses, worms, and Trojans. Virus Form - A Antivirus scanners use extracted patterns, or “signatures” to identify known malware. Anti-Virus Anti-Virus scanners use a number of static and dynamic techniques to recognize a malicious program when they see one. Static signature scanning is among the most popular of these techniques. It simply consists of extracting a code segment from the malware and using that segment as a signature for that malware, meaning that when a scanner encounters a program containing that signature, it determines that the program is, or is infected with, the malware. Signature Signature 9/28/2006 SCAM'06

4 Metamorphic Malware Metamorphic malware change as they propagate
Virus Virus Virus M M Form - A Form - B Form - C Metamorphic malware change as they propagate They create multiple variants of themselves The phrase metamorphic malware is used to refer to malicious programs that spread modified copies of themselves. These copies are often called variants of the malware and it is typical for most metamorphic malware to have an intractably high number of them. So the sheer size of a database containing all such signatures for just one given metamorphic malware is a considerable challenge to static signature scanning. 9/28/2006 SCAM'06

5 Metamorphic Malware Challenge
Virus Virus Virus M M Form - A Form - B Form - C Using different signatures for most variants cannot scale. Anti-Virus The phrase metamorphic malware is used to refer to malicious programs that spread modified copies of themselves. These copies are often called variants of the malware and it is typical for most metamorphic malware to have an intractably high number of them. So the sheer size of a database containing all such signatures for just one given metamorphic malware is a considerable challenge to static signature scanning. Signature Too many signatures challenge the AV Scanner 9/28/2006 SCAM'06

6 Proposed approach: normalizer
Virus Form - A Form - B Form - C M N N N Virus Normalizer Construction Problem: Reduce the number of signatures needed to detect all variants. Our work proposes a code normalization approach that assists static signature scanning by reducing the size of the set of signatures needed to detect all of the variants of a given instruction-substituting metamorphic malware. NormalForm Anti-Virus Signature 9/28/2006 SCAM'06

7 Inspiration: “undo” transformations
push ecx mov ecx, [ebp + 10] mov ecx, ebp push eax add eax, 2342 mov eax, 33 add ecx, eax pop eax mov eax, esi mov esi, ecx push edx xor edx, 778f mov edx, 34 sub esi, edx pop edx mov [esi-2], eax pop esi pop ecx push ecx mov ecx, ebp push eax mov eax, 33 add ecx, eax pop eax push esi mov esi, ecx push edx mov edx, 34 sub esi, edx pop edx mov [esi - 2], eax pop esi pop ecx push ecx mov ecx,ebp add ecx,33 push esi mov esi,ecx sub esi,34 mov [esi-2],eax pop esi pop ecx push ecx mov ecx,ebp add ecx,33 mov [ecx-36],eax pop ecx mov [ebp - 3], eax Our study focused on metamorphic malware that uses a set of transformation rules each of which maps a code segment (the LHS) to another code segment (the RHS). When a LHS is encountered in the code to be transformed, the rule is applied provided that the overall semantics of the program does not change. Our normalization approach originated from the need to somehow "undo" the transformations by morphing the variants of a malware back to their original form. 9/28/2006 SCAM'06

8 Problem 1: “naïve” undo is naïve
push 0x04 mov eax, 0x04 mov edi, 0x04 3. mov eax, 0x04 push eax 2. push eax mov eax, 0x04 push 0x04 1. push ecx mov ecx, 0x04 mov edi, ecx pop ecx mov eax, 0x04 push eax The undoing procedure, however, must be controlled since, simply applying the rules in reverse may fail to terminate if for example the rewrite system contains loops. We addressed the termination issue by reorienting the rules of the malware's transformation system. This step ensured that the resulting system is strictly length reducing and hence terminating. Now this resulting system is terminating alright but may still not be convergent: As shown on this slide, completely reducing The application of rules in reverse may also diverge, as shown in this example. This is due to the fact that the prefix of one LHS if also the suffix of another. These concerns gave rise to the need for modifying the transformation system of the malware and judiciously applying the rules of the modified system so that termination is guaranteed and divergence is eliminated or reduced. The ultimate goal is to construct a convergent terminating normalizer for the variants of the malware. the code segment at left which contains two overlapping LHS may yield two normal forms. The problem with non convergent systems is that a variant of the malware may reduce to more than one normal form. So the goal now was to transform our terminating system non-convergent system into a convergent one. mov eax, 0x04 push eax push 0x04 9/28/2006 SCAM'06

9 Problem 2: conditional transformations
mov edi, 0x04 push ecx mov ecx, 0x04 mov edi, ecx pop ecx unconditional push eax mov eax, 0x04 push eax eax not live We have observed that certain rules are semantics-preserving only provided that certain conditions are satisfied. Rules (such as rule 1 for example) are unconditionally semantics-preserving and can be safely be applied whenever their left hand sides are encountered. Rules 2 and 3, on the other hand, are semantics-preserving only provided that register eax is not live at that point. Hence the application of these rules very much depend on the liveness of that register. push 0x04 mov eax, 0x04 push eax eax not live Q: how to reorient rules while guaranteeing termination? 9/28/2006 SCAM'06

10 Term rewriting approach
Adopted term-rewriting framework Model the metamorphic engine as TRS Modify it to create normalizing rule set and engine apply completion procedure, which reorients rules Can guarantee needed properties (termination, confluence) We have observed that certain rules are semantics-preserving only provided that certain conditions are satisfied. Rules (such as rule 1 for example) are unconditionally semantics-preserving and can be safely be applied whenever their left hand sides are encountered. Rules 2 and 3, on the other hand, are semantics-preserving only provided that register eax is not live at that point. Hence the application of these rules very much depend on the liveness of that register. 9/28/2006 SCAM'06

11 Completion procedure sketch
push 0x04 mov eax, 0x04 mov eax, 0x04 push eax Critical Pairs These normal forms are called critical pairs in Rewriting literature and can be resolved by adding a rule to the system that maps the largest to the smallest. Other overlaps between the LHS of the system are similarly resolved and rules will be added to the system until no code segment has more than normal form. mov eax, 0x04 push eax push 0x04 9/28/2006 SCAM'06

12 Completion procedure sketch
push 0x04 mov eax, 0x04 mov eax, 0x04 push eax Reorient New Rule mov eax, 0x04 push eax push 0x04 9/28/2006 SCAM'06

13 What to do when completion procedure fails?
Successful completion guarantees a unique normal form for all variants: The “perfect” normalizer but Completion procedure may not terminate! Number of rules in the normalizer may be too high to be practical Does not take into account conditions  Need alternative scheme If the completion procedure halts, it returns a “perfect normalizer”: a system that is equivalent to the malware's system, terminating, and convergent, which means that it can now be used to reduce all of the variants of the metamorphic malware to a unique normal form. This procedure for resolving overlaps is due to a paper by Knuth and Bendix. But it cannot be totally relied upon to generate a perfect normalizer from the malware transformation system. Because it it may not terminates, and even if does, it may return a normalizing system that is simply too large to be useful. It also does not take into account the presence of conditional rules in the malware’s transformation system. 9/28/2006 SCAM'06

14 Priority Scheme Simple
No Need for costly/imprecise condition evaluation Improved through Ad-hoc completion Partition N into NU and NC Input Program Normalize w.r.t NU The priority scheme repeatedly normalizes the input program with respect to the unconditional rules then with respect to the conditional ones. We chose this priority scheme instead of condition evaluation because correct condition evaluation is often very time consuming and suffers the limitations of static analysis. The priority scheme in our case study was not convergent, so he manually added rules to the system that we knew would further reduce the size of the normal form while preserving its semantics. This ad-hoc completion, consisted of manually identifying and undoing junk inserting rules. If possible, Apply a rule from NC Y yes Still Reducible? no NU – Unconditional rules NC – Conditional rules HALT 9/28/2006 SCAM'06

15 Question: condition checking required?
Conditional rules require checking of conditions Can be expensive, or impossible What is the practical penalty of incorrectly checking conditions? e.g., ignoring conditions completely? 9/28/2006 SCAM'06

16 Case Study W32.Evol Virus can generate huge number of variants
Tested the normalization schemes on 26 variants over 6 generations Manually Extracted rules used by W32.Evol 55 rules 84 overlaps TXL implementations: Ordinary and priority-based evaluation 9/28/2006 SCAM'06

17 Results Normalizer Generation Eve 2 3 4 5 6 Avg. size of original 2182
3257 4524 5788 6974 8455 Convergen t Avg. size of normal form 2173 Priority AC 2166 Priority WC 2167 2177 2183 2191 2204 Lines not in common 10 16 24 37 % in common 100. 0 99.5 4 99.2 7 98.9 0 98.3 2

18 Contributions Applications for assisting malware scanners
Initial exploration of possibility of “perfect” normalization Indications of usefulness of heuristic alternatives (priority scheme and ignoring conditions) 9/28/2006 SCAM'06

19 Future Work Expanded scope and empirical study
Extensions for semantics-non-preserving metamorphic engines? Localized normalization using term rewriting M. Chouchane and A. Lakhotia “Using Engine Signature to Detect Metamorphic Malware”, Workshop on Rapid Malcode, Fairfax, VA, Nov (to appear) More at 9/28/2006 SCAM'06

20 Alumni Software Research Lab Center for Advanced Computer Studies
Nitin Jyoti, Avertlabs Aditya Kapoor, McAfee Erik Uday Kumar, Authentium Rachit Mathur, McAfee Moinuddin Mohammed, Microsoft Prashant Pathak, Symantec Prabhat Singh, Symantec Funded by: Louisiana Governor’s IT Initiative Software Research Lab Center for Advanced Computer Studies University of Louisiana at Lafayette Arun Lakhotia Director Andrew Walenstein Research Scientist Michael Venable Software Engineer and Alumnus Ph.D. Students Mohamed R. Chouchane Md Enamul Karim M.S. Students Christopher Thompson Matthew Hayes 9/28/2006 SCAM'06


Download ppt "Normalizing Metamorphic Malware Using Term Rewriting"

Similar presentations


Ads by Google