Download presentation
Presentation is loading. Please wait.
1
By mohamed saher and ahmed garhy
Catch and release: a new look at detecting and mitigating highly obfuscated Exploit Kits By mohamed saher and ahmed garhy
2
Agenda Our Intent Rethinking Evasions Domain of the Problem
Current Problem Problem with Current Solutions Solution #1 First Method Solution #2 Second Method
3
Our intent Is this function malicious?
function Translate(objects, offset, size) { var length = 4; for (var i = 0; i < size; i++) { var r = rc.substr(0, length); if(offset > 0) { r = r.substr(offset) + r.substr(0, offset); } objects[i] = r.substr(0, r.length);
4
Our intent Is this function malicious?
function Translate(objects, offset, size) { var length = 4; for (var i = 0; i < size; i++) { var r = rc.substr(0, length); if(offset > 0) { r = r.substr(offset) + r.substr(0, offset); } objects[i] = r.substr(0, r.length); Without understanding the context on how a function is used, it is very difficult to determine if it is malicious or not
5
Our intent What about this script? <script>
var a = '%25%33%43%69%66%72%61%6d%65 ...'; var b = unescape(unescape(a)); var spray = new Function(unescape(b)); </script>
6
Our intent What about this script? <script>
var a = '%25%33%43%69%66%72%61%6d%65 ...'; var b = unescape(unescape(a)); var spray = new Function(unescape(b)); </script> An “expert’s eye” can probably determine it looks suspicious. The two are actually equal to each other
7
Our intent What about this script? <script>
var a = '%25%33%43%69%66%72%61%6d%65 ...'; var b = unescape(unescape(a)); var spray = new Function(unescape(b)); </script> An “expert’s eye” can probably determine it looks suspicious. The two are actually equal to each other Our intent is to allow an attack using the first example script, without depending on obfuscating like the second example script, and propose a more superior method for detecting both
8
Rethinking evasions Designing a new architecture
9
Rethinking evasions Designing a new architecture
Use a message oriented architecture (MOA) to split the attack into disparate self contained messages – we refer to this as “units of work”
10
Rethinking evasions Designing a new architecture
Use a message oriented architecture (MOA) to split the attack into disparate self contained messages – we refer to this as “units of work” This is a variation of the “script splitting” technique except a message exists within a local scope and is destroyed after it serves its purpose
11
Rethinking evasions Designing a new architecture
Use a message oriented architecture (MOA) to split the attack into disparate self contained messages – we refer to this as “units of work” This is a variation of the “script splitting” technique except a message exists within a local scope and is destroyed after it serves its purpose Does not require DOM manipulation to hide “magic strings”
12
Rethinking evasions Designing a new architecture
Use a message oriented architecture (MOA) to split the attack into disparate self contained messages – we refer to this as “units of work” This is a variation of the “script splitting” technique except a message exists within a local scope and is destroyed after it serves its purpose Does not require DOM manipulation to hide “magic strings” Avoid the “magic redirect IFRAME” that can be a trigger for some analyzers
13
Rethinking evasions Designing a new architecture Avoiding HTTP
14
Rethinking evasions Designing a new architecture Avoiding HTTP
An artifact that can be parsed or scanned for patterns, characteristics, and definitions does not exist
15
Rethinking evasions Designing a new architecture Avoiding HTTP
An artifact that can be parsed or scanned for patterns, characteristics, and definitions does not exist An alternative to loading JavaScript in “clear text”
16
Rethinking evasions Designing a new architecture Avoiding HTTP
An artifact that can be parsed or scanned for patterns, characteristics, and definitions does not exist An alternative to loading JavaScript in “clear text” Load one message at a time, forcing each message to be analyzed independently – remember “units of work”
17
Rethinking evasions Designing a new architecture Avoiding HTTP
An artifact that can be parsed or scanned for patterns, characteristics, and definitions does not exist An alternative to loading JavaScript in “clear text” Load one message at a time, forcing each message to be analyzed independently – remember “units of work” Web Sockets are a perfect candidate for both MOA and bypassing HTTP from a web environment
18
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state
19
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Two components involved, client and server Client Listen Invoke
20
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Two components involved, client and server Client Server Listen State Invoke Send
21
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Two components involved, client and server For each accepted connection from a client, server maintains a state machine
22
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Two components involved, client and server For each accepted connection from a client, server maintains a state machine Messages are essentially commands and do not depend on each other – remember “units of work”
23
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Two components involved, client and server For each accepted connection from a client, server maintains a state machine Messages are essentially commands and do not depend on each other – remember “units of work” Client evaluates message, invokes message, and destroys it
24
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy
25
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy Only client control flow is that of the client listening and invoking a message
26
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy Only client control flow is that of the client listening and invoking a message Order of messages not guaranteed by server. Server may send NOP messages as part of an attack to trick certain analyzers
27
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy Only client control flow is that of the client listening and invoking a message Order of messages not guaranteed by server. Server may send NOP messages as part of an attack to trick certain analyzers “Monkey patch” functions dynamically evaluated in messages to trick certain analyzers
28
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format
29
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Web Sockets are simple TCP pipes, so data can be represented on the wire in an application specific way
30
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Web Sockets are simple TCP pipes, so data can be represented on the wire in an application specific way No longer restricted to sending JavaScript in clear text
31
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Web Sockets are simple TCP pipes, so data can be represented on the wire in an application specific way No longer restricted to sending JavaScript in clear text Create custom binary format
32
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Web Sockets are simple TCP pipes, so data can be represented on the wire in an application specific way No longer restricted to sending JavaScript in clear text Create custom binary format Send message in binary on the wire
33
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Web Sockets are simple TCP pipes, so data can be represented on the wire in an application specific way No longer restricted to sending JavaScript in clear text Create custom binary format Send message in binary on the wire Simply looking at a binary message won't give hints about what its contents are – is it an audio file, an image, even text?
34
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Web Sockets are simple TCP pipes, so data can be represented on the wire in an application specific way No longer restricted to sending JavaScript in clear text Create custom binary format Send message in binary on the wire Simply looking at a binary message won't give hints about what its contents are – is it an audio file, an image, even text? To even begin to understand a binary message, its format specification needs to be known beforehand or else it is a very challenging problem in its own
35
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Confusing the Context
36
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Confusing the Context Remember this function? function Translate(objects, offset, size) { var length = 4; for (var i = 0; i < size; i++) { var r = rc.substr(0, length); if(offset > 0) { r = r.substr(offset) + r.substr(0, offset); } objects[i] = r.substr(0, r.length);
37
Rethinking evasions Designing a new architecture Avoiding HTTP
Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Confusing the Context Remember this function? function Translate(objects, offset, size) { var length = 4; for (var i = 0; i < size; i++) { var r = rc.substr(0, length); if(offset > 0) { r = r.substr(offset) + r.substr(0, offset); } objects[i] = r.substr(0, r.length); Now that we get this from our binary format, we again ask the question, how do you determine if it is malicious?
38
Domain of the problem How can we define a malicious website?
39
Domain of the problem How can we define a malicious website?
How can we detect a malicious website?
40
Domain of the problem How can we define a malicious website?
How can we detect a malicious website? How can we detect obfuscation?
41
Domain of the problem How can we define a malicious website?
How can we detect a malicious website? How can we detect obfuscation? How can we identify obfuscation used for malicious purposes?
42
Domain of the problem How can we define a malicious website?
How can we detect a malicious website? How can we detect obfuscation? How can we identify obfuscation used for malicious purposes? How can we categorize what is malicious and what is not?
43
Current Problem Exploits delivered at some point relies on JavaScript
44
Current Problem Exploits delivered at some point relies on JavaScript
JavaScript is continuously getting obfuscated with more complexity
45
Current Problem Exploits delivered at some point relies on JavaScript
JavaScript is continuously getting obfuscated with more complexity Current solutions are way behind in technology
46
Problems with current solutions
Relies heavily on invocative functions that are not a concrete base to be malicious (fromCharCode, eval, unescape, etc.) and have plenty of legitimate use cases
47
Problems with current solutions
Relies heavily on invocative functions that are not a concrete base to be malicious (fromCharCode, eval, unescape, etc.) and have plenty of legitimate use cases DOM and CSS selectors
48
Problems with current solutions
Relies heavily on invocative functions that are not a concrete base to be malicious (fromCharCode, eval, unescape, etc.) and have plenty of legitimate use cases DOM and CSS selectors Client side proxies for client-server interaction
49
Problems with current solutions
Relies heavily on invocative functions that are not a concrete base to be malicious (fromCharCode, eval, unescape, etc.) and have plenty of legitimate use cases DOM and CSS selectors Client side proxies for client-server interaction Client side template engines
50
Problems with current solutions
Relies heavily on invocative functions that are not a concrete base to be malicious (fromCharCode, eval, unescape, etc.) and have plenty of legitimate use cases Limited sets of characteristics
51
Problems with current solutions
Relies heavily on invocative functions that are not a concrete base to be malicious (fromCharCode, eval, unescape, etc.) and have plenty of legitimate use cases Limited sets of characteristics Probabilistic decisions is directly proportional with the characteristics extracted
52
Types of approaches Dynamic analysis of embedded JS
53
Types of approaches Dynamic analysis of embedded JS
Static analysis of extracted JS (Method #1)
54
Types of approaches Dynamic analysis of embedded JS
Static analysis of extracted JS (Method #1) Static analysis of extracted JS (Method #2)
55
Dynamic Analysis AdHoc Forwarding
56
Dynamic Analysis AdHoc Forwarding
Create a middle layer between the browser and the JS engine
57
Dynamic Analysis AdHoc Forwarding
Create a middle layer between the browser and the JS engine Analyze the CFG of the scripts being executed
58
Dynamic Analysis AdHoc Forwarding
Create a middle layer between the browser and the JS engine Analyze the CFG of the scripts being executed Analyze a call hierarchy of functions order
59
Dynamic Analysis AdHoc Forwarding
Create a middle layer between the browser and the JS engine Analyze the CFG of the scripts being executed Analyze a call hierarchy of functions order Analyze certain combination of functions used including known highly risky ones
60
Dynamic Analysis AdHoc Forwarding Browser Automation
61
Dynamic Analysis AdHoc Forwarding Browser Automation
Attach to IE process
62
Dynamic Analysis AdHoc Forwarding Browser Automation
Attach to IE process Use shdocvw.dll to automate COM callbacks
63
Dynamic Analysis AdHoc Forwarding Browser Automation
Attach to IE process Use shdocvw.dll to automate COM callbacks Capture events while they trigger and manipulate them
64
Dynamic Analysis AdHoc Forwarding Browser Automation
Attach to IE process Use shdocvw.dll to automate COM callbacks Capture events while they trigger and manipulate them Analyze in the same manner as AdHoc Forwarding
65
Dynamic Analysis AdHoc Forwarding Browser Automation
Browser In-Memory Injection
66
Dynamic Analysis AdHoc Forwarding Browser Automation
Browser In-Memory Injection Inject JS in DOM to monitor events
67
Dynamic Analysis AdHoc Forwarding Browser Automation
Browser In-Memory Injection Inject JS in DOM to monitor events Use a JS Debugger (FireBug or other)
68
Static analysis (Method 1)
Extract local scripts
69
Static analysis (Method 1)
Extract local scripts Extract remote scripts
70
Static analysis (Method 1)
Analyze the script and categorize them based on certain criteria
71
Static analysis (Method 1)
Analyze the script and categorize them based on certain criteria Web page encoding
72
Static analysis (Method 1)
Analyze the script and categorize them based on certain criteria Web page encoding Detecting current language used and extracting features
73
Static analysis (Method 1)
Analyze the script and categorize them based on certain criteria Web page encoding Detecting current language used and extracting features Check the WHOIS for the web page
74
Static analysis (Method 1)
Analyze the script and categorize them based on certain criteria Web page encoding Detecting current language used and extracting features Check the WHOIS for the web page Determine probabilistically to which category it belongs to
75
Shannon’s entropy Formula
76
Shannon’s entropy Formula
We use Shannon’s Entropy to determine the entropy of the file only as a side-effect and not a main criteria to determine the decision whether it was malicious or not
77
Naïve Bayesian A machine-learning technique that can be used to predict to which category a particular data case belongs
78
Naïve Bayesian A machine-learning technique that can be used to predict to which category a particular data case belongs Given the above formula’: An event A is INDEPENDENT from event B if the conditional probability is the same as the marginal probability
79
Laplacian Smoothing To avoid having a 0 joint in any partial probability we use the add-one smoothing technique
80
Laplacian Smoothing To avoid having a 0 joint in any partial probability we use the add-one smoothing technique. Given an observation x = (x1, …, xd) from a multinomial distribution with N trials and parameter vector θ = (θ1, …, θd), a "smoothed" version of the data gives the estimator where α > 0 is the smoothing parameter (α = 0 corresponds to no smoothing)
81
Static analysis (Method 2)
How is JS executed/handled?
82
Static analysis (Method 2)
How is JS executed/handled? The code is scanned for all function(s) declaration. Each declaration is executed by creating a function object and a named reference to that function is created so that the function can be called from within a statement.
83
Static analysis (Method 2)
How is JS executed/handled? The code is scanned for all function(s) declaration. Each declaration is executed by creating a function object and a named reference to that function is created so that the function can be called from within a statement. The statements are evaluated and executed by order as they appear on the page after fully loaded.
84
JS Example #1 This works <script> DoNothing(); function DoNothing() { return; } </script>
85
JS Example #2 This does not works <script> DoNothing(); </script> function DoNothing() { return; }
86
JS Example #3 This works <script> function DoNothing() { return; } </script> DoNothing();
87
JS Example #3 <script> // assuming that DoNothing is not defined DoNothing(); alert(1); </script> This does not works
88
JS Example #3 <script> // assuming that DoNothing is not defined DoNothing(); </script> alert(1); This works
89
Static analysis (Method 2)
Semantic analysis to focus on “what does this mean”
90
Static analysis (Method 2)
Semantic analysis to focus on “what does this mean” Optimizer-Compiler for JS which focuses on structure other than extracted invocative functions
91
Optimizer-compiler The following describes the architecture of any ordinary compiler and the current compiler as well Lexer Tokens Parser AST Translator IR Optimizer
92
Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Hidden Classes
93
Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Hidden Classes Type Inference
94
Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Hidden Classes Type Inference Inline Caches
95
Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Hidden Classes Type Inference Inline Caches Function Synthesis
96
Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Inline Expansion Hidden Classes Type Inference Inline Caches Function Synthesis
97
Loop Invariant Code Motion
Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Inline Expansion Loop Invariant Code Motion Hidden Classes Type Inference Inline Caches Function Synthesis
98
Loop Invariant Code Motion
Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Inline Expansion Loop Invariant Code Motion Hidden Classes Constant Folding Type Inference Inline Caches Function Synthesis
99
Loop Invariant Code Motion
Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Inline Expansion Loop Invariant Code Motion Hidden Classes Constant Folding Type Inference Copy Propagation Inline Caches Function Synthesis
100
Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Inline Expansion Loop Invariant Code Motion Hidden Classes Constant Folding Type Inference Copy Propagation Inline Caches Common Sub-Expression Elimination Function Synthesis
101
Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Inline Expansion Loop Invariant Code Motion Hidden Classes Constant Folding Type Inference Copy Propagation Inline Caches Common Sub-Expression Elimination Function Synthesis Dead Code Elimination
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.