By mohamed saher and ahmed garhy

Slides:



Advertisements
Similar presentations
JavaScript FaaDoOEngineers.com FaaDoOEngineers.com.
Advertisements

CHAPTER 15 WEBPAGE OPTIMIZATION. LEARNING OBJECTIVES How to test your web-page performance How browser and server interactions impact performance What.
Java Script Session1 INTRODUCTION.
Nick Guo, Ulysses Wang JavaScript De-Obfuscation Engine -- JDOE.
Presented by Vaibhav Rastogi.  Advent of Web 2.0 and Mashups  Inclusion of untrusted third party content a necessity  Need to restrict the functionality.
Building Applications using ASP.NET and C# / Session 1 / 1 of 21 Session 1.
1 Subspace: Secure Cross Domain Communication for Web Mashups Collin Jackson and Helen J. Wang Mamadou H. Diallo.
Jarhead Analysis and Detection of Malicious Java Applets Johannes Schlumberger, Christopher Kruegel, Giovanni Vigna University of California Annual Computer.
INTRODUCTION TO WEB DATABASE PROGRAMMING
FALL 2005CSI 4118 – UNIVERSITY OF OTTAWA1 Part 4 Web technologies: HTTP, CGI, PHP,Java applets)
Beyond DHTML So far we have seen and used: CGI programs (using Perl ) and SSI on server side Java Script, VB Script, CSS and DOM on client side. For some.
JavaScript Teppo Räisänen LIIKE/OAMK HTML, CSS, JavaScript HTML defines the structure CSS defines the layout JavaScript is used for scripting It.
Warren He, Devdatta Akhawe, and Prateek MittalUniversity of California Berkeley This subset of the web application generates new requests to the server.
JavaScript, Fourth Edition
CNIT 133 Interactive Web Pags – JavaScript and AJAX JavaScript Environment.
JSP Java Server Pages Softsmith Infotech.
SEG3210 DHTML Tutorial. DHTML DHTML is a combination of technologies used to create dynamic and interactive Web sites. –HTML - For creating text and image.
Client Scripting1 Internet Systems Design. Client Scripting2 n “A scripting language is a programming language that is used to manipulate, customize,
JAVA SERVER PAGES. 2 SERVLETS The purpose of a servlet is to create a Web page in response to a client request Servlets are written in Java, with a little.
1 JavaScript in Context. Server-Side Programming.
Extending HTML CPSC 120 Principles of Computer Science April 9, 2012.
CH1. Hardware: CPU: Ex: compute server (executes processor-intensive applications for clients), Other servers, such as file servers, do some computation.
Generative Programming. Automated Assembly Lines.
Enterprise Integration Patterns CS3300 Fall 2015.
JavaScript Scripting language What is Scripting ? A scripting language, script language, or extension language is a programming language.
1 JavaScript in Context. Server-Side Programming.
© FPT SOFTWARE – TRAINING MATERIAL – Internal use 04e-BM/NS/HDCV/FSOFT v2/3 JSP Application Models.
Plug-in Architectures Presented by Truc Nguyen. What’s a plug-in? “a type of program that tightly integrates with a larger application to add a special.
Internet Applications (Cont’d) Basic Internet Applications – World Wide Web (WWW) Browser Architecture Static Documents Dynamic Documents Active Documents.
Scripting Languages Info derived largely from Programming Language Pragmatics, by Michael Scott.
JavaScript and Ajax (JavaScript Environment) Week 6 Web site:
1 Chapter 22 World Wide Web (HTTP) Chapter 22 World Wide Web (HTTP) Mi-Jung Choi Dept. of Computer Science and Engineering
Introduction to ASP.NET development. Background ASP released in 1996 ASP supported for a minimum 10 years from Windows 8 release ASP.Net 1.0 released.
World Wide Web has been created to share the text document across the world. In static web pages the requesting user has no ability to interact with the.
By Collin Donaldson. Hacking is only legal under the following circumstances: 1.You hack (penetration test) a device/network you own. 2.You gain explicit,
TCP/IP Protocol Suite 1 Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 22 World Wide Web and HTTP.
SQL INJECTION Lecturer: A.Prof.Dr. DANG TRAN KHANH Student :Le Nguyen Truong Giang.
JavaScript Invented 1995 Steve, Tony & Sharon. A Scripting Language (A scripting language is a lightweight programming language that supports the writing.
Chapter 5 Introduction to Defining Classes Fundamentals of Java.
1 Chapter 1 INTRODUCTION TO WEB. 2 Objectives In this chapter, you will: Become familiar with the architecture of the World Wide Web Learn about communication.
Applications Active Web Documents Active Web Documents.
Software Design Refinement Using Design Patterns
CS 330 Class 7 Comments on Exam Programming plan for today:
Automatic Web Security Unit Testing: XSS Vulnerability Detection Mahmoud Mohammadi, Bill Chu, Heather Richter, Emerson Murphy-Hill Presenter:
“Under the hood”: Angry Birds Maze
Chapter 1 Introduction.
Scripting Languages Info derived largely from Programming Language Pragmatics, by Michael Scott.
WWW and HTTP King Fahd University of Petroleum & Minerals
World Wide Web policy.
Ad-blocker circumvention System
Static Detection of Cross-Site Scripting Vulnerabilities
Compiler Construction (CS-636)
Play Framework: Introduction
Chapter 1 Introduction.
Processes The most important processes used in Web-based systems and their internal organization.
Compiler Lecture 1 CS510.
JavaScript.
JavaScript Introduction
DHTML Javascript Internet Technology.
WEB PROGRAMMING JavaScript.
DHTML Javascript Internet Technology.
Objective Understand web-based digital media production methods, software, and hardware. Course Weight : 10%
COMPONENTS – WHY? Object-oriented source-level re-use of code requires same source code language. Object-oriented source-level re-use may require understanding.
JavaScript CS 4640 Programming Languages for Web Applications
JavaScript is a scripting language designed for Web pages by Netscape.
CNIT 133 Interactive Web Pags – JavaScript and AJAX
Creating dynamic/interactive web pages
Message Passing Systems Version 2
Exploring DOM-Based Cross Site Attacks
Message Passing Systems
Presentation transcript:

By mohamed saher and ahmed garhy Catch and release: a new look at detecting and mitigating highly obfuscated Exploit Kits By mohamed saher and ahmed garhy

Agenda Our Intent Rethinking Evasions Domain of the Problem Current Problem Problem with Current Solutions Solution #1 First Method Solution #2 Second Method

Our intent Is this function malicious? function Translate(objects, offset, size) { var length = 4; for (var i = 0; i < size; i++) { var r = rc.substr(0, length); if(offset > 0) { r = r.substr(offset) + r.substr(0, offset); } objects[i] = r.substr(0, r.length);

Our intent Is this function malicious? function Translate(objects, offset, size) { var length = 4; for (var i = 0; i < size; i++) { var r = rc.substr(0, length); if(offset > 0) { r = r.substr(offset) + r.substr(0, offset); } objects[i] = r.substr(0, r.length); Without understanding the context on how a function is used, it is very difficult to determine if it is malicious or not

Our intent What about this script? <script> var a = '%25%33%43%69%66%72%61%6d%65 ...'; var b = unescape(unescape(a)); var spray = new Function(unescape(b)); </script>

Our intent What about this script? <script> var a = '%25%33%43%69%66%72%61%6d%65 ...'; var b = unescape(unescape(a)); var spray = new Function(unescape(b)); </script> An “expert’s eye” can probably determine it looks suspicious. The two are actually equal to each other

Our intent What about this script? <script> var a = '%25%33%43%69%66%72%61%6d%65 ...'; var b = unescape(unescape(a)); var spray = new Function(unescape(b)); </script> An “expert’s eye” can probably determine it looks suspicious. The two are actually equal to each other Our intent is to allow an attack using the first example script, without depending on obfuscating like the second example script, and propose a more superior method for detecting both

Rethinking evasions Designing a new architecture

Rethinking evasions Designing a new architecture Use a message oriented architecture (MOA) to split the attack into disparate self contained messages – we refer to this as “units of work”

Rethinking evasions Designing a new architecture Use a message oriented architecture (MOA) to split the attack into disparate self contained messages – we refer to this as “units of work” This is a variation of the “script splitting” technique except a message exists within a local scope and is destroyed after it serves its purpose

Rethinking evasions Designing a new architecture Use a message oriented architecture (MOA) to split the attack into disparate self contained messages – we refer to this as “units of work” This is a variation of the “script splitting” technique except a message exists within a local scope and is destroyed after it serves its purpose Does not require DOM manipulation to hide “magic strings”

Rethinking evasions Designing a new architecture Use a message oriented architecture (MOA) to split the attack into disparate self contained messages – we refer to this as “units of work” This is a variation of the “script splitting” technique except a message exists within a local scope and is destroyed after it serves its purpose Does not require DOM manipulation to hide “magic strings” Avoid the “magic redirect IFRAME” that can be a trigger for some analyzers

Rethinking evasions Designing a new architecture Avoiding HTTP

Rethinking evasions Designing a new architecture Avoiding HTTP An artifact that can be parsed or scanned for patterns, characteristics, and definitions does not exist

Rethinking evasions Designing a new architecture Avoiding HTTP An artifact that can be parsed or scanned for patterns, characteristics, and definitions does not exist An alternative to loading JavaScript in “clear text”

Rethinking evasions Designing a new architecture Avoiding HTTP An artifact that can be parsed or scanned for patterns, characteristics, and definitions does not exist An alternative to loading JavaScript in “clear text” Load one message at a time, forcing each message to be analyzed independently – remember “units of work”

Rethinking evasions Designing a new architecture Avoiding HTTP An artifact that can be parsed or scanned for patterns, characteristics, and definitions does not exist An alternative to loading JavaScript in “clear text” Load one message at a time, forcing each message to be analyzed independently – remember “units of work” Web Sockets are a perfect candidate for both MOA and bypassing HTTP from a web environment

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Two components involved, client and server Client Listen Invoke

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Two components involved, client and server Client Server Listen State Invoke Send

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Two components involved, client and server For each accepted connection from a client, server maintains a state machine

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Two components involved, client and server For each accepted connection from a client, server maintains a state machine Messages are essentially commands and do not depend on each other – remember “units of work”

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Two components involved, client and server For each accepted connection from a client, server maintains a state machine Messages are essentially commands and do not depend on each other – remember “units of work” Client evaluates message, invokes message, and destroys it

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy Only client control flow is that of the client listening and invoking a message

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy Only client control flow is that of the client listening and invoking a message Order of messages not guaranteed by server. Server may send NOP messages as part of an attack to trick certain analyzers

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy Only client control flow is that of the client listening and invoking a message Order of messages not guaranteed by server. Server may send NOP messages as part of an attack to trick certain analyzers “Monkey patch” functions dynamically evaluated in messages to trick certain analyzers

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Web Sockets are simple TCP pipes, so data can be represented on the wire in an application specific way

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Web Sockets are simple TCP pipes, so data can be represented on the wire in an application specific way No longer restricted to sending JavaScript in clear text

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Web Sockets are simple TCP pipes, so data can be represented on the wire in an application specific way No longer restricted to sending JavaScript in clear text Create custom binary format

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Web Sockets are simple TCP pipes, so data can be represented on the wire in an application specific way No longer restricted to sending JavaScript in clear text Create custom binary format Send message in binary on the wire 0100100001100101011011000110110001101111001000000100100001100001011011010110001001110101011100100110011100100001

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Web Sockets are simple TCP pipes, so data can be represented on the wire in an application specific way No longer restricted to sending JavaScript in clear text Create custom binary format Send message in binary on the wire Simply looking at a binary message won't give hints about what its contents are – is it an audio file, an image, even text?

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Web Sockets are simple TCP pipes, so data can be represented on the wire in an application specific way No longer restricted to sending JavaScript in clear text Create custom binary format Send message in binary on the wire Simply looking at a binary message won't give hints about what its contents are – is it an audio file, an image, even text? To even begin to understand a binary message, its format specification needs to be known beforehand or else it is a very challenging problem in its own

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Confusing the Context

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Confusing the Context Remember this function? function Translate(objects, offset, size) { var length = 4; for (var i = 0; i < size; i++) { var r = rc.substr(0, length); if(offset > 0) { r = r.substr(offset) + r.substr(0, offset); } objects[i] = r.substr(0, r.length);

Rethinking evasions Designing a new architecture Avoiding HTTP Avoiding client side state Limit control flow and function call hierarchy Getting creative in transport format Confusing the Context Remember this function? function Translate(objects, offset, size) { var length = 4; for (var i = 0; i < size; i++) { var r = rc.substr(0, length); if(offset > 0) { r = r.substr(offset) + r.substr(0, offset); } objects[i] = r.substr(0, r.length); Now that we get this from our binary format, we again ask the question, how do you determine if it is malicious?

Domain of the problem How can we define a malicious website?

Domain of the problem How can we define a malicious website? How can we detect a malicious website?

Domain of the problem How can we define a malicious website? How can we detect a malicious website? How can we detect obfuscation?

Domain of the problem How can we define a malicious website? How can we detect a malicious website? How can we detect obfuscation? How can we identify obfuscation used for malicious purposes?

Domain of the problem How can we define a malicious website? How can we detect a malicious website? How can we detect obfuscation? How can we identify obfuscation used for malicious purposes? How can we categorize what is malicious and what is not?

Current Problem Exploits delivered at some point relies on JavaScript

Current Problem Exploits delivered at some point relies on JavaScript JavaScript is continuously getting obfuscated with more complexity

Current Problem Exploits delivered at some point relies on JavaScript JavaScript is continuously getting obfuscated with more complexity Current solutions are way behind in technology

Problems with current solutions Relies heavily on invocative functions that are not a concrete base to be malicious (fromCharCode, eval, unescape, etc.) and have plenty of legitimate use cases

Problems with current solutions Relies heavily on invocative functions that are not a concrete base to be malicious (fromCharCode, eval, unescape, etc.) and have plenty of legitimate use cases DOM and CSS selectors

Problems with current solutions Relies heavily on invocative functions that are not a concrete base to be malicious (fromCharCode, eval, unescape, etc.) and have plenty of legitimate use cases DOM and CSS selectors Client side proxies for client-server interaction

Problems with current solutions Relies heavily on invocative functions that are not a concrete base to be malicious (fromCharCode, eval, unescape, etc.) and have plenty of legitimate use cases DOM and CSS selectors Client side proxies for client-server interaction Client side template engines

Problems with current solutions Relies heavily on invocative functions that are not a concrete base to be malicious (fromCharCode, eval, unescape, etc.) and have plenty of legitimate use cases Limited sets of characteristics

Problems with current solutions Relies heavily on invocative functions that are not a concrete base to be malicious (fromCharCode, eval, unescape, etc.) and have plenty of legitimate use cases Limited sets of characteristics Probabilistic decisions is directly proportional with the characteristics extracted

Types of approaches Dynamic analysis of embedded JS

Types of approaches Dynamic analysis of embedded JS Static analysis of extracted JS (Method #1)

Types of approaches Dynamic analysis of embedded JS Static analysis of extracted JS (Method #1) Static analysis of extracted JS (Method #2)

Dynamic Analysis AdHoc Forwarding

Dynamic Analysis AdHoc Forwarding Create a middle layer between the browser and the JS engine

Dynamic Analysis AdHoc Forwarding Create a middle layer between the browser and the JS engine Analyze the CFG of the scripts being executed

Dynamic Analysis AdHoc Forwarding Create a middle layer between the browser and the JS engine Analyze the CFG of the scripts being executed Analyze a call hierarchy of functions order

Dynamic Analysis AdHoc Forwarding Create a middle layer between the browser and the JS engine Analyze the CFG of the scripts being executed Analyze a call hierarchy of functions order Analyze certain combination of functions used including known highly risky ones

Dynamic Analysis AdHoc Forwarding Browser Automation

Dynamic Analysis AdHoc Forwarding Browser Automation Attach to IE process

Dynamic Analysis AdHoc Forwarding Browser Automation Attach to IE process Use shdocvw.dll to automate COM callbacks

Dynamic Analysis AdHoc Forwarding Browser Automation Attach to IE process Use shdocvw.dll to automate COM callbacks Capture events while they trigger and manipulate them

Dynamic Analysis AdHoc Forwarding Browser Automation Attach to IE process Use shdocvw.dll to automate COM callbacks Capture events while they trigger and manipulate them Analyze in the same manner as AdHoc Forwarding

Dynamic Analysis AdHoc Forwarding Browser Automation Browser In-Memory Injection

Dynamic Analysis AdHoc Forwarding Browser Automation Browser In-Memory Injection Inject JS in DOM to monitor events

Dynamic Analysis AdHoc Forwarding Browser Automation Browser In-Memory Injection Inject JS in DOM to monitor events Use a JS Debugger (FireBug or other)

Static analysis (Method 1) Extract local scripts

Static analysis (Method 1) Extract local scripts Extract remote scripts

Static analysis (Method 1) Analyze the script and categorize them based on certain criteria

Static analysis (Method 1) Analyze the script and categorize them based on certain criteria Web page encoding

Static analysis (Method 1) Analyze the script and categorize them based on certain criteria Web page encoding Detecting current language used and extracting features

Static analysis (Method 1) Analyze the script and categorize them based on certain criteria Web page encoding Detecting current language used and extracting features Check the WHOIS for the web page

Static analysis (Method 1) Analyze the script and categorize them based on certain criteria Web page encoding Detecting current language used and extracting features Check the WHOIS for the web page Determine probabilistically to which category it belongs to

Shannon’s entropy Formula

Shannon’s entropy Formula We use Shannon’s Entropy to determine the entropy of the file only as a side-effect and not a main criteria to determine the decision whether it was malicious or not

Naïve Bayesian A machine-learning technique that can be used to predict to which category a particular data case belongs

Naïve Bayesian A machine-learning technique that can be used to predict to which category a particular data case belongs Given the above formula’: An event A is INDEPENDENT from event B if the conditional probability is the same as the marginal probability

Laplacian Smoothing To avoid having a 0 joint in any partial probability we use the add-one smoothing technique

Laplacian Smoothing To avoid having a 0 joint in any partial probability we use the add-one smoothing technique. Given an observation x = (x1, …, xd) from a multinomial distribution with N trials and parameter vector θ = (θ1, …, θd), a "smoothed" version of the data gives the estimator where α > 0 is the smoothing parameter (α = 0 corresponds to no smoothing)

Static analysis (Method 2) How is JS executed/handled?

Static analysis (Method 2) How is JS executed/handled? The code is scanned for all function(s) declaration. Each declaration is executed by creating a function object and a named reference to that function is created so that the function can be called from within a statement.

Static analysis (Method 2) How is JS executed/handled? The code is scanned for all function(s) declaration. Each declaration is executed by creating a function object and a named reference to that function is created so that the function can be called from within a statement. The statements are evaluated and executed by order as they appear on the page after fully loaded.

JS Example #1 This works <script> DoNothing(); function DoNothing() { return; } </script>

JS Example #2 This does not works <script> DoNothing(); </script> function DoNothing() { return; }

JS Example #3 This works <script> function DoNothing() { return; } </script> DoNothing();

JS Example #3 <script> // assuming that DoNothing is not defined DoNothing(); alert(1); </script> This does not works

JS Example #3 <script> // assuming that DoNothing is not defined DoNothing(); </script> alert(1); This works

Static analysis (Method 2) Semantic analysis to focus on “what does this mean”

Static analysis (Method 2) Semantic analysis to focus on “what does this mean” Optimizer-Compiler for JS which focuses on structure other than extracted invocative functions

Optimizer-compiler The following describes the architecture of any ordinary compiler and the current compiler as well Lexer Tokens Parser AST Translator IR Optimizer

Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Hidden Classes

Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Hidden Classes Type Inference

Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Hidden Classes Type Inference Inline Caches

Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Hidden Classes Type Inference Inline Caches Function Synthesis

Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Inline Expansion Hidden Classes Type Inference Inline Caches Function Synthesis

Loop Invariant Code Motion Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Inline Expansion Loop Invariant Code Motion Hidden Classes Type Inference Inline Caches Function Synthesis

Loop Invariant Code Motion Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Inline Expansion Loop Invariant Code Motion Hidden Classes Constant Folding Type Inference Inline Caches Function Synthesis

Loop Invariant Code Motion Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Inline Expansion Loop Invariant Code Motion Hidden Classes Constant Folding Type Inference Copy Propagation Inline Caches Function Synthesis

Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Inline Expansion Loop Invariant Code Motion Hidden Classes Constant Folding Type Inference Copy Propagation Inline Caches Common Sub-Expression Elimination Function Synthesis

Optimizer-compiler At this phase the optimizer tries to optimize the JS input based on optimization theories after the AST was generated and converted into an IR Optimizer Inline Expansion Loop Invariant Code Motion Hidden Classes Constant Folding Type Inference Copy Propagation Inline Caches Common Sub-Expression Elimination Function Synthesis Dead Code Elimination