Sruthi Bandhakavi Samuel T. King P. Madhusudan Marianne Winslett University of Illinois at Urbana Champaign

Slides:



Advertisements
Similar presentations
PHP I.
Advertisements

Hossain Shahriar Mohammad Zulkernine. One of the worst vulnerabilities in web applications It involves the generation of dynamic HTML contents with invalidated.
CPSC 388 – Compiler Design and Construction
Semantics Static semantics Dynamic semantics attribute grammars
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
JavaScript FaaDoOEngineers.com FaaDoOEngineers.com.
Context-Sensitive Interprocedural Points-to Analysis in the Presence of Function Pointers Presentation by Patrick Kaleem Justin.
Programming Languages and Paradigms
GATEKEEPER MOSTLY STATIC ENFORCEMENT OF SECURITY AND RELIABILITY PROPERTIES FOR JAVASCRIPT CODE Salvatore Guarnieri & Benjamin Livshits Presented by Michael.
Ensuring Operating System Kernel Integrity with OSck By Owen S. Hofmann Alan M. Dunn Sangman Kim Indrajit Roy Emmett Witchel Kent State University College.
The Assembly Language Level
Program Representations. Representing programs Goals.
An Evaluation of the Google Chrome Extension Security Architecture
By Philipp Vogt, Florian Nentwich, Nenad Jovanovic, Engin Kirda, Christopher Kruegel, and Giovanni Vigna Network and Distributed System Security(NDSS ‘07)
1 / 28 Modeling the HTML DOM and Browser API in Static Analysis of JavaScript Web Applications ESEC/FSE 2011 Anders Møller, Magnus Madsen and Simon Holm.
CS 330 Programming Languages 10 / 16 / 2008 Instructor: Michael Eckmann.
Tutorial 10 Programming with JavaScript
XP Tutorial 1 New Perspectives on JavaScript, Comprehensive1 Introducing JavaScript Hiding Addresses from Spammers.
C++ fundamentals.
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
D ATABASE S ECURITY Proposed by Abdulrahman Aldekhelallah University of Scranton – CS521 Spring2015.
VEX: Vetting Browser Extensions For Security Vulnerabilities Sruthi Bandhakavi, Samuel T. King, P. Madhusudan, Marianne Winslett University of Illinois.
4.1 JavaScript Introduction
Syntax & Semantic Introduction Organization of Language Description Abstract Syntax Formal Syntax The Way of Writing Grammars Formal Semantic.
JavaScript, Fifth Edition Chapter 1 Introduction to JavaScript.
VEX: VETTING BROWSER EXTENSIONS FOR SECURITY VULNERABILITIES XIANG PAN.
CSCI 6962: Server-side Design and Programming Secure Web Programming.
NDSS 2007 Philipp Vogt, Florian Nentwich, Nenad Jovanovic, Engin Kirda, Christopher Kruegel, Giovanni Vigna.
Behavior-based Spyware Detection By Engin Kirda and Christopher Kruegel Secure Systems Lab Technical University Vienna Greg Banks, Giovanni Vigna, and.
Lecture slides prepared for “Computer Security: Principles and Practice”, 3/e, by William Stallings and Lawrie Brown, Chapter 5 “Database and Cloud Security”.
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
Computer Security and Penetration Testing
JavaScript: Functions © by Pearson Education, Inc. All Rights Reserved.
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
1 JavaScript in Context. Server-Side Programming.
Tutorial 10 Programming with JavaScript. XP Objectives Learn the history of JavaScript Create a script element Understand basic JavaScript syntax Write.
Tutorial 10 Programming with JavaScript
 2008 Pearson Education, Inc. All rights reserved JavaScript: Functions.
Extending HTML CPSC 120 Principles of Computer Science April 9, 2012.
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
Defending Browsers against Drive-by Downloads:Mitigating Heap-Spraying Code Injection Attacks Authors:Manuel Egele, Peter Wurzinger, Christopher Kruegel,
Formal Semantics Chapter Twenty-ThreeModern Programming Languages, 2nd ed.1.
Java server pages. A JSP file basically contains HTML, but with embedded JSP tags with snippets of Java code inside them. A JSP file basically contains.
Introduction to JavaScript CS101 Introduction to Computing.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University IWPSE 2003 Program.
David Lawrence 7/8/091Intro. to PHP -- David Lawrence.
Protecting Browsers from Extension Vulnerabilities Paper by: Adam Barth, Adrienne Porter Felt, Prateek Saxena at University of California, Berkeley and.
Testing OO software. State Based Testing State machine: implementation-independent specification (model) of the dynamic behaviour of the system State:
Plug-in Architectures Presented by Truc Nguyen. What’s a plug-in? “a type of program that tightly integrates with a larger application to add a special.
Introduction Program File Authorization Security Theorem Active Code Authorization Authorization Logic Implementation considerations Conclusion.
CMSC 330: Organization of Programming Languages Operational Semantics a.k.a. “WTF is Project 4, Part 3?”
Review of Parnas’ Criteria for Decomposing Systems into Modules Zheng Wang, Yuan Zhang Michigan State University 04/19/2002.
JavaScript: Functions © by Pearson Education, Inc. All Rights Reserved.
Rich Internet Applications 2. Core JavaScript. The importance of JavaScript Many choices open to the developer for server-side Can choose server technology.
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
JavaScript Introduction and Background. 2 Web languages Three formal languages HTML JavaScript CSS Three different tasks Document description Client-side.
XML Schema – XSLT Week 8 Web site:
REEM ALMOTIRI Information Technology Department Majmaah University.
Database and Cloud Security
Automatic Web Security Unit Testing: XSS Vulnerability Detection Mahmoud Mohammadi, Bill Chu, Heather Richter, Emerson Murphy-Hill Presenter:
Static Detection of Cross-Site Scripting Vulnerabilities
CS 326 Programming Languages, Concepts and Implementation
JavaScript: Functions
PHP.
UNIT V Run Time Environments.
JavaScript CS 4640 Programming Languages for Web Applications
An Introduction to JavaScript
Protecting Browsers from Extension Vulnerabilities
Presentation transcript:

Sruthi Bandhakavi Samuel T. King P. Madhusudan Marianne Winslett University of Illinois at Urbana Champaign Presented by Doron Tromer

Introduction What is VEX in a few words ? A framework for highlighting potential security vulnerabilities in browser extensions by applying static information-flow analysis to the JavaScript code used to implement extensions. 2

Browser Extensions What are they? A browser extension is a computer program that extends the functionality of a web browser in some way. Extensions can be used to modify the behavior of existing features of the application or add entirely new features. Extensions are especially popular with Firefox, because Mozilla developers intend for the browser to be a fairly minimalistic application in order to reduce software bloat and bugs, while retaining a high degree of extensibility. 3

Mozilla Firefox Extensions Extension technologies: HTML, CSS, DOM, JavaScript, XPCOM, XPConnect, XPI, XUL, Mozilla JetPack Uses of extensions in Firefox: adding features - RSS readers, bookmark organizers, toolbars, FTP, Firebug, HttpFox modifying how the user views web pages – Adblock, Greasemonkey Plugins - Acrobat Reader, Flash Player, Windows Media Player 4

Mozilla privilege levels Page - for web pages displayed in the browser’s content window. restrictive - a page loaded from site x cannot access content from sites other than x Chrome - for elements belonging to Firefox and its extensions. Gives access to: all browser states and events OS resources all web pages Extensions have full chrome privileges by using an API called the XPCOM Components to extension JavaScript, thereby allowing the extensions to have access to all the resources Firefox can access. 5

Mozilla privilege levels – cont. 6 Chrome Page

privilege issues Extensions can: Access objects that run with page privileges and interact with page content. Access objects that run with full chrome privileges. Include user interface components via a chrome document, which also runs with full chrome privileges: window.content.document.  Thus, it can Lead to execution of remote code in privileged context, e.g. RSS reader extension takes the content of the RSS feed (HTML code) and insert it into the extension window. 7

Vulnerabilities in Browser Extensions Extensions might be malicious and exploit the full privileges. But even extensions written with benign intent can have subtle vulnerabilities that expose the user to a disastrous attack from the web.  mostly by injecting JavaScript into a data item that is executed by the extension under full browser privileges. Doing so attackers can: take over the browser steal cookies or protected passwords compromise confidential information hijack the host system 8

What is done today regarding these vulnerabilities Mozilla provides a set of security primitives to extension developers the goal: reducing the attack surface for extensions. disadvantages: discretionary primitives, difficult to understand and use correctly. Example: evalInSandbox (text, sandbox) 9

What is done today regarding these vulnerabilities – cont. The research community propose dynamic techniques such as SABRE system the goal: improving the security of extensions. how is it done: The SABRE system tracks JavaScript objects to prevent extensions from accessing sensitive information unsafely (using security labels for every JavaScript object inside the browser). pros: SABRE can prevent potentially malicious flows from both exploited extensions and from malicious extensions. cons: overhead (SunSpider - 6.1x, V8 JavaScript x). security violation notification: users must determine if a particular flow is malicious or benign. Determining whether extensions are malicious or harbor security vulnerabilities is a hard problem 10

Assumptions in VEX 1. The developer isn’t malicious, but he could write incorrect code that contains vulnerabilities. 2. There are no bugs in the browser itself. 3. There are no bugs in other browser extensibility mechanisms, such as plug-ins. 11

Attack models considered 1. Attacks that originate from web sites. The attacker can send arbitrary HTML and JavaScript to the user’s browser, that might lead to code injection or privilege escalation through buggy extensions. 2. In the second model some web sites are considered trusted. 12

Points of attack VEX focuses on vulnerable points for code injection and privilege escalation attacks: Eval: interprets string data as JavaScript and executes it dynamically. InnerHTML: each HTML element for a page has an innerHTML property that defines the text that occurs between that element’s tags. Extensions can change DOM (document object model) elements, or add new ones. EvalInSandbox: execution of JavaScript in the extension’s context with restricted privileges. WrappedJSObject: lets the extension access modified properties of the document object, even when automatic wrapping is on. 13

Information flow A variable A is said to depend on another variable B in a procedure if there is a path such that the value of B can cause the value of A to change we also say that there is a flow from variable B to variable A types of dependencies: strongly dependent: A = B + 1 weakly dependent: if (condition) A = B + 1 conditionally dependent: if (B > 0) A = 0 Information Flow Analysis (also called variable dependency analysis) is a study of the interdependencies of the program variables 14

Suspicious flow patterns tracked by VEX 1. From content document data to eval. 2. From content document data to innerHTML. 3. From Resource Description Framework (RDF) data to innerHTML. 4. EvalInSandbox return objects used improperly by code running with chrome privileges. 5. WrappedJSObject return object used improperly by code running with chrome privileges. These flows: Don’t always result in a vulnerability. Are not all of the possible extension security bugs. 15

An example for a suspicious flow pattern A flow from content document data to eval Wikipedia Toolbar, up to version

VEX’s work flow scheme 17

VEX’s anticipated contribution Such flow patterns may occur in only a few of the extensions that use these constructs. Mozilla offers an open-source automatic tool to help with reviews (see ) it just greps for strings that indicate dangerous patterns. then the reviewer needs to manually check all of the suspect extensions. this checking is difficult and error-prone. VEX is designed to help vetting the flows automatically, greatly reducing the number of extensions that need to be manually reviewed. 18

Static information flow analysis VEX is a general explicit information flow static analysis tool Computes flows between any source and sink. Tracks the precise dependencies of flows from variables to objects created in the JavaScript extension. This is a difficult task: large number of objects and functions. there are program defined objects as well as objects of DOM and of the extension (using XPCOM components) the objects are dynamic. new object properties can be created dynamically at run-time. functions are objects in JavaScript, they can be created, redefined dynamically, and passed as parameters. 19

Abstract Heaps The analysis uses an abstract heap (AH) the analysis keeps track of one abstract heap at each program point. VEX creates a node for every: object function Property Ignores the exact primitive values in the heap. The AH records explicit-flow dependencies to heap nodes. 20

Abstract Heaps – cont. A definition: Pvar – A set of all the program variables An abstract heap is a tuple: (ns,n,d,fr,dm,tm) ns - a set of heap locations. - represents the current node. - represents the subset of program variables that flow in to the current node n. - encodes the pointers representing properties (fields). What does mean? 21

Abstract Heaps – cont. - a relation that denotes a dependency map. What does mean? - a “this-map” relation, which is actually the relation of a function. What does mean? 22

A core subset of JavaScript Reflects the aspects of JavaScript, omitting certain features (such as eval) 23

The rules Big step operational semantics on abstract heaps: A relation prog - an program expression or statement - the initial abstract heap - the abstract heap obtained from the complete evaluation of prog starting from the heap This resulting heap, in every iteration, will be merged with the current heap, conservatively taking the union of dependencies. 24

Evaluating expressions 25

Evaluating expressions – cont. What happens to the AH when evaluating a constant? the only change is that the current node isn’t a heap location, and there isn’t any program variable that flow into it. Thus: Rule (CONSTANT) evaluates to a node with empty dependencies: 26

Evaluating expressions – cont. What happens to the AH when evaluating “this”? the current node is the node that is the scope of the current node. the program variables that flow into the current node are the those who flow into the scope of the current node. Thus: Rule (THIS) extracts the scope of the current node – 27

Evaluating expressions – cont. What happens to the AH when evaluating a variable access? There are 3 kinds of variable accesses: local JavaScript variables declared global JavaScript variables undeclared variables – automatically global 28

Evaluating expressions – cont. Thus, first the existence property x is checked in the current scope if it exists, the current node is the node of the variable, and so is the d part of AH Otherwise, the global node is checked for property x -if it exists, the same happens 29

Evaluating expressions – cont. Otherwise (not in the current or global scope), a new node is created and added to the global scope: a new heap location is created a new node is created its dependency is empty the existence property of x in the global heap is added to the fr the fact that the scope of the new node is the global heap, is added to “this-map” 30

Evaluating expressions – cont. What happens to the AH when evaluating a field access? if the variable x already exists in one of the heaps, and the field f of the node resulted from the variable access evaluation may be located in the field node, then: all the sets, maps and relations resulted by the evaluation are those of the AH resulted from the evaluation of the variable x only two additions: the current node is the one of the field, and the dependencies includes the program variables that flow into 31

Evaluating expressions – cont. Otherwise (if the variable x exists but the field node doesn’t) a new is created and added to the AH with the variable x : a new heap location is created a new node is created the dependencies are those of the AH resulted from the variable evaluation the existence property of f in the AH with x is added to the fr the fact that the scope of the new node is the node representing x, is added to “this-map” 32

Evaluating expressions – cont. What happens to the AH when evaluating a binary operation? the new AH is the union of dependencies of both the expressions includes union of heap locations, dependencies, fr’s, dependency maps and “this-maps”. The current node is a new node representing the operation. 33

Evaluating expressions – cont. What happens to the AH when evaluating a object literal? example: a summary is computed by recursively creating heap locations for each of its properties. 34

Evaluating expressions – cont. What happens to the AH when evaluating a function definition? like with object literals, except that new summary locations are created for each of the function arguments and also for the return variable. the function body is evaluated with respect to the new heap. the result of the evaluation is the new heap with the function summary attached to the node of the return value. 35

Evaluating expressions – cont. What happens to the AH when evaluating a function call? uses this summary to compute the node and dependencies of the return value. the return value of the function can be obtained by evaluating each of the function argument expressions, and replacing the appropriate nodes in the function summary with the values returned. if the function is not defined, then the dependencies of the return values are the union of dependencies of the individual function parameters. 36

Evaluating statements 37

Evaluating statements – cont. What happens to the AH when evaluating skip and sequence statements? What happens to the AH when evaluating a variable declaration? a new node is created in the current scope. if the heap node for that variable already exists, it is replaced by this new node. 38

Evaluating statements – cont. What happens to the AH when evaluating assignment statements? the left hand side and the right hand side expressions are evaluated, and the node on the left hand side is replaced with the node on the right hand side. 39

Evaluating statements – cont. What happens to the AH when evaluating conditionals? they are not evaluated as our heaps are symbolic What happens to the AH when evaluating a return statement? If evaluation of e with the AH results in, then the AH after returning e is the same, with the emphasis on the change in fr. 40

Evaluating statements – cont. What happens to the AH when evaluating while statements? while statements, like conditionals, are not evaluated as our heaps are symbolic the while body is evaluated until we reach a fixed point (or until we reach a fixed number of loop un-rollings) the abstract heap is also allowed to immediately go across a while-loop The analyze begins with an initial state consisting of a global heap (with summaries for a few built-in objects like Array) The evaluation of the rules either proceed until we converge on a least fixed-point, or until we reach a preset bound on the number of iterations 41

Handling other features of JavaScript Dynamic code (eval): an accurate analysis of the structure of dynamically created code is too complex furthermore, eval statements cannot be simply ignored VEX implements a static constant-string analysis for strings, and subject the strings that are eval-ed to this analysis Strings that are not statically known but subject to eval are essentially ignored innerHTML: creating a symbolic representation of the source, computing summaries of innerHTML and allowing outside methods to instantiate the symbolic source to a concrete source in whichever context it becomes available. 42

Notes about the analysis The analysis is: Flow-sensitive: takes into account the order of statements in a program. Path-sensitive: computes different pieces of analysis information dependent on the predicates at conditional branch instructions. Context-sensitive: interprocedural analysis that considers the calling context when analyzing the target of a function call. 43

Notes about the analysis – cont. Unsoundness: a static analysis tool like VEX is inherently conservative if VEX reports a flow, there may be no such feasible flow in the program (false positives) Incompleteness false negatives are also possible because of several unsummarized objects VEX has several sources of unsoundness and incompleteness: eval prototypes higher-order functions fixed number of unrolls of loops exceptions 44

Evaluation: VEX implementation VEX: is implemented in Java (2000 LOC) utilizes a JavaScript parser built using the ANTLR parser generator for the JavaScript 1.5 grammar. ANTLR outputs Java-based Abstract Syntax Trees (AST) for JavaScript files. VEX walks through the ASTs computing the flow sets from all sources to all sinks, in a single pass analysis 45

Evaluation - cont. 1. The current version of VEX checks these flow patterns that capture flows from injectable sources to executable sinks: 46

Evaluation - cont. 2. Furthermore, VEX searches for these patterns that characterize unsafe programming practices that could lead to security vulnerabilities: The VEX tool can be adapted to other kinds of suspect flows 47

Evaluation methodology The experiment’s steps: 1. Chose a random sample of 1827 extensions from the Mozilla add-ons web site (first extensions in alphabetical order for all subject categories) 2. Chose 699 of the most popular extensions (74 extensions in common, total of 2452 extensions) 3. Extracted the JavaScript files from these extensions 4. Ran VEX on them, using a 2.4GHz 64 bit x86 processor with a maximum heap size of 4GB for the JVM 48

Experimental results Finding flows from injectible sources to executable sinks: on average, VEX took 15.5 seconds per extension 49

Experimental results – cont. Finding unsafe programming practices: 15 of the alerts were analyzed manually 50

Successful attacks Attack scripts example: 51

Successful attacks Vulnerabilities founded by VEX: Wikipedia Toolbar, up to version Fizzle versions 0.5, 0.5.1, Beatnik version

Conclusion Advantages of VEX: VEX vets the flows automatically greatly reduces the number of extensions that need to be manually reviewed 15.5 seconds per extension instead of hours more accurate than manual review VEX performs the analysis only once and from the results, allow us to search for any source-to-sink flow Flow-sensitive, path-sensitive, context-sensitive analysis 53

Conclusion – cont. Disadvantages of VEX: Unsoundness and incompleteness false positives and false negatives The design choices aren’t necessarily optimal No modeling of actual values conditional and while statements aren’t evaluated The evaluation is executed until reaching a specific condition No evaluation of prototypes No evaluation of statically unknown strings subject to eval There is no information about the existence of known vulnerabilities that VEX hasn’t detected 54

Future Work 1. A points-to analysis more precise on certain aspects of JavaScript such as higher order functions, prototypes, and scoping 2. Defining a more complete set of flow-patterns (sources and sinks) that capture vulnerabilities 3. Automatically building attack vectors for statically discovered flows, by a constraint solver can help synthesize attacks (handling sanitization routines effectively) 55

תודה רבה ! 56