Towards Fully Automatic Placement of Security Sanitizers and Declassifiers Ben Livshits, Microsoft Research Stephen Chong, Harvard University.

Slides:



Advertisements
Similar presentations
Accelerating The Application Lifecycle. DEPLOY DEFINE DESIGN TEST DEVELOP CHANGE MANAGEMENT Application Lifecycle Management #1 in Java Meta, Giga, Gartner.
Advertisements

University of South Australia Distributed Reconfiguration Avishek Chakraborty, David Kearney, Mark Jasiunas.
Overview Structural Testing Introduction – General Concepts
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
2014 Network and Distributed System Security Symposium AppSealer: Automatic Generation of Vulnerability-Specific Patches for Preventing Component Hijecking.
Building web applications on top of encrypted data using Mylar Presented by Tenglu Liang Tai Liu.
Program Slicing Mark Weiser and Precise Dynamic Slicing Algorithms Xiangyu Zhang, Rajiv Gupta & Youtao Zhang Presented by Harini Ramaprasad.
Presented by Vaibhav Rastogi.  Advent of Web 2.0 and Mashups  Inclusion of untrusted third party content a necessity  Need to restrict the functionality.
By Philipp Vogt, Florian Nentwich, Nenad Jovanovic, Engin Kirda, Christopher Kruegel, and Giovanni Vigna Network and Distributed System Security(NDSS ‘07)
Partial Redundancy Elimination. Partial-Redundancy Elimination Minimize the number of expression evaluations By moving around the places where an expression.
Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.
1 Steve Chenoweth Friday, 10/21/11 Week 7, Day 4 Right – Good or bad policy? – Asking the user what to do next! From malware.net/how-to-remove-protection-system-
Technical Architectures
Future Work Needed Kenneth Wade Najim Yaqubie. Outline 1.Model is simple 2.Too many assumptions 3.Conflicting internal architectures 4.Security Challenges.
The Most Dangerous Code in the Browser Stefan Heule, Devon Rifkin, Alejandro Russo, Deian Stefan Stanford University, Chalmers University of Technology.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
It’s always better live. MSDN Events Security Best Practices Part 2 of 2 Reducing Vulnerabilities using Visual Studio 2008.
Dept. of Computer Science & Engineering, CUHK1 Trust- and Clustering-Based Authentication Services in Mobile Ad Hoc Networks Edith Ngai and Michael R.
ReferencesReferences DiscussionDiscussion Vulnerability Example: SQL injection Auditing Tool for Eclipse LAPSE: a Security Auditing Tool for Eclipse IntroductionIntroductionResultsResults.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Incremental Path Profiling Kevin Bierhoff and Laura Hiatt Path ProfilingIncremental ApproachExperimental Results Path profiling counts how often each path.
Robust Declassification Steve Zdancewic Andrew Myers Cornell University.
Jonas Thomsen, Ph.d. student Computer Science University of Aarhus Best Practices and Techniques for Building Secure Microsoft.
1 RAKSHA: A FLEXIBLE ARCHITECTURE FOR SOFTWARE SECURITY Computer Systems Laboratory Stanford University Hari Kannan, Michael Dalton, Christos Kozyrakis.
Mobile Data Sharing over Cloud Group No. 8 - Akshay Kantak - Swapnil Chavan - Harish Singh.
Firewalls and VPNS Team 9 Keith Elliot David Snyder Matthew While.
Leveraging User Interactions for In-Depth Testing of Web Application Sean McAllister Secure System Lab, Technical University Vienna, Austria Engin Kirda.
Automatic Mediation Of Privacy-sensitive Resource Access In Smartphone Applications Ben Livshits and Jaeyeon Jung Microsoft Research.
OWASP Mobile Top 10 Why They Matter and What We Can Do
D ATABASE S ECURITY Proposed by Abdulrahman Aldekhelallah University of Scranton – CS521 Spring2015.
Basic Concepts of Computer Networks
CSCI 5801: Software Engineering
Secure Embedded Processing through Hardware-assisted Run-time Monitoring Zubin Kumar.
Peter R. Pietzuch Ioannis Papagiannis Peter Pietzuch Large-Scale Distributed Systems Group ACM Cloud Computing.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
1 3 Web Proxies Web Protocols and Practice. 2 Topics Web Protocols and Practice WEB PROXIES  Web Proxy Definition  Three of the Most Common Intermediaries.
Calculating Discrete Logarithms John Hawley Nicolette Nicolosi Ryan Rivard.
Secure Web Applications via Automatic Partitioning Stephen Chong, Jed Liu, Andrew C. Meyers, Xin Qi, K. Vikram, Lantian Zheng, Xin Zheng. Cornell University.
Preventing SQL Injection Attacks in Stored Procedures Alex Hertz Chris Daiello CAP6135Dr. Cliff Zou University of Central Florida March 19, 2009.
M1G Introduction to Database Development 6. Building Applications.
1 A Static Analysis Approach for Automatically Generating Test Cases for Web Applications Presented by: Beverly Leung Fahim Rahman.
Preventing Web Application Injections with Complementary Character Coding Raymond Mui Phyllis Frankl Polytechnic Institute of NYU Presented at ESORICS.
SDN based Network Security Monitoring in Dynamic Cloud Networks Xiuzhen CHEN School of Information Security Engineering Shanghai Jiao Tong University,
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
Mining Multiple Private Databases Topk Queries Across Multiple Private Databases (2005) Li Xiong (Emory University) Subramanyam Chitti (GA Tech) Ling Liu.
SECURE WEB APPLICATIONS VIA AUTOMATIC PARTITIONING S. Chong, J. Liu, A. C. Myers, X. Qi, K. Vikram, L. Zheng, X. Zheng Cornell University.
Module 7 Planning and Deploying Messaging Compliance.
M. Alexander Helen J. Wang Yunxin Liu Microsoft Research 1 Presented by Zhaoliang Duan.
Sep 13, 2006 Scientific Computing 1 Managing Scientific Computing Projects Erik Deumens QTP and HPC Center.
Protecting Browsers from Extension Vulnerabilities Paper by: Adam Barth, Adrienne Porter Felt, Prateek Saxena at University of California, Berkeley and.
Security Attacks CS 795. Buffer Overflow Problem Buffer overflow Analysis of Buffer Overflow Attacks.
1 Firewalls - Introduction l What is a firewall? –Firewalls are frequently thought of as a very complex system that is some sort of magical, mystical..
Web Technologies Lecture 6 State preservation. Motivation How to keep user data while navigating on a website? – Authenticate only once – Store wish list.
1 Software Testing & Quality Assurance Lecture 13 Created by: Paulo Alencar Modified by: Frank Xu.
Saner: Composing Static and Dynamic Analysis to Validate Sanitization in Web Applications Davide Balzarotti, Marco Cova, Vika Felmetsger, Nenad Jovanovic,
Key Management and Distribution Anand Seetharam CST 312.
PGP Desktop (Client only) By: Courtney Wirtz & Vincent Verner.
Cooperative Caching in Wireless P2P Networks: Design, Implementation And Evaluation.
Detecting Web Attacks Using Multi-Stage Log Analysis
Web Application Vulnerabilities, Detection Mechanisms, and Defenses
TaintART: A Practical Multi-level Information-Flow Tracking System for Android RunTime Sadiq Basha.
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
Topic 5: Communication and the Internet
ONLINE SECURE DATA SERVICE
CSC-682 Advanced Computer Security
Unit 32 Every class minute counts! 2 assignments 3 tasks/assignment
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

Towards Fully Automatic Placement of Security Sanitizers and Declassifiers Ben Livshits, Microsoft Research Stephen Chong, Harvard University

Why Track Explicit Information Flow? (1) preventing injection attacks such as cross-site scripting (XSS) and SQL injection; and (2) preventing private data leaks in mobile applications 2

A Decade of Research & Development  Type systems [8, 32]  Static analysis [17, 18, 22, 42, 43, 47]  Runtime monitoring and enforcement [6, 7, 11, 25] String p1 = request.getParameter(“p1”); // source p2 = p1;... out.println(p2); // sink 3

Developers Are Not Able to Manage  Places burden for sanitizer placement on developer  Uses analysis (static/runtime/hybrid) to find placement violations  Large modern applications exceed developer’s ability to reason about inter- procedural and inter- modular data flow  We argue that data sanitization should not be developer’s responsibility  Runtime tracking is a possibility, but overhead too high!  Here we provide a hybrid static/runtime technique for taint tracking with low overhead 4

I. Motivation Three motivating applications from security and privacy: Web Mobile Cloud

Data Flow Graph & Sanitization Policy  Examined large-scale applications  Data processing done via a fixed set of sanitizers (for integrity) or declassifiers (for confidentiality)  Reasoning done via reachability on data flow graph  Developer’s mental model involves deciding which sanitizer to use at which point  Sanitizer used depends on where data comes from (source) and where it is going (sink) 6

Sanitizing Inputs in Web Applications  The OWASP Enterprise Security API (ESAPI) is an open-source web- application security library.  Usage guidelines of ESAPI reveal that the correct sanitization to apply to data depends on how the data will be used: sink context.  To sanitize user-provided URL, use encodeForURL(input). To sanitize input that will be used to construct a CSS attribute use encodeForCSS(input) 7

Declassifying Sensitive Data in Android Mobile Apps  Consider a Gmail app that needs to communicate with its parent site mail.google.com. It is necessary to send information to that hosting URL, including keystrokes, files to be attached to .  No compelling need to send user data to AdMob.com, a third- party mobile ad provider, whose library is embedded in the app.  So data sent to a third-party should be cleansed, i.e., should have sensitive information removed, a form of declassification 8

Encrypted Cloud  Consider a web application using a public cloud provider for storage.  Web application does not trust the cloud to protect the confidentiality of its data.  The application therefore will use encryption when serializing data to the database, and decryption when deserializing. 9

Contributions 1) Fully automatic sanitizer placement. Argue that sanitizer placement should be automatic, given a policy and an application. 2) Propose node-based placement, a simple node- based strategy for static sanitizer placement.  simple to implement  incurs no run-time overhead 3) Propose edge-based placement  attempts to place sanitizers statically  “spills over" into run time whenever necessary.  appropriate when simple node-based strategy fails 4) Evaluation on programs 1.8M LOC  edge-based approach provides full sanitization,  reduces number of instrumentation points by 6.19x on average  27x for sparse graphs 10

Graph & Valid Placement  Our algorithms work on the inter-procedural data flow graph  The graph can be obtained through different means  Sound pointer analysis  Unsound dataflow extraction tool such as used by static analysis companies Fortify and Coverity  Our graph came from CAT.NET, a static analysis tool for security for.NET applications Definition: Valid placement  Given a data flow graph G= and a source-sink pair  if P(I, O)=S, on every path from I -> O, sanitizer S is applied exactly once and no other sanitizer is applied;  if P(I, O)= ⊥, no sanitizer is applied on the path. 11

Motivating Example 12

I. Algorithms Here: intuition Glimpses of algorithm specification Paper: full algorithms Theorem showing correctness of placement

Node-based placementEdge-based placement Two Placement Strategies + Intuitive, easy to understand + Fast to compute - Fails to find a valid placement all too frequently + Finds valid placement and escapes into runtime tracking whenever necessary - Quite a bit more involved to compute and understand 14 Given inter-procedural flow graph G= and Policy table P

15 Node-Based Placement  We say that a node n is S i - possible if it is on a path from a source node I to a sink node O that requires sanitizer S i  These are merely possible points for S i sanitizer placement  S i -possible are pretty plentiful in most graphs  We say a node n is S i - exclusive if it is S i - possible, and it is not S j - possible for any j ≠ I  Placing at S i -exclusive nodes can be done without fear of trampling over another path  Unfortunately, exclusive nodes are pretty rare Given inter-procedural flow graph G= and Policy table P

How Do We Compute It? 16  Data flow formulation  Inspired by research on Common Subexpression Elimination (CSE) from the 1990s  Scales linearly  See Dragon Book for details

Node-based Placement Example S3S3 17 Exclusive nodes are not necessarily unique: multiple S i -exclusive nodes on a single path!

Latest-Exclusive Nodes  Late sanitization:  prefer to perform sanitization as late as possible  saving work in case program is terminated  sanitized values are larger than original ones  We say that node n is S i -latest-exclusive if  it is S i -exclusive, and  for every path through n, it is the last S i -exclusive on that path. 18

Latest-Exclusive Nodes Example S3S3 S3S3 S4S4 S4S4 S1S1 S1S1  Does not protect  2 → 18  2 → 19  3 → 20  4 → 20  Does not protect  2 → 18  2 → 19  3 → 20  4 → 20 19

Edge-based Placement  Computation pipeline  Classify edges into groups  Compute groups using multiple (pipelined) data flow analysis stages  Try fully static placement  Minimize the amount of runtime tracking Source-dependent edge Sink-dependent edge Source-dependent edge Sink-dependent edge In-trigger edge Out-trigger edge In-trigger edge Out-trigger edge Sanitization edge Tag edge Untag edge Carry edge Tag edge Untag edge Carry edge 20

Source- and Sink-Dependent  An edge e is source- dependent if there exist sources I 0 and I 1 and sink O such that  e is on a path from I 0 to O and  on a path from I 1 to O and  P(I 0, O) ≠ P(I 1,O)  In other words, sanitizer to use depends on the source

In- and Out-Trigger  Edge e is an out-trigger edge if it is a sink- independent edge but has a predecessor edge that is source-dependent.  At out-trigger edges we have sufficient information to know where a value came from.  We can cease tracking at out-trigger edges.  Edge e is an in-trigger edge if it is a source- independent edge but has a successor edge that is source-dependent.  At in-trigger edges we have sufficient information to know where a value came from.  Need to start run-time tracking because origin affects sanitizer choice. 22

Sanitization Edges sanitization(e) dom_sani(e): whether edge e is dominated by sanitization edges  Sanitization edges are sink-independent edges that are the earliest sink- independent edge on some path from a source to a sink.  We prefer early sanitization to avoid extra tracking 23

Tagging, Untagging, and Carry  Edge e is a tag edge if e is an in-trigger edge that is not dominated by sanitization edges.  An untag edge is either  (a) an out-trigger edge that is not dominated by a sanitization edge; or  (b) a sanitization edge. in-trigger sanitization carry tag untag [ [ 24

Putting it All Together… Source-dependent edge Sink-dependent edge Source-dependent edge Sink-dependent edge In-trigger edge Out-trigger edge In-trigger edge Out-trigger edge Sanitization edge Tag edge Untag edge Carry edge Tag edge Untag edge Carry edge 25

III. Experiments Evaluation on real and synthetic benchmarks

Benchmark Applications  10 large, long-running server- side applications written in C#, up to 1.8 M LOC [PLDI’09] 27  Runtime overhead of taint tracking is typically workload-and path-specific  Goal: Reducing # instrumentation points

What is The Baseline?  What is the naïve approach?  Traditional explicit tainting:  Find nodes or edges that are both reached from a source and go to a sink (FW and BW data slices)  Instrument these to carry taint information forward 28 Unfortunately, in practice, there are too many such places

Node-based Placement Many nodes tainted: tracking taint involves instrumenting half the graph Many nodes tainted: tracking taint involves instrumenting half the graph Last-exclusive node placement: considerably less instrumentation Most of the time, does not achieve complete sanitization! Most of the time, does not achieve complete sanitization! 29

Edge-based Placement Results Savings compared to the naïve approach: 6.19x on average Savings compared to the naïve approach: 6.19x on average Our analysis is most effective at reducing # of instrumentation points for sparsely-connected graphs. 30 Total number of instrumented edges 100% of source-sink paths are protected. Some runtime tracking is needed

Savings for Sparse Graphs Significant savings compared to the naïve approach Significant savings compared to the naïve approach sparsedense 31

Conclusions 1) The algorithms developed in out paper pave the way for completely automatic placement of sanitizers and declassifiers. 2) Theory of placement provides a way to reason about static and runtime taint tracking in the same framework 3) Showed that it works well in practice and produces significantly lower overhead (as high as 27x) compared to other strategies 32