Effective Real-time Android Application Auditing

Slides:



Advertisements
Similar presentations
All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask) Edward J. Schwartz, ThanassisAvgerinos,
Advertisements

P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Yoshi
Exception Handling. Background In a perfect world, users would never enter data in the wrong form, files they choose to open would always exist, and code.
Sim-alpha: A Validated, Execution-Driven Alpha Simulator Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin.
Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.
Expert System Human expert level performance Limited application area Large component of task specific knowledge Knowledge based system Task specific knowledge.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
About the Presentations The presentations cover the objectives found in the opening of each chapter. All chapter objectives are listed in the beginning.
Introduction & Overview CS4533 from Cooper & Torczon.
Java Security Updated May Topics Intro to the Java Sandbox Language Level Security Run Time Security Evolution of Security Sandbox Models The Security.
Efficient Privilege De-Escalation for Ad Libraries in Mobile Apps Bin Liu (SRA), Bin Liu (CMU), Hongxia Jin (SRA), Ramesh Govindan (USC)
Software Testing Verification and validation planning Software inspections Software Inspection vs. Testing Automated static analysis Cleanroom software.
DroidKungFu and AnserverBot
Java Security. Topics Intro to the Java Sandbox Language Level Security Run Time Security Evolution of Security Sandbox Models The Security Manager.
D2Taint: Differentiated and Dynamic Information Flow Tracking on Smartphones for Numerous Data Sources Boxuan Gu, Xinfeng Li, Gang Li, Adam C. Champion,
NDSS 2007 Philipp Vogt, Florian Nentwich, Nenad Jovanovic, Engin Kirda, Christopher Kruegel, Giovanni Vigna.
Protection and the Kernel: Mode, Space, and Context.
CS 501: Software Engineering Fall 1999 Lecture 16 Verification and Validation.
High Performance Architectures Dataflow Part 3. 2 Dataflow Processors Recall from Basic Processor Pipelining: Hazards limit performance  Structural hazards.
CMSC 345 Fall 2000 Unit Testing. The testing process.
ITEC224 Database Programming
Presented by: Tom Staley. Introduction Rising security concerns in the smartphone app community Use of private data: Passwords Financial records GPS locations.
Behavior-based Spyware Detection By Engin Kirda and Christopher Kruegel Secure Systems Lab Technical University Vienna Greg Banks, Giovanni Vigna, and.
Programmer's view on Computer Architecture by Istvan Haller.
Department of Computer Science A Static Program Analyzer to increase software reuse Ramakrishnan Venkitaraman and Gopal Gupta.
University of Central Florida TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones Written by Enck, Gilbert,
CSC-682 Cryptography & Computer Security Sound and Precise Analysis of Web Applications for Injection Vulnerabilities Pompi Rotaru Based on an article.
Introduction of Geoprocessing Topic 7a 4/10/2007.
Jose Sanchez 1 o Tielei Wang†, TaoWei†, Zhiqiang Lin‡, Wei Zou†. o Purdue University & Peking University o Proceedings of NDSS'09: Network and Distributed.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 22 Slide 1 Software Verification, Validation and Testing.
Static Program Analyses of DSP Software Systems Ramakrishnan Venkitaraman and Gopal Gupta.
Exceptions in Java. Exceptions An exception is an object describing an unusual or erroneous situation Exceptions are thrown by a program, and may be caught.
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
Semantic Clipboard User Interface is integrated in the Browser Architecture of the Semantic Clipboard Illustration of a license incompliant content reuse.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
Java Basics Hussein Suleman March 2007 UCT Department of Computer Science Computer Science 1015F.
Highly Scalable Distributed Dataflow Analysis Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan Chelsea LeBlancTodd.
CSV 889: Concurrent Software Verification Subodh Sharma Indian Institute of Technology Delhi Scalable Symbolic Execution: KLEE.
Hui Xu, Yangfan Zhou, Cuiyun Gao, Yu Kang, Michael R. Lyu
Software Engineering1  Verification: The software should conform to its specification  Validation: The software should do what the user really requires.
Intrusion Detection System
Sampling Dynamic Dataflow Analyses Joseph L. Greathouse Advanced Computer Architecture Laboratory University of Michigan University of British Columbia.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
Introduction to Software Analysis CS Why Take This Course? Learn methods to improve software quality – reliability, security, performance, etc.
Mobile Testing Overview. Agenda Mobile application quality poses a unique challenge Mobile changes the ALM cycle – Interoperability is unique to mobile.
Testing Overview Software Reliability Techniques Testing Concepts CEN 4010 Class 24 – 11/17.
Language-Based Information- Flow Security (Sabelfeld and Myers) “Practical methods for controlling information flow have eluded researchers for some time.”
AppAudit Effective Real-time Android Application Auditing Andrew Jeong
INFORMATION-FLOW ANALYSIS OF ANDROID APPLICATIONS IN DROIDSAFE JARED YOUNG.
The Fallacy Behind “There’s Nothing to Hide” Why End-to-End Encryption Is a Must in Today’s World.
WHAT THE APP IS THAT? DECEPTION AND COUNTERMEASURES IN THE ANDROID USER INTERFACE.
More Security and Programming Language Work on SmartPhones
Introduction to Operating Systems
Security and Programming Language Work on SmartPhones
TriggerScope: Towards Detecting Logic Bombs in Android Applications
TRUST Area 3 Overview: Privacy, Usability, & Social Impact
Java Programming Language
TaintART: A Practical Multi-level Information-Flow Tracking System for Android RunTime Sadiq Basha.
Presented by Xiaohui (Amy) Lin
MobiSys 2017 Symbolic Execution of Android Framework with Applications to Vulnerability Discovery and Exploit Generation Qiang Zeng joint work with Lannan.
TriggerScope Towards Detecting Logic Bombs in Android Applications
Taint tracking Suman Jana.
CMPE419 Mobile Application Development
TriggerScope Towards detecting logic bombs in android applications
Introduction to Operating Systems
All You Ever Wanted to Know About Dynamic Taint Analysis & Forward Symbolic Execution (but might have been afraid to ask) Edward J. Schwartz, Thanassis.
Autonomous Network Alerting Systems and Programmable Networks
CMPE419 Mobile Application Development
Presentation transcript:

Effective Real-time Android Application Auditing Presented by Lihua Ren

Personal data in mobile devices 1

Problem Definition Abuses of personal data Data leaks Tamper user privacy Abandon apps Harming app developers Harming app market intentionally e.g. for improper advertising revenue unintentionally e.g. exposing there data in plain-text over public networks 2

Existing work Analyze and identify data-leaking apps based on static program analysis: AppIntent Pios Flowdroid Merits : comprehensively examine program data flows comprehensively reveal data-leaking code paths Limitation : inefficient (time- and memory -consuming) produces false alarms 3

Outline Problem definition Prior work Evaluation Discussion AppAudit (new idea) Design overview Efficient Static API Analysis Approximated Execution Evaluation Discussion 4

Approach of this study Approach: AppAudit: a program analysis framework that can analyze apps efficiently and effectively. In real-time Report actual data leaks http://californiasupplementalexam.com/2014/01/setting-goals-to-pass-the-cse-in-2014/ 5

AppAudit: Use cases integrated into IDEs to check apps for developers before release identify problematic 3rd-party modules deployed as an automatic app auditing service at app markets wipe out human involvement in validating analysis results installed on mobile devices to check apps before installation protect users against data-leaking apps from untrusted sources or app markets that lack auditing service 6

AppAudit: Design overview Goal: tackle analysis efficiency and false positives Idea: 2 steps start with a very lightweight static API analysis rely on a dynamic analysis to prune its false positives 7

Tackle false positives Static analysis solutions (existing work) VS AppAudit Static analysis solutions: explore all code paths AppAudit: only explores code paths that could happen in real -> few false positives Challenge of AppAudit When dynamic analysis meets unknowns, it can hardly explore deeply into code paths, which will cause false negatives. Approach of AppAudit design a novel object model to represent and propagate unknowns design several execution mechanisms to increase the depth of our analysis and avoid false negatives 8

Step 1: Efficient Static API Analysis Goal: Efficiently find functions that can potentially cause data leaks Essential conditions for data leak Reach a Source API to retrieve personal data Reach a Sink API to transmit data out of the device Finding data leak 〓 Finding one path from the function to a source API and another to a sink API 9

Step 1: Efficient Static API Analysis (CONT) API Usage Analysis Firstly ,build a standard call graph from program bytecode Then extend it Finally, perform a breadth-first search to mark all suspicious functions Why extend? how to extend? 10

Call Graph Extensions Traditional call graph Missing paths: Dynamic Java language features, Android programming model AppAudit: Call Graph Extensions Java Virtual Calls and Reflection Calls Assume virtual calls can reach any matching method from all inherited classes while a reflection call will directly be marked suspicious Static Fields as Intermediates Android Life Cycle Methods Multi-threading GUI Event Callbacks Android Remote Procedure Call (RPC) 11

Call Graph Extensions(CONT) 12

Call Graph Extensions(CONT) 13

Step 2: Approximated execution Definition A dynamic analysis that executes the bytecode instructions of a suspicious function and reports when data leak happens Component typical register set a program counter (pc) a call stack as its execution context 3 working modes “execution (exec)” mode: interpret bytecodes and perform operations “check” mode: check the parameters for the sink API “approximation (approx)” mode: facing unknown -> switch to this mode for approximations to continue the execution. 14

Approximated Executor State Machine exec: Tainted objects propagate with the execution and taint any object derived from them check: Tainted objects are found, report and terminate (“end” final state); otherwise, revert back approx: If fail, enter “leap” final state leap: terminate current execution and start executing one of its caller function 15

Object Representation Type: i.e. int, long, string object kind: concrete object (CON): created during the execution process prior unknown (PU): exist prior to the execution process and contain unknown values derived unknown (DU): a prior unknown but is changed during the execution process store the known value(s) of the object i.e. Unknown value 16

Execution Rules (Table Ⅱ) 17

Taint Taint representation: Tainting rules Personal data is marked as tainted. Taints propagate along with the object. If a sink API meets a tainted object, report a leak. 18

Approximation Mode When to change to approximation mode: a conditional jump instruction meets unknown values Unknown Branching Approximation Idea: skip unknown loop as these loops cannot provide useful known information from unknowns. http://californiasupplementalexam.com/2014/01/setting-goals-to-pass-the-cse-in-2014/ 19

Unknown Branching Approximation Choose not to take the conditional branch to skip these loops 20

Unknown Branching Approximation Cannot distinguish ifs and loops -> Only explore the “then” branch for unknown if-else structures This bias is benign 21

x, y are correlated but it always produce untainted y Accuracy Analysis Execution mode: faithfully reproduce the actual path of the real execution Approximation mode: only misses non-leaking paths and is benign to the overall accuracy Limitations of Taint Analysis Taint Sanitization: only add and propagate taints but never remove them -> inaccuracy and false positives. (i.e. ) Array Indexing: i.e. , will be tainted if is tainted. -> over-taints -> false positives Control Flow Dependent Taints : x, y are correlated but it always produce untainted y 22

Outline Problem definition Prior work Evaluation Discussion AppAudit (new idea) Design overview Efficient Static API Analysis Approximated Execution Evaluation Discussion 23

Evaluation Methodology: Evaluation datasets: Completeness of Static API Analysis: by micro-benchmark Detection Accuracy: by malware samples Usability: by real-word apps Characterization of Data Leaks in Real Apps: to provide guidance Evaluation datasets: 24

Completeness of Static API Analysis FlowDroid: a state-of-the-art pure static analysis tool False negatives: When: particular user inputs happen in a particular order Why: AppAudit cannot model infinite possibility of user input orderings Explanation: Some particular ordering of user inputs might imply user awareness of the data leak, so a detection tool should not report such leaks. 25

Detection Accuracy 19

Detection Accuracy 19

Usability 19

Characterization of Data Leaks in Real Apps 19

Characterization of Data Leaks in Real Apps 19

Questions Question 1: AppAudit adopts a two-stage design. What are the two stages and their responsibilities? Question 2: what might lead to approximation failure? Question 3: Unknown branching approximation Only explores the “then” branch for unknown if-else structures, but this bias is benign. Why? 44

Thanks!