INFORMATION-FLOW ANALYSIS OF ANDROID APPLICATIONS IN DROIDSAFE JARED YOUNG
INTRODUCTION Leaking of sensitive information Permission granting Is the information used legitimately?
CURRENT ANALYSIS TECHNIQUES Dynamic analysis Missed information flows Malicious application behaviour Static Analysis Scaling while maintaining precision Android API and Runtime model
DROIDSAFE Static analysis Tracks information flows from source to sinks “Accurately and precisely analyzes sensitive explicit information flows in large, real-world Android applications”
ANDROID DEVICE IMPLEMENTATION (ADI) Android Open Source Project (AOSP) as a java basis for the model Missing parts of Android runtime lead to the development of Accurate Analysis Stubs Stubs incompletely models runtime behaviour Add stubs include native methods; event callback initiation and hidden state AOSP + Accurate Analysis Stubs = ADI
POINTS-TO ANALYSIS With 2 variables p and q, will p point to q at some point during runtime? Uses global object-sensitive points-to analysis (example next slide) Stores state Removes irrelevant classes to information flow
DROIDSAFE OPTIMISATIONS - POINTS-TO ANALYSIS Scaling is an issue when we need deeper depths Uses a pointer assignment graph Explicit representation of the program Exhausts main memory fast More main memory now so can now work Android specific optimisations 3 of the 24 APAC applications could not finish as 64GB limit With optimisations all finished using max of 34GB
FLOW-INSENSITIVE ANALYSIS Assumes that statements can be executed in any order Considers all asynchronous callbacks between Android applications and environment Improves scalability as they do not need to track flow-sensitive flows
INTER-COMPONENT COMMUNICATION (ICC) Communication between application components and or separate applications. Sent through dynamically constructed Strings packaged in an Intent object. Uses ADI model to increase precision by storing state via java objects. Uses Java String Analyser (JSA) to resolve strings. Replace all strings with regular expression. Can then perform points-to analysis using state and regular expressions.
IDENTIFYING SOURCES AND SINKS Initially used SuSi to identify sources and sinks Missed 53% of source calls as “sensitive sources” and 32% of sink calls as “sinks” for the malicious flows in the APAC applications Identified manually 4,051 sensitive sources 2,116 sensitive sinks
INFORMATION-FLOW ANALYSIS Approximation of all memory states Define memory state transformation for each statement of code Stores tuple of information and memory location in InfoVal
EVALUATION Tested against FlowDroid + IccTA (Inter-component communication Taint Analysis) Used DroidBench, a test suite developed by the creators of FlowDroid Suite of 94 Android information-flow benchmarks
ADDITIONAL TESTS Developed their own suite of 40 small applications Largest app is 255 lines of code 42 total leaks DroidSafe 100% accuracy and precision FlowDroid + IccTA 34.88% accuracy and 79.0% precision
APAC TESTS Automated Program Analysis for Cybersecurity (APAC) program Tested against APAC test suit which consists of 24 real-world applications that the developers have intentionally hid malicious flows. 200 to 80,000 lines of code Flows are hidden in places such as exceptions, application native methods and string manipulation Uses difficult to model Android API methods such as Object.Clone and System.arraycopy
ADVANTAGES Accurate Android model – ADI Information flow insensitivity ICC modelling
LIMITATIONS Assumes non-rooted device DroidSafe definition of sink and source are defined by the Android API Does not fully handle Java native methods, dynamic class loading and reflection
CRITICISM Manually identifying sources and sinks Uses Android version Their own test suite happens to have 100% accuracy and precision Test against other static analysis tools and more applications