Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information flow Landon Cox April 1, 2016. Information flow Crucial goal of secure system –Prevent inappropriate information flows –Can model “appropriateness”

Similar presentations

Presentation on theme: "Information flow Landon Cox April 1, 2016. Information flow Crucial goal of secure system –Prevent inappropriate information flows –Can model “appropriateness”"— Presentation transcript:

1 Information flow Landon Cox April 1, 2016

2 Information flow Crucial goal of secure system –Prevent inappropriate information flows –Can model “appropriateness” with a lattice of tags –i.e., only allow “low” objects to flow into “high” objects –Non-interference := all flows are appropriate Information-flow analysis –Helps track where sensitive data goes –Getting this right is tricky

3 Information flow Building blocks –Storage objects (information receptacles) –Processes (move information to/from objects) Tracking information –Tag (or label) describes information sensitivity –Each storage object is assigned a tag –Need to update tags as processes execute

4 Information flow Issue 1: precision Say that storage object is an address space –If process P reads sensitive data item D –P’s entire address space is tagged What must we assume about any of P’s outputs? –Must assume that they contain sensitive information Which processes are allowed to communicate with P? –Other processes that are allowed to read D Why is this problematic? –Probably want P to communicate with processes that can’t access D –Hard to do anything useful otherwise

5 Information flow Issue 1: precision Say that storage object is an address space –If process P reads sensitive data item D –P’s entire address space is tagged Passwor d file accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } SSH client

6 Information flow Issue 1: precision Say that storage object is an address space –If process P reads sensitive data item D –P’s entire address space is tagged Passwor d file accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } SSH client uid/pw

7 Information flow Issue 1: precision Say that storage object is an address space –If process P reads sensitive data item D –P’s entire address space is tagged Passwor d file accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } SSH client uid/pw

8 Information flow Issue 1: precision Say that storage object is an address space –If process P reads sensitive data item D –P’s entire address space is tagged Passwor d file accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } SSH client uid/pw

9 Information flow Issue 1: precision Say that storage object is an address space –If process P reads sensitive data item D –P’s entire address space is tagged Passwor d file accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } SSH client uid/pw How do you solve this?

10 Information flow Issue 1: precision Say that storage object is an address space –If process P reads sensitive data item D –P’s entire address space is tagged Passwor d file accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } SSH client uid/pw How do you solve this? Often use a trusted “declassifier”

11 Information flow Issue 1: precision Say that storage object is an address space –If process P reads sensitive data item D –P’s entire address space is tagged Passwor d file accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } accept uid/pw; if (pw not in file) { return error; } else { fork/exec shell; } SSH client uid/pw Declassifie r Small piece of code trusted to remove tags

12 Information flow Issue 1: precision Say that storage object is an address space –If process P reads sensitive data item D –P’s entire address space is tagged What else could we do to improve precision? –Use finer-grained storage objects –Tag program variables or memory words What are the implications for performance? –Have to update tags much more frequently –i.e., everytime an instruction executes –Can introduce a lot of overhead

13 Tracking explicit flows Propagate taint tags with data flows c ← a op b taint(c) ← taint(a) ∪ taint(b) setTaint(a,t) c = a + b taint(c) ← {t} ∪ {} = {t} taint(a) ← {t} Send(c, see a?

14 Information flow Issue 2: explicit vs implicit flows Two ways to propagate information –Explicitly := direct transfer from one object to another –Implicitly := indirect transfer usually via control flow // a is sensitive int foo (int a){ int b, w, x, y, z; a = 11; b = 5; w = a * 2; x = b + 1; y = w + 1; z = x + y; print (z); } // a is sensitive int foo (int a){ int b, w, x, y, z; a = 11; b = 5; w = a * 2; x = b + 1; y = w + 1; z = x + y; print (z); } Each line is an explicit flow from source operands to destination operand

15 Information flow Issue 2: explicit vs implicit flows Two ways to propagate information –Explicitly := direct transfer from one object to another –Implicitly := indirect transfer usually via control flow // a is sensitive int foo (int a){ int b, w, x, y, z; a = 11; b = 5; w = a * 2; x = b + 1; y = w + 1; z = x + y; print (z); } // a is sensitive int foo (int a){ int b, w, x, y, z; a = 11; b = 5; w = a * 2; x = b + 1; y = w + 1; z = x + y; print (z); } Very easy to implement: just interpose on each instruction to update each var’s tag

16 Information flow Issue 2: explicit vs implicit flows Two ways to propagate information –Explicitly := direct transfer from one object to another –Implicitly := indirect transfer usually via control flow // a is sensitive void foo (int a) { int x, y; if (a > 10) { x = 1; } y = 10; print (x); print (y); } // a is sensitive void foo (int a) { int x, y; if (a > 10) { x = 1; } y = 10; print (x); print (y); } Where is the implicit flow?

17 Information flow Issue 2: explicit vs implicit flows Two ways to propagate information –Explicitly := direct transfer from one object to another –Implicitly := indirect transfer usually via control flow // a is sensitive void foo (int a) { int x, y; if (a > 10) { x = 1; } y = 10; print (x); print (y); } // a is sensitive void foo (int a) { int x, y; if (a > 10) { x = 1; } y = 10; print (x); print (y); } How would you update x’s tag?

18 Information flow Issue 2: explicit vs implicit flows Two ways to propagate information –Explicitly := direct transfer from one object to another –Implicitly := indirect transfer usually via control flow // a is sensitive void foo (int a) { int x, y; if (a > 10) { x = 1; } else { y = 10; } print (x); print (y); } // a is sensitive void foo (int a) { int x, y; if (a > 10) { x = 1; } else { y = 10; } print (x); print (y); } What is tricky about this code?

19 Information flow Issue 2: explicit vs implicit flows Two ways to propagate information –Explicitly := direct transfer from one object to another –Implicitly := indirect transfer usually via control flow // a is sensitive void foo (int a) { int x, y; if (a > 10) { baz (&x); } else { bar (&y); } print (x); print (y); } // a is sensitive void foo (int a) { int x, y; if (a > 10) { baz (&x); } else { bar (&y); } print (x); print (y); } What is trickier about this code?

20 Information flow Issue 2: explicit vs implicit flows Two ways to propagate information –Explicitly := direct transfer from one object to another –Implicitly := indirect transfer usually via control flow // a is sensitive void foo (int a) { int x, y; if (a > 10) { exit(0); } else { exit(1); } y = 10; print (x); print (y); } // a is sensitive void foo (int a) { int x, y; if (a > 10) { exit(0); } else { exit(1); } y = 10; print (x); print (y); } Where is the implicit flow here?

21 Information flow Issue 2: explicit vs implicit flows Two ways to propagate information –Explicitly := direct transfer from one object to another –Implicitly := indirect transfer usually via control flow // a is sensitive void foo (int a) { int x, y; if (a > 10) { exit(0); } else { exit(1); } y = 10; print (x); print (y); } // a is sensitive void foo (int a) { int x, y; if (a > 10) { exit(0); } else { exit(1); } y = 10; print (x); print (y); } How would you track this?

22 Hidden channels Get system to communicate in unintended ways Example: tenex (supposedly secure OS) –Created a team to break in –Team had all passwords within 48 hours … oops. –Goal: require 256^8 tries to see if password is right Password checker for (i=0; i<8; i++) { if (input[i] != password[i]) { break; }

23 Hidden channels: tenex How to break? (user passes in input buffer, virtual mem faults are visible) –Specially arrange the input’s layout in memory –Force a page fault if second character is read –If you get a fault, the first character was right –Do again for third, fourth, … eighth character Can check the password in 256*8 tries Password checker for (i=0; i<8; i++) { if (input[i] != password[i]) { break; }

24 Course administration Project proposals –Due today (ok if you send it to me by Monday) –Guidelines in the syllabus –One page should be fine Amount of work –Two-three weeks of effort –Focus on answering one interesting question

25 Cloud  large- scale analysis, collection, dissemination. Sensors  rich, personal data. Mobile  present at work, home, and play. m Username Password

26 App-centric operating systems Apps access sensitive information in many contexts –Location, images, and communication –Home, work, and play Apps run on behalf of many stakeholders –Users, services, developers, platform providers, advertisers

27 Monitoring app behavior Permissions are coarse. No insight into what is collected and by whom.

28 Consumer: “Why is my wallpaper app sending my phone number to China?”

29 Enterprise: “Who is collecting information about our workers?”

30 Wider interest in the issue

31 Emerging malware threat 1 McAfee Threats Report: Q1 2012 - 2 F-Secure Mobile Threat Report Q1 2012 - New mobile malware 1 New mobile malware family or variant 2

32 Where does data go after you grant access?

33 Monitoring goals Monitor where apps send data –What happens after you grant access? –Is observed behavior expected? Monitor apps at runtime –Want users to monitor their own apps –Must balance accuracy and efficiency Solution: TaintDroid –Original collaboration with Penn State, Intel –Will Enck (NCSU), Jaeyeon Jung (MSR), others

34 Taint tracking TaintDroid: system-wide taint tracking for Android –Records “explicit” data dependencies via taint tags –Does not capture “implicit” data dependencies Tag data as enters app Track how information propagates Check tags of emitted data Username Password

35 Taint tracking TaintDroid: system-wide taint tracking for Android –Records “explicit” data dependencies via taint tags –Does not capture “implicit” data dependencies Key issues for tag propagation –How are tags stored? –What is the tag-propagation logic? –Is tracking precise and efficient? Project website: http://appanalysis.org

36 Tag propagation Goal: balance precision and efficiency ImprecisePrecise Slow Fast Process-grained (All outputs tainted) Instruction-grained (2-20x overhead) Ideal

37 Multi-level approach Variable-level tracking through Dalvik VM (DEX instructions) Patch state after native method invocation Extend tracking to IPC and file system Network Variable-level tracking Method-level tracking File-level tracking Message-level tracking File system Application code Dalvik VM Native system libraries Application code Dalvik VM ms g

38 Variable-level tracking Tag-propagation logic for Dalvik executables (DEX)

39 Variable-level tracking Modified Dalvik VM –Store and propagate 32-bit tags Local vars and args –Store tags adjacent to vars on stack –Correspond to VM registers –64-bit vars require two tags Class fields –Store tags inside heap objects Arrays –One tag per array –Trade precision for efficient storage Performance optimizations –Per-variable tags reduce storage overhead –Adjacent tags provide spatial locality out0 out0 taint tag out1 out1 taint tag (unused) VM goop v0 == local0 v0 taint tag v1 == local1 v1 taint tag v2 == in0 … v4 taint tag SP FP

40 Method-grained tracking Huge opportunity for performance gains –JNI code is often CPU intensive Challenge for method-grained tracking –In worst case, must manually reason about side- effects –Luckily, a very simple heuristic works most of the time class java.lang.Math { public static double cos (double d); } class java.lang.Math { public static double cos (double d); }

41 Method-grained tracking Tainting heuristic “Assign union of arguments’ tags to return value on exit.” Most JNI methods have no side effects Many JNI methods operate on native types When it doesn’t work, use method profiles Generic framework for defining argument/retval dependencies So far, only needed to define for IBM charset converter See paper for more details … class java.lang.Math { public static double cos (double d); } class java.lang.Math { public static double cos (double d); }

42 Method-grained tracking Found 2,844 JNI methods in Android source –913 did not use Object references –Others could induce false negatives Third-party JNI is not supported –Apps must be written entirely in Java –Survey of Android Market, ~25% file –Subject of ongoing research

43 Evaluation Is TaintDroid fast and precise? ImprecisePrecise Slow Fast Process-grained (All outputs tainted) Instruction-grained (2-20x overhead) TaintDroid

44 Performance evaluation (higher is better) 20% overhead (extra memory accesses) 20% overhead (extra memory accesses) 14% overhead Not shown 4.4% memory overhead Not shown 4.4% memory overhead

45 Performance evaluation (higher is better) Reasons for efficiency (1) Method-grained tracking of JNI calls (2) Spatial locality of taint tags (3) One tag per array Reasons for efficiency (1) Method-grained tracking of JNI calls (2) Spatial locality of taint tags (3) One tag per array

46 App study Selected 30 apps from Android Market –Biased toward popular apps –Sampled from 12 categories App permissions –Access to Internet –Access to location, camera, phone state, mic –No native libraries Ran apps manually under TaintDroid

47 App study Of 105 flagged connections, only 37 to expected servers

48 App study: location 15 of 30 apps shared location with ad server –,,, Most traffic was plaintext (e.g., AdMob HTTP GET) – used binary format In no cases were users informed by EULA –In one case, app sent location every 30 seconds...&s=a14a4a93f1e4c68&..&t=062A1CB1D476DE85 B717D9195A6722A9&d%5Bcoord%5D=47.661227890000006%2C- 122.31589477&...

49 App study: phone identifiers 7 apps sent device id (IMEI) 2 apps sent phone info (Ph. #, IMSI *, ICC-ID) –Done without informing the user –One app’s EULA indicated the IMEI was sent –Another app sent the hash of the IMEI Frequency was app-specific –One sent info every time the phone booted

50 Source code available –Most recent version is for Android 4.3 Great platform for research –Compatible with vast majority of Android apps –Playground for all kinds of information-flow projects Video demo by Peter Gilbert

51 TaintDroid demo

52 Media coverage

53 Limitations Implicit flows –Fundamentally difficult problem –Can handle passwords (SpanDex, USENIX Sec) Native code –Ongoing work –Talk to Ali and Alex …

Download ppt "Information flow Landon Cox April 1, 2016. Information flow Crucial goal of secure system –Prevent inappropriate information flows –Can model “appropriateness”"

Similar presentations

Ads by Google