Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automatic for the people: Reducing inadvertent leaks by personal machines Landon Cox Duke University.

Similar presentations


Presentation on theme: "Automatic for the people: Reducing inadvertent leaks by personal machines Landon Cox Duke University."— Presentation transcript:

1 Automatic for the people: Reducing inadvertent leaks by personal machines Landon Cox Duke University

2 Inadvertent leaks Usability and privacy: A Study of Kazaa... ‣ Good and Krekelberg, CHI, 2003 ‣ In 12 hours, found 150 inboxes on Kazaa ‣ Observed people downloading dummy inbox Problem hasn’t gone away

3 Stories from 2009

4 Technical solution? Reference monitor Policy Process Process Process Network Files IPC Servers: Asbestos, HiStar, Flume Languages: Jif, Laminar, Resin Desktop: PrivacyScope, TightLip DevAdminUser Automation

5 Automatic policy specific. State of the art: pattern matching ‣ Look for strings that look like SSNs, CCs, etc. ‣ find_SSNs, Firefly, SENF, Spider, etc. ‣ A bit brittle and error-prone ‣ High false positive/negative rates Let’s take a different approach

6 Key observations 1) Personal machines often cache sensitive data 2) Servers force clients to access files using crypto 3) Crypto is general technique, used across admin. domains and applications

7 RedFlag overview Identifies processes that store decrypted data ‣ Unobtrusive (requires no user input) ‣ Compatible with legacy applications ‣ Compatible with existing Internet protocols High-level insights ‣ Stop trying to figure out what sensitive data looks like ‣ Use heuristics of how sensitive data is handled

8 Caveats We cannot stop all inadvertent leaks ‣ Stop large, important class of leaks Trust and threat model ‣ Uncompromised host ‣ No IP spoofing or DNS hijacking ‣ Correct, trusted reference monitor (take your pick) ‣ Buggy/absent access-control policies

9 RedFlag system overview Monitor sockets Inspect process Compose rules

10 Monitoring sockets Goal ‣ Try to identify incoming encrypted data ‣ Only at application level (e.g., SSL) Easy for most widely used apps ‣ Look at remote port (e.g., 443 or 993) Not always sufficient ‣ Non-standard ports: Skype, Groove, Groupwise ‣ XMPP sends SSL, non-SSL data to same port (5222/TCP)

11 Information entropy Compute entropy score for ambiguous ports ‣ Negligible performance overhead ‣ If score above threshold (~7.9 bits/byte), invoke inspection process Can induce false positives ‣ Compressed data sent in the clear (e.g., mp3s) ‣ On-the-fly compression schemes (e.g., http content-coding=gzip ) Luckily, doesn’t need to be 100% accurate ‣ Really just a performance optimization to save work ‣ Only used as a first-pass filter ‣ Correct any mistakes in inspection phase

12 RedFlag system overview Monitor sockets Inspect process Compose rules

13 Inspect process Goals of inspection ‣ Infer when file write depends on network read ‣ Determine whether file write is decrypted data Use taint-tracking ‣ Too slow to perform in critical path of desktop apps ‣ Perform asynchronously via deterministic replay ‣ Fork if network monitor flags process (port or entropy) ‣ Log libc calls in original, use log in replay process ‣ Attach taint-tracker to replayed process (e.g., PIN) ‣ Perform analysis on a free core in the background

14 Taint tracking Implement with PIN ‣ Rewrite instructions to propagate taint ‣ Record taint in shadow memory Key questions ‣ What are the taint sources? ‣ What info to send to the policy composer?

15 } Shadow memory } Taint label (byte) 100000 IDSource 174.125.45.83:443 210.212.1.3:443... 63- } <!DOCTYPE html PUBLIC... “/tmp/attach.pdf, 74.125.45.83:443” Fine when there is no ambiguity about the source But what about ambiguous ports? Address space

16 Ambiguous ports Search process memory for AES s-boxes ‣ S-boxes are set by algorithm designer ‣ S-boxes are unlikely to appear randomly ‣ (also look for well-known transformations)

17 Ambiguous ports If we find s-boxes in a library data section ‣ Assume image is a crypto library ‣ Vast majority of crypto libraries include AES implementation Instrument lib to set “crypto bit” of inbound taint labels ‣ If crypto bit == 1, network data was “routed” through crypto lib ‣ If crypto bit == 0, assume network data was not decrypted Also use s-boxes as taint source ‣ Data derived from s-boxes have “AES bit” set ‣ Can use to gauge strength of crypto algorithm Taint label (byte) 100000 11 } ID index AES bitCrypto bit

18 RedFlag system overview Monitor sockets Inspect process Compose rules

19 Compose rules Taint-tracking gives three pieces of info ‣ Description of network source ‣ If data was routed through crypto library ‣ If data was derived from AES s-box Can use this to compose policies

20 Compose rules Same source ‣ Allow sensitive files to be copied back to their source ‣ Raise alert otherwise ‣ Generalize hostnames (e.g., *.google.com) Obfuscation vs. confidentiality ‣ Many P2P clients use crypto to obfuscate ‣ Aren’t trying to protect data so use weak algorithms ‣ (e.g., BitTorrent and LimeWire explicitly do not support AES) ‣ If ambiguous port + no AES, then ignore file

21 RedFlag implementation Runs on Ubuntu 8.10 Modified Jockey for logging/replay ‣ Supports multi-threaded programs ‣ User-level thread library PIN tool for tainting ‣ Based on sequential taint tracker from Speck ‣ Modified to allow tainting during replay ‣ Implemented s-box search, crypto and AES bits in taint label

22 Evaluation Accuracy ‣ How well can RedFlag identify crypto libraries using s-boxes? ‣ How well does RedFalg categorize sensitive files? Performance ‣ Will asynchronous taint-tracking fall behind?

23 Identifying crypto libraries Looked at 10 Ubuntu programs ‣ Email: checkgmail, thunderbird ‣ IM: pidgin ‣ P2P: Azureus, Limewire, Skype, Transmission ‣ Web: Firefox, Opera, wget Successfully identified crypto libs in all ‣ Including custom implementations, plugins (flash player) ‣ Interesting case: Opera folds crypto into exectable

24 Categorizing sensitive files Non-sensitive files ‣ Used Firefox ‣ Loaded 30 most popular webistes (alexa) ‣ RedFlag produced no false positives/negatives Sensitive files ‣ Downloaded 17 representative sensitive docs ‣ Firefox, thunderbird, pidgin

25 Categorizing sensitive files

26 Taint-tracking performance

27 Conclusions RedFlag automates policy specification ‣ Heuristic-based approach ‣ Monitor process behavior, not file content ‣ Sensitive files usually downloaded using crypto ‣ Deal with ambiguous ports using entropy scores, AES s-boxes Evaluation highlights ‣ Automatically identified crypto libraries ‣ Correctly categorized files in 45/47 scenarios ‣ No false positives, three false negatives ‣ Sufficient idle time in long-running process

28 Thanks! I’m happy to take questions


Download ppt "Automatic for the people: Reducing inadvertent leaks by personal machines Landon Cox Duke University."

Similar presentations


Ads by Google