Download presentation
Presentation is loading. Please wait.
1
LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks Feng Qin, Cheng Wang, Zhenmin Li, Ho-seop Kim, Yuanyuan zhou, Youfeng Wu University of Illinois at Urbana-Champaign Intel Corporation The Ohio State University
2
Information Flow Tracking Taint Analysis To detect / prevent security attacks For attacks that corrupts control data General: not for specific types of software vulnerabilities Even for unknown attacks
3
Approach 1. Tag (label) the input data from unsafe channels: network 2. Propagate the data tags through the computation Any data derived from unsafe data are also tagged as unsafe 3. Detect unexpected usages of the unsafe data Switch the program control to the unsafe data
4
A Simple Example a is unsafe Information flows from a to b: b is unsafe If c is unsafe, jumping to the location pointed by c fails
5
Three Ways Language-based For programs written in special type-safe programming languages To track information flow at compile time Good: No runtime overhead Bad: Only for specific program languages Not Practical
6
Three Ways Instrumentation To track the information flow and detect exploits at runtime Source code instrumentation Lower overhead Cannot track in third-party library code Require a specification of library calls Complex, error-prone, side-effects Binary code instrumentation Runtime overhead: 37 times
7
Three Ways Hardware-based RIFLE Good: low overhead Bad: Non-trivial hardware extensions
8
Overview of LIFT Dynamically instruments the binary code (1) tracking information flow (2) detect security exploits Advantages: Low overhead, software-only, No source code Built on top of StarDBT Binary translator by Intel
9
Design of LIFT Basic design Tag management Information flow tracking Exploit detection Protection of the tag space Optimizations
10
Tag Management: Design Associate a one-bit tag for each byte of data in memory and general data register 0: safe; 1: unsafe At the beginning: all tags are cleared to zero Data may be tagged with 1 when It is read from network or standard input Information flow from other unsafe data to it An unsafe data can become safe if it is reassigned from some safe data
11
Tag Management: Storage For memory data Storage: a special memory region (tag space) Look-up: one-to-one mapping between a tag bit and a memory byte in the virtual address space Overhead: 12.5% Compression: memory data nearby each other usually have similar tag values For general registers Store tags in a dedicated extra register (64-bit) Reduce overhead If no spare registers: a special memory area No significant overhead as the L1 cache Hardware ??
12
Information Flow Tracking Dynamically instrument instructions Instrumented once at runtime, and executed multiple times The instrumentation is done before the instruction in the original program Tracks information flow based on data dependencies but not control dependencies
13
Information Flow Tracking For data movement-based instructions E.g., MOV, PUSH, POP Tag propagation: source operand destination For arithmetic instructions E.g., ADD, OR Tag propagation: both source operands destination For instructions that involve only one operand E.g., INC The tag does not change
14
Information Flow Tracking Special cases XOR reg, reg: reset reg to zero SUB reg, reg: Clear the corresponding tag
15
Exploit Detection Also instrument instructions to detect exploits Unsafe data cannot be used as a return address or the destination of an indirect jump instruction
16
Protection of Tag Space and Code It is necessary to protect them To protect the LIFT code Make the memory pages that store the LIFT code read-only To protect the tag space Turn off the access permission of the pages that store the tag values of the tag space itself Any access of the original program or hijacked code to the tag space results in access to the corresponding tag and triggers a fault
17
Optimizations 47 times runtime overhead Three binary optimizations
18
Fast Path (FP): Motivation Observation: for most server applications, majority of tag propagations are zero-to-zero From safe data sources to a safe destination
19
FP: Approach Before a code segment, insert a check Check whether all its live-in and live-out registers and memory data are safe or not If so, no need to do tracking inside the code segment Run the fast binary version (check version) If not, run the slow version (track version)
20
FP: Approach Live-in: source operand Live-out: may change to safe after the execution if they are unsafe before the execution Others: (a) not used in the code segment (b) dead at the beginning or end of the code segment
21
FP: More Technique Details Difficult to know the address of all units at the beginning Run the check version first Postpone the check until the memory location is known Jump to track version when the check fails Granularity of code segments Basic blocks Hot trace Remove unnecessary checks Network processing component
22
Merged Check (MC): Motivation Temporal / Spatial Locality A recently accessed data is likely to be accessed again in a near future After an access to a location, memory locations that are nearby are also likely to be accessed again in near future To combine multiple checks into one Combine the temporally and spatially nearby checks
23
Merged Check: Approach Clustering the memory references into groups Scan all the instructions and build a data dependency graph for each memory reference Introduce version number to represent the timing attribute Clustering based on spatially / temporally distance
24
Fast Switch (FS) When the program execution switches between the original binary code and the instrumented code it requires saving and restoring the context Introduce large runtime overhead because they are inserted at many locations Use cheaper instructions and remove unnecessary saves / restores
25
Evaluation Effectiveness Performance
26
Evaluation: Effectiveness
27
Evaluation: Performance Throughput and response time of Apache Throughput: 6.2% (StarDBT: 3.4%) Time: 90.9%
28
Evaluation: Performance SPEC2000: 3.6 times on average
29
Conclusion A “Practical” Information flow tracking system Low-overhead Not requiring hardware extension Not requiring source code
30
Discussions Source-code instrumentation 81% on average for CPU-intensive C-programs 5% on average for IO-intensive (sever) program If we are able to apply similar optimization techniques to source- code instrumentation, the performance could be “practical” Binary-code instrumentation CPU-bound: 24 times Apache server: worst case 25 times, most cases: 5~10 times
31
More Discussions Focus on basic design and three optimizations Not much details about the taint analysis Evaluation Effectiveness: false positive / false negative Performance IO-incentive vs. CPU-incentive More benchmarks Formal model to analyze taint analysis
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.