Presentation is loading. Please wait.

Presentation is loading. Please wait.

Path-Based Fault Correlations

Similar presentations


Presentation on theme: "Path-Based Fault Correlations"— Presentation transcript:

1 Path-Based Fault Correlations
Wei Le and Mary Lou Soffa University of Virginia

2 Motivation Fault detection: many techniques
Many commercial and academic tools Automatic and fast Thousands even millions of reports can be generated [Prefix,Prefast,FindBugs] Fault diagnosis: manual and inefficient 15-20 reports per person day [Microsoft06] 120 lines of code per person hour [Hatton08] University of Virginia © Wei Le

3 Fault Diagnosis Statically done over program source code
Goal: identify and fix causes of faults Challenges Root causes located far from detection Little runtime information False alarms and benign errors University of Virginia © Wei Le

4 Goals of the Work Develop static analysis techniques to help with diagnosis Model runtime error states to predict How faults propagate along paths How faults enable other faults Automatic Scalable: real programs Precise: path-sensitive University of Virginia © Wei Le

5 Key Idea: Fault Correlation
buffer overflow Fault Relationships Based on their Dynamic Behavior a fault “causes” another? integer fault buffer overflow buffer overflow University of Virginia © Wei Le

6 Faults Considered A program fault is an abnormal condition caused by the violation of a required property at a program point. Examples: buffer overflow, integer faults, resource leaks University of Virginia © Wei Le

7 Define Fault Correlation
Suppose f1 and f2 are two program faults. f2 Definition: f1 and f2 are correlated if the occurrence of f2 along path p is dependent on the error state of f1. If f2 only occurs with f1, f2 is uniquely correlated with f1. University of Virginia © Wei Le

8 Error State Fault Type Code Signature Error State buffer overflow
strcpy(a,b) len(a)>size(a) integer overflow unsigned i = a+b value(i) == value(a)+value(b) - C integer signedness unsigned i int j = i value(j)<0 integer truncation unsigned i uchar j = i value(j)<value(i) resource leak Socket s = accept() s = accept() Avail(Socket) == Avail(Socket) – 1 University of Virginia © Wei Le © Wei Le 8 8

9 Examples of Fault Correlations
[1] char a[2]; char c[2];  [2] strcpy (a, input); [ len(a) > size(a) ]  [3] strcpy (c, a); Data-Dependency  [1] unsigned i = 8* strlen(input); [ value(i) == 8*len(input)- C ] [2] if ( i < size (a))  [3] strcpy (a, input); Control-Dependency University of Virginia © Wei Le © Wei Le 9

10 Null-pointer Dereference !
Value I: Impact and Propagation of a Fault int i, char* input Input: xyzabc…d |>max_int| adapted from ffmpeg 1 p = NULL 2 i = strlen (input) i < 0 3 i > 0 yes 4 p = malloc(8*i+8) no inf Null-pointer Dereference ! 5 p[i] = ‘\0’ University of Virginia © Wei Le © Wei Le 10 10

11 Value II: Group Faults Correlation Graph Polymorph-0.4.0 +
2,buf,filename char f i l e n ame [ ] ; s t r c p y ( f i l e n ame , F i l eDa t a . cFi leName ) ; c o n v e r t _ f i l eName ( f i l e n ame ) ; void c o n v e r t _ f i l e n ame ( char o r i g i n a l ) { char newname [ ] ; char b s l a s h = NULL; . . . i f ( does_nameHaveUpper s ( o r i g i n a l ) ) { f o r ( i =0 ; i < s t r l e n ( o r i g i n a l ) ; i ++){ i f ( i s u p p e r ( o r i g i n a l [ i ] ) ) { newname [ i ] = t o l owe r ( o r i g i n a l [ i ] ) ; cont inue ; } newname [ i ] = o r i g i n a l [ i ] ; } newname [ i ] = ’ \ 0 ’ ; e l s e s t r c p y ( newname , o r i g i n a l ) ; i f ( c l e a n ) { b s l a s h = s t r r c h r ( newname , ’ \ \ ’ ) ; i f ( b s l a s h != NULL) s t r c p y ( newname , &b s l a s h [ 1 ] ) } . . . s t r c p y ( o r i g i n a l , newname ) ; + 10,buf,newname 12,buf,newname 14,buf,newname 16,buf,newname 19,buf,newname 21,buf,original Correlation Graph University of Virginia © Wei Le © Wei Le 11 11

12 Does Correlation Exist in General ?
Approach Manually analyze 300 vulnerability reports from 2009 CVE Discoveries Fault correlation widely exists Fault correlations occur along different paths and between different types of faults Code inspector manually correlate the faults University of Virginia © Wei Le

13 Types of Correlated Faults Found in CVE
int buf nullptr free leak loop race privilege * *×# # × *:unique correlation; ×: not unique; #: correlated with the same fault University of Virginia © Wei Le

14 Computation: Challenges
Compute path information for faults: expensive Detect multiple types of faults on the same framework Model the error state and incorporate in the static analysis University of Virginia © Wei Le

15 Computation: an Overview
Given two statically detected faults, we determine whether they are correlated or uniquely correlated Given a statically detected fault, we determine whether it correlates with a fault not yet detected Key: error states propagate and create impact: Directly impact on the determination of faults First change path feasibility, through which the error states indirectly impact faults University of Virginia © Wei Le © Wei Le 15

16 Computation: Two Steps
Phase 1: Fault Detection Phase 2: Fault Correlation Program Faults Construct Query Model Error State 1 1 Propagate Query Impact on Feasibility 2 2 Resolve Query Impact on Faults 3 3 yes no Identify Paths Determine Corr-Types 4 4 Path Segments of a Fault Correlations University of Virginia © Wei Le

17 value(x) == 8* value(i)-C
Computation: Case I unsigned i, unsigned x Fault Detection Fault Correlation Q5 1 char a[128] 5 value(i)*8  C 4 F INPUT*8 C After Node 5 value(x) == 8* value(i)-C 2 flag int fault y n 4 3 i = 128 scanf (“%d”,&i) Error states directly impact on faults Q7 7 value(i) size(p) 6 8*value(i)  value(x) 5T 8*value(i) 8*value(i) 5 x = 8*i F5 F8 (new) 6 8*value(i)  value(x) 6 F 8*value(i) 8*value(i)-C 6 p = malloc (x) buf safe 7 memcpy(p,y,i) buf safe – buf fault University of Virginia © Wei Le © Wei Le 17 17 5 3 value (p) 0 INF

18 Computation: Case II Fault Detection Fault Correlation 1 p = NULL Q3 int After Node 3 int fault 3 len(x)+1 < C’ 2 F INPUT+1 < C’ 2 scanf (“%s”, x) value(i)<0 Error states first impact feasibility and then faults 3 int i= strlen(x)+1 Q3 feasibility 4 value(i) >0 3 T len(x)+1 >0 4 value(i) > 0 4 F 0>0 4 i > 0 ptr inf y inf Q6 ptr ptr inf – ptr fault 5 p = malloc(8*i) 6 value(p) 0 4 INF n F3 F6 (new) 6 p[i] = ‘\0’ 4 INF 4 value(p) 0 1 F 00 University of Virginia © Wei Le © Wei Le 18 18 5 3 value (p) 0 INF

19 Experiments: Goals and Setup
Implementation: Marple Phoenix and Disolver Faults: buffer bounds error, integer faults, null-pointer dereference Benchmarks: 5 from bugbench and buffer overflow benchmarks[Lu05,zitser04] 4 from real-world programs: putty, apache, ffmpeg Goals: Automatically find correlations Demonstrate the properties and usefulness of fault correlations University of Virginia © Wei Le

20 Results: Fault Correlations
Benchmark int,buf,ptr Corr-Faults Types New Faults from Correlation Groups wu-ftp1 4 100% buf-buf 1 sendmail-6 3 33.3% int-buf 1 (buf) sendmail-2 5 80% 2 polymorph 8 gzip 24 66.7% int-buf, buf-buf, buf-int 9 ffmpeg 7 28.5% int-buf, int-ptr 11(1 ptr, 10 buf) tightvnc 11 72.7% int-int, int-buf 7 (2 int, 5 buf) 6 putty int-int, int-buf, int-int 5 (3 int, 2 buf) apache - Prioritize Group Dissertation Defense © Wei Le

21 Results: Properties of Fault Correlations
Benchmark Size (kloc) Total pair Unique/Not Dir/ Indir Inter/ Intra Corr-Proc Analysis Cost (detect/corr) wu-ftp1 0.4 7 4/3 7/0 1-10 3.9 m/43.2 s sendmail-6 1 1/0 0/1 1-1 108.0 m/5.6 s sendmail-2 0.7 3 0/3 3/0 10.8 s/3.7 s polymorph 1.7 13 11/2 13/0 8/5 1-3 39.4 s/9.3 s gzip 8.2 22 12/10 21/1 15/7 1-19 29.3 m/90.0 m ffmpeg 39.8 11 11/0 1/10 0/11 114.2 m/3.4 m putty 66.5 17 10/0 2/8 62.8 m/1.2 m tightvnc 78.9 10 14/3 17/0 16/1 1-2 60.3 m/2.4 m apache 418.9 - 217.8 m/2.1 s University of Virginia © Wei Le

22 False Positives and False Negatives
The fault is a false positive: 23 The relationship is the false positive: 2 False positives can be grouped False negatives: 2 Buffer read overflow (does not model): 1 Don’t-know paths: 1 University of Virginia © Wei Le

23 Related Work Fault propagation Fault ranking and localization
Understand vulnerability and exploits [chen03, Ghosh98] Debugging: isolate inputs that produce the faults [clause09] Effective testing [Goradia93, Wu93] Fault ranking and localization Z-statistics [kremenek02, kremenek04] Aware [Heckman09] Logistic regression model to coordinate factors [ruthruff08] Other types of correlations such as branch correlations [esp02,bodik07] University of Virginia © Wei Le

24 Conclusions Developed the concept of fault correlation
Formally defined fault correlations Identified the values of fault correlations in fault diagnosis Defined correlation graph Reduce number of faults to be diagnosed Developed scalable, precise (path sensitive) static technique to automatically detect fault correlation Experimentally demonstrated the commonly existent fault correlations and we can find them University of Virginia © Wei Le

25 Questions ? University of Virginia © Wei Le


Download ppt "Path-Based Fault Correlations"

Similar presentations


Ads by Google