Download presentation
Presentation is loading. Please wait.
Published byAngel Barton Modified over 9 years ago
1
Program Analysis and Design Conformance Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology
2
Research Overview Program Analysis Commutativity Analysis for C++ Programs [PLDI96] Memory Disambiguation for Multithreaded C Programs Pointer Analysis [PLDI99] Region Analysis [PPoPP99, PLDI00] Pointer and Escape Analysis for Multithreaded Java Programs [OOPSLA99, PLDI01, PPoPP01]
3
Research Overview Transformations Automatic Parallelization Object-Oriented Programs with Linked Data Structures [PLDI96] Divide and Conquer Programs [PPoPP99, PLDI00] Synchronization Optimizations Lock Coarsening [POPL97,PLDI98] Synchronization Elimination [OOPSLA99] Optimistic Synchronization Primitives [PPoPP97] Memory Management Optimizations Stack Allocation [OOPSLA99,PLDI01] Per-Thread Heap Allocation
4
Research Overview Verifications of Safety Properties Data Race Freedom [PLDI00] Array Bounds Checks [PLDI00] Correctness of Region-Based Allocation [PPoPP01] Credible Compilation [RTRV99] Correctness of Dataflow Analysis Results Correctness of Standard Compiler Optimizations
5
Talk Overview Memory Disambiguation Goal: Verify Data Race Freedom for Multithreaded Divide and Conquer Programs Analyses: Pointer Analysis Accessed Region Analysis Experience integrating information from the developer into the memory disambiguation analysis Role Verification Design Conformance
6
Basic Memory Disambiguation Problem *p = v Without Any Analysis: *p=v may access any location *p = v; (write v into the memory location that p points to) What memory locations may *p=v access?
7
*p = v; (write v into the memory location that p points to) What memory location may *p=v access? *p = v With Analysis: *p=v does not access these memory locations ! *p=v may access this location Basic Memory Disambiguation Problem
8
Static Memory Disambiguation Analyze the program to characterize the memory locations that statements in the program read and write Fundamental problem in program analysis with many applications
9
Application: Verify Data Race Freedom *p = v1; *q = v2; *q = v2 *p = v1 || *q = v2 *p = v1 Program Does This NOT This
10
Example - Divide and Conquer Sort 47615382
11
82536147 47615382 Divide
12
28531674 82536147 47615382 Example - Divide and Conquer Sort Conquer Divide
13
Example - Divide and Conquer Sort 28531674 Conquer 82536147 Divide 47615382 41673258 Combine
14
Example - Divide and Conquer Sort 28531674 Conquer 82536147 Divide 47615382 41673258 Combine 21346578
15
Divide and Conquer Algorithms Lots of Generated Concurrency Solve Subproblems in Parallel
16
Divide and Conquer Algorithms Lots of Recursively Generated Concurrency Recursively Solve Subproblems in Parallel
17
Divide and Conquer Algorithms Lots of Recursively Generated Concurrency Recursively Solve Subproblems in Parallel Combine Results in Parallel
18
“Sort n Items in d, Using t as Temporary Storage” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n);
19
“Sort n Items in d, Using t as Temporary Storage” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); Divide array into subarrays and recursively sort subarrays in parallel
20
“Sort n Items in d, Using t as Temporary Storage” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); Subproblems Identified Using Pointers Into Middle of Array 47615382 d d+n/4 d+n/2 d+3*(n/4)
21
“Sort n Items in d, Using t as Temporary Storage” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 74165328 d d+n/4 d+n/2 d+3*(n/4) Sorted Results Written Back Into Input Array
22
“Merge Sorted Quarters of d Into Halves of t” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 74165328 41673258 d t t+n/2
23
“Merge Sorted Halves of t Back Into d” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 21346578 41673258 d t t+n/2
24
“Use a Simple Sort for Small Problem Sizes” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 47615382 d d+n
25
“Use a Simple Sort for Small Problem Sizes” void sort(int *d, int *t, int n) if (n > CUTOFF) { spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); sync; merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 47165382 d d+n
26
What Do You Need To Know To Verify Data Race Freedom? Points-to Information (data blocks that pointers point into) Region Information (accessed regions within data blocks)
27
d and t point to different memory blocks Calls to sort access disjoint parts of d and t Together, calls access [d,d+n-1] and [t,t+n-1] sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+n/2,t+n/2,n/4); sort(d+3*(n/4),t+3*(n/4), n-3*(n/4)); Information Needed To Verify Race Freedom d t d t d t d t d+n-1 t+n-1 d+n-1 t+n-1 d+n-1 t+n-1 d+n-1 t+n-1
28
d and t point to different memory blocks First two calls to merge access disjoint parts of d,t Together, calls access [d,d+n-1] and [t,t+n-1] merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4), d+n,t+n/2); merge(t,t+n/2,t+n,d); d t d t d+n-1 t+n-1 d+n-1 t+n-1 d t d+n-1 t+n-1 Information Needed To Verify Race Freedom
29
dd+n-1 Information Needed To Verify Race Freedom Calls to insertionSort access [d,d+n-1] insertionSort(d,d+n);
30
What Do You Need To Know To Verify Data Race Freedom? Points-to Information (d and t point to different data blocks) Symbolic Region Information (accessed regions within d and t blocks)
31
How Hard Is It To Figure These Things Out?
32
Challenging How Hard Is It For the Program Analysis To Figure These Things Out?
33
void insertionSort(int *l, int *h) { int *p, *q, k; for (p = l+1; p < h; p++) { for (k = *p, q = p-1; l <= q && k < *q; q--) *(q+1) = *q; *(q+1) = k; } Not immediately obvious that insertionSort(l,h) accesses [l,h-1]
34
void merge(int *l1, int*m, int *h2, int *d) { int *h1 = m; int *l2 = m; while ((l1 < h1) && (l2 < h2)) if (*l1 < *l2) *d++ = *l1++; else *d++ = *l2++; while (l1 < h1) *d++ = *l1++; while (l2 < h2) *d++ = *l2++; } Not immediately obvious that merge(l,m,h,d) accesses [l,h-1] and [d,d+(h-l)-1] How Hard Is It For the Program Analysis To Figure These Things Out?
35
Issues Heavy Use of Pointers Pointers into Middle of Arrays Pointer Arithmetic Pointer Comparison Multiple Procedures sort(int *d, int *t, n) insertionSort(int *l, int *h) merge(int *l, int *m, int *h, int *t) Recursion Multithreading
36
Pointer Analysis For each program point, computes where each pointer may point e.g. “ p x before statement *p = 1” Complications 1. Statically unbounded number of locations recursive data structures (lists, trees) dynamically allocated arrays 2. Multiple possible executions of the program may create different dynamic data structures
37
Memory Abstraction Physical Memory Abstract Memory StackHeap p i head r p r q v qv j i j Allocation block for each variable declaration Allocation block for each memory allocation site
38
Memory Abstraction Physical Memory Abstract Memory StackHeap p i head r p r q v qv j i j Allocation block for each variable declaration Allocation block for each memory allocation site
39
Pointer Analysis Summary Key Challenge for Multithreaded Programs: Analyzing interactions between threads Solution: Interference Edges Record edges generated by each thread Captures effect of parallel threads on points-to information of other threads
40
What Pointer Analysis Gives Us Disambiguation of Memory Accesses Via Pointers Pointer-based loads and stores: use pointer analysis results to derive the allocation block that each pointer-based load or store statement accesses MOD-REF or READ-WRITE SETS Analysis: All loads and stores Procedures: use the memory access information for loads and stores to compute the allocation blocks that each procedure accesses
41
Is This Information Enough?
42
NO Necessary but not Sufficient Parallel Tasks Access (Disjoint) Regions of Same Allocated Block of Memory
43
Structure of Analysis Bounds Analysis Region Analysis Data Race Freedom Symbolic Upper and Lower Bounds for Each Memory Access in Each Procedure Symbolic Regions Accessed By Execution of Each Procedure Check that Parallel Threads Are Independent Pointer Analysis Disambiguate Memory at the Granularity of Allocation Blocks
44
Running Example – Array Increment void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); /* increment first half */ spawn f(p+n/2, n/2); /* increment second half */ sync; } else { /* base case: increment small array */ int i = 0; while (i < n) { *(p+i) += 1; i++; } }
45
Bounds Analysis Region Analysis Data Race Detection Symbolic Upper and Lower Bounds for Each Memory Access in Each Procedure Pointer Analysis Intra-procedural Bounds Analysis
46
Intraprocedural Bounds Analysis GOAL: For each pointer and array index variable at each program point, derive lower and upper bounds E.g. “ 0 i n-1 at statement *(p+i) += 1 ” Bounds are symbolic expressions variables represent initial values of parameters of enclosing procedure bounds are combinations of variables example expression for f(p,n): p+(n/2)-1
47
What are upper and lower bounds for i at each program point in base case? int i = 0; while (i < n) { *(p+i) += 1; i++; } Intraprocedural Bounds Analysis
48
Bounds Analysis, Step 1 Build control flow graph i = 0 i < n *(p+i) += 1 i = i+1
49
Set up bounds at beginning of basic blocks Bounds Analysis, Step 2 l 1 i u 1 i = 0 i < n *(p+i) += 1 i = i+1 l 2 i u 2 l 3 i u 3
50
Compute transfer functions Bounds Analysis, Step 3 l 1 i u 1 i = 0 i < n *(p+i) += 1 i = i+1 l 2 i u 2 l 3 i u 3 0 i 0 l 3 i u 3 l 3 +1 i u 3 +1
51
l 2 i n-1 n i u 2 l 2 i u 2 Compute transfer functions Bounds Analysis, Step 3 l 1 i u 1 i = 0 i < n *(p+i) += 1 i = i+1 l 3 i u 3 0 i 0 l 3 i u 3 l 3 +1 i u 3 +1
52
Key Step: set up constraints for bounds Bounds Analysis, Step 4 l 2 i n-1 n i u 2 l 2 i u 2 i = 0 i < n *(p+i) += 1 i = i+1 l 3 i u 3 0 i 0 l 3 i u 3 l 3 +1 i u 3 +1 Build Region Constraints [ 0, 0 ] [ l 2, u 2 ] [ l 3 +1, u 3 +1 ] [ l 2, u 2 ] [ l 2, n-1 ] [ l 3, u 3 ] l 1 i u 1
53
Key Step: set up constraints for bounds Bounds Analysis, Step 4 l 2 i n-1 n i u 2 l 2 i u 2 i = 0 i < n *(p+i) += 1 i = i+1 l 3 i u 3 0 i 0 l 3 i u 3 l 3 +1 i u 3 +1 Build Region Constraints [ 0, 0 ] [ l 2, u 2 ] [ l 3 +1, u 3 +1 ] [ l 2, u 2 ] [ l 2, n-1 ] [ l 3, u 3 ] l 1 i u 1
54
Key Step: set up constraints for bounds Bounds Analysis, Step 4 l 2 i n-1 n i u 2 l 2 i u 2 i = 0 i < n *(p+i) += 1 i = i+1 l 3 i u 3 0 i 0 l 3 i u 3 l 3 +1 i u 3 +1 Build Region Constraints [ 0, 0 ] [ l 2, u 2 ] [ l 3 +1, u 3 +1 ] [ l 2, u 2 ] [ l 2, n-1 ] [ l 3, u 3 ] l 1 i u 1
55
Key Step: set up constraints for bounds Bounds Analysis, Step 4 l 2 i n-1 n i u 2 l 2 i u 2 i = 0 i < n *(p+i) += 1 i = i+1 l 3 i u 3 0 i 0 l 3 i u 3 l 3 +1 i u 3 +1 Build Region Constraints [ 0, 0 ] [ l 2, u 2 ] [ l 3 +1, u 3 +1 ] [ l 2, u 2 ] [ l 2, n-1 ] [ l 3, u 3 ] - i +
56
Key Step: set up constraints for bounds Bounds Analysis, Step 4 l 2 i n-1 n i u 2 l 2 i u 2 i = 0 i < n *(p+i) += 1 i = i+1 l 3 i u 3 0 i 0 l 3 i u 3 l 3 +1 i u 3 +1 Build Region Constraints [ 0, 0 ] [ l 2, u 2 ] [ l 3 +1, u 3 +1 ] [ l 2, u 2 ] [ l 2, n-1 ] [ l 3, u 3 ] - i +
57
Key Step: set up constraints for bounds Bounds Analysis, Step 4 l 2 i n-1 n i u 2 l 2 i u 2 i = 0 i < n *(p+i) += 1 i = i+1 l 3 i u 3 0 i 0 l 3 i u 3 l 3 +1 i u 3 +1 Build Region Constraints [ 0, 0 ] [ l 2, u 2 ] [ l 3 +1, u 3 +1 ] [ l 2, u 2 ] [ l 2, n-1 ] [ l 3, u 3 ] - i + l 2 0 l 2 l 3 +1 l 3 l 2 0 u 2 u 3 +1 u 2 n-1 u 3 Inequality Constraints
58
Generate symbolic expressions for bounds Goal: express bounds in terms of parameters Bounds Analysis, Step 5 l 2 = c 1 p + c 2 n + c 3 l 3 = c 4 p + c 5 n + c 6 u 2 = c 7 p + c 8 n + c 9 u 3 = c 10 p + c 11 n + c 12
59
Generate symbolic expressions for bounds Goal: express bounds in terms of parameters l 2 = c 1 p + c 2 n + c 3 l 3 = c 4 p + c 5 n + c 6 Bounds Analysis, Step 5 u 2 = c 7 p + c 8 n + c 9 u 3 = c 10 p + c 11 n + c 12 l 2 0 l 2 l 3 +1 l 3 l 2 0 u 2 u 3 +1 u 2 n-1 u 3
60
c 1 p + c 2 n + c 3 0 c 1 p + c 2 n + c 3 c 4 p + c 5 n + c 6 +1 c 4 p + c 5 n + c 6 c 1 p + c 2 n + c 3 Substitute expressions into constraints Bounds Analysis, Step 6 0 c 7 p + c 8 n + c 9 c 10 p + c 11 n + c 12 +1 c 7 p + c 8 n + c 9 c 7 p + c 8 n + c 9 c 10 p + c 11 n + c 12
61
Reduce symbolic inequalities to linear inequalities c 1 p + c 2 n + c 3 c 4 p + c 5 n + c 6 if c 1 c 4, c 2 c 5, and c 3 c 6 Bounds Analysis, Step 7
62
Apply reduction and generate a linear program c 1 0 c 2 0 c 3 0 c 1 c 4 c 2 c 5 c 3 c 6 +1 c 4 c 1 c 5 c 2 c 6 c 3 Bounds Analysis, Step 8 0 c 7 0 c 8 0 c 9 c 10 c 7 c 11 c 8 c 12 +1 c 9 c 7 c 10 c 8 c 11 c 9 c 12
63
Apply reduction and generate a linear program c 1 0 c 2 0 c 3 0 c 1 c 4 c 2 c 5 c 3 c 6 +1 c 4 c 1 c 5 c 2 c 6 c 3 lower boundsupper bounds Bounds Analysis, Step 8 Objective Function: max: (c 1 + + c 6 ) - (c 7 + + c 12 ) 0 c 7 0 c 8 0 c 9 c 10 c 7 c 11 c 8 c 12 +1 c 9 c 7 c 10 c 8 c 11 c 9 c 12
64
Solve linear program to extract bounds Bounds Analysis, Step 10 c 1 =0c 2 =0c 3 =0 c 4 =0c 5 =0c 6 =0 c 7 =0c 8 =1c 9 =0 c 10 =0c 11 =1c 12 =-1 l 2 i n-1 n i u 2 l 2 i u 2 - i + i = 0 i < n *(p+i) += 1 i = i+1 l 3 i u 3 0 i 0 l 3 i u 3 l 3 +1 i u 3 +1 Solution
65
Solve linear program to extract bounds Bounds Analysis, Step 9 u 2 = n u 3 = n-1 l 2 i n-1 n i u 2 l 2 i u 2 - i + i = 0 i < n *(p+i) += 1 i = i+1 l 3 i u 3 0 i 0 l 3 i u 3 l 3 +1 i u 3 +1 l 2 = 0 l 3 = 0 c 1 =0c 2 =0c 3 =0 c 4 =0c 5 =0c 6 =0 c 7 =0c 8 =1c 9 =0 c 10 =0c 11 =1c 12 =-1 Solution Symbolic Bounds
66
Substitute bounds at each program point Bounds Analysis, Step 10 0 i n-1 n i n 0 i n - i + i = 0 i < n *(p+i) += 1 i = i+1 0 i n-1 0 i 0 0 i n-1 1 i n u 2 = n u 3 = n-1 l 2 = 0 l 3 = 0 c 1 =0c 2 =0c 3 =0 c 4 =0c 5 =0c 6 =0 c 7 =0c 8 =1c 9 =0 c 10 =0c 11 =1c 12 =-1 Solution Symbolic Bounds
67
0 i n-1 n i n 0 i n - i + i = 0 i < n *(p+i) += 1 i = i+1 0 i n-1 0 i 0 0 i n-1 1 i n Compute access regions at each load or store Access Regions [p,p+n-1] u 2 = n u 3 = n-1 l 2 = 0 l 3 = 0 c 1 =0c 2 =0c 3 =0 c 4 =0c 5 =0c 6 =0 c 7 =0c 8 =1c 9 =0 c 10 =0c 11 =1c 12 =-1 Solution Symbolic Bounds
68
Bounds Analysis Region Analysis Data Race Detection Symbolic Regions Accessed By Execution of Each Procedure Pointer Analysis Interprocedural Region Analysis
69
Same Approach Set up target bounds of accessed regions Build a constraint system to compute these bounds Constraint System Accessed regions for a procedure must include: 1. Regions accessed by statements in the procedure 2. Regions accessed by invoked procedures Interprocedural Region Analysis GOAL: Compute accessed regions of memory for each procedure E.g. “ f(p,n) accesses [p, p+n-1] ”
70
void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); spawn f(p+n/2, n/2); sync; } else { int i = 0; while (i < n) { *(p+i) += 1; i++; } } [ p, p+n-1 ] Region Analysis in Example
71
f(p,n) accesses [ l(p,n), u(p,n) ] Region Analysis in Example void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); spawn f(p+n/2, n/2); sync; } else { int i = 0; while (i < n) { *(p+i) += 1; i++; } } [ p, p+n-1 ]
72
[ l(p,n/2), u(p,n/2) ] [ l(p+n/2,n/2), u(p+n/2,n/2) ] Region Analysis in Example void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); spawn f(p+n/2, n/2); sync; } else { int i = 0; while (i < n) { *(p+i) += 1; i++; } } [ p, p+n-1 ] f(p,n) accesses [ l(p,n), u(p,n) ]
73
Derive Constraint System Region constraints [ l(p,n/2), u(p,n/2) ] [ l(p,n), u(p,n) ]www [ l(p+n/2,n/2), u(p+n/2,n/2) ] [ l(p,n), u(p,n) ]www [ p, p+n-1 ] [ l(p,n), u(p,n) ]www Reduce to inequalities between lower/upper bounds Further reduce to a linear program and solve: l(p,n) = p u(p,n) = p+n-1 Access region for f(p,n): [p, p+n-1]
74
Bounds Analysis Region Analysis Data Race Freedom Check that Parallel Threads Are Independent Pointer Analysis Data Race Freedom
75
Dependence testing of two statements Do accessed regions intersect? Based on comparing upper and lower bounds of accessed regions Absence of data races Check that all the statements that execute in parallel are independent Data Race Freedom
76
void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); spawn f(p+n/2, n/2); sync; } else { int i = 0; while (i < n) { *(p+i) += 1; i++; } } f(p,n) accesses [ p, p+n-1 ]
77
[ p, p+n/2-1 ] [ p+n/2, p+n-1 ] Data Race Freedom void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); spawn f(p+n/2, n/2); sync; } else { int i = 0; while (i < n) { *(p+i) += 1; i++; } } f(p,n) accesses [ p, p+n-1 ]
78
No data races ! Data Race Freedom void f(char *p, int n) if (n > CUTOFF) { spawn f(p, n/2); spawn f(p+n/2, n/2); sync; } else { int i = 0; while (i < n) { *(p+i) += 1; i++; } }
79
Fundamental Property of the Analysis: No Fixed Point Computations The analysis does not use fixed-point computations: The problem is reduced to a linear program The solution to the linear program directly gives the symbolic lower and upper bounds Fixed-point approaches: Termination is not guaranteed: analysis domain of symbolic expressions has infinite ascending chains Use imprecise techniques to ensure termination: Artificially truncate number of iterations Use imprecise widening operators
80
Experience Set of benchmark programs Two versions of each benchmark Sequential version written in C Multithreaded version written in Cilk Experiments: 1.Data Race Freedom for the multithreaded versions 2.Array Bounds Violation Detection for both sequential and multithreaded versions 3.Automatic Parallelization for the sequential version
81
Data Races and Array Bounds Violations Application Data races (multithreaded) Array Bounds Violations (multithreaded) Array Bounds Violations (sequential) QuickSort NO MergeSort NO BlockMul NO NoTempMul NO LU NO Knapsack YESNO Heat NO
82
Parallel Performance Quicksort MergesortHeat BlockMul NoTempMulLU
83
Summary Sophisticated Memory Disambiguation Analysis Points-to Information Accessed Region Information Automatic Interprocedural Handles Multithreaded Programs Other Uses Besides Data Race Freedom Bitwidth Analysis Array-Bounds Check Elimination Buffer Overrun Detection
84
Bigger Picture Analysis has a very specific goal Developer understands and cares about results Points-to and region information is (implicitly) part of the interface of each procedure Developer understands interfaces Developer has expectations about analysis results Analysis can identify serious programming errors Developer expectations are implicit
85
Idea Enhance procedure interface to make points-to and region information explicit Points-to language Points-to graphs at entry and exit Effect on points-to relationships Region language Symbolic specification of accessed regions Developer provides information Analysis verifies that it is correct, and that correctness implies data race freedom
86
Points-to Language f(p, q, n) { context { entry: p->_a, q->_b; exit: p->_a, _a->_c, q->_b, _b->_d; } context { entry: p->_a, q->_a; exit: p->_a, _a->_c, q->_a; }
87
Points-to Language f(p, q, n) { context { entry: p->_a, q->_b; exit: p->_a, _a->_c, q->_b, _b->_d; } context { entry: p->_a, q->_a; exit: p->_a, _a->_c, q->_a; } p q p q p q p q Contexts for f(p,q,n) entry exit
88
Verifying Points-to Information One (flow sensitive) analysis per context f(p,q,n) {. } p q p q p q p q Contexts for f(p,q,n) entry exit
89
Verifying Points-to Information Start with entry points-to graph f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)
90
Verifying Points-to Information Analyze procedure f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)
91
Verifying Points-to Information Analyze procedure f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)
92
Verifying Points-to Information Check result against exit points-to graph f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)
93
Verifying Points-to Information Similarly for other context f(p,q,n) {. } p q p q p q p q entry exit Contexts for f(p,q,n)
94
Verifying Points-to Information Start with entry points-to graph f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)
95
Verifying Points-to Information Analyze procedure f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)
96
Verifying Points-to Information Check result against exit points-to graph f(p,q,n) {. } p q p q p q p q entry exit p q Contexts for f(p,q,n)
97
Analysis of Call Statements g(r,n) {. f(r,s,n);. }
98
Analysis of Call Statements Analysis produces points-graph before call g(r,n) {. f(r,s,n);. } r s
99
p q p q p q p q entry exit Contexts for f(p,q,n) Analysis of Call Statements Retrieve declared contexts from callee g(r,n) {. f(r,s,n);. } r s
100
p q p q p q p q entry exit Contexts for f(p,q,n) Analysis of Call Statements Find context with matching entry graph g(r,n) {. f(r,s,n);. } r s
101
p q p q p q p q entry exit Contexts for f(p,q,n) Analysis of Call Statements Find context with matching entry graph g(r,n) {. f(r,s,n);. } r s
102
p q p q p q p q entry exit Contexts for f(p,q,n) Analysis of Call Statements Apply corresponding exit points-to graph g(r,n) {. f(r,s,n);. } r s r s
103
Analysis of Call Statements Continue analysis after call g(r,n) {. f(r,s,n);. } r s
104
Analysis of Call Statements g(r,n) {. f(r,s,n);. } r s Result Points-to declarations separate analysis of multiple procedures Transformed global, whole-program analysis into local analysis that operates on each procedure independently
105
Experience Implemented points-to and region languages Integrated with points-to and region analyses Divide and Conquer Benchmarks Quicksort (QS) Mergesort (MS) Matrix multiply (MM) LU decomposition (LU) Heat (H) We added points-to and region information Sorting Programs Dense Matrix Computations Scientific Computation
106
Programming Overhead Proportion of C Code, Region Declarations, and Points-to Declarations 0.00 0.25 0.50 0.75 1.00 QSMSMMLUH C Code Region Declarations Points-to Declarations
107
Evaluation How difficult is it to provide declarations? Not that difficult. Have to write comparatively little code Must know information anyway How much benefit does analysis obtain? Substantial benefit. Simpler analysis software (no complex interprocedural analysis) More scalable, precise analysis
108
Evaluation Software Engineering Benefits of Points-to and Region Declarations Improved communication between developer and analysis Analysis reflects developer’s expectations Enhanced code reliability Enhanced interface information Analyze incomplete programs Programs that use libraries Programs under development
109
Evaluation Drawbacks of Points-to and Region Declarations Have to learn new language Have to integrate into development process Legacy software issues (programmer may not know points-to and region information)
110
Steps to Design Conformance Verify that Program Correctly Implements Key Design Properties as Expressed by Developer or Designer Role Verification Design Conformance for Object Models (joint with Daniel Jackson, MIT LCS) Context: Air Traffic Control Software MIT LCS (Daniel Jackson, Martin Rinard) MIT Aero-Astro Department (R. John Hansman) NASA Ames Research Center (Michelle Eshow) Kansas State University CS Dept. (David Schmidt) CTAS (Center/TRACON Automation System)
111
Role Verification Objects play different roles during their lifetime in computation Parked Aircraft, Taxiing Aircraft, Cleared for Takeoff Aircraft, In Flight Aircraft Roles reflect constraints on activities of object System actions must respect role constraints Parked Aircraft can’t take off Action violations indicate system confusion Goals Obtain role information from developer Check that program uses roles correctly
112
Role Classification Two General Kinds of Classification Content-based (predicate on object fields determines role) Relative (points-to relationships determine role) Role Classification is Application Dependent Aircraft Flying Aircraft Parked Aircraft Taxiing Aircraft Cleared Aircraft Class Roles
113
Standard View of Object Fields Outgoing References List of Meter Fixes Sequence Of Points String Runway Object Gate Object Incoming References Flight Plan Trajectory Flight Name Runway Gate
114
Relative Role Classification Points-to relationships define roles Specify sources of incoming edges Field of an object playing a given role Global or local variable Specify target of outgoing edges Specify available fields in each role
115
Example Roles Gate Object Aircraft Parked Aircraft Flight Plan Trajectory Flight Name Runway Gate
116
Trajectory Gate Example Roles Runway Object Aircraft Cleared for Takeoff Aircraft Flight Plan Runway Flight Name List of Meter Fixes String
117
Role Verification Analysis Obtains Role Definitions Method Information Roles of parameters and globals on entry Role changes that method performs Role of return value Intraprocedural Analysis Simulates potential executions of method Precise abstraction of heap Use role information for invoked methods Verify correctness of role information
118
Benefits of Roles Software Engineering Benefits Safety checks that take application semantics into account Enhanced implementation transparency Transformations Enabled By Precise Referencing Behavior Safe real-time memory management Parallelization and race detection for Programs with linked data structures Optimized Atomic Transactions
119
Key Issue: Obtaining Role Information Range of Developer and Designer Involvement Some Involvement Reasonable and Necessary: Roles Reflect Application-Specific Properties Primary Focus: Role Definitions Determine analysis distinctions Relevance of extracted information Secondary Focus: Method Specifications Developer specifies roles of parameters Analysis extracts role changes
120
Design Conformance Software Development Activities Requirements Design Implementation Design is Partial Focus on Important Aspects Omit Many Low-Level Details Design and Implementation are Disconnected No guarantee that code conforms to design
121
Goal of Design Conformance Establish and mechanically check conformance Use specific design formalism (object models) Boxes (objects) and Arrows (relations between objects) Aircraft Flying Aircraft Parked Aircraft Taxiing Aircraft Cleared Aircraft Meter Fix Flight Plan ++
122
Key Issue Establishing correspondence between object model and implementation Object models usually at a higher level of abstraction Many relations in object model realized as group of objects and references Object model may entirely omit some objects or references Enables designer to focus on important aspects But complicates path to conformance analysis
123
Aircraft Flying Aircraft Parked Aircraft Taxiing Aircraft Cleared Aircraft Meter Fix Flight Plan ++ Gate Object Aircraft Flight Plan Trajectory Flight Name Runway Gate Trajectory Gate Runway Object Aircraft Flight Plan Runway Flight Name List of Meter Fixes String Aircraft Meter Fix Flight Plan + Abstract Object Model Concrete Object Model Intermediate Object Model Roles
124
Concretization Specifications Maps Between Object Models Enables Designer/Developer to Establish Correspondence Between Object Models Specify how Object Model is Realized in Code Foundation for design conformance analysis Guides implementation of object model Implementation patterns for object models
125
Design Conformance Benefits Higher Confidence in Software Promote clean implementation of design Guarantee important design properties Design becomes useful throughout entire development cycle Updated as implementation changes Reliable source of information Enables more precise, relevant analysis
126
Related Work Pointer Analysis Landi, Ryder, Zhang – PLDI93 Emami, Ghiya, Hendren – PLDI94 Wilson, Lam – PLDI96 Rugina, Rinard – PLDI99 Rountev, Ryder – CC01 Salcianu, Rinard – PPoPP01 Region Analysis Triolet, Irigoin, Feautrier- PLDI86 Havlak, Kennedy – IEEE TPDS91 Rugina, Rinard – PLDI00 Pointer Specifications Hendren, Hummel, Nicolau – PLDI92 Guyer, Lin – LCPC00
127
Related Work Shape Analysis [CWZ90,GH96,FL97,SRW99,MS01] Extended Type Systems FX/87 [GJLS87] Dependent Types [XF99] Program Verification ESC [DLNS98] PVS [ORRSS96] Implementations of Object Models [HBR00]
128
Conclusion Developer and Designer Interact with Analysis Benefits More precise, relevant analysis Verify key safety and design properties Enhance utility of design Enable powerful transformations Key Issue: Determining appropriate abstractions to leverage Access regions, roles, object models Abstractions Share Several Features Identify important properties of data Relate properties of data to behavior of computation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.