Download presentation
Presentation is loading. Please wait.
1
Design-Driven Compilation
Radu Rugina and Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology
2
Points-to Analysis, Region Analysis
Overview + Computation Goal: Parallelization Analysis Problems: Points-to Analysis, Region Analysis Fully Automatic Design Driven Two Potential Solutions Evaluation
3
Example - Divide and Conquer Sort
7 4 6 1 3 5 8 2
4
Example - Divide and Conquer Sort
7 4 6 1 3 5 8 2 4 7 6 1 5 3 8 2 Divide
5
Example - Divide and Conquer Sort
7 4 6 1 3 5 8 2 4 7 6 1 5 3 8 2 Divide 4 7 1 6 3 5 2 8 Conquer
6
Example - Divide and Conquer Sort
7 4 6 1 3 5 8 2 4 7 6 1 5 3 8 2 Divide 4 7 1 6 3 5 2 8 Conquer 1 4 6 7 2 3 5 8 Combine
7
Example - Divide and Conquer Sort
7 4 6 1 3 5 8 2 4 7 6 1 5 3 8 2 Divide 4 7 1 6 3 5 2 8 Conquer 1 4 6 7 2 3 5 8 Combine 1 2 3 4 5 6 7 8
8
Divide and Conquer Algorithms
Lots of Generated Concurrency Solve Subproblems in Parallel
9
Divide and Conquer Algorithms
Lots of Recursively Generated Concurrency Recursively Solve Subproblems in Parallel
10
Divide and Conquer Algorithms
Lots of Recursively Generated Concurrency Recursively Solve Subproblems in Parallel Combine Results in Parallel
11
“Sort n Items in d, Using t as Temporary Storage”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n);
12
“Recursively Sort Four Quarters of d”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); Divide array into subarrays and recursively sort subarrays
13
“Recursively Sort Four Quarters of d”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); Subproblems Identified Using Pointers Into Middle of Array 4 7 6 1 5 3 8 2 d d+n/4 d+n/2 d+3*(n/4)
14
“Recursively Sort Four Quarters of d”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 4 7 6 1 5 3 8 2 d d+n/4 d+n/2 d+3*(n/4)
15
“Recursively Sort Four Quarters of d”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); Sorted Results Written Back Into Input Array 7 4 1 6 5 3 2 8 d d+n/4 d+n/2 d+3*(n/4)
16
“Merge Sorted Quarters of d Into Halves of t”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 7 4 1 6 5 3 2 8 d 4 1 6 7 3 2 5 8 t t+n/2
17
“Merge Sorted Halves of t Back Into d”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 2 1 3 4 6 5 7 8 d 4 1 6 7 3 2 5 8 t t+n/2
18
“Use a Simple Sort for Small Problem Sizes”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 4 7 6 1 5 3 8 2 d d+n
19
“Use a Simple Sort for Small Problem Sizes”
void sort(int *d, int *t, int n) if (n > CUTOFF) { sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+2*(n/2),t+2*(n/2),n/4); sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n); 4 7 1 6 5 3 8 2 d d+n
20
Parallel Sort void sort(int *d, int *t, int n) if (n > CUTOFF) {
spawn sort(d,t,n/4); spawn sort(d+n/4,t+n/4,n/4); spawn sort(d+2*(n/2),t+2*(n/2),n/4); spawn sort(d+3*(n/4),t+3*(n/4),n-3*(n/4)); sync; spawn merge(d,d+n/4,d+n/2,t); spawn merge(d+n/2,d+3*(n/4),d+n,t+n/2); merge(t,t+n/2,t+n,d); } else insertionSort(d,d+n);
21
What Do You Need To Know To Exploit This Form of Parallelism?
Points-to Information (data blocks that pointers point to) Region Information (accessed regions within data blocks)
22
Information Needed To Exploit Parallelism
d and t point to different memory blocks Calls to sort access disjoint parts of d and t Together, calls access [d,d+n-1] and [t,t+n-1] sort(d,t,n/4); sort(d+n/4,t+n/4,n/4); sort(d+n/2,t+n/2,n/4); sort(d+3*(n/4),t+3*(n/4), n-3*(n/4)); d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1
23
Information Needed To Exploit Parallelism
d and t point to different memory blocks First two calls to merge access disjoint parts of d,t Together, calls access [d,d+n-1] and [t,t+n-1] merge(d,d+n/4,d+n/2,t); merge(d+n/2,d+3*(n/4), d+n,t+n/2); merge(t,t+n/2,t+n,d); d d+n-1 t t+n-1 d d+n-1 t t+n-1 d d+n-1 t t+n-1
24
Information Needed To Exploit Parallelism
Calls to insertionSort access [d,d+n-1] insertionSort(d,d+n); d d+n-1
25
What Do You Need To Know To Exploit This Form of Parallelism?
Points-to Information (d and t point to different data blocks) Symbolic Region Information (accessed regions within d and t blocks)
26
How Hard Is It To Figure These Things Out?
27
How Hard Is It To Figure These Things Out?
Challenging
28
How Hard Is It To Figure These Things Out?
void insertionSort(int *l, int *h) { int *p, *q, k; for (p = l+1; p < h; p++) { for (k = *p, q = p-1; l <= q && k < *q; q--) *(q+1) = *q; *(q+1) = k; } Not immediately obvious that insertionSort(l,h) accesses [l,h-1]
29
How Hard Is It To Figure These Things Out?
void merge(int *l1, int*m, int *h2, int *d) { int *h1 = m; int *l2 = m; while ((l1 < h1) && (l2 < h2)) if (*l1 < *l2) *d++ = *l1++; else *d++ = *l2++; while (l1 < h1) *d++ = *l1++; while (l2 < h2) *d++ = *l2++; } Not immediately obvious that merge(l,m,h,d) accesses [l,h-1] and [d,d+(h-l)-1]
30
Issues Heavy Use of Pointers Pointers into Middle of Arrays
Pointer Arithmetic Pointer Comparison Multiple Procedures sort(int *d, int *t, n) insertionSort(int *l, int *h) merge(int *l, int *m, int *h, int *t) Recursion
31
Fully Automatic Solution
Whole-program pointer analysis Context-sensitive, flow-sensitive Rugina and Rinard, PLDI 1999 Whole-program region analysis Symbolic constraint systems Solve by reducing to linear programs Rugina and Rinard, PLDI 2000
32
Need for sophisticated interprocedural analyses
Key Complication Need for sophisticated interprocedural analyses Pointer analysis Propagate analysis results through call graph Fixed-point algorithm for recursive programs Region analysis Formulation avoids fixed-point algorithms Single constraint system for each strongly connected component Need to have whole program in analyzable form
33
Bigger Picture Points-to and region information is (implicitly) part of the interface of each procedure Programmer understands procedure interfaces Programmer knows Points-to relationships on entry Effect of procedure on points-to relationships Regions of memory blocks that procedure accesses
34
Idea Enhance procedure interface to make points-to and region information explicit Points-to language Points-to graphs at entry and exit Effect on points-to relationships Region language Symbolic specification of accessed regions Programmer provides information Analysis verifies that it is correct
35
Points-to Language f(p, q, n) { context { entry: p->_a, q->_b;
exit: p->_a, _a->_c, q->_b, _b->_d; } entry: p->_a, q->_a; q->_a;
36
Points-to Language f(p, q, n) { context { Contexts for f(p,q,n)
entry: p->_a, q->_b; exit: p->_a, _a->_c, q->_b, _b->_d; } entry: p->_a, q->_a; q->_a; Contexts for f(p,q,n) p q p q entry p q p q exit
37
Verifying Points-to Information
One (flow sensitive) analysis per context f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q exit
38
Verifying Points-to Information
Start with entry points-to graph f(p,q,n) { . } Contexts for f(p,q,n) p q p q p q entry p q p q exit
39
Verifying Points-to Information
Analyze procedure f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q p q exit
40
Verifying Points-to Information
Analyze procedure f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q p q exit
41
Verifying Points-to Information
Check result against exit points-to graph f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q p q exit
42
Verifying Points-to Information
Similarly for other context f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q exit
43
Verifying Points-to Information
Start with entry points-to graph f(p,q,n) { . } Contexts for f(p,q,n) p q p q p q entry p q p q exit
44
Verifying Points-to Information
Analyze procedure f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q p q exit
45
Verifying Points-to Information
Check result against exit points-to graph f(p,q,n) { . } Contexts for f(p,q,n) p q p q entry p q p q p q exit
46
Analysis of Call Statements
g(r,n) { . f(r,s,n); }
47
Analysis of Call Statements
Analysis produces points-graph before call g(r,n) { . f(r,s,n); } r s
48
Analysis of Call Statements
Retrieve declared contexts from callee g(r,n) { . f(r,s,n); } Contexts for f(p,q,n) p q p q r entry s p q p q exit
49
Analysis of Call Statements
Find context with matching entry graph g(r,n) { . f(r,s,n); } Contexts for f(p,q,n) p q p q r entry s p q p q exit
50
Analysis of Call Statements
Find context with matching entry graph g(r,n) { . f(r,s,n); } Contexts for f(p,q,n) p q p q r entry s p q p q exit
51
Analysis of Call Statements
Apply corresponding exit points-to graph g(r,n) { . f(r,s,n); } Contexts for f(p,q,n) p q p q r entry s r s p q p q exit
52
Analysis of Call Statements
Continue analysis after call g(r,n) { . f(r,s,n); } r s
53
Analysis of Call Statements
g(r,n) { . f(r,s,n); } Result Points-to declarations separate analysis of multiple procedures Transformed global, whole-program analysis into local analysis that operates on each procedure independently r s
54
Region Language h(p,n) { reads [p,p+n-1]; writes [p,p+n-1]; }
55
Region Language h(p,n) { reads [p,p+n-1]; writes [p,p+n-1]; } reads p
56
Verifying Region Information
Two region containment requirements Direct Accesses: Locations directly accessed by procedure must be contained in declared regions Callees: Regions accessed by callees must be contained in declared regions of caller
57
Verifying Region Information
h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); }
58
Verifying Region Information
Extract directly accessed regions h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } reads p p+n-1 Directly Accessed Regions writes p p+n-1
59
Verifying Region Information
Check inclusion within declared regions h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } reads p p+n-1 Declared Regions for h(p,n) writes p p+n-1 reads p p+n-1 Directly Accessed Regions writes p p+n-1
60
Verifying Region Information
Check inclusion for accesses of callees h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } Callees
61
Verifying Region Information
Start with call to h(p,n/2) h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); }
62
Verifying Region Information
Extract and translate regions for h(p,n/2) h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } reads p p+n-1 Translated Regions from h(p,n/2); writes p p+n-1
63
Verifying Region Information
Check inclusion in declared regions of caller h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } reads p p+n-1 Declared Regions for h(p,n) writes p p+n-1 reads p p+n-1 Translated Regions from h(p,n/2); writes p p+n-1
64
Verifying Region Information
Similarly for call h(p+n/2,n-n/2) h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } reads p p+n-1 Translated Regions from h(p+n/2,n-n/2); writes p p+n-1
65
Verifying Region Information
Check inclusion in declared regions of caller h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); } reads p p+n-1 Declared Regions for h(p,n) writes p p+n-1 reads p p+n-1 Translated Regions from h(p+n/2,n-n/2); writes p p+n-1
66
Verifying Region Information
Result Region declarations separate analysis of multiple procedures Transformed global, whole-program analysis into local analysis that operates on each procedure independently h(p,n) { if (n < k) for (i=0;i<n;i++) p[i] = p[i]-1; else { h(p,n/2); h(p+n/2,n-n/2); }
67
Experience
68
Experience Implemented points-to and region languages
Integrated with points-to and region analyses Obtained Divide and Conquer Benchmarks Quicksort (QS) Mergesort (MS) Matrix multiply (MM) LU decomposition (LU) Heat (H) Written in C We added points-to and region information Sorting Programs Dense Matrix Computations Scientific Computation
69
Results With points-to and region information, could parallelize all benchmarks Points-to information speeds up points-to analysis significantly (up to factor of two) Region information has no significant effect on how fast region analysis runs
70
Proportion of C Code, Region Declarations, and Points-to Declarations
Programming Overhead Proportion of C Code, Region Declarations, and Points-to Declarations 1.00 C Code 0.75 Region Declarations 0.50 Points-to Declarations 0.25 0.00 QS MS MM LU H
71
Evaluation How difficult is it to provide declarations?
Not that difficult. Have to write comparatively little code Must know information anyway How much benefit does compiler obtain? Substantial benefit. Simpler analysis software (no complex interprocedural analysis) More scalable, precise analysis
72
Software Engineering Benefits of Points-to and Region Declarations
Evaluation Software Engineering Benefits of Points-to and Region Declarations Analysis reflects programmers intention Enhanced code reliability Enhanced interface information Analyze incomplete programs Programs that use libraries Programs under development
73
Drawbacks of Points-to and Region Declarations
Evaluation Drawbacks of Points-to and Region Declarations Have to learn new language Have to integrate into development process Legacy software issues (programmer may not know points-to and region information)
74
Related Work Extended Type Systems FX/87 [GJLS87]
Dependent Types [XF99] Issue: where put extended type information? Integrated with rest of program Separated from rest of program Program Verification ESC [DLNS98] PVS [ORRSS96]
75
Related Work Pointer Analysis Landi, Ryder, Zhang – PLDI93
Emami, Ghiya, Hendren – PLDI94 Wilson, Lam – PLDI96 Rugina, Rinard – PLDI99 Rountev, Ryder – CC01 Region Analysis Triolet, Irigoin, Feautrier- PLDI86 Havlak, Kennedy – IEEE TPDS91 Rugina, Rinard – PLDI00 Pointer Specifications Hendren, Hummel, Nicolau – PLDI92 Guyer, Lin – LCPC00
76
Conclusion Basic idea: Programmer provides Points-to information
Region information Analysis Verifies correctness Uses information to enable further analyses and transformations Lots of benefits to compiler and programmer
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.