Atanas (Nasko) Rountev Barbara G. Ryder Rutgers University Points-to and Side-effect Analyses for Programs Built with Precompiled Libraries Atanas (Nasko) Rountev Barbara G. Ryder Rutgers University April 2, 2001 CC 2001
Whole-program Analysis Static analysis Optimizing compilers and SE tools Whole-program analysis Complex “global” properties Aggressive optimizations Helps humans Large body of work April 2, 2001 CC 2001
Weaknesses Need the whole program Incomplete programs Parts not implemented yet Separate compilation Precompiled components, no source Limited applicability April 2, 2001 CC 2001
Our Target Problem Goal: handle incompleteness and separation Specific instance C programs with precompiled libraries Points-to analysis and side-effect analysis Importance Points-to and side-effect info is necessary Techniques useful for similar problems April 2, 2001 CC 2001
Overview Introduction Analysis for Programs with Libraries Analysis of Libraries Library Summary Information Analysis of Library Clients Empirical Results Summary April 2, 2001 CC 2001
Programs Built with Libraries Library module Write the library Create library binary Client module Write the client Compile and link with library binary No library source Need separate analyses April 2, 2001 CC 2001
Separate Analysis of a Library Module Unknown client modules Simulated by an abstract client Abstract Client Whole-program Analyzer Library Solution Library Module April 2, 2001 CC 2001
Separate Analysis of a Client Module Use precomputed summary information Enables precise analysis Client Module Whole-program Analyzer Client Solution Library Summary April 2, 2001 CC 2001
Library Summary Information Used later to analyze client modules Enclosed with library binary, reusable Abstract Client Library Module Whole-program Analyzer Solution Summary Generator Library April 2, 2001 CC 2001
Points-to and Side-effect Analyses What variables may *p refer to? Andersen’s analysis What variables may be modified? MOD set for each statement Pointer resolution: *p=0 Procedure calls: backward propagation Wide range of applications April 2, 2001 CC 2001
Overview Introduction Analysis for Programs with Libraries Analysis of Libraries Library Summary Information Analysis of Library Clients Empirical Results Summary April 2, 2001 CC 2001
Separate Analysis of Libraries Unknown clients Access through library interface Abstract client Placeholder variable v0 Placeholder procedure p0 Apply whole-program analyses Reuse of implementation April 2, 2001 CC 2001
Abstract Client proc p0(v1,...,vn) returns ret0 { v0 = vi; v0 = &v; (v interface) v0 = &v0; v0 = &p0; v0 = *v0; *v0 = v0; v0 = (*v0)(v0,...,v0); ret0 = v0; } April 2, 2001 CC 2001
Library Example global g; proc p1(p,fp) { proc p2(r) local s, t, q; { *r = -(*r); } t = p; q = &s; p2(q); *t = s; (*fp)(t,g); } fp p0 p1 t p g v0 s1: MOD(s1) = {v0,g} April 2, 2001 CC 2001
Overview Introduction Analysis for Programs with Libraries Analysis of Libraries Library Summary Information Analysis of Library Clients Empirical Results Summary April 2, 2001 CC 2001
Library Summary Information Used for separate analysis of clients Enclosed with library binary, reusable Input to whole-program analyzers Best possible precision Basic summary Precision-preserving optimizations Reduce summary size Reduce cost of separate analysis of clients April 2, 2001 CC 2001
Basic Summary proc p1(p,fp) *t=s t=p (*fp)(t,g) q=&s proc p2(r) p2(q) global g; proc p1(p,fp) { local s=1, t, q; t = p; q = &s; p2(q); *t = s; (*fp)(t,g); } proc p2(r) { *r = -(*r); } proc p1(p,fp) *t=s t=p (*fp)(t,g) q=&s proc p2(r) p2(q) Mod(p1): (t,I) Call(p1) : (p2,D) (fp,I) Mod(p2): (r,I) April 2, 2001 CC 2001
Summary Optimizations Replace sets of equivalent variables Have the same points-to sets Replace with a set representative Remove irrelevant summary elements Smaller points-to and MOD sets in clients Inactive variables Inaccessible variables Use solution from separate library analysis April 2, 2001 CC 2001
Optimized Summary proc p1(p,fp) *t=s t=p (*fp)(t,g) q=&s proc p2(r) p2(q) proc p1(rep1,fp) (*fp)(rep1,g) Mod(p1): (t,I) Call(p1) : (p2,D) (fp,I) Mod(p2): (r,I) Mod(p1): (rep1,I) Call(p1) : (fp,I) April 2, 2001 CC 2001
Overview Introduction Analysis for Programs with Libraries Analysis of Libraries Library Summary Information Analysis of Library Clients Empirical Results Summary April 2, 2001 CC 2001
Experiments Library and client points-to analyses BANE toolkit (UC Berkeley) Summary generation and optimization Publicly available C programs Include general-purpose libraries Sun Ultra-60, 360MHz, 512K memory April 2, 2001 CC 2001
Data Programs April 2, 2001 CC 2001
Cost of Library Analysis April 2, 2001 CC 2001
Cost of Client Analysis April 2, 2001 CC 2001
Cost Reduction for Client Analysis April 2, 2001 CC 2001
Summary Challenges for whole-program analysis Programs with precompiled libraries Points-to and side-effect analyses Separate analysis of libraries and clients Summary construction Future work Other points-to analyses Client analyses April 2, 2001 CC 2001