Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jaeho Shin ROPAS Show & Tell

Similar presentations


Presentation on theme: "Jaeho Shin ROPAS Show & Tell"— Presentation transcript:

1 Jaeho Shin <netj@ropas.snu.ac.kr> 2004-11-01 ROPAS Show & Tell
Set-Based Analysis Jaeho Shin ROPAS Show & Tell

2 Overview Treating program variables as sets of values
is simple and intuitive. requires no abstract domain (if no further approximation is used). Ignore dependencies between different variables. different occurrences of the same variables. domain and codomain of functions. Set-based analysis (especially in [He1994]) makes no a priori requirement for sets be finitely presentable. represents an upper-bound on the accuracy of systems that ignore dependencies between variables.

3 Inter-Variable Dependencies
{u  1, v  2} {u  3, v  4} {x  1, ran(f)  [1,1]} {dom(g)  1, ran(g)  2} {x  2, ran(f)  [2,2]} {dom(g)  2, ran(g)  3}

4 Ignoring Inter-Variable Dependencies
{ u  {1, 3}, v  {2, 4} } { x  {1, 2}, ran(f)  {[1,1],[1,2],[2,1],[2,2]} } { dom(g)  {1, 2}, ran(g)  {2, 3} }

5 Target Language ML-like, Simple call-by-value functional language

6 Operational Semantics

7 Set-Based Operational Semantics
Approximates execution by collapsing all environments into one single set environment.

8 Set-Based Approximation
Local safety conditions for safe approximation The set-based semantics defined here is non-deterministic, and it may lead to an unsound approximation. Set-based approximation of term e0 is the set of values derived from the safe and minimal set environment Emin.

9 Algorithm for Computing sba(e0)
Representation of values To forget the environment part of closures The algorithm in [He1994] computes the representation of sba(e0), Basically two steps: Construct set constraints from given term. Simplify the constructed set constraints.

10 Set Constraint Set Variable Set Expression

11 Constructing Constraints

12 Meaning of Constraints
Interpretation I from set expressions to sets of set constraint values

13 Correspondence of C with sba(e0)
Interpretation I is a model of the conjunction of constraints C if, for each constraint X ⊇ se, I(se) is defined and I(X) ⊇ I(se). By giving order between I I1 ⊇ I2 if I1(X) ⊇ I2(X) for all X there is a least model lm(C) of C. It can be proved that if e0 B (X, C) and Ilm = lm(C), then Ilm(X) = ||sba(e0)||.

14 Simplifying Constraints

15 Remarks on the Algorithm
The simplification algorithm outputs explicit form of C. Explicit form contains only constraints with atomic expressions, where atomic expression is an abstraction or a constant with all subparts atomic. Explicit form represents a regular grammar for possible values. Time complexity is O(n3). Construction of constraints is linear in the size of e0 . At most O(n2) new constraints can be added by the simplification. Determining what other new constraints need to be added, when adding each new constraint, can be bounded by O(n). Space complexity is O(n2). Also computes the least set environment safe w.r.t. e0.

16 Application: Finding Links in Web Pages
Goal Find all possible links (URL’s) from a given web page which is written in HTML and JavaScript. Observation URL’s in HTML can be found trivially. For JavaScript, strings assigned to variables named *.href or *.src are the URL’s. Solution Transform given web page into an intermediate representation. Construct set constraints from the intermediate program. Simplify constraints. Gather all strings that may be assigned to variables named *.href or *.src.

17 Finding Links in Web Pages: Transforming HTML + JavaScript

18 Finding Links in Web Pages: Intermediate Language

19 Finding Links in Web Pages: Set Constraints

20 Finding Links in Web Pages: Constructing Constraints 1/2

21 Finding Links in Web Pages: Constructing Constraints 2/2

22 Finding Links in Web Pages: Simplifying Constraints

23 Finding Links in Web Pages: Concretizing Values 1/2

24 Finding Links in Web Pages: Concretizing Values 2/2

25 Finding Links in Web Pages: Future Works
Demand-driven analysis To analyze only the variables named *.href or *.src Using the idea in [ChYi2002] Increase precision Process undeclared global variables and nested functions. Distinguish different occurrences of same variables. Handle arithmetic more sophisticatedly. Consider using regular expressions instead of strings with *’s for final concrete output.

26 References [He1994] Nevin Heintze, “Set-Based Analysis for ML Programs”, In Proceedings of the SIGPLAN Conference on Lisp and Functional Programming, 1994. [ChYi2002] Woongshik Choi and Kwang Yi, “Demand-driven Set-Based Analysis”, Tech. Memo. ROPAS , Research On Program Analysis System, Korea Advanced Institute of Science and Technology, October


Download ppt "Jaeho Shin ROPAS Show & Tell"

Similar presentations


Ads by Google