Download presentation
Presentation is loading. Please wait.
1
Recap Mooly Sagiv
2
Outline Subjects Studied Questions & Answers
3
input –program text (file) output –sequence of tokens Read input file Identify language keywords and standard identifiers Handle include files and macros Count line numbers Remove whitespaces Report illegal symbols [Produce symbol table] Lexical Analysis (Scanning)
4
The Lexical Analysis Problem Given –A set of token descriptions –An input string Partition the strings into tokens (class, value) Ambiguity resolution –The longest matching token –Between two equal length tokens select the first
5
Jlex Input – regular expressions and actions (Java code) Output – A scanner program that reads the input and applies actions when input regular expression is matched Jlex regular expressions input program tokens scanner
6
Summary For most programming languages lexical analyzers can be easily constructed automatically Exceptions: –Fortran –PL/1 Lex/Flex/Jlex are useful beyond compilers
7
input –Sequence of tokens output –Abstract Syntax Tree Report syntax errors unbalanced parenthesizes [Create “symbol-table” ] [Create pretty-printed version of the program] In some cases the tree need not be generated (one-pass compilers) Syntax Analysis (Parsing)
8
Pushdown Automaton control parser-table input stack $ $utw V
9
Efficient Parsers Pushdown automata Deterministic Report an error as soon as the input is not a prefix of a valid program Not usable for all context free grammars cup context free grammar tokens parser “Ambiguity errors” parse tree
10
Kinds of Parsers Top-Down (Predictive Parsing) LL –Construct parse tree in a top-down matter –Find the leftmost derivation –For every non-terminal and token predict the next production –Preorder tree traversal Bottom-Up LR –Construct parse tree in a bottom-up manner –Find the rightmost derivation in a reverse order –For every potential right hand side and token decide when a production is found –Postorder tree traversal
11
Top-Down Parsing 1 t 1 t 2 input 5 4 32
12
Bottom-Up Parsing t 1 t 2 t 4 t 5 t 6 t 7 t 8 input 1 2 3
13
Example Grammar for Predictive LL Top- Down Parsing expression digit | ‘(‘ expression operator expression ‘)’ operator ‘+’ | ‘*’ digit ‘0’ | ‘1’ | ‘2’ | ‘3’ | ‘4’ | ‘5’ | ‘6’ | ‘7’ | ‘8’ | ‘9’
14
Example Grammar for Predictive LL Top- Down Parsing expression digit | ‘(‘ expression operator expression ‘)’ operator ‘+’ | ‘*’ digit ‘0’ | ‘1’ | ‘2’ | ‘3’ | ‘4’ | ‘5’ | ‘6’ | ‘7’ | ‘8’ | ‘9’
15
static int Parse_Expression(Expression **expr_p) { Expression *expr = *expr_p = new_expression() ; /* try to parse a digit */ if (Token.class == DIGIT) { expr->type=‘D’; expr->value=Token.repr –’0’; get_next_token(); return 1; } /* try parse parenthesized expression */ if (Token.class == ‘(‘) { expr->type=‘P’; get_next_token(); if (!Parse_Expression(&expr->left)) Error(“missing expression”); if (!Parse_Operator(&expr->oper)) Error(“missing operator”); if (Token.class != ‘)’) Error(“missing )”); get_next_token(); return 1; } return 0; }
16
Parsing Expressions Try every alternative production –For P A 1 A 2 … A n | B 1 B 2 … B m –If A 1 succeeds Call A 2 If A 2 succeeds –Call A 3 If A 2 fails report an error –Otherwise try B 1 Recursive descent parsing Can be applied for certain grammars Generalization: LL1 parsing
17
int P(...) { /* try parse the alternative P A 1 A 2... A n */ if (A 1 (...)) { if (!A 2 ()) Error(“Missing A 2 ”); if (!A 3 ()) Error(“Missing A 3 ”);.. if (!A n ()) Error(Missing A n ”); return 1; } /* try parse the alternative P B 1 B 2... B m */ if (B 1 (...)) { if (!B 2 ()) Error(“Missing B 2 ”); if (!B 3 ()) Error(“Missing B 3 ”);.. if (!B m ()) Error(Missing B m ”); return 1; } return 0;
18
Predictive Parser for Arithmetic Expressions Grammar C-code? 1E E + T 2E T 3T T * F 4T F 5 F id 6 F (E)
19
Bottom-Up Syntax Analysis Input –A context free grammar –A stream of tokens Output –A syntax tree or error Method –Construct parse tree in a bottom-up manner –Find the rightmost derivation in (reversed order) –For every potential right hand side and token decide when a production is found –Report an error as soon as the input is not a prefix of valid program
20
Constructing LR(0) parsing table Add a production S’ S$ Construct a finite automaton accepting “valid stack symbols” States are set of items A –The states of the automaton becomes the states of parsing-table –Determine shift operations –Determine goto operations –Determine reduce operations –Report an error when conflicts arise
21
1: S E$ 4: E T 6: E E + T 10: T i 12: T (E) 5: E T T 11: T i i 2: S E $ 7: E E + T E 13: T ( E) 4: E T 6: E E + T 10: T i 12: T (E) ( ( 15: T (E) ) 14: T (E ) 7: E E + T E 7: E E + T 10: T i 12: T (E) + + 8: E E + T T 2: S E $ $ i i
22
1: S E$ 4: E T 6: E E + T 10: T i 12: T (E) 5: E T T 11: T i i 2: S E $ 7: E E + T E 13: T ( E) 4: E T 6: E E + T 10: T i 12: T (E) ( ( 15: T (E) ) 14: T (E ) 7: E E + T E 7: E E + T 10: T i 12: T (E) + + 8: E E + T T 2: S E $ $ i i Parsing “ (i)$ ”
23
Summary (Bottom-Up) LR is a powerful technique Generates efficient parsers Generation tools exit LALR(1) –Bison, yacc, CUP But some grammars need to be tuned –Shift/Reduce conflicts –Reduce/Reduce conflicts –Efficiency of the generated parser
24
Summary (Parsing) Context free grammars provide a natural way to define the syntax of programming languages Ambiguity may be resolved Predictive parsing is natural –Good error messages –Natural error recovery –But not expressive enough But LR bottom-up parsing is more expressible
25
Abstract Syntax Intermediate program representation Defines a tree - Preserves program hierarchy Generated by the parser Declared using an (ambiguous) context free grammar (relatively flat) –Not meant for parsing Keywords and punctuation symbols are not stored (Not relevant once the tree exists) Big programs can be also handled (possibly via virtual memory)
26
Semantic Analysis Requirements related to the “context” in which a construct occurs Examples –Name resolution –Scoping –Type checking –Escape Implemented via AST traversals Guides subsequent compiler phases
27
Abstract Interpretation Static analysis Automatically identify program properties –No user provided loop invariants Sound but incomplete methods –But can be rather precise Non-standard interpretation of the program operational semantics Applications –Compiler optimization –Code quality tools Identify potential bugs Prove the absence of runtime errors Partial correctness
28
Constant Propagation z =3 while (x>0) if (x=1) y =7y =z+4 assert y==7 [x ?, y ?, z ? ] [x ?, y ?, z 3 ] [x 1, y ?, z 3 ] [x 1, y 7, z 3 ] [x ?, y 7, z 3 ] [x ?, y ?, z 3 ]
29
/* c */ L0: a := 0 /* ac */ L1:b := a + 1 /* bc */ c := c + b /* bc */ a := b * 2 /* ac */ if c < N goto L1 /* c */ return c a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;
30
a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ;
31
a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ; {c}
32
a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ; {c}
33
a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ; {c} {c, b}
34
a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ; {c} {c, b}
35
a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ; {c} {c, b} {c, a} {c, b}
36
a := 0 ; b := a +1 ; c := c +b ; a := b*2 ; c <N goto L1 return c ; {c, a} {c, b} {c, a} {c, b}
37
Summary Iterative Procedure Analyze one procedure at a time –More precise solutions exit Construct a control flow graph for the procedure Initializes the values at every node to the most optimistic value Iterate until convergence
38
Basic Compiler Phases
39
Overall Structure
40
Techniques Studied Simple code generation Basic blocks Global register allocation Activation records Object Oriented Assembler/Linker/Loader
41
Heap Memory Management Part of the runtime system Utilities for dynamic memory allocation Utilities for automatic memory reclamation –Garbage Colletion
42
Garbage Collection Techniques –Mark and sweep –Copying collection –Reference counting Modes –Generational –Incremental vs. Stop the world
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.