Abstract Syntax Prabhaker Mateti 1
Different Levels of Syntax Lexical syntax Basic symbols (names, values, operators, etc.) Concrete syntax Rules for writing expressions, statements, programs Input to compilers/interpreters Abstract syntax “Internal” representation Captures semantics CS784(PM) 2 2
Overview Concrete Syntax Abstract Syntax Interpreter Results Parse-expression Concrete Syntax Abstract Syntax Unparse-expression Interpreter Potentially one can have several concrete syntax. Specifically, the text uses two concrete syntaxes in examples: Scheme Symbolic Expressions Scheme Character Strings Results CS784(PM) 3 3
Concrete vs. Abstract Syntax Expressions with common meaning (should) have the same abstract syntax. C: a+b*c Assumes certain operator precedence (why?) Forth: bc*a+ (reverse Polish) This expression tree represents the meaning of expression Not the same as parse tree (why?) abc*+ (or is this it?) CS784(PM) 4 4
Parse tree vs. AST expr + expr expr expr + expr expr expr 1 + ( 2 ) + 3 ) 1 2 3 CS784(PM) 5 5
Abstract Syntax Tree More useful representation of syntax tree Less clutter Actual level of detail depends on your design Basis for semantic analysis Later annotated with various information Type information Computed values CS784(PM) 6 6
Compilation in a Nutshell Source code (character stream) if (b == 0) a = b; Lexical analysis Token stream if ( b == ) a = b ; Parsing if == = ; Abstract syntax tree (AST) b a b if Semantic Analysis boolean int == = ; Decorated AST int b int 0 int a lvalue int b CS784(PM) 7 7
λ-expressions <exp> ::= <identifier> | (lambda (<identifier> ) <exp>) | (<exp> <exp>) Compare with Scheme S-expressions. EOPL3 p52: Lc-exp CS784(PM) 8 8
Lc-exp ::= Identifier (lambda (Identifier) Lc-exp) (Lc-exp Lc-exp) var-exp (var) (lambda (Identifier) Lc-exp) lambda-exp (bound-var body) (Lc-exp Lc-exp) app-exp (rator rand) Abstract syntax CS784(PM) 9 9
EOPL3 Scheme: define-datatype Syntax definition: (define-datatype type-name type-predicate-name {(variant-name {(field-name predicate)}* )} + ) An Example: (define-datatype environment environment? (empty-env-record) (extended-env-record (syms (list-of symbol?)) (vals (list-of scheme-value?)) (env environment?))) Data types built by define-datatype may be mutually recursive. CS784(PM) 10 10
Syntax Driven Representation (define-datatype expression expression? (var-exp (id symbol?)) (lambda-exp (id symbol?) (body expression?)) (app-exp (rator expression?) (rand expression?))) Resembles signature declarations var-exp : symbol? -> expression? lambda-exp : symbol? x expression? -> expression? app-exp : expression? x expression? -> expression? (caadr X) = (car (car (cdr X))) caadr = first of second of caddr = third CS784(PM) 11 11
Figure 2.2: (lambda (x) (f (f x))) CS784(PM) 12 12
Concrete to Abstract Syntax (define parse-expression (lambda (datum) (cond ((symbol? datum) (var-exp datum)) ((pair? datum) (if (eqv? (car datum) 'lambda) (lambda-exp (caadr datum) (parse-expression (caddr datum))) (app-exp (parse-expression (car datum)) (parse-expression (cadr datum)))) ) (else (eopl:error 'parse-expression "Invalid concrete syntax ~s" datum)) ))) CS784(PM) 13 13
DrRacket CS784(PM) 14 14
Unparse: Abstract to Concrete Syntax (define unparse-expression (lambda (exp) (cases expression exp (var-exp (id) id) (lambda-exp (id body) (list 'lambda (list id) (unparse-expression body)) ) (app-exp (rator rand) (list (unparse-expression rator) (unparse-expression rand)) )))) CS784(PM) 15 15
Role of Induction and Recursion Define data structures (infinite values) by induction. Seed elements. Closure operations. Define functions (operations) by recursion. Boundary/Basis case. Composite/Recursive case. Prove properties using structural induction. Basis case. Inductive step. Example: Natural number Constructors zero, succ Operations add,mul Properties identity, commutativity Chapter 1 + Scoping CS784(PM) 16 16
The Environment Environment: Constructors Observer table of variable-value pairs Chronologically later var-value pair overrides Constructors empty-env extend-env Observer apply-env Constructors (empty-env, extend-env)+ Lookup (apply-env) together capture table behavior. They apportion the overall workload differently in the different implementations. Extend-env introduces a collection of bindings simultaneously. CS784(PM) 17 17
The Environment Spec (empty-env) = ∅ (apply-env f var) = f (var) (extend-env var v f] ) = g, where g(var1 ) = v, if var1 = var = f (var1). otherwise from EOPL3 p36 CS784(PM) 18 18
An Example Env e (define e (extend-env ’d 6 (extend-env ’y 8 (extend-env ’x 7 (extend-env ’y 14 (empty-env)))))) e(d)=6, e(x)=7, e(y)=8 CS784(PM) 19 19
Representing Environment Many representations are possible Speedy access Memory frugal Change in the interface: Syms x Vals (define extend-env (lambda (syms vals env) … )) Constructors (empty-env, extend-env)+ Lookup (apply-env) together capture table behavior. They apportion the overall workload differently in the different implementations. Extend-env introduces a collection of bindings simultaneously. CS784(PM) 20 20
Alt-1: env is a List of Ribs left rib: list of variables right rib: corresponding list of values. Exercise 2.11 EOPL3 (Ribcage) CS784(PM) 21 21
Alt-1: List of Ribs (Ribcage) (define empty-env (lambda () '())) (define extend-env (lambda (syms vals env) (cons (list syms vals) env) )) (define apply-env (lambda (env sym) (if (null? env) (eopl:error 'apply-env "No binding for ~s" sym) (let ((syms (car (car env))) (vals (cadr (car env))) (env (cdr env))) (let ((pos (rib-find-position sym syms))) (if (number? pos) (list-ref vals pos) (apply-env env sym))))))) Ribcage = list of list of pairs impl. CS784(PM) 22 22
Alt-2: env is a Unary Function (define empty-env (lambda () (lambda (sym) (eopl:error 'apply-env "No binding for ~s" sym)) )) (define extend-env (lambda (syms vals env) (let ((pos (list-find-position sym syms))) (if (number? pos) (list-ref vals pos) (apply-env env sym)))) )) (define apply-env (lambda (env sym) (env sym) )) Env value – unary function – (lambda (sym) …) CS784(PM) 23 23
Alt-3: Tagged Records (define-datatype environment environment? (empty-env-record) (extended-env-record (syms (list-of symbol?)) (vals (list-of scheme-value?)) (env environment?))) (define scheme-value? (lambda (v) #t)) (define empty-env (lambda () (empty-env-record) )) (define extend-env (lambda (syms vals env) (extended-env-record syms vals env))) CS784(PM) 24 24
Alt-3: Tagged records Ordinary tagged record implementation (define apply-env (lambda (env sym) (cases environment env (empty-env-record () (eopl:error 'apply-env "No binding for ~s" sym)) (extended-env-record (syms vals env) (let ((pos (list-find-position sym syms))) (if (number? pos) (list-ref vals pos) (apply-env env sym))))))) Ordinary tagged record implementation CS784(PM) 25 25
Queue (define reset (lambda (q) (vector-ref q 0))) (define empty? (lambda (q) (vector-ref q 1))) (define enqueue (lambda (q) (vector-ref q 2))) (define dequeue (lambda (q) (vector-ref q 3))) (define Q (create-queue)) ((enqueue Q) 55) ((empty? Q)) ((dequeue Q)) ((reset Q)) Queue represented as a pair of lists with the second list in reverse order to enable constant time access to both ends in practice. When the first list is empty, the second list is reversed --- “inefficient” dequeue. (Cf. CS776 Queue) (Data invariant – normal form) Assignment operation in Scheme + encapsulation (static scoping) + (infinite lifetime). CS784(PM) 26 26
(define create-queue (lambda () (let ((q-in '()) (q-out '())) (letrec ((reset-queue (set! q-in '()) (set! q-out '())) ) (empty-queue? (and (null? q-in) (null? q-out))) ) (enqueue (lambda (x) (set! q-in (cons x q-in))) ) (dequeue (if (empty-queue?) (eopl:error 'dequeue "Not on an empty queue") (begin (if (null? q-out) (set! q-out (reverse q-in)) (set! q-in '()))) (let ((ans (car q-out))) (set! q-out (cdr q-out)) ans))))) ) (vector reset-queue empty-queue? enqueue dequeue)) ))) Queue represented as a pair of list with the second list in reverse order to enable constant time access to both ends in practice. When the first list is empty, the second list is reversed --- “inefficient” dequeue. (Cf. CS776 Queue) (Data invariant – normal form) Assignment operation in Scheme + encapsulation (static scoping) + (infinite lifetime). CS784(PM) 27 27