Design Patterns for Recursive Descent Parsing

Slides:



Advertisements
Similar presentations
Parsing 4 Dr William Harrison Fall 2008
Advertisements

SIGCSE 2005, St. Louis, MO Design Patterns for Recursive Descent Parsing Dung Nguyen, Mathias Ricken & Stephen Wong Rice University.
Parsing & Scanning Lecture 2 COMP /25/2004 Derek Ruths Office: DH Rm #3010.
COMPSCI 105 S Principles of Computer Science 12 Abstract Data Type.
INTERPRETER Main Topics What is an Interpreter. Why should we learn about them.
Honors Compilers An Introduction to Grammars Feb 12th 2002.
1 Contents Introduction A Simple Compiler Scanning – Theory and Practice Grammars and Parsing LL(1) Parsing LR Parsing Lex and yacc Semantic Processing.
CS 280 Data Structures Professor John Peterson. Lexer Project Questions? Must be in by Friday – solutions will be posted after class The next project.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Design Patterns.
CS 280 Data Structures Professor John Peterson. Lexer Project Questions? Must be in by Friday – solutions will be posted after class The next project.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Design Patterns OOD. Course topics Design Principles UML –Class Diagrams –Sequence Diagrams Design Patterns C#,.NET (all the course examples) Design Principles.
GENERAL CONCEPTS OF OOPS INTRODUCTION With rapidly changing world and highly competitive and versatile nature of industry, the operations are becoming.
CS 2104 Prog. Lang. Concepts Dr. Abhik Roychoudhury School of Computing Introduction.
Polymorphism, Inheritance Pt. 1 COMP 401, Fall 2014 Lecture 7 9/9/2014.
Design Pattern Interpreter By Swathi Polusani. What is an Interpreter? The Interpreter pattern describes how to define a grammar for simple languages,
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
OOP in Introductory CS Stephen Wong and “Zung” Nguyen Rice University Better students though abstraction.
ECE450 - Software Engineering II1 ECE450 – Software Engineering II Today: Design Patterns IX Interpreter, Mediator, Template Method recap.
1 Parsers and Grammar. 2 Categories of Grammar Rules  Declarations or definitions. AttributeDeclaration ::= [ final ] [ static ] [ access ] datatype.
Recursive Descent Parsers Lecture 6 Mon, Feb 2, 2004.
Behavioral Patterns CSE301 University of Sunderland Harry R Erwin, PhD.
Software Design Patterns Curtsy: Fahad Hassan (TxLabs)
Bc. Jozef Lang (xlangj01) Bc. Zoltán Zemko (xzemko01) Increasing power of LL(k) parsers.
The Interpreter Pattern (Behavioral) ©SoftMoore ConsultingSlide 1.
Patterns for Decoupling Data Structures and Algorithms or How visitors can help you grow! Stephen Wong, Oberlin College Dung “Zung” Nguyen, Pepperdine.
CS 2130 Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing Warning: The precedence table given for the Wff grammar is in error.
Introduction to Parsing
Comp 411 Principles of Programming Languages Lecture 3 Parsing
Announcements/Reading
Teaching Compiler Design
Agenda Preliminaries Motivation and Research questions Exploring GLL
CSC 222: Computer Programming II
A Simple Syntax-Directed Translator
Object-Oriented Analysis and Design
Programming Languages Translator
CS510 Compiler Lecture 4.
Chapter 10 Design Patterns.
Introduction to Parsing (adapted from CS 164 at Berkeley)
Behavioral Design Patterns
Syntax Specification and Analysis
object oriented Principles of software design
Presentation by Julie Betlach 7/02/2009
Top-Down Parsing.
Syntax Analysis Sections :.
Parsing & Context-Free Grammars Hal Perkins Autumn 2011
Programming Language Syntax 7
IDE and Visualisation of Abstract Syntax Trees for Coco/R
(Slides copied liberally from Ruth Anderson, Hal Perkins and others)
Programming Language Syntax 2
Parsing & Scanning Lecture 2
CSE401 Introduction to Compiler Construction
Lecture 7: Introduction to Parsing (Syntax Analysis)
Programming Language Syntax 5
Nifty Assignments: Marine Biology Simulation
LL and Recursive-Descent Parsing Hal Perkins Autumn 2011
Introduction to Parsing
Introduction to Parsing
LL and Recursive-Descent Parsing
Computing Follow(A) : All Non-Terminals
Kanat Bolazar February 16, 2010
Recursive descent parsing
LL and Recursive-Descent Parsing Hal Perkins Autumn 2009
Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing
LL and Recursive-Descent Parsing Hal Perkins Winter 2008
Recursive descent parsing
COMPILER CONSTRUCTION
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 2, 09/04/2003 Prof. Roy Levow.
PZ03BX - Recursive descent parsing
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

Design Patterns for Recursive Descent Parsing Dung Nguyen, Mathias Ricken & Stephen Wong Rice University

RDP in CS2? Context: objects-first intro curriculum which already covers Polymorphism Recursion Design patterns (visitors, factories, etc) OOD principles Want good OOP/D example Want a relevant CS topic Recursive Descent Parsing: Smooth transitions from simple to complex examples, developing abstract model ∆ change in grammar  ∆ change in code

The Problem of Teaching RDP Mutual Recursion! Parser generator ? ? “A complex, isolated, advanced topic for upper division only” Global Analysis ? ? New Grammar High level topic only Complex Non-modular Difficult to extract an overall abstraction Scaling to generators problematic Less useful from a pedagogical standpoint. Difficult example to learn recursion with Path from grammar to parser easy for computer but hard for humans New Code

Object-Oriented Approach Grammar must drive any processing related to it, e.g. parsing.  Model the grammar first: Terminal symbols (tokens) Non-Terminal symbols (incl. start symbol) Rules Driving forces Decouple intelligent tokens from rules  visitors to tokens Extensible system: open ended number of tokens  extended visitors Then Parsing will come! Intelligent tokens vs. switching on dumb tokens Rules as visitors vs. interpreter pattern on tokens Localized decisions Express the overall abstraction Pedagogical aspects Tangibility of objects makes understanding the recursion easier Easier to see how the grammar creates the parser Fits in with OO curriculum—no new concepts to master

Representing Tokens Intelligent Tokens  No type checking! Decoupled from processing  Visitor pattern For LL(1) grammars, in any given situation, the token determines the parsing action taken  Parsing is done by visitors to tokens

Processing Tokens with Visitors Standard Visitor Pattern: Visitor caseA caseB visits Token A calls visits Token B calls But we want to be able to add an unbounded number of tokens!

Processing Tokens with Visitors Visitor Pattern modified with Chain-of-Responsibility: VisitorA defaultCase Visitor caseA visits Token A caseB VisitorB caseA calls delegates to visits chain Token B calls visits VisitorB defaultCase calls caseB caseB Handles Any Types of Tokens!

Modeling an LL(1) Grammar F | F + E ¤ E1 F  E1  num | id empty | Preparing the LL1 grammar Left-factorization Left-Factoring Make grammar predictively parsable

Modeling an LL(1) Grammar F E1 E1  empty | + E F  E1a  E1a num | id F  num | id F1 F1  F2  F2 Preparing the LL1 grammar Associating with a unique symbol Sequences – separates from branches Wrappers of terminals In multiple rules (branches), replace sequences and tokens with unique non-terminal symbols Branches only contain non-terminals

Modeling an LL(1) Grammar Branches modeled by inheritance (“is-a”) A  B | C Sequences modeled by composition (“has-a”) Local view only. For non-terminals Representing multiple rules with inheritance (union) F1 is a F and F2 is a F  union Representing a sequence with composition E1a has a +, E1a has an E.  composite with sequential processing S  X Y

Object Model of Grammar E  F E1 E1  empty | E1a E1a  + E F  F1 | F2 F1  num F2  id Move this before “Representing the Tokens”? Grammar Structure = Class Structure

Modeling an LL(1) Grammar No Predictive Parsing Table! Declarative, not procedural Model the grammar, not the parsing!

Detailed and Global Analysis Abstract and Local Analysis! Detailed and Global Analysis E  F E1 To process E, we must have the ability to process F and E1, independent of how either F or E1 are processed! To process E, we must first know about F and E1… E1  empty | E1a E1a  E1a + E But to process F, we must first know about F1 and F2… F  F1 | F2 Since parsing is done with visitors to tokens, all we need to parse E are the visitors to parse F and E1. F1 F1  num but to process F1, we must first know about num! F2  id Interdependence between rules One rule needs functionality of another rule Circular relationship problem Delegation model Visitors to tokens determine the parsing that occurs due to the grammar rules. Replaces switch statements With visitors, don’t need to know either which token or what rules to follow  can think in terms of abstract behaviors Look at abstract behavior to decouple Abstract behaviors  abstract construction Abstract Factories create concrete instances of the abstract behaviors Solution using factories Branching factory Sequence factory But E doesn’t know what it takes to make the F and E1 parsing visitors… The processing of one rule requires deep knowledge of the whole grammar! We need abstract construction of the visitors… Or does it??... Abstract Factories Decouple Rules

Factory Model of Parser E  F E1 E1  empty | E1a E1a  + E F  F1 | F2 F1  num F2  id Parser Structure = Factory Structure Grammar represented purely with composition

Extending the Grammar Adding new tokens and rules Highly localized impact on code No re-computing of prediction tables

E  F E1 E1  empty | E1a E1a  + E F  F1 | F2 F1  num F2  id E  S E1 E1  empty | E1a E1a  + E S  P | T P  (E) T  F T1 T1  empty | T1a T1a  * S F  F1 | F2 F1  num F2  id

We change your grammar in two minutes Parser Demo (If time permits) We change your grammar in two minutes while you wait! gram

Automatic Parser Generator No additional theory needed for generalization No fixed-points, FIRST and FOLLOWS sets Kooprey Parser generator: BNF  Java kou·prey (noun): “a rare short-haired ox (Bos sauveli) of forests of Indochina […]” (Merriam-Webster Online) Extensions Skip generation of source, create parser at runtime

Conclusion Simple enough to introduce in CS2 course (@Rice – near end of CS2) Teaches an abstraction of grammars and parsing Reinforces foundational OO principles Abstract representations Abstract construction Decoupled systems Recursion http:///www.exciton.cs.rice.edu/research/sigcse05