Presentation is loading. Please wait.

Presentation is loading. Please wait.

SIGCSE 2005, St. Louis, MO Design Patterns for Recursive Descent Parsing Dung Nguyen, Mathias Ricken & Stephen Wong Rice University.

Similar presentations


Presentation on theme: "SIGCSE 2005, St. Louis, MO Design Patterns for Recursive Descent Parsing Dung Nguyen, Mathias Ricken & Stephen Wong Rice University."— Presentation transcript:

1 SIGCSE 2005, St. Louis, MO Design Patterns for Recursive Descent Parsing Dung Nguyen, Mathias Ricken & Stephen Wong Rice University

2 RDP in CS2? Context: objects-first intro curriculum which already covers Polymorphism Recursion Design patterns (visitors, factories, etc) OOD principles Want good OOP/D example Want a relevant CS topic Recursive Descent Parsing: Smooth transitions from simple to complex examples, developing abstract model ∆ change in grammar  ∆ change in code

3 The Problem of Teaching RDP “ A c o m p l e x, i s o l a t e d, a d v a n c e d t o p i c f o r u p p e r d i v i s i o n o n l y ” Parser generator ?

4 Object-Oriented Approach Grammar must drive any processing related to it, e.g. parsing.  Model the grammar first: Terminal symbols (tokens) Non-Terminal symbols (incl. start symbol) Rules Driving forces Decouple intelligent tokens from rules  visitors to tokens Extensible system: open ended number of tokens  extended visitors Then Parsing will come!

5 Representing Tokens Intelligent Tokens  No type checking! Decoupled from processing  Visitor pattern For LL(1) grammars, in any given situation, the token determines the parsing action taken  Parsing is done by visitors to tokens

6 Processing Tokens with Visitors Token A Token B Visitor caseA caseB visits Standard Visitor Pattern: calls

7 caseB VisitorB Processing Tokens with Visitors Token A Token B Visitor caseA visits Visitor Pattern modified with Chain-of-Responsibility: visits chain VisitorB defaultCase caseB VisitorA defaultCase caseB caseA visits calls delegates to

8 Modeling an LL(1) Grammar Left-Factoring Make grammar predictively parsable EE E1  + EFF | FF empty | ¤ E1 num | id

9 FF Modeling an LL(1) Grammar In multiple rules (branches), replace sequences and tokens with unique non- terminal symbols Branches only contain non-terminals EE E1  F FF empty | E1 + E numid |E1a  F1  F2  E1a F1 F2 numid |

10 Modeling an LL(1) Grammar Branches modeled by inheritance (“is-a”) Sequences modeled by composition (“has-a”) A  B | C S  X Y

11 Object Model of Grammar E  F E1 E1  empty | E1a E1a  + E F  F1 | F2 F1  num F2  id Grammar Structure = Class Structure

12 Modeling an LL(1) Grammar

13 Detailed and Global Analysis FF EE E1  F empty | E1 + E num id E1a  F1  F2  E1a F1 | E1a F1 F2 To process E, we must first know about F and E1… But to process F, we must first know about F1 and F2… but to process F1, we must first know about num! The processing of one rule requires deep knowledge of the whole grammar! Or does it??... Abstract and Local Analysis! To process E, we must have the ability to process F and E1, independent of how either F or E1 are processed! Since parsing is done with visitors to tokens, all we need to parse E are the visitors to parse F and E1. But E doesn’t know what it takes to make the F and E1 parsing visitors… We need abstract construction of the visitors…

14 Factory Model of Parser E  F E1 E1  empty | E1a E1a  + E F  F1 | F2 F1  num F2  id Parser Structure = Factory Structure Grammar represented purely with composition

15 Extending the Grammar Adding new tokens and rules Highly localized impact on code No re-computing of prediction tables

16 E  S E1 E1  empty | E1a E1a  + E S  P | T P  (E) T  F T1 T1  empty | T1a T1a  * S F  F1 | F2 F1  num F2  id E  F E1 E1  empty | E1a E1a  + E F  F1 | F2 F1  num F2  id

17 Parser Demo (If time permits) We change your grammar in two minutes while you wait! gram

18 Automatic Parser Generator No additional theory needed for generalization No fixed-points, FIRST and FOLLOWS sets Kooprey Parser generator: BNF  Java kou·prey (noun): “a rare short-haired ox (Bos sauveli) of forests of Indochina […]” (Merriam-Webster Online) Extensions Skip generation of source, create parser at runtime

19 Conclusion Simple enough to introduce in CS2 course (@Rice – near end of CS2) Teaches an abstraction of grammars and parsing Reinforces foundational OO principles Abstract representations Abstract construction Decoupled systems Recursion http:///www.exciton.cs.rice.edu/research/sigcse05


Download ppt "SIGCSE 2005, St. Louis, MO Design Patterns for Recursive Descent Parsing Dung Nguyen, Mathias Ricken & Stephen Wong Rice University."

Similar presentations


Ads by Google