Streaming Tree Transducers Loris D'Antoni University of Pennsylvania Joint work with Rajeev Alur 1.

Slides:



Advertisements
Similar presentations
Compiler Construction
Advertisements

Chapter 5 Pushdown Automata
Pushdown Automata Section 2.2 CSC 4170 Theory of Computation.
Equivalence of Extended Symbolic Finite Transducers Presented By: Loris D’Antoni Joint work with: Margus Veanes.
C O N T E X T - F R E E LANGUAGES ( use a grammar to describe a language) 1.
Introduction to Computability Theory
1 Introduction to Computability Theory Lecture7: PushDown Automata (Part 1) Prof. Amos Israeli.
Costas Busch - RPI1 Pushdown Automata PDAs. Costas Busch - RPI2 Pushdown Automaton -- PDA Input String Stack States.
Courtesy Costas Busch - RPI1 Pushdown Automata PDAs.
Courtesy Costas Busch - RPI1 NPDAs Accept Context-Free Languages.
1 Normal Forms for Context-free Grammars. 2 Chomsky Normal Form All productions have form: variable and terminal.
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
Fall 2004COMP 3351 Pushdown Automata PDAs. Fall 2004COMP 3352 Pushdown Automaton -- PDA Input String Stack States.
Fall 2006Costas Busch - RPI1 Pushdown Automata PDAs.
January 14, 2015CS21 Lecture 51 CS21 Decidability and Tractability Lecture 5 January 14, 2015.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
Chapter 2 A Simple Compiler
Prof. Busch - LSU1 Pushdown Automata PDAs. Prof. Busch - LSU2 Pushdown Automaton -- PDA Input String Stack States.
Fall 2005Costas Busch - RPI1 Pushdown Automata PDAs.
Model Checking Lecture 5. Outline 1 Specifications: logic vs. automata, linear vs. branching, safety vs. liveness 2 Graph algorithms for model checking.
Finite State Machines Data Structures and Algorithms for Information Processing 1.
Bottom-up parsing Goal of parser : build a derivation
More Trees COL 106 Amit Kumar and Shweta Agrawal Most slides courtesy : Douglas Wilhelm Harder, MMath, UWaterloo
Streaming Tree Transducers Loris D'Antoni University of Pennsylvania Joint work with Rajeev Alur 1.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 8 Mälardalen University 2010.
FAST : a Transducer Based Language for Manipulating Trees Presented By: Loris D’Antoni Joint work with: Margus Veanes, Ben Livshits, David Molnar.
Pushdown Automata CS 130: Theory of Computation HMU textbook, Chap 6.
Design contex-free grammars that generate: L 1 = { u v : u ∈ {a,b}*, v ∈ {a, c}*, and |u| ≤ |v| ≤ 3 |u| }. L 2 = { a p b q c p a r b 2r : p, q, r ≥ 0 }
Managing XML and Semistructured Data Lecture 13: XDuce and Regular Tree Languages Prof. Dan Suciu Spring 2001.
P p Chapter 10 has several programming projects, including a project that uses heaps. p p This presentation shows you what a heap is, and demonstrates.
Management of XML and Semistructured Data Lecture 11: Schemas Wednesday, May 2nd, 2001.
1Computer Sciences Department. Book: INTRODUCTION TO THE THEORY OF COMPUTATION, SECOND EDITION, by: MICHAEL SIPSER Reference 3Computer Sciences Department.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.2: Pushdown Automata) Prof. Karen Daniels, Fall 2010 with acknowledgement.
11 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 7 School of Innovation, Design and Engineering Mälardalen University 2012.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Pushdown.
Pushdown Automata Hopcroft, Motawi, Ullman, Chap 6.
Grammar Set of variables Set of terminal symbols Start variable Set of Production rules.
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
1 Chapter Pushdown Automata. 2 Section 12.2 Pushdown Automata A pushdown automaton (PDA) is a finite automaton with a stack that has stack operations.
CS 154 Formal Languages and Computability March 22 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
Formal Languages, Automata and Models of Computation
G. Pullaiah College of Engineering and Technology
CSE 105 theory of computation
Table-driven parsing Parsing performed by a finite state machine.
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Pushdown Automata PDAs
Pushdown Automata PDAs
Pushdown Automata PDAs
PDAs Accept Context-Free Languages
Pushdown Automata PDAs
Chapter 7 PUSHDOWN AUTOMATA.
NPDAs Accept Context-Free Languages
Principles of Computing – UFCFA3-30-1
Intro to Data Structures
Finite Automata.
CSE 105 theory of computation
MA/CSSE 474 Theory of Computation Nondeterminism NFSMs.
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
… NPDAs continued.
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Pushdown automata The Chinese University of Hong Kong Fall 2011
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Sub: Theoretical Foundations of Computer Sciences
CSE 105 theory of computation
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Normal Forms for Context-free Grammars
Pushdown automata Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
CSE 105 theory of computation
Presentation transcript:

Streaming Tree Transducers Loris D'Antoni University of Pennsylvania Joint work with Rajeev Alur 1

Outline 1.Deterministic bottom-up MSO equivalent model for ranked tree transformations 2.Deterministic left-to-right MSO equivalent model for tree transformations 2

Motivations A tree transducer maps a tree over an input alphabet to a tree over an output alphabet Desirable properties of a class of transducers C – Closure properties: Composition: given T 1, T 2 in C, their composition T 1 oT 2 belongs to C (for free if MSO equivalence); Regular look-ahead: ability to ask question about the remaining input, without needing to read it. – Fast Execution: single pass over the input tree deterministic – Expressiveness: possibly MSO equivalent – Fast algorithms: equivalence, type checking… 3

Example of Transformations Insert/delete nodes Copy a sub-tree K times Swap sub-trees based on some regular pattern – Given an address book, where each entry has a tag that denotes whether the entry is “private” or “public”, sort the address book based on this tag: all private entries should appear before public entries NO actual sorting: – we want to be MSO equivalent 4

Bottom-up Ranked Tree Transducers When processing a tree a(x 1,x 2 ) the transducer – reads the state q i reached by each child x i (while going bottom-up) – reads the symbol a of the current node – Uses the transformations t 1, t 2 computed by the x 1, x 2 to produce a new output – Updates the state to q a t1t1 t2t2 q1q1q1q1 q2q2q2q2 c t1t1 t2t2 q 5

Multiple Variables Needed If the root is labeled with a – compute the identity function, – otherwise replace each a with b and each b with an a a ILIL qLqLqLqL qRqRqRqR a qaqaqaqa SLSL IRIR SRSR ILIL IRIR b SLSL SRSR Each tree must be able to compute more than one possible transformation VARIABLES Each tree must be able to compute more than one possible transformation VARIABLES 6

Holes in Variables Needed 1/3 Tree Swap:swapsub-trees bTree Swap: swap the first two sub-trees with root labeled with a b (in-order traversal) b a c a b t1t1 t2t2 b a c a b t2t2 t1t1 7

Holes in Variables Needed 2/3 i b q i means that so far we saw i top level b y i ib y i contains the i-th b-rooted sub-tree x b x contains the tree processed so far but has i holes in place of the top-level b-rooted sub-trees a q1q1q1q1 q1q1q1q1 a q2q2q2q2 y2Ly2L y1Ly1L xLxL y2Ry2R y1Ry1R xRxR xLxL xRxR y1Ly1L y1Ry1R 8

Holes in Variables Needed 2/3 i b q i means that so far we saw i top level b y i ib y i contains the i-th b-rooted sub-tree x holesb x contains the tree processed so far but has i holes in place of the top-level b-rooted sub-trees b q1q1q1q1 q1q1q1q1 b y2Ly2L y1Ly1L xLxL y2Ry2R y1Ry1R xRxR xLxL xRxR ε ? y1Ly1L y1Ry1R Hole Empty tree b q1q1q1q1 9

Conflict Relation 1/3 Recursive swap: – f(a(x,y)) = a(f(y),f(x)) – f(b(x,y)) = b(x,y) Easy to compute top-down Bottom-up it needs two variables 10

Conflict Relation 2/3 a ILIL q q a qaqaqaqa SLSL IRIR SRSR a ILIL IRIR SLSL SRSR Two variables – I computes the identity: case in which we have not hit the last b yet – S computes the swap: case in which we have hit the last b f(a(x,y)) = a(f(y),f(x)) f(b(x,y)) = b(x,y) f(a(x,y)) = a(f(y),f(x)) f(b(x,y)) = b(x,y) 11

Conflict Relation 3/3 The variable I is used twice – This could cause the output tree to be of exponential size in the size of the input tree (NO MSO) – We need the ability of copying but we need to limit it – INTUITION: only one of the two trees we are computing will appear in the final output (will explain later) b ILIL q q b qaqaqaqa SLSL IRIR SRSR b ILIL IRIR ILIL IRIR f(a(x,y)) = a(f(y),f(x)) f(b(x,y)) = b(x,y) f(a(x,y)) = a(f(y),f(x)) f(b(x,y)) = b(x,y) 12

Streaming Tree Transducers: Design Principles Execution: single left-to-right pass in linear time Key to expressiveness: – multiple variables – variables can be stored on stack – explicit way of combining sub-trees in the assignments of variables (hole substitution) Key to analyzability: – single-use restricted updates – write-only output – Can compute multiple possible partial outputs 13

Streaming Tree Transducers 1/3 The input and output trees are represented as nested words Each node is represented by an open tag This requires a stack to model the current depth in the input tree (pushdown machine) Enables uniform representation of string, ranked trees, unranked trees, and forests cb a a> a> 14

Streaming Tree Transducers 2/3 STT from Σ to Γ: Q : set of states P : set of stack states X : set of variables ~ : conflict relation over X Variables can contain a hole ? δ : transition function. Updates state when reading input symbol in a given state U : variable update function. Updates variable values when reading an input symbol in a given state. O : output function for combining variables and producing final output The limited form of coyping 15

Streaming Tree transducers 3/3 Transition function δ: Open Tags: – δ(q,<a) → (q',p) (push state p on the stack) – x := ? – x p := (x stored on the stack as x p ) Close tags: – δ(q,a>,p) → q’ – x := (x p popped from the stack) Internal: – δ(q,a) → q’ – x := 16

The Conflict Relation We want to be able to express the assignment However x and y must not be combined later we can create an output of size exponential in the input SOLUTION: Conflict relation: x ~ y x and y can never appear on the RHS of the same variable assignment Example: z:=a(x,y) is not allowed X:=X Y:=X X:=X Y:=X 17

STT Properties MSO equivalent (closure under composition and regular lookahead) Output computed in single left-to-right linear time pass over the input Functional equivalence decidable in NExpTime: – compute a exponential size PDA over {0,1} that accepts a string with same number of 0s and 1s iff two STTs are not equivalent. Use Parikh Image Type checking decidable in ExpTime: – given two tree language I and O and an STT S check whether S(I) is included in O 18

Loris D’Antoni University of Pennsylvania Thank you! Questions? 19