Coalgebra, Automata, and Document Synchronization

Slides:



Advertisements
Similar presentations
Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.
Advertisements

Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
Determinization of Büchi Automata
Pushdown Automata Part II: PDAs and CFG Chapter 12.
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
Introduction to Computability Theory
1 Introduction to Computability Theory Lecture7: PushDown Automata (Part 1) Prof. Amos Israeli.
Introduction to Computability Theory
1 Introduction to Computability Theory Discussion1: Non-Deterministic Finite Automatons Prof. Amos Israeli.
CS5371 Theory of Computation
61 Nondeterminism and Nodeterministic Automata. 62 The computational machine models that we learned in the class are deterministic in the sense that the.
Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.
Validating Streaming XML Documents Luc Segoufin & Victor Vianu Presented by Harel Paz.
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.2: Pushdown Automata) Prof. Karen Daniels, Fall 2009 with acknowledgement.
January 14, 2015CS21 Lecture 51 CS21 Decidability and Tractability Lecture 5 January 14, 2015.
Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
CS5371 Theory of Computation Lecture 8: Automata Theory VI (PDA, PDA = CFG)
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
Regular Expressions (RE) Empty set Φ A RE denotes the empty set Empty string λ A RE denotes the set {λ} Symbol a A RE denotes the set {a} Alternation M.
CSCI 2670 Introduction to Theory of Computing September 21, 2005.
DECIDABILITY OF PRESBURGER ARITHMETIC USING FINITE AUTOMATA Presented by : Shubha Jain Reference : Paper by Alexandre Boudet and Hubert Comon.
Theory of Computation, Feodor F. Dragan, Kent State University 1 Regular expressions: definition An algebraic equivalent to finite automata. We can build.
Pushdown Automata CS 130: Theory of Computation HMU textbook, Chap 6.
Pushdown Automata (PDAs)
XML Data Management 10. Deterministic DTDs and Schemas Werner Nutt.
Lecture 05: Theory of Automata:08 Kleene’s Theorem and NFA.
Managing XML and Semistructured Data Lecture 13: XDuce and Regular Tree Languages Prof. Dan Suciu Spring 2001.
2. Regular Expressions and Automata 2007 년 3 월 31 일 인공지능 연구실 이경택 Text: Speech and Language Processing Page.33 ~ 56.
Pushdown Automata Chapters Generators vs. Recognizers For Regular Languages: –regular expressions are generators –FAs are recognizers For Context-free.
Copyright © Curt Hill Finite State Automata Again This Time No Output.
Chapter 3 Regular Expressions, Nondeterminism, and Kleene’s Theorem Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction.
Recognising Languages We will tackle the problem of defining languages by considering how we could recognise them. Problem: Is there a method of recognising.
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.2: Pushdown Automata) Prof. Karen Daniels, Fall 2010 with acknowledgement.
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
CSCI 3130: Formal languages and automata theory Tutorial 1 Lee Chin Ho.
Lecture Notes 
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 2 Context-Free Languages Some slides are in courtesy.
Pushdown Automata Hopcroft, Motawi, Ullman, Chap 6.
Tree Automata First: A reminder on Automata on words Typing semistructured data.
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
Theory of Computation Automata Theory Dr. Ayman Srour.
1 Chapter Pushdown Automata. 2 Section 12.2 Pushdown Automata A pushdown automaton (PDA) is a finite automaton with a stack that has stack operations.
1/29/02CSE460 - MSU1 Nondeterminism-NFA Section 4.1 of Martin Textbook CSE460 – Computability & Formal Language Theory Comp. Science & Engineering Michigan.
Theory of Computation Automata Theory Dr. Ayman Srour.
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Context-free grammars
Pushdown Automata.
Theory of Computation Pushdown Automata pda Lecture #10.
Lexical analysis Finite Automata
Deterministic FA/ PDA Sequential Machine Theory Prof. K. J. Hintz
FORMAL LANGUAGES AND AUTOMATA THEORY
CIS Automata and Formal Languages – Pei Wang
Push-down Automata Section 3.3 Wed, Oct 27, 2004.
Chapter 7 PUSHDOWN AUTOMATA.
Two issues in lexical analysis
Chapter 2 FINITE AUTOMATA.
Context-Free Languages
Alternating tree Automata and Parity games
4. Properties of Regular Languages
Nondeterministic Finite Automata
Context-Free Grammars
Chapter Nine: Advanced Topics in Regular Languages
Finite Automata Reading: Chapter 2.
4b Lexical analysis Finite Automata
Discrete Math II Howon Kim
Instructor: Aaron Roth
CSCI 2670 Introduction to Theory of Computing
CSCI 2670 Introduction to Theory of Computing
What is it? The term "Automata" is derived from the Greek word "αὐτόματα" which means "self-acting". An automaton (Automata in plural) is an abstract self-propelled.
Presentation transcript:

Coalgebra, Automata, and Document Synchronization Kazuhiro Inaba Reading “Merging Hierarchically-Structured Documents in Workflow Systems” [E. Badouel and M. T. Tchendji, CMCS’08] IPLAS 2009/06/09

((All pictures are cited from the authors presentation slide: http://old-www.cwi.nl/projects/cmcs08/slides/index.html thanks.))

≪Problem≫ Partial views of a (big) document Concurrently updated How to merge them? Anyhow, the problem the authors want to address is this. There is a big document tree, and many people are working on its partial views concurrently. The problem is, how to merge them consitently?

Approach of this paper Use coalgebra (= tree automaton)! Solution is, to use a certain kind of coalgebra, namely, tree automaton,

Represented by an automaton doc u2 u1 view2 view1 Represented by an automaton Set of all doc’s s.t. view2(doc’) = u2 Set of all doc’s s.t. view1(doc’) = u1 Here is the one picture summary of the work. We have an original document ‘doc’, two view functions view1 and view2, and viewed document u1 and u2. They at updated independently To merge them, in the coalgebraic approach, we first compute the set of inverse images. They can be infinite inter- section !!

Basic Notions Document View Grammar Projection Expansion Now let us go into the details. First I introduce some notions, in other words, I’ll exapling how the documents or views are modeled in this work.

(Untyped) Document Let S be a finite alphabet E.g. S = {A,B,C} A “document” (over S) is an unranked tree over S First, a document they consider is a tree, an unranked tree. Unranked means that the nodes with same label may have different number of children

(Untyped) View A “view” is a subset of S The “projection associated with a view” is the function (of type: tree  forest) that erases all symbols not in the view View on a document is defined very simply.

(Untyped) Expansion The “expansion associated with a view” is the function (of type: forest  2tree) which is the inverse of the projection Note: the output set is regular!  they’re represented as a tree automaton How precisely? Wait a moment…

Grammar In the paper, the authors consider only “typed” documents. Typed = Conformance to a grammar A “Grammar” over S is a triple (S , A, P): A ∈ S axiom (initial symbol) P ⊆ S ×S * set of productions

(Typed) Document A grammar G = (S , A, P) corresponds to a set of trees called “derivation trees”, defined for each X∈S as follows: Der(G, X) = {X(t1,…, tn) | ∃XX1…Xn ∈ P: ∀i: ti∈Der(G,Xi)} The members of Der(G, A) are the document confirming the grammar G From now on, we deal with such docs only

Example of a Grammar & a Doc.

(Typed) View Same as before “expansion” takes two more parameters A “view” ⊆ S, “projection” is an erasure “expansion” takes two more parameters G = (S, A, P) : Grammar X ∈ S : Axiom expansion(V, ts, G, X) returns the set of trees in Der(G,X) whose projection with V are equal to ts

Example of an Expansion

Top-Down Nodeterministic Tree Automata Tree Automaton A is a tuple <S, Q, δ>: S : node labels Q : set of states δ : Q  2S ×Q* : transition relation Grammars are straightforwardly converted to tree automata: G = (S , A, P) ~~> A = (S, Q, δ) where Q = S δ( q ) = {(q, q1 … qn) | qq1…qn ∈ P}

Example A = ({A,B}, {q1,q2}, δ) q1 q2 q2 q1 q2 q2 δ(q1) = { (A, q1q2), (A, q2q2) } δ(q2) = { (B, ), (B, q1) } q1 This tree is in the set of trees repr’d by (A, q1)! A A q2 q2 B B B A q1 A B B This tree is not! (no possible assignment) q2 q2 B B

Representing Expansions by TA Recall: expansion(V, ts, G, X) returns the set of trees in Der(G,X) whose projection by V are equal to ts. V ⊆ S G = (S , P)

Input: V, G=(S, P), ts Output: the corresponding automaton Corresponding automaton is A = (S, Q, δ): Q = S ×T T is the list of subtrees of ts δ( <s, t> ) = Φ if s∈V and t≠[s(…)] { (s, <s1, t1><s2, t2>…<sn, tn>) | ss1…sn ∈ P, t1’ … tn’ = t’ } if s ∈V and t=[s(t’)] or s ∉ V and t=t’

Proposition (Correctness) Member of expansion(V, ts, G, X) iff Accepted by the automaton A from the state <X, ts>

Example S = {A,B,C} G = (S, A, P) where P is: V = {A, B} ts =

δ( <A, A(A,B(A,A))> ) = { (A, <C, A,B(A,A))><B,ε>), (A, <C, A><B, B(A,A))>), (A, <C, ε><B, A,B(A,A)) >) } δ( <C, A,B(A,A))> ) = { (C, <A, A,B(A,A)><C, ε>), (C, <A, A><C, B(A,A)>), (C, <A, ε><C, A,B(A,A)>), (C, <C, A,B(A,A)><C, ε>), (C, <C, A><C, B(A,A)>), (C, <C, ε><C, A,B(A,A)>) } …

Now, On Tree Automata… We can compute Emptiness of the represented set Coinductively Intersection between two sets By Product Construction (Q, δ) ∩ (P, γ) = (Q×P, β) where β( (q, p) ) = { (σ, (q1,p1)…(qn,pn)) | (σ,q1…qn)∈δ(q) and (σ,p1…pn)∈γ(p) }

Algorithm: “Coherence” check Given Grammar G = (S, A, P) View V1 and Forest ts1 View V2 and Forest ts2 Are these two views coherent? (i.e., can we “merge” them into a single document that generates the two views simultaneously?) expansion(V1,ts1,G,A) ∩ expansion(V2,ts2,G,A) ≠ Φ?

Algorithm: “Synchronization” Given Grammar G = (S, A, P) View V1 and Forest ts1 View V2 and Forest ts2 How can we get the merged document?  If is a singleton set, that’s it! Otherwise, ambiguous  error expansion(V1,ts1,G,A) ∩ expansion(V2,ts2,G,A)

Singleton check (Not in the paper…) Step 1 (Cleaning): A ~~> Acl Eliminate all “failure” states and transitions Step 2 (Thinning): Acl ~~> Acl,th Determinize the automaton by dropping nondeterministic rules Step 3 (Equivalence Check) Acl =? Acl,th Check Bisimilarity

Example 1 ADDRESSBOOK  @ @  ε | PERSON @ PERSON  NAME ADDRESS TEL V1 = {NAME} V2 = {TEL}

Example of a document: ADDRESSBOOK[@[ PERSON[ NAME[…] ADDRESS[…] TEL[…] ]@[ ]@[]]]] expansion(V1, NAME[…]NAME[…]) is consistent with expansion(V2, TEL[…]TEL[…]) but not with expansion(V2, TEL[…]TEL[…]TEL[…])

Example 2 ADDRESSBOOK  @ @  ε | PERSON @ PERSON  NAME ADDRESS TEL # V1 = {NAME} V2 = {TEL}

Example of a document: ADDRESSBOOK[@[ PERSON[ NAME[…] ADDRESS[…] TEL[…] #[] ]@[ NAME[…] ADDRESS[…] TEL[…] #[TEL[…] #[]] ]@[]]]] expansion(V1, NAME[…]NAME[…]) is consistent with expansion(V2, TEL[…]TEL[…]) and also with expansion(V2, TEL[…]TEL[…]TEL[…])

Ambiguity Example of a document: ADDRESSBOOK[@[ PERSON[ NAME[…] ADDRESS[…] #[TEL[…] #[]] ]@[ NAME[…] ADDRESS[…] #[TEL[…] #[TEL[…] #[]] ]@[]]]] Ambiguity in the synchronization of expansion(V1, NAME[…]NAME[…]) with expansion(V2, TEL[…]TEL[…]TEL[…])

How to resolve ambiguity Be more careful in choosing views Set of views that “covers” the whole structure Such as V1 = {NAME} V2 = {PERSON, TEL} or V1 = {PERSON, NAME} V2 = {TEL, #} etc.

“Static” Singleton Checking? How can the merging system support users to choose appropriate views? For example, Given a set of views, can we check whether they cause ambiguity or not?  Undecidable Problem (reduces to the ambiguity of CFG) (No clear solution is given in the paper…)

Summary Synchronization of multiple (edited) views can be computed by using tree automaton Compute inverse-image of the view Compute intersection, emptiness, & singularity Further topic (in the paper, but not in this presentation) “To-be-written” node For each symbol X in the grammar, add X and Xε Synchronizer allows X in u2 to occur at X position in u1, etc…

Appendix: Automata as Coalgebra J.J.M.M. Rutten, “Automata and Coinduction (An Exercise in Coalgebra)”, CONCUR 1998 The classical theory of deterministic automata is presented in terms of the notions of homomorphism and bisimulation, which are the cornerstones of the theory of (universal) coalgebra. This leads to a transparent and uniform presentation of automata theory and yields some new insights, amongst which coinduction proof methods for language equality and language inclusion. At the same time, the present treatment of automata theory may serve as an introduction to coalgebra.