Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Web Data Management Path Expressions. 2 In this lecture Path expressions Regular path expressions Evaluation techniques Resources: Data on the Web Abiteboul,

Similar presentations


Presentation on theme: "1 Web Data Management Path Expressions. 2 In this lecture Path expressions Regular path expressions Evaluation techniques Resources: Data on the Web Abiteboul,"— Presentation transcript:

1 1 Web Data Management Path Expressions

2 2 In this lecture Path expressions Regular path expressions Evaluation techniques Resources: Data on the Web Abiteboul, Buneman, Suciu : section 4.1

3 3 Path Expressions Examples: Bib.paper Bib.book.publisher Bib.paper.author.lastname Given an OEM instance, the answer of a path expression p is a set of objects

4 4 Path Expressions Examples: DB = &o1 &o12&o24&o29 &o43 &o70&o71 &96 &243 &206 &25 “Serge” “Abiteboul” 1997 “Victor” “Vianu” 122133 paper book paper references author title year http author title publisher author title page firstname lastname firstnamelastnamefirst last Bib &o44&o45&o46 &o47&o48 &o49 &o50 &o51 &o52 Bib.paper={&o12,&o29} Bib.book.publisher={&o51} Bib.paper.author.lastname={&o71,&206} Bib.paper={&o12,&o29} Bib.book.publisher={&o51} Bib.paper.author.lastname={&o71,&206}

5 5 Answer of a Path Expression Simple evaluation algorithms for Answer(P,DB): Runs in PTIME in size(P), size(db): –PTIME complexity Answer(P, DB) = f(P, root(DB)) Where: f( , x) = {x} f(L.P, x) =  {f(P,y) |  (x,L,y)  edges(DB)} Answer(P, DB) = f(P, root(DB)) Where: f( , x) = {x} f(L.P, x) =  {f(P,y) |  (x,L,y)  edges(DB)}

6 6 Regular Path Expressions R ::= label | _ | R.R | (R|R) | R* | R+ | R? Examples: Bib.(paper|book).author Bib.book.author.lastname? Bib.book.(references)*.author Bib.(_)*.zip

7 7 Applications of Regular Path Expressions Navigating uncertain structure: –Bib.book.author.lastname? Syntactic substitution for inheritance: –Bib.(paper|book).author –Better: Bib.publication.author, but we don’t have inheritance

8 8 Applications of Regular Path Expressions Computing transitive closure: –Bib.(_)*.zip = everything accessible –Bib.book.(references)*.author = everything accessible via references Some regular expressions of doubtful practical use: –(references.references)* = a path with an even number of references –(_._)* = paths of even length –(_._._.(_)?)* = paths of length (3m + 4n) for some m,n But make great examples for illustration

9 9 Answer of a Regular Path Expression Recall: –Lang(R) = the set of words P generated by R Answer of regular path expressions: –Answer(R,DB) =  {Answer(P,DB) | P  Lang(R)} Need an evaluation algorithm that copes with cycles

10 10 Regular Path Expressions Recall: each regular expression  NDFA Example: R = (a.a)*.a.b A = s1s2 s3 s4 a a a b states(A) = {s1,s2,s3,s4} initial(A) = s1 terminal(A) = {s4}

11 11 Regular Path Expressions Canonical Evaluation Algorithm Answer(R,DB): 1.construct A from R 2.construct product automaton G = A x DB: –nodes(G) = states(A) x nodes(db) –edges(G) = {((s,x),L,(s’,x’) | (s,L,s’)  edges(A), (x,L,x’)  edges(DB)} –root(G) = (initial(A), root(DB)) 3.compute G acc = set of nodes accessible from root(G) 4.return {x | s  terminal(A) s.t. (s,x)  G acc } Answer(R,DB): 1.construct A from R 2.construct product automaton G = A x DB: –nodes(G) = states(A) x nodes(db) –edges(G) = {((s,x),L,(s’,x’) | (s,L,s’)  edges(A), (x,L,x’)  edges(DB)} –root(G) = (initial(A), root(DB)) 3.compute G acc = set of nodes accessible from root(G) 4.return {x | s  terminal(A) s.t. (s,x)  G acc }

12 12 Regular Path Expressions Example: R = _.(_._)*.a A = DB = &o1 &o2 &o3&o4 a a b a s1s2 s3 _ _ a Answer of R on DB = { &o2, &o3}

13 13 Compute Product Automaton G s1,&o1 s1,&o2 s1,&o3s1,&o4 a a b a s2,&o1 s2,&o2 s2,&o3s2,&o4 a a b a s3,&o1 s3,&o2 s3,&o3s3,&o4 a a b a _ _ a

14 14 Compute Accessible Part G acc s1,&o1 s1,&o2 s1,&o3s1,&o4 a a b a s2,&o1 s2,&o2 s2,&o3s2,&o4 a a b a s3,&o1 s3,&o2 s3,&o3s3,&o4 a a b a _ _ a Answer(R,DB) = {&o2, &o3}

15 15 Complexity of Regular Path Expressions The evaluation algorithm runs in PTIME in size(R), size(DB) Even when there are cycles in DB


Download ppt "1 Web Data Management Path Expressions. 2 In this lecture Path expressions Regular path expressions Evaluation techniques Resources: Data on the Web Abiteboul,"

Similar presentations


Ads by Google