Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Thomas Triebsees, Department of Computer Science Thomas Triebsees University of the German Federal Armed Forces Munich Department of Computer Science.

Similar presentations


Presentation on theme: "1 Thomas Triebsees, Department of Computer Science Thomas Triebsees University of the German Federal Armed Forces Munich Department of Computer Science."— Presentation transcript:

1 1 Thomas Triebsees, Department of Computer Science Thomas Triebsees University of the German Federal Armed Forces Munich Department of Computer Science Thomas.Triebsees@unibw.de Winnipeg, 31th August 2007 Towards Automatic Document Migration: Semantic Preservation of Embedded Queries

2 2 Thomas Triebsees, Department of Computer Science Agenda I.Research Context and Motivation II.Our Approach 1.Property Specification and Tracing 2.Automated Query Evalutation and Construction III.Results IV.Conclusions

3 3 Thomas Triebsees, Department of Computer Science I.Research Context and Motivation

4 4 Thomas Triebsees, Department of Computer Science Research Context Task: Semantic preservation  high degree of process reliability necessary (trustworthyness)  amount of documents requires automation  document representations (formats) change  still: most QA done hand-crafted

5 5 Thomas Triebsees, Department of Computer Science Example Property – Link Consistency Calculation documents harvest WWW store source calc05 calc.pdf start.html Website Calculation Calculation documents 137.193.60.99 Aim: improve portability source calc05 calc.pdf start.html 137.193.60.82 Website Calculation style.css

6 6 Thomas Triebsees, Department of Computer Science Example Property – Link Consistency 137.193.60.99 Calculation html index.html calc05 resources calc.pdf calc05 source calc05 calc.pdf start.html 137.193.60.82 Website Calculation Calculation documents harvest WWW store Calculation documents style.css

7 7 Thomas Triebsees, Department of Computer Science Semantic Queries Queries embedded in documents; Formalize semantic preservation: - evaluation - construction? Examples:  URLs query server/directory structure  style sheets (CSS) query XML/HTML documents  XPath expressions query XML documents  … Calculation documents 137.193.60.99 Calculation htmlindex.html calc05 resources calc.pdf calc05 style.css

8 8 Thomas Triebsees, Department of Computer Science II.Our Approach – Semantic Evaluation and Construction of Embedded Queries

9 9 Thomas Triebsees, Department of Computer Science Our Approach migration process source documentstarget documents property specifications preservation requirements Framework tracing property matching property matching automated verification notification What are the relevant properties? What are the different representation forms? (1) (2) What is to be preserved? (3) Implement transformation: Notify system on transformation steps (4) Trace relevant object histories. Verify preservation requirements w.r.t. source and target objects.

10 10 Thomas Triebsees, Department of Computer Science (1) Property Specification LinksTo Calculation documents link_source link_anchor link_target Concept + Interface Context LinkAbsContext LinkRel store Calculation documents 137.193.60.99 Calculation htmlindex.html calc05 resources calc.pdf calc05 style.css source calc05 calc.pdf start.html 137.193.60.82 Website Calculation style.css  define role names for property  assign roles in different implementations

11 11 Thomas Triebsees, Department of Computer Science pres K ( {s → link_source, a → link_anchor, t → link_target}, LinksTo (s, a, t), {LinkAbs,LinkRel}, {LinkRel}) Expressed semi-formally using concepts and contexts: When transforming a link source, a link anchor, and a link target to a new representation, preserve the concept LinksTo for these objects in the context LinkRel. (2) Expressing Preservation Requirements Requirement: When transforming a website, translate all absolute links to relative links while preserving link consistency. Expressed formally:

12 12 Thomas Triebsees, Department of Computer Science (3) Tracing Semantic Properties - Preservation LinksTo Calculation documents link_source link_anchor link_target LinkAbsLinkRel store Calculation documents pres K ( {s → link_source, a → link_anchor, t → link_target}, LinksTo (s, a, t), {LinkAbs,LinkRel}, {LinkRel}) 137.193.60.99 Calculation htmlindex.html calc05 resources calc.pdf calc05 style.css source calc05 calc.pdf start.html 137.193.60.82 Website Calculation style.css

13 13 Thomas Triebsees, Department of Computer Science Preservation of Embedded Queries Targets: Semantic preservation of link consistency  links can be evaluated semantically  only valid URLs are accepted as links  links can be constructed automatically  only valid URLs are constructed  constructions allow for formal proofs w.r.t. preservation requirement Tools:  Automata Theory (Finite State Automata, FSA)  Graph Theory Steps: (1)Formalize queried structure for link evaluation and construction (2)Formalize syntactically valid URLs (3)Combine both Can be generalized to other applications Integrating embedded queries

14 14 Thomas Triebsees, Department of Computer Science Specification of Queried Structure (1) Formalize queried structure - vertices (objects) yield query semantics - labels carry URL substrings - generate finite state automaton

15 15 Thomas Triebsees, Department of Computer Science Specification of Queried Structure

16 16 Thomas Triebsees, Department of Computer Science Grammar for URI-references Specification of Syntactically Valid URLs (2) Formalize syntactically valid URLs - reduce URI-reference grammar - construct query automaton

17 17 Thomas Triebsees, Department of Computer Science Specification of Syntactically Valid URLs Construction of Query automaton

18 18 Thomas Triebsees, Department of Computer Science Combine both – Full link automaton - basically: Let both automata run in parallel - match non-terminal transitions of URL automaton with appropriate transitions of struture automaton (3) Combine both

19 19 Thomas Triebsees, Department of Computer Science Integration and Benefit LinksTo Calculation documents link_source link_anchor link_target LinkAbsLinkRel store Calculation documents evaluation construction 137.193.60.99 Calculation htmlindex.html calc05 resources calc.pdf calc05 style.css source calc05 calc.pdf start.html 137.193.60.82 Website Calculation style.css working provably correct

20 20 Thomas Triebsees, Department of Computer Science III.Results

21 21 Thomas Triebsees, Department of Computer Science

22 22 Thomas Triebsees, Department of Computer Science IV.Conclusions and Outlook

23 23 Thomas Triebsees, Department of Computer Science I.Automated evaluation and construction of embedded queries II.Based on formal, automata-theoretic constructions -> provable correctness III.Integration into framework for semantic preservation IV.Future work:  Computing structures on demand  Regular expressions as queries  Include extensions like CSS or XPath predicates

24 24 Thomas Triebsees, Department of Computer Science Subject to your questions… Thomas Triebsees Universität der Bundeswehr München Department of Computer Science www.unibw.de/Thomas.Triebsees Thomas.Triebsees@unibw.de


Download ppt "1 Thomas Triebsees, Department of Computer Science Thomas Triebsees University of the German Federal Armed Forces Munich Department of Computer Science."

Similar presentations


Ads by Google