Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semantics for Valid XML Documents Harold Boley Dagstuhl Seminar 01021 Semantics in Databases Jan. 7-12, 2001.

Similar presentations


Presentation on theme: "Semantics for Valid XML Documents Harold Boley Dagstuhl Seminar 01021 Semantics in Databases Jan. 7-12, 2001."— Presentation transcript:

1

2 Semantics for Valid XML Documents Harold Boley Dagstuhl Seminar 01021 Semantics in Databases Jan. 7-12, 2001

3 Dagstuhl 01021 Semantics for Valid XML Documents 1 Overview Introduction of valid XML documents to establish a ‘grammar-typed’ syntax for Web data Survey of some practical (Web) aspects of three semantics: – Transformational semantics (incl. proof-theoretic ~) – Model-theoretic semantics – Metadata semantics Study of (Web) applicability and combination of the three semantics Systems using or implementing such semantics Running example: address-document processing

4 Dagstuhl 01021 Semantics for Valid XML Documents 2 Address Example: HTML to XML Xaver M. Linde Wikingerufer 7 10555 Berlin HTML Markup: XML tags are chosen for content-structuring needs Xaver M. Linde Wikingerufer 7 10555 Berlin XML Markup: While not conveying any formal semanticsany formal semantics:

5 Dagstuhl 01021 Semantics for Valid XML Documents 3 Xaver M. Linde Wikingerufer 7 10555 Berlin Address Example: XML to XML Xaver M. Linde Wikingerufer 7 10555 Berlin XML Markup 1: XML Markup 2: XML stylesheets are usable to transform XML elements E.g., for data interoperation:

6 Dagstuhl 01021 Semantics for Valid XML Documents 4 Xaver M. Linde Wikingerufer 7 10555 Berlin Address Example: XML to XML Xaver M. Linde Wikingerufer 7 10555 Berlin XML Markup 1: XML Markup 2: XML stylesheets are usable to transform XML elements E.g., for a kind of normalization:

7 Dagstuhl 01021 Semantics for Valid XML Documents 5 Xaver M. Linde Wikingerufer 7 10555 Berlin WHERE Xaver M. Linde $s $t CONSTRUCT $s $t Address Example: XML Queries XML Markup: XML Query (XML-QL): XML queries can select subelements of XML elements elementelement s subelements Wikingerufer 7 10555 Berlin

8 Dagstuhl 01021 Semantics for Valid XML Documents 6 address( name("Xaver M. Linde"), street("Wikingerufer 7"), town("10555 Berlin") ) Address Example: Prolog Queries Prolog Term: Prolog Query: Prolog queries can select substructures of Prolog structures S = "Wikingerufer 7" T = "10555 Berlin" structurestructure s substructures address( name("Xaver M. Linde"), street(S), town(T) )

9 Dagstuhl 01021 Semantics for Valid XML Documents 7 Address Example: The Element Tree Node-Labeled, (Left-to-Right-)Ordered Element Tree: address( name("Xaver M. Linde"), street("Wikingerufer 7"), town("10555 Berlin") ) Prolog Term: structurestructure s substructures Xaver M. Linde Wikingerufer 7 10555 Berlin XML Markup: elementelement s subelements address Xaver M. LindeWikingerufer 710555 Berlin namestreettown subtreessubtrees tree

10 Dagstuhl 01021 Semantics for Valid XML Documents 8 Address Example: Document Type Definition and Tree (1) Document Type Tree: Document Type Definition (DTD): address PCDATA namestreettown address ::=name street town name ::=PCDATA street ::=PCDATA town ::=PCDATA Extended Backus-Naur Form (EBNF):

11 Dagstuhl 01021 Semantics for Valid XML Documents 9 Address Example: Document Type Definition and Tree (2) Document Type Tree: Document Type Definition (DTD): address PCDATA name streettown place

12 Dagstuhl 01021 Semantics for Valid XML Documents 10 Well-Formedness and Validity Open and close all tags Empty tags end with /> There is a unique root element Elements may not overlap Attribute values are quoted < and & are only used to start tags and entities Only the five predefined entity references are used Matches the type-like constraints listed in the DTD (or, can be generated from DTD as linearized CF grammar-derivation tree) XML principles for a document being well- formed: XML principle for a document being valid with respect to a DTD : Checked by validators such as http://www.stg.brown.edu/s ervice/xmlvalid/ http://www.stg.brown.edu/s ervice/xmlvalid/

13 Dagstuhl 01021 Semantics for Valid XML Documents 11 Practical Semantics Need: Web(-Page) Transformations, Models, and Metadata Up to now: XML with Document Type Definitions (DTDs) or XML Schemas as the syntactic basis Practical need for Web semantics: 1) Getting meaning from XML Web pages through translation results 2) Modeling formal XML elements by constructing their extensions (finite or infinite sets) 3) Annotating arbitrary Web objects in RDF/XML for semantic retrieval

14 Dagstuhl 01021 Semantics for Valid XML Documents 12 Practical Semantics Techniques: Web Transformations, Models, and Metadata Corresponding semantic techniques: 1) Transformational semantics translates XML into other XML or HTML documents via XSLT stylesheets (e.g. using Cocoon engine) 2) Model-theoretic semantics explicates rule consequences by generating Herbrand models for XML knowledge bases of relations and functions 3) Metadata semantics in XML-based RDF (Resource Description Framework) and RDF Schema enables high-precision search engines for Berners-Lee’s "Semantic Web"

15 Dagstuhl 01021 Semantics for Valid XML Documents 13 Address Document: Transformational Semantics via an XSLT Stylesheet Me2XML 96 Hyper Road Boston RDF4All 2001 Broadway New York XML4You 96 Hyper Road Boston Me2XML 96 Hyper Road Boston RDF4All 2001 Broadway New York XML4You 96 Hyper Road Boston % start fact base for addresses address( name("Me2XML"), place( street("96 Hyper Road"), town("Boston") ) ). address( name("RDF4All"), place( street("2001 Broadway"), town("New York") ) ). address( name("XML4You"), place( street("96 Hyper Road"), town("Boston") ) ). % end fact base for addresses XSLT template

16 Dagstuhl 01021 Semantics for Valid XML Documents 14 Address Document: XSLT Stylesheet Template as a Tree-Transforming Rule address PCDATA namestreettown address PCDATA name streettown place

17 Dagstuhl 01021 Semantics for Valid XML Documents 15 Colocation Rule: Model-Theoretic Semantics via Consequence Generation Me2XML XML4You % start fact base for addresses address( name("Me2XML"), place( street("96 Hyper Road"), town("Boston") ) ). address( name("RDF4All"), place( street("2001 Broadway"), town("New York") ) ). address( name("XML4You"), place( street("96 Hyper Road"), town("Boston") ) ). % end fact base for addresses % start rule base for colocated colocated(name( N1 ),name( N2 )) :- address(name( N1 ),place( P )), address(name( N2 ),place( P )), lexiless( N1,N2 ). % end rule base for colocated % start fact base for colocated colocated( name( "Me2XML" ), name( "XML4You") ). % end fact base for colocated Horn rule The Herbrand model of the rule and addresses is the set of the colocated and address ground facts N1 N2...

18 Dagstuhl 01021 Semantics for Valid XML Documents 16 Linked Address Documents: Metadata Semantics via RDF Annotations flat <ConvertsTo resource=“http://addr.nest.com”/> Me2XML 96 Hyper Road Boston RDF4All 2001 Broadway New York... nested <ConvertsTo resource=“http://addr.flat.com”/> Me2XML 96 Hyper Road Boston... http://addr.flat.comhttp://addr.nest.com ConvertsTo nestedflat Shape

19 Dagstuhl 01021 Semantics for Valid XML Documents 17 Practical Semantics Combination: Metadata  Transformation  Model Generate the finite model containing all colocated facts derivable from given flat-address base facts, with inference rules available only for nested facts This problem can be divided into three subproblems: (1) Navigate metadata, starting from flat-address URL, for available nested-address version (alternatively, use a semantic search engine with Shape = nested) (2) If none available, transform flat-address facts into nested addresses via the URL’s XSLT stylesheet (3) Apply colocated rule to nested-address base facts to generate finite model of colocated facts Consider the following problem of (inferential, XML) data mining with report generation for findings:

20 Dagstuhl 01021 Semantics for Valid XML Documents 18 (1) Check Metadata via RDF Annotations flat <ConvertsTo resource=“http://addr.nest.com”/> Me2XML 96 Hyper Road Boston RDF4All 2001 Broadway New York... http://addr.flat.comhttp://addr.nest.com ConvertsTo nested Shape Not Found The requested URL was not found on this server. It might be, that this is by the fact, that this server will be configured this day.

21 Dagstuhl 01021 Semantics for Valid XML Documents 19 (2) Transform via the XSLT Stylesheet Me2XML 96 Hyper Road Boston RDF4All 2001 Broadway New York XML4You 96 Hyper Road Boston % start fact base for addresses address( name("Me2XML"), place( street("96 Hyper Road"), town("Boston") ) ). address( name("RDF4All"), place( street("2001 Broadway"), town("New York") ) ). address( name("XML4You"), place( street("96 Hyper Road"), town("Boston") ) ). % end fact base for addresses address PCD namestreettown PCD address PCD name streettown place XSLT template

22 Dagstuhl 01021 Semantics for Valid XML Documents 20 (3) Generate Model as Rule Consequences Me2XML XML4You % start fact base for addresses address( name("Me2XML"), place( street("96 Hyper Road"), town("Boston") ) ). address( name("RDF4All"), place( street("2001 Broadway"), town("New York") ) ). address( name("XML4You"), place( street("96 Hyper Road"), town("Boston") ) ). % end fact base for addresses % start rule base for colocated colocated(name( N1 ),name( N2 )) :- address(name( N1 ),place( P )), address(name( N2 ),place( P )), lexiless( N1,N2 ). % end rule base for colocated Horn rule Data findings report: Me2XML and XM4You might be the same organization

23 Dagstuhl 01021 Semantics for Valid XML Documents 21 Model-Theoretic Semantics Practically usable only for finite models (such as in the address example) Still theoretically interesting to formalize semantics of XML-based inference systems (such as RFML, RuleML, DAML, or OIL)RFML RuleMLDAMLOIL Even when finite, not practical for highly distributed and highly dynamic fact bases (such as the ever- changing geographic data scattered over the Web)such as the ever- changing geographic data scattered over the Web Perhaps to be replaced/augmented by semantics characterizing new logic for the Web, which is open, uncertain, and paraconsistentnew logic for the Web

24 Dagstuhl 01021 Semantics for Valid XML Documents 22 Transformational Semantics Practically usable for all declarative programs, e.g. for normalization or interoperation (such as in the address example) XSLT stylesheet engines flourish, e.g. Cocoon, and probably to be built directly into most Web browsersCocoon XSLT with variables and parameter passing recently shown to be relationally complete shown to be relationally complete The emerging XML query algebra permits similar transformations, incl. certain functional programsXML query algebra The Rule Markup Initiative will provide a lattice of XML DTDs (Schemas) for RuleML subsets containing inference and/or transformation rulesRuleML

25 Dagstuhl 01021 Semantics for Valid XML Documents 23 Metadata Semantics Practically usable for describing/localizing all possible (Web) objects, whose internals need not be accessible (unlike in the address example) RDF can be formalized logically and its expressive power may be generalized via logic programming (and hypergraphs): metadata combined with rulesformalized logicallygeneralized via logic programming (and hypergraphs) Metadata complemented by subsumption semantics for XML tags to better integrate XML and RDFsubsumption semantics for XML tags RDF extensible by subClassOf/subPropertyOf vocabularies (cf. sorted logics), as in RDF Schema, and, further, by full ontologies, as in DAML or OILDAMLOIL

26 Dagstuhl 01021 Semantics for Valid XML Documents 24 SubPropertyOf Example: An Illustrative Hierarchy of Properties ColorShapeTexture SurfaceComposition DensityHardness Body nested flat......... Property inheritance handled as, e.g., for description logic roles: RDF Schema  OILOIL and DAMLDAML

27 Dagstuhl 01021 Semantics for Valid XML Documents 25 Conclusions Identified and exemplified three complementary semantics for XML data: – Transformations – Models – Metadata For data distributed in the Web, models are of limited use, while transformations and metadata are being widely applied Further semantics will be needed for the Web, e.g.: – SQL semantics: Contributions to this Dagstuhl SeminarContributions to this Dagstuhl Seminar – “URI-deictic” logics: Berners-Lee’s “pointing as proving” Already the different usage of transformations and metadata would suggest: e t many flowers bloom !


Download ppt "Semantics for Valid XML Documents Harold Boley Dagstuhl Seminar 01021 Semantics in Databases Jan. 7-12, 2001."

Similar presentations


Ads by Google