Presentation is loading. Please wait.

Presentation is loading. Please wait.

Qualified Dublin Core Using RDF for Sci-Tech Journal Articles DC-2001 International Conference on Dublin Core and Metadata Applications, October 22-26,

Similar presentations


Presentation on theme: "Qualified Dublin Core Using RDF for Sci-Tech Journal Articles DC-2001 International Conference on Dublin Core and Metadata Applications, October 22-26,"— Presentation transcript:

1 Qualified Dublin Core Using RDF for Sci-Tech Journal Articles DC-2001 International Conference on Dublin Core and Metadata Applications, October 22-26, 2001 National Institute of Informatics, Tokyo, Japan Thomas G. Habing Timothy W. Cole William H. Mischo University of Illinois at Urbana-Champaign

2 History and Objectives of the Testbed
Funded under DLI-I (NSF/NASA/DARPA). Continued under CNRI’s D-Lib Test Suite. Construct large-scale, multi-publisher, markup-based full-text journal testbed. Investigate processing, indexing, normalization, retrieval, rendering and linking. Study end-user searching behavior and needs.

3 Description of Testbed
Testbed contains 65,000 articles from 50 journals. Received from publishers as SGML (various DTDs). Converted to well-formed XML. Content & support from AIP, APS, ASCE, IEE, ASM, ACM, Elsevier. Additional support from IEEE, NRL, NTT Learning Systems.

4 Usage of Metadata in Illinois Testbed
Facilitate resource discovery across heterogeneous sources through normalization. Common, easily displayable search results. Add value to the original object: reference linking, links to alternate formats and A & I services. Data exchange, as with Open Archive Initiative Protocol for Metadata Harvesting (OAI PMH).

5

6

7 Metadata Extraction Process
Metadata is derived from full-text using XSLT. One-to-one mappings. select=“//titlegrp/title” maps to <dc:title>. Complex mappings: Tables of Contents, Literal Markup such as MathML. Advanced XSLT techniques: JavaScript functions are used for some formatting. The document(url) function is used to merge XML from other sources, such as CrossRef, into the metadata. See paper for sample XSLT code.

8 Other uses of XSLT ‘Dumb-down’ to unqualified DC.
Transform metadata to HTML for display. Generate RDF triples for use in a RDBMS.

9 ‘Dumb-down’ XSLT <xsl:variable name="DCQ" select="document('dcq.rdfs')"/> … <xsl:template name="dumb_down_dcq"> <xsl:variable name="SubPropertyOf“ <xsl:variable name="DCTag" select="substring- after($SubPropertyOf,'&xmlns_dc;')"/> <xsl:if test="$DCTag"> <xsl:call-template name="dumb_down"> <xsl:with-param name="Tag" select="$DCTag"/> … Variable $DCQ contains the complete DOM Tree for the DCQ RDF Schema Variable $SubPropertyOf contains the URL representing the parent property of the current node Variable $DCTag contains the DC tag name for the parent property The dumb_down template creates a node with the $DCTag name and containing the text in the current node

10 Local Extensions to DCQ
Qualified DC was not adequate for our needs. Various DC working groups provided some guidance. We extended DCQ in three areas: Citation-related extensions. Agent-related (creator) extensions. Type and encoding scheme extensions.

11 Citation-related Extensions
<uiLib:citation>A. Author. "A Title" Some Jrnl. … <dc:identifier> <uiLib:OpenURL-OBJECT-METADATA-ZONE> <rdf:value>genre=article&aulast=Author… <dcq:isPartOf> <rdf:Description rdf:ID="JournalIssue"> <dc:identifier><uiLib:ISSN> <rdf:value> </rdf:value> </uiLib:ISSN></dc:identifier> <dc:title>Some Journal</dc:title>…

12 Agent-related Extensions
Based on DC Agent Qualifiers, Working Draft <dc:creator><rdf:Seq><rdf:li> <dca:Person rdf:ID="AUTHOR-1"> <dca:agentname><dca:FNF> <rdf:value>Author, A. N.</rdf:value> </dca:FNF></dca:agentname> <dca:agentaffiliation>Big University</dca:agentaffiliation> <dca:agentidentifier

13 Type and Encoding Extensions
Extensions to DCMI Type Vocabulary. <dc:type rdf:resource= " <dc:type rdf:resource= " Additional Encoding Schemes. PACS, ACMCCS, ISSN, CODEN, ACM_JRNL_CODE.

14 Conclusions Using DCQ/RDF for sci-tech journal articles is viable
Steep learning curve for RDF ‘Dumbing-down’ DCQ/RDF is complex Cannot ignore non-DC tags, RDF Schema is required DCQ is missing many properties and types required for complete serials descriptions Utility of RDF remains uncertain


Download ppt "Qualified Dublin Core Using RDF for Sci-Tech Journal Articles DC-2001 International Conference on Dublin Core and Metadata Applications, October 22-26,"

Similar presentations


Ads by Google