Download presentation
Presentation is loading. Please wait.
1
Using XML, XSLT, and CSS in a Digital Library
Markup Transformations SGML to XML Conversions Metadata Schema & Generation Robert Ferrer ASIS Annual Meeting 2000
2
SGML to XML Conversions - Modular
15 November 2000 ASIS Annual Meeting 2000
3
SGML to XML Conversions - Basic
Empty tags <empty> to < ….. /> <?Processing Instruction> to <? ……... ?> CDATA to CDATA sections <![CDATA[ … ]]> Named entities remain unchanged - α <!DOCTYPE ...> refers to XML DTD containing only character entity definitions to Unicode points <!ENTITY alpha “α”> 15 November 2000 ASIS Annual Meeting 2000
4
SGML to XML Conversions - Linking
Attributes to facilitate internal linking <CITEREF REFID="bib5" idli_occurrence=”3” /> External links represented as XLinks <FIG NAME=“F1” xlink:type=“simple” xlink:href=“fig1.jpg” xlink:show=“new” xlink:actuate=“user” /> 15 November 2000 ASIS Annual Meeting 2000
5
SGML to XML Conversions - Math
SGML Math converted to MathML Presentational MathML <math xmlns=“ <msubsup> <mrow><mi>α</mi></mrow> <mrow><mi>i</mi></mrow> <mrow><mo>-</mo><mn>2</mn></mrow> </msubsup> </math> ISO Math <dformula> <g>a</g> <sup>-2</sup> <inf>i</inf> </dformula> Identify & translate mathematical character references Identify & tokenize mathematical content 15 November 2000 ASIS Annual Meeting 2000
6
SGML to XML Conversions - Math
Recognize & transform mathematical markup <xsl:template match=“dformula”> : <xsl:when test="sup or inf"> <xsl:for-each select="child::node()"> <xsl:choose> <xsl:when test="name(self::node())='sup' and name(following sibling::node()[1])='inf'"> <xsl:element name="msubsup” namespace=“ <xsl:element name="mrow” namespace=“ <xsl:apply-templates select="preceding-sibling::node()[1]"/> </xsl:element> 15 November 2000 ASIS Annual Meeting 2000
7
SGML to XML Conversions - TeX
TeX converted to GIF images <FORM NOTATION="TEX" HIDE="TRUE"> $$ (j_0-a_2')\,{\rm mod}\,P $$</FORM><uie name= “uie1” xlink:type="simple" xlink:href="fig1.gif" xlink:show="new" xlink:actuate="user” /> TeX converted into MathML IBM TechExplorer $$ (j_0-a_2')\,{\rm mod}\,P <math><mo>(</mo><msub> <mrow><mi>j</mi></mrow> <mrow><mn>0</mn></mrow> </msub><mi>−</mi> <msubsup><mrow><mi>a</mi> </mrow><mrow><mn>2</mn>….. 15 November 2000 ASIS Annual Meeting 2000
8
SGML to XML Conversions - DTD
XML DTD does not permit inclusions and exclusions SGML:<!ELEMENT Article - - (front, body) +(%i.float;)> XML:<!ELEMENT Article (front | body | %i.float;)*> XML DTD does not permit the ‘&’ connector XML DTD does not permit the use of mixed content models <!ELEMENT Other ((author, journal) | (#PCDATA))> 15 November 2000 ASIS Annual Meeting 2000
9
Metadata - Usage Metadata Within the DLI Testbed
Normalize key fields from different publisher DTDs to facilitate searching Provide common and easily displayable intermediate search results Add value in the form of links to cited or citing articles within the Testbed, external abstracts and indexes, etc. 15 November 2000 ASIS Annual Meeting 2000
10
Metadata - Schema Resource Description Framework (RDF) provides standardized way to represent metadata using XML Encapsulates metadata elements Provides varying levels of granularity RDF container objects describe the relations between repeated metadata elements 15 November 2000 ASIS Annual Meeting 2000
11
Metadata - Schema Dublin Core (DC) model is used to encapsulate all searchable metadata Provides the semantic framework for describing each object in the collection Content Intellectual Property Instantiation Title Creator Date Subject Publisher Format Description Contributor Identifier Type Rights Language Source Relation Coverage 15 November 2000 ASIS Annual Meeting 2000
12
Metadata - Schema Extensive custom IDLI tags are included
Offer a further level of granularity <DC:Description><idli:Abstract></DC:Description> Search clients familiar with IDLI schema can achieve much greater precision Dublin Core Qualifiers (DCQ) substructure to replace many of the project-specific IDLI elements <DC:Description><DCQ:Abstract></DC:Description> 15 November 2000 ASIS Annual Meeting 2000
13
Metadata - Schema <rdf:seq> <rdf:li> <dc:Creator>
<idli:author_name>Giust, G. K.</idli:author_name> <idli:organization_name>Department of Electrical Engineering, Arizona State University</idli:organization_name> </dc:Creator> </rdf:li> <idli:author_name>Sigmon, T.W.</idli:author_name> <idli:organization_name>Department of Computer Science, Illinois State University </idli:organization_name> </rdf:seq> 15 November 2000 ASIS Annual Meeting 2000
14
Metadata - Extracting Metadata is extracted from the ‘base’ XML files
Utilization of XML Header DTD is used to resolve entities XML-Stylesheet processing instruction Visual Basic application serves as parser Document Object Model (DOM) XSLT Style Sheets 15 November 2000 ASIS Annual Meeting 2000
15
Metadata - Extracting Utilization of XSLT Style Sheets
XSLT transformative features to generate base metadata file and forward citation fragment XSLT scripting features to generate elements not directly expressed in the document XSLT instantiation of ActiveX objects to test for links 15 November 2000 ASIS Annual Meeting 2000
16
Metadata - Extracting Utilization of DOM
Insert pseudo elements (e.g. bibliographic data) Search reference citations from the generated metadata object to insert forward references into other metadata files 15 November 2000 ASIS Annual Meeting 2000
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.