Solutions for XML Document Navigation and Delivery Lori Wong and T.R. Girill Customer Services Group Services and Development Division Integrated Computing and Communications Department Lawrence Livermore National Laboratory UCRL-PRES Nov. 3, 2003
Defining the problem Solutions for XML Document Navigation and Delivery Lawrence Livermore National Laboratory Replace a legacy SGML-based delivery system of A collection of 36 SGML documents (4000+ pages). A proprietary document delivery system that was awkward and costly to support. Requirements Offer table of contents navigation with adjustable granularity. Display individual sections with forward and backward navigation within the document structure. Render link references appropriately for web or print. Support search capability within and across documents in the collection.
Solutions for XML Document Navigation and Delivery Lawrence Livermore National Laboratory Forging a path from SGML to XML Some initial steps were needed just to get to XML... Developed a script to translate inconsistencies between SGML and XML. - entity substitution - closing tags Modified the DTD to conform to XML requirements. Expanded some of the element attributes where necessary to support XSLT. - section level attribute assists with TOC generation - keep attribute assists with pagination control Once accomplished, the translation process is trivial.
Solutions for XML Document Navigation and Delivery Lawrence Livermore National Laboratory Using XSLT for web rendering A JSP was used to link the XML file with an XSLT to render the output as HTML. Unique ids for each section allowed for rendering of specific document sections. An expandable table of contents allows for greater ease of navigation. Adding search capabilities is the next step.
Solutions for XML Document Navigation and Delivery Lawrence Livermore National Laboratory A JSP simplifies the rendering process A JSP was used to link the XML file with an XSLT to render the output as HTML Support many browsers – PC, Mac, UNIX, Linux platforms. Allow display of specific sections of a document. Server-side rendering ensures consistent display regardless of platform. JSP is trivial and can be used for all of the documents. Source xmlSource = new StreamSource(xmlFile); String paramShow=request.getParameter("show"); if (paramShow==null) {paramShow="Preface";} TransformerFactory tFactory = TransformerFactory.newInstance(); Transformer transformer = tFactory.newTransformer(new StreamSource(xslFile)); transformer.setParameter("show",paramShow); transformer.transform(xmlSource, new StreamResult(out));
Solutions for XML Document Navigation and Delivery Lawrence Livermore National Laboratory Unique section IDs allow special rendering treatment Dynamic forward and backward navigation links were needed to provide continuity in the document delivery. Allows for URLs to be used for linking to specific sections.
Solutions for XML Document Navigation and Delivery Lawrence Livermore National Laboratory Expandable TOC allows for greater ease of navigation Simple javascript routines make use of DOM to generate the expandable menu Based on domCollapse ( Modified to add arrow gifs and to allow topics to be linkable. State of the menu is not saved, menu is expanded to the section that is being displayed. XSLT was difficult to write due to lack of a counter variable to identify which topic to expand.
Solutions for XML Document Navigation and Delivery Lawrence Livermore National Laboratory Adding search capabilities is the next step XML provides us with document structure which can be used to refine a search Searches can be limited to matches only in section heads, for example. Results reported by title or section can help profile found information. XSLT can provide enhanced display features Matched text can be highlighted within the document section displayed. Search results could be shown as a navigation menu – similar in feel to the TOC, but these links could be to matched document sections.
Solutions for XML Document Navigation and Delivery Lawrence Livermore National Laboratory Using XSLT for print rendering An intermediate file is created by the XSLT to produce an XSL-FO formatted file. RenderX’s XEP product is used to render the XSL-FO file into PDF. Apache’s FOP was inadequate for the translations and formatting we needed. - Deficiencies in generating a well-formatted table of contents - Replacement of text to show link addresses explicitly encountered layout difficulties The table of contents and links are two specific areas where the XSLT generates distinctly different results from online rendering. Page numbers and simple headers and footers were added. Page numbers needed to be generated for references to internal document sections where the web rendering would have had a link.
Solutions for XML Document Navigation and Delivery Lawrence Livermore National Laboratory Print vs. web rendering - table of contents TOC rendering is different
Solutions for XML Document Navigation and Delivery Lawrence Livermore National Laboratory Print vs. web rendering – link visibility Link rendering differs by changing URL visibility
Solutions for XML Document Navigation and Delivery Lawrence Livermore National Laboratory Print vs. web rendering – page referencing References within the document differ by noting the page for print renderings
Solutions for XML Document Navigation and Delivery Lawrence Livermore National Laboratory A scalable application for document delivery We have a reasonable and scalable way to deliver our online documents. XSLT provides a way to deliver the documents to different media without having to modify the documents themselves. We have a way to control presentation of the documents in different environments (displaying links where access to the WWW is unavailable). XPATH allows us to develop more refined treatment by utilizing the document structure. We have the potential to build new pages by selecting or re-using specific sections or selections from multiple document sources thereby minimizing duplication of content. We have a workable model which can help in the development and design of other structured documents.