XML and XSL A report on the workshop given by Shaoping Moss on October 16, 2004 Presented by ASIS&T members Caryn Anderson, Prairie Clayton & Kara Schwartz.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

CSCI N241: Fundamentals of Web Design Copyright ©2004 Department of Computer & Information Science Introducing XHTML: Module B: HTML to XHTML.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
XHTML Basics.
3 November 2008CIS 340 # 1 Topics To define XML as a technology To place XML in the context of system architectures.
HTML and XHTML Controlling the Display Of Web Content.
IS 373—Web Standards Todd Will
Tutorial 1 Developing a Basic Web Page
Creating an Electronic Edition of an Original 18 th Century Manuscript -- Mémoires de la comtesse de L… Shaoping Moss Monday, Oct. 3, 2005 Research and.
DECO 3002 Advanced Technology Integrated Design Computing Studio Tutorial 5 – XML Basic School of Architecture, Design Science and Planning Faculty of.
Introduction to XML: Yong Choi CSU Bakersfield.
Upgrading to XHTML DECO 3001 Tutorial 1 – Part 1 Presented by Ji Soo Yoon 19 February 2004 Slides adopted from
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
Developing a Basic Web Page with HTML
Introducing XHTML: Module B: HTML to XHTML. Goals Understand how XHTML evolved as a language for Web delivery Understand the importance of DTDs Understand.
Introducing HTML & XHTML:. Goals  Understand hyperlinking  Understand how tags are formed and used.  Understand HTML as a markup language  Understand.
Introduce of XML Xiaoling Song CS157A. What is XML? XML stands for EXtensible Markup Language XML stands for EXtensible Markup Language XML is a markup.
XML – Extensible Markup Language Sivakumar Kuttuva & Janusz Zalewski.
Creating a Simple Page: HTML Overview
1 Networks and the Internet A network is a structure linking computers together for the purpose of sharing resources such as printers and files Users typically.
August Chapter 1 - Introduction Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology Radford.
EAD: A Technical Introduction Julie Hardesty, Metadata Analyst June 3, 2014.
What is XML? XML stands for EXtensible Markup Language
Mark Sullivan University of Florida Libraries Digital Library of the Caribbean.
CREATED BY ChanoknanChinnanon PanissaraUsanachote
Introduction technology XSL. 04/11/2005 Script of the presentation Introduction the XSL The XSL standard Tools for edition of codes XSL Necessary resources.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
Introduction to XML Eugenia Fernandez IUPUI. What is XML? From the World Wide Web Consortium (W3C) The Extensible Markup Language (XML) is the universal.
Chapter 1 Understanding the Web Design Environment Principles of Web Design, 4 th Edition.
Week 1 Understanding the Web Design Environment. 1-2 HTML: Then and Now HTML is an application of the Standard Generalized Markup Language Intended to.
1 © Netskills Quality Internet Training, University of Newcastle Introducing XML © Netskills, Quality Internet Training University.
Learning Web Design: Chapter 4. HTML  Hypertext Markup Language (HTML)  Uses tags to tell the browser the start and end of a certain kind of formatting.
CP2022 Multimedia Internet Communication1 HTML and Hypertext The workings of the web Lecture 7.
XML Basics Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University Extensible Meta Language Markup Language.
TEXT ENCODING INITIATIVE (TEI) Inf 384C Block II, Module C.
XML About XML Things to be known Related Technologies XML DOC Structure Exploring XML.
XML TUTORIAL Portions from w3 schools By Dr. John Abraham.
Metadata: Essential Standards for Management of Digital Libraries ALI Digital Library Workshop Linda Cantara, Metadata Librarian Indiana University, Bloomington.
Avoid using attributes? Some of the problems using attributes: Attributes cannot contain multiple values (child elements can) Attributes are not easily.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
1 Chapter 10: XML What is XML What is XML Basic Components of XML Basic Components of XML XPath XPath XQuery XQuery.
CITA 330 Section 6 XSLT. Transforming XML Documents to XHTML Documents XSLT is an XML dialect which is declared under namespace "
Presentation Topic: XML and ASP Presented by Yanzhi Zhang.
XML eXtensible Markup Language. Topics  What is XML  An XML example  Why is XML important  XML introduction  XML applications  XML support CSEB.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
Chapter 1 Understanding the Web Design Environment Principles of Web Design, 4 th Edition.
Waqas Anwar Next SlidePrevious Slide. Waqas Anwar Next SlidePrevious Slide XML XML stands for EXtensible Markup Language.
XML Instructor: Charles Moen CSCI/CINF XML  Extensible Markup Language  A set of rules that allow you to create your own markup language  Designed.
CEAL 2003 XML for CJK Wooseob Jeong School of Information Studies University of Wisconsin - Milwaukee.
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
1 Tutorial 11 Creating an XML Document Developing a Document for a Cooking Web Site.
HTML Basics Computers. What is an HTML file? *HTML is a format that tells a computer how to display a web page. The documents themselves are plain text.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Web Technologies Lecture 4 XML and XHTML. XML Extensible Markup Language Set of rules for encoding a document in a format readable – By humans, and –
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
XSLT. XSLT stands for Extensible Stylesheet Language Transformations XSLT is used to transform XML documents into other kinds of documents. XSLT can produce.
Document Computing Technologies for Managing Electronic Document Collections Ross Wilkinson... [et al.] Circulation Counter [RES3H] ZA4080.D
Copyright © 2003 Pearson Education, Inc. Slide 1-1 Created by Cheryl M. Hughes The Web Wizard’s Guide to XHTML by Cheryl M. Hughes.
SCHOOL OF LIBRARY, ARCHIVE AND INFORMATION STUDIES Andy Dawson LIS1510 Library and Archives Automation Issues XML and extensible systems Andy Dawson School.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
XML The Extensible Markup Language (XML ), which is comparable to SGML and modeled on it, describes how to describe a collection of data. A standard way.
XML Introduction to XML Extensible Markup Language.
Connecting to External Data. Financial data can be obtained from a number of different data sources.
XML Notes taken from w3schools. What is XML? XML stands for EXtensible Markup Language. XML was designed to store and transport data. XML was designed.
Rendering XML Documents ©NIITeXtensible Markup Language/Lesson 5/Slide 1 of 46 Objectives In this session, you will learn to: * Define rendering * Identify.
Kynn Bartlett 11 April 2001 STC San Diego The HTML Writers Guild Copyright © 2001 XML, XHTML, XSLT, and other X-named specifications.
XML BASICS and more…. What is XML? In common:  XML is a standard, simple, self-describing way of encoding both text and data so that content can be processed.
XML QUESTIONS AND ANSWERS
Introduction to XHTML.
Presentation transcript:

XML and XSL A report on the workshop given by Shaoping Moss on October 16, 2004 Presented by ASIS&T members Caryn Anderson, Prairie Clayton & Kara Schwartz At Simmons College, November 1, 2004 ….with additional examples from a real-life project

Topics discussed SGML, XML, and HTML XML and XSL Basics XML in Libraries and Academics XML in Future Web Development Slide content courtesy of Shaoping Moss.

Markup Languages Address the structure of a document. Identify different components of the document. Convey information to software that will allow it to: –Index the data for searching. –Render the data. –Transform the data. SGML, XML, and HTML are all markup languages. Slide content courtesy of Shaoping Moss.

Document, Structure, and Format A document is: –“A record which contains information, originally an inscribed or written record but now considered to include any format in which information might be held (e.g. map, manuscript, tape, video, software).” (International Encyclopedia of Information and Library Science) –A collection of small elements, which can be headings, subheadings, paragraphs, quotations, etc… Structure vs Format –Structure is about the content of the document. –Format is about the way a document looks. Slide content courtesy of Shaoping Moss.

What is SGML? Stands for Standard Generalized Markup Language. Initiated by Charles Goldfarb at IBM in the 1960s. Adopted as a standard of the International Organization for Standardization(ISO 8879) in Slide content courtesy of Shaoping Moss.

SGML and Its Subdivisions SGML is composed of tag-set building rules. SGML has given birth to other sets of subdivisions: –HTML and XML. –CALS for defense. –BOEING for commercial airlines. –C-H for publishing. –OED for Old English Dictionary. –TEI guidelines for the Text Encoding Initiative. –EAD for Encoded Archival Descriptions. Slide content courtesy of Shaoping Moss.

HTML Development HTML stands for Hypertext Markup Language. HTML was developed by Tim Berners-Lee at a physics lab near Geneva, Switzerland in Its simplicity has contributed to the rapid growth of the World Wide Web in the 1990s. HTML version 4 came out in XHTML 1.0 is the latest HTML standard. Slide content courtesy of Shaoping Moss.

HTML Problems Easy HTML coding has made it harder for browsers to handle. Tags are predefined in HTML. Format and content are mixed and content is hard to reuse. Slide content courtesy of Shaoping Moss.

What is XML? XML is a new Web standard developed by the World Wide Web Consortium in XML stands for eXtensible Markup Language. XML was designed to describe data. XML tags are not predefined in XML. XML separates format from content and semantic structure. Data encoded in XML can function much like a traditional database. XML content can be output in many formats, such as XHTML, text, Word documents, PDF, etc… Slide content courtesy of Shaoping Moss.

The Display of the Document My First XML Chapter 1: Introduction to XML What is HTML? What is XML? Chapter 2: XML Syntax Elements must have a closing tag Elements must be properly nested Slide content courtesy of Shaoping Moss.

An HTML Document Slide content courtesy of Shaoping Moss. An HTML document describes the book: … My First XML Introduction to XML What is HTML? What is XML? XML Syntax Elements must have a closing tag. Elements must be properly nested. …

An XML Document Slide content courtesy of Shaoping Moss, 2004 An XML document describes the book: … My First XML Introduction to XML What is HTML? What is XML? XML Syntax Elements must have a closing tag. Elements must be properly nested. …

HTML Elements/Tags Original slide content courtesy of Shaoping Moss. An HTML document describes the book: … My First XML Introduction to XML What is HTML? What is XML? XML Syntax Elements must have a closing tag. Elements must be properly nested. … Are: defined by HTML standard always the same can be used in any order

XML Elements/Tags An XML document describes the book: … My First XML Introduction to XML What is HTML? What is XML? XML Syntax Elements must have a closing tag. Elements must be properly nested. … Are: defined by user/groups (DTD/Schema) different for each DTD/Schema hierarchical (tree structure) Original slide content courtesy of Shaoping Moss.

XML is flexible and extensible An XML document describes the book for a different user group: … My First XML Introduction to XML What is HTML? What is XML? XML Syntax Element Rules Elements must have a closing tag. Elements must be properly nested. … Instead of “book” Extend to accommodate greater detail of “part” “section” AND “paragraph” Original slide content courtesy of Shaoping Moss.

Slide content courtesy of Shaoping Moss. Differences between HTML and XML XML is not a replacement for HTML. XML and HTML were designed with different goals. - XML was designed to describe data and to focus on what data is. - HTML was designed to display data and to focus on how data looks. HTML structure and tags are very loose while XML structure and tags are strict: - XML documents must be well-formed. - XML elements must be properly nested. - All XML elements must be closed. - Tag names must be case consistent.

Differences HTML XML Content Format Selection & Organization - Held in generic containers (,, etc.) -In the default format of the content tag OR -As defined by a Cascading Style Sheet (internal or external) -All content always included (no option to easily select or suppress content – must manually change document) -Content only displayed in the order written (to change order you must manually change document -Held in specific containers that describe what the data is (,, etc.) -XSLT files define the formats of each section (i.e. font, color, size, etc.) -multiple XSLTs for same XML -XSLT selects and determines order of display of content -Multiple XSLTs for same XML (one to produce just book title list, one to display full text, one for citations, etc.)

Differences HTML XML Analogy What you can get Address List in plain WORD document One document of your list of contacts with all the information that you have for each person in the order you typed it. Address List in database or MAIL MERGE data file Friends & Family with full addresses for Holiday cards list of just Professional contacts for announcing new product Special formatting of whole list for better display on PDA Etc. etc. etc. all from SAME XML document

How to Build an XML file family 1.Establish the Document Type Definition (DTD) or Schema 2.Write a well-formed XML document that holds your data in the containers established by your DTD/Schema 3.Validate your XML document to make sure you conformed to your DTD/Schema 4.Build as many different XSL documents as you need to select data from your XML file, organize it the way you want it to appear, and format it so it looks the way you want. Now you can link your XML file to whatever XSL you want to get the kind of display you want at any given time.

The XML family unit of files and languages XML Where the data is held DTD or Schema The organizational chart for the data XSL Instructions for using XML data and displaying it Uses XSLT to select data from.xml file and format it Uses XSL-PATH to access certain spots in the.xml file Uses XSL-FO for specifying formatting semantics (?) File types:.dtd.xml (schemas) File type:.xmlFile type:.xsl For validation during creation WEB PAGE Languages used in XSLT documents during creation 1. Calls the.xml file 2. Calls.xsl for display instructions 3. Looks in.xml for content 4. Returns content to.xsl 5. Displays content to browser Uses HTML for formatting

The DTD or Schema <!ELEMENT booktitle(#PCDATA) + means there can be as many of this element as you want The DTD establishes the hierarchy of elements/tags. Original file content courtesy of Shaoping Moss.

The XML document HTML and XHTML:the Definitive Guide Chuck Musciano Bill Kennedy USA O’ Reilly XHTML 1.0 Language Sourcebook Ian S. Graham USA John Wiley and Sons This is what DTD is being used. This is what XSL is being used. Original file content courtesy of Shaoping Moss.

The XSL document My Book Collection Title Author Publisher Country Price 1995"> “xsl:template” is XSLT for “use the template below” “xsl:for-each” with the “select” instruction is XSLT for “select from each of the books in the booklist” “match” is X-PATH for “link to” or “start with” and “/” means the root element (“booklist” in this case) “xsl:sort” with the “select” instruction is XSLT for “sort by publisher” “xsl:if” with the “test” instruction is XSLT for “only those books when the year is later than 1995” This is basic HTML for the template… “xsl:value-of” with the “select” instruction is XSLT for “use the data from this element” You must close your XSLT commands You must close the HTML tags of your template

The Web Page Original file content courtesy of Shaoping Moss.

Done! – not so hard Logical Flexible Extensible Interoperable!!

XML in Libraries Use XML to mapping MARC to MARC XML, HTML, or MODS formats MARC XML Conversion Stylesheets MARC XML Conversion Stylesheets Use XML to improve searching of archival finding aids and to catalog Web sites- Five College Archives & Manuscript Collections. XML-based eScholarship. Use XML for interlibrary loan. XML-based database systems. Slide content courtesy of Shaoping Moss.

XML in Academics Text Encoding Initiative(TEI)  Initially launched in 1987, TEI is an internationally and interdisciplinary standard for encoding, keeping and analyzing textual content & structure of digital texts.  This standard is designed for use with a broad range of text types, especially in the humanities. It is widely used in libraries, archives, and by publishers and researchers for online research and teaching and for the storage and exchange of large and small text collections.  Since 1987, TEI projects have mushroomed in all humanities disciplines, including language, literature, history, classics, social science and computer science. Slide content courtesy of Shaoping Moss.

TEI projects Women Writers Project. Perseus Digital Library. Early American Fiction Collection. American Memory Project- Historical Collections for the National Digital Library. The Newton Papers Project. Slide content courtesy of Shaoping Moss.

XML is Going to Be Everywhere TEI guidelines for the Text Coding Initiative EAD for Encoded Archival Descriptions The Dublin Core Metadata Initiative (DCMI) MARC XML-MARC 21 XML Schema MODS XML- Metadata Object Description Schema Slide content courtesy of Shaoping Moss.

XML is Going to Be Everywhere Resource Description Framework (RDF) Information and Content Exchange (ICE) Online Information Exchange (ONIX) Metadata for Images in XML (MIX) XML/EDI (Electronic Data Interchange) Bioinformatic Sequence Markup Language (BSML) Mathematical Markup Language (MathML) Slide content courtesy of Shaoping Moss.

XML in Future Web Development XML is a cross-platform, software and hardware independent tool for transmitting information. XML will be as important to the future of the Web as HTML has been to the foundation of the Web. XML will become the most common tool for all data manipulation and data transmission. Every serious Web technology is now expected to define its relationship to XML. Slide content courtesy of Shaoping Moss.

XML in Future Web Development “Every serious Web technology is now expected to define its relationship to XML.” - Catherine Ebenezer in Trends in Integrated Library Systems. Slide content courtesy of Shaoping Moss.

Shaoping Moss Information Technology Consultant Research and Instructional Support Mount Holyoke College Phone: Fax: We are grateful to Shaoping Moss for being such an excellent instructor and giving us permission to use her slides and materials in this presentation.

So this XML stuff is rad and all but could I see why I’d want to learn it and not just an encoding set like EAD?

Well, suppose you’ve got a batch of metadata on your hands. Not just any metadata, but some weird set of information that can’t really be shoehorned into your pal MARC 21. You need some way of organizing the metadata. It would be nice if you could make the metadata look all pretty and whatnot, while you’re at it.

Here’s where XML comes in! 1.Get your metadata together, having done all the sexy stuff like data dictionary creation first 2.Define labels for everything 3.Match related terms, including subordinates 4.Define your rules (Y can only appear after X, and if you have X and Y, you must have Z, but Q is optional, etc) 5.You’ve pretty much just made up a schema right there 6.Wait, what was that about making it pretty?

Oh, right, it should be attractive. Well, then you just start playing with XSL. Specifically, you tell the XSL to go look at the plain ol’ stylesheet you’ve adapted from a thousand other HTML pages.

So then you’ve got this.

Hey, wait. I thought you said this was all cross-platform and cross-browser. How come this isn’t parsing in my browser? And how do I search individual records? You mean I have to hand encode every record? Well, yes. You can write your own parser, export encoded records from a database, or create a search engine if you like. You’ll just need more than a semester’s worth of practice to do it.