Lucas Mak and Dao Rong Gong Michigan State University Millennium and XML: Repurposing and Customizing Metadata May 17 - 20, 2009.

Slides:



Advertisements
Similar presentations
XML-XSL Introduction SHIJU RAJAN SHIJU RAJAN Outline Brief Overview Brief Overview What is XML? What is XML? Well Formed XML Well Formed XML Tag Name.
Advertisements

Chungnam National University DataBase System Lab
Putting the Pieces Together Grace Agnew Slide User Description Rights Holder Authentication Rights Video Object Permission Administration.
Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
CG0119 Web Database Systems Parsing XML: using SimpleXML & XSLT.
SPECIAL TOPIC XML. Introducing XML XML (eXtensible Markup Language) ◦A language used to create structured documents XML vs HTML ◦XML is designed to transport.
XSL XSLT and XPath 11-Apr-17.
Introduction to metadata for IDAH fellows Jenn Riley Metadata Librarian Digital Library Program.
An Introduction to MODS: The Metadata Object Description Schema Tech Talk By Daniel Gelaw Alemneh October 17, 2007 October 17, 2007.
XSL Concepts Lecture 7. XML Display Options What can XSL Transformations do? generation of constant text suppression of content moving text (e.g., exchanging.
A Practical Introduction to XML in Libraries Marty Kurth NYLA October 22, 2004.
XML Technologies and Applications Rajshekhar Sunderraman Department of Computer Science Georgia State University Atlanta, GA 30302
September 15, 2003Houssam Haitof1 XSL Transformation Houssam Haitof.
17 Apr 2002 XML Stylesheets Andy Clark. What Is It? Extensible Stylesheet Language (XSL) Language for document transformation – Transformation (XSLT)
Incompatible or Interoperable? A METS bridge for a small gap between two digital preservation software packages Lucas Mak Metadata & CatalogLibrarian
Introduction to XSLT & its use in Grainger Library full-text & metadata projects Thomas G. Habing Grainger Engineering Library Presentation to ASIS&T,
Batch-conversion of Non-standard Multiscript Records by XSLT Lucas Mak Metadata and Catalog Librarian Michigan State University Catalog Management Interest.
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
OCLC Online Computer Library Center Two Paths to Interoperable Metadata Jean Godby, Devon Smith, Eric Childress DC-2003 September 29, 2003.
Overview of XPath Author: Dan McCreary Date: October, 2008 Version: 0.2 with TEI Examples M D.
Digital Encoding What’s behind E-text Resources?.
ECA 228 Internet/Intranet Design I Intro to XSL. ECA 228 Internet/Intranet Design I XSL basics W3C standards for stylesheets – CSS – XSL: Extensible Markup.
By Carrie Moran. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
CSE3201/CSE4500 XPath. 2 XPath A locator for elements or attributes in an XML document. XPath expression gives direction.
Metadata: An Overview Katie Dunn Technology & Metadata Librarian
IS432 Semi-Structured Data Lecture 5: XSLT Dr. Gamal Al-Shorbagy.
CSE3201/CSE4500 Information Retrieval Systems
Introduction technology XSL. 04/11/2005 Script of the presentation Introduction the XSL The XSL standard Tools for edition of codes XSL Necessary resources.
XP New Perspectives on XML Tutorial 6 1 TUTORIAL 6 XSLT Tutorial – Carey ISBN
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
WORKING WITH XSLT AND XPATH
XP New Perspectives on XML, 2 nd Edition Tutorial 10 1 WORKING WITH THE DOCUMENT OBJECT MODEL TUTORIAL 10.
An Introduction to XML Presented by Scott Nemec at the UniForum Chicago meeting on 7/25/2006.
XML Overview. Chapter 8 © 2011 Pearson Education 2 Extensible Markup Language (XML) A text-based markup language (like HTML) A text-based markup language.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
I Never Met a Data I Didn’t Like Metadata Issues in Local and Shared Digital Collections Presentation to ALCTS Electronic Resources Interest Group January.
Metadata: Essential Standards for Management of Digital Libraries ALI Digital Library Workshop Linda Cantara, Metadata Librarian Indiana University, Bloomington.
ECA 228 Internet/Intranet Design I XSLT Example. ECA 228 Internet/Intranet Design I 2 CSS Limitations cannot modify content cannot insert additional text.
CITA 330 Section 6 XSLT. Transforming XML Documents to XHTML Documents XSLT is an XML dialect which is declared under namespace "
Extensible Stylesheet Language Chao-Hsien Chu, Ph.D. School of Information Sciences and Technology The Pennsylvania State University XSL-FO XSLT.
Complex Data Transformations in Digital Libraries with Spatio-Temporal Information B. Martins, N. Freire, J. Borbinha Instituto Superior Técnico, Technical.
1 Metadata Standards Catherine Lai MUMT-611 MIR January 27, 2005.
I Never Met a Data I Didn’t Like Metadata Issues in Local and Shared Digital Collections Presentation to ALCTS Electronic Resources Interest Group January.
XPath Aug ’10 – Dec ‘10. XPath   XML Path Language   Technology that allows to select a part or parts of an XML document to process   XPath was.
Roy Tennant California Digital Library escholarship.cdlib.org/rtennant/presentations/2003cil/ Achieving Together What None Can Do Alone: Interoperability.
Metadata and Documentation Iain Wallace Performing Arts Data Service.
Introduction to metadata
ACG 6415 XSLT Presenting XML and XBRL. Re-Purpose  The main benefit of XML / XBRL Reusability of Data contained in Instance Document We need a method.
Welcome to de Gruyter Reference Global. De Gruyter Reference Global provides you with comprehensive access to high quality academic content Run a quick.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
XSLT. XSLT stands for Extensible Stylesheet Language Transformations XSLT is used to transform XML documents into other kinds of documents. XSLT can produce.
XPath. XPath, the XML Path Language, is a query language for selecting nodes from an XML document. The XPath language is based on a tree representation.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
ACG 4401 XSLT Extensible Stylesheet Language for Transformations Presenting XML and XBRL.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
ACG 4401 XSLT Extensible Stylesheet Language for Transformations Presenting XML and XBRL.
A RCHIVAL COLLECTIONS IN A D IGITAL W ORLD Cheryl Walters Nov. 6, 2008.
XML Schema – XSLT Week 8 Web site:
A centre of expertise in digital information management UKOLN is supported by: Metadata – what, why and how Ann Chapman.
Making the Most of Your Descriptive Metadata: Planning, Transforming, and Re-using Nancy Fallgren, Metadata Specialist Librarian National Library of Medicine*
1 XSLT XSLT (extensible stylesheet language – transforms ) is another language to process XML documents. Originally intended as a presentation language:
1 XML and XML in DLESE Katy Ginger November 2003.
Terry Reese Build your toolbox: In depth data manipulation with MarcEdit to prepare your data for the ANBD Terry Reese
7th Annual Hong Kong Innovative Users Group Meeting
Catherine Lai MUMT-611 MIR January 27, 2005
A Lightweight Structured Data Implementation Using JSON-LD and Schema
Some Options for Non-MARC Descriptive Metadata
Presentation transcript:

Lucas Mak and Dao Rong Gong Michigan State University Millennium and XML: Repurposing and Customizing Metadata May , 2009

Today’s Outline Overview of Metadata Millennium system and XML Overview of XSLT Case Studies 1.Sunday School Books Collection 2.New Book List Conclusions and Observations

Metadata Structured data or information about an information resource. Types of metadata: –Descriptive –Administrative/Rights –Preservation –Technical –Structural

Descriptive Metadata Popular descriptive metadata standards –Dublin Core (Simple & Qualified) –MODS –MARCXML –VRA Core –IEEE LOM –TEI Header –EAD

Innovative XML XML records from Millennium Retrieved through HTTP query Data arrangement based on MARC fields –But MARC field and its subfields are siblings Optimized for WebPAC display –Brief record (for search result index page display)Brief record Contains data from MARC 245, Publication year, record ID –Full record (for both public and staff MARC display of individual record)Full record

Public display Staff MARC display

Millennium System and XML MillenniumMillennium Delimited Text MARCMARC XMLXML /xrecord XMLServer OAIHarvester Metadata Builder Content Pro Content Pro Content Pro Content Pro

/xrecord

XML Server XML server query string (search for title “xslt”): txslt

OAI Harvester

MetaData Builder

Content Pro in Encore

XSLT Extensible Stylesheet Language Transformation Current version: 2.0 “Transformation” means: –Manipulation of XML documents by creating a new document based on the original document We recommend against multiple bullet indents Usages in library context: –Crosswalking Data selection and manipulation –Web display Example: converting EAD into HTML for web display

XSLT Uses XPath expressions to select/filter data node –By name of “Element” –By value of “Element” and/or “Attribute”

Case Study One Sunday School Books Collection –19 th century publications by religious societies –170 titles digitized and cataloged Data conversion needs –Source: Millennium –Target: Content Pro –Conversions in: Format:.marc to XML Schema and Data Structure: MARC to Qualified Dublin Core

Options for Data Migration Create Lists MARC XML Innovative XML MARC File Content Pro (QDC) MillenniumMillennium HTTP Query HTTP Query XSLT MARCEdit

Segment of Innovative XML Siblings MARC field/subfield as value of element Field indicator as value of element

Segment of MARC21XML Parent-Child MARC field/subfield as value of element attribute Field indicator as value of element attribute

Segment of MARC21XML Issues with Innovative XML data conversion needs –Data structured differently from MARC21XML Availability of existing “Innovative XML to DC/QDC” XSLT? –Not optimized for data manipulation Complications in data selection »Selection of data node by matching criteria against values in individual elements »A series of matching may be needed for selecting just one node Efficiency in processing »Multiple upward, downward, and lateral movement involved in data selection

Final Path of Data Migration Create Lists MARC XML MARC File Content Pro (QDC) Millennium(.marc)Millennium(.marc) XSLT MARCEdit

Design of XSLT Based on LC’s “MARC To Simple DC” XSLT“MARC To Simple DC” XSLT –Customized mappings according to LC’s suggestionsLC’s suggestions –Crosswalking strategies Conditional processing (i.e. matching) boolean ( ), contains ( ), starts-with ( ),, String manipulation Used in both conditional processing and data selection for output substring ( ), substring-before ( ), substring-after ( ), translate ( ), concat ( ), normalize-space ( )

Design of XSLT Conditional Processing & String Manipulation in De- duplication <xsl:if test="not(contains($dataField245Lower, translate(substring(normalize-space(.),1,string-length()-1), $upperCase,$lowerCase)))"> <xsl:value-of select="normalize-space (substring(.,1,string-length()-1))"/> Converts 245 & 246 into lower case before comparing Chop trailing period (.) Compare MARC 246 against MARC 245

Design of XSLT No for MARC 246

Design of XSLT Predicate Used for data selection and de-duplication --> <xsl:for-each = preceding-sibling::marc: and Selects LCSH only Selects unique 650$y only

Design of XSLT Hard-coding Inserted elements that are global to all records application/pdf --> application/pdf

Segment of Source MARCXML

Segment of Output QDC XML

Case Study Two Library’s book lists Issues with featured list

Existing New Book List –Newly cataloged books for browse shelf –New approach using XML and XSLT New features design –Sorting –RSS feed –Customization Case Study Two

New Book List Based on XML File Millennium XML server outputs two files –Entire new book list over a rolling period of time –List of daily added books New Book List program output –Book List in HTML format –RSS feed for daily added books

Path of Data Processing Web Server & php Web Server & php MillenniumMillennium EXPECT XSLT Internet XML output

Design of XSLT

Putting It Together

Observations and Challenges Millennium System and XML –XSLT processor within Millennium and customizing Innovative XML output Using XML as data source –Large XML file size XSLT and data processing –XSLT data manipulation –Lack of built-in functions for conditional data looping etc.

Thank you!