A Lightweight Structured Data Implementation Using JSON-LD and Schema

Slides:



Advertisements
Similar presentations
A centre of expertise in digital information management Approaches To The Validation Of Dublin Core Metadata Embedded In (X)HTML Documents Background The.
Advertisements

DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
HTML5 ETDs Edward A. Fox, Sung Hee Park, Nicholas Lynberg, Jesse Racer, Phil McElmurray Digital Library Research Laboratory Virginia Tech ETD 2010, June.
Metadata Descriptions statements descriptions records.
The Caught and Coloured website: its EMu origins Alex Chubaty – Collection Information Systems Craig Churchill – IT Software Development Museum Victoria.
OCLC Online Computer Library Center CONTENTdm Developers Meeting ALA Midwinter Meeting Seattle, WA January 19 th, 2007 Claire Cocco, Product Manager Joe.
Multiple Tiers in Action
Web Database Programming Week 6 Using Templates & Updating Web Database.
HTML Presented by: Ondřej Procházka Course: Distributed Data Processing Mentor: Rafał Michalski.
Introduction to XSLT & its use in Grainger Library full-text & metadata projects Thomas G. Habing Grainger Engineering Library Presentation to ASIS&T,
Batch-conversion of Non-standard Multiscript Records by XSLT Lucas Mak Metadata and Catalog Librarian Michigan State University Catalog Management Interest.
OCLC Online Computer Library Center Two Paths to Interoperable Metadata Jean Godby, Devon Smith, Eric Childress DC-2003 September 29, 2003.
Project Report Presentation and Update October 10, 2014 Jeff Mixter - OCLC Research Patrick OBrien - Montana State Univeristy Kenning Arlitsch - Montana.
Metadata Standards and Applications 4. Metadata Syntaxes and Containers.
By Carrie Moran. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability.
Letters Across the Pond A Digital Library Project Jonathan Tweedy S652 – Fall 2010.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
Jennifer Bowen, University of Rochester ALA Midwinter Conference January 22, 2012, Dallas, TX The eXtensible Catalog (XC): Transitioning to a Post-MARC.
Overview of Previous Lesson(s) Over View  ASP.NET Pages  Modular in nature and divided into the core sections  Page directives  Code Section  Page.
XML and its applications: 4. Processing XML using PHP.
NetTech Solutions Working with Web Elements Lesson 6.
JavaScript is a client-side scripting language. Programs run in the web browser on the client's computer. (PHP, in contrast, is a server-side scripting.
Lucas Mak and Dao Rong Gong Michigan State University Millennium and XML: Repurposing and Customizing Metadata May , 2009.
Linked Data and Islandora Jenny Jing, Code4Lib 2015 Workshop, Portland, Feb.9, 2014.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Presentation Topic: XML and ASP Presented by Yanzhi Zhang.
Javascript II DOM & JSON. In an effort to create increasingly interactive experiences on the web, programmers wanted access to the functionality of browsers.
Library needs and workflows Diane Boehr Head of Cataloging National Library of Medicine, NIH, DHHS
JSON-LD. JSON as an XML Alternative JSON is a light-weight alternative to XML for data- interchange JSON = JavaScript Object Notation – It’s really language.
JavaScript - A Web Script Language Fred Durao
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
Cataloging Compound Digital Objects: Using METS for Digitized Sanborn Maps Christopher Cronin Head of Digital Resources Cataloging University of Colorado.
Introduction to metadata
Extensible Metadata Developments in the Triangle Digital Library Project.
Semantic Web Technologies Brief Readings Discussion Class work: Research topics and Project discussion Research Presentation Topics assigned Building lightweight.
Uwe SchindlerGES 2007 – May 2-4, 2007 Data Information Service based on Open Archives Initiative Protocols and Apache Lucene Uwe Schindler 1, Benny Bräuer.
A Whirlwind Tour Through Part of the Metadata Landscape Jenn Riley Metadata Librarian IU Digital Library Program.
Dr. Martin Zhao Sept 4, Topics HTML and related tutorials on w3schools.com Related HTML tags Adding interesting features using JavaScript What is.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
The Catalog of the Future: Integrating Electronic Resources By Dana M. Caudle Cataloging Librarian Auburn University Libraries
HTML tags and attributes By: Dennis Champagne. List of tags.
PREPARING FOR LINKED DATA IN DIGITAL REPOSITORIES Sai Deng, University of Central Florida Libraries ACRL Technical Services Interest Group ALA.
Beyond HTML: Extensible Markup Language (XML)
Lucas Mak & Lisa Lorenzo, Michigan State University Libraries
Web Basics: HTML/CSS/JavaScript What are they?
XML: Extensible Markup Language
Getting Started with CSS
Rep change 1590 (ver 18) Access to Google books
Ready...Set...URIs...Actionable!
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
JSON-LD.
ALA Practical Linked Data With Open Source
Embedding Knowledge in HTML
Workshop on XML-Based Library Applications 5
The Re3gistry software and the INSPIRE Registry
PREMIS Tools and Services
IDEALS at the University Of Illinois: A Case Study of Integration Between an IR and Library Discovery Systems Sarah L. Shreeves University of Illinois.
Google Dataset Search Evaluation
Database Design Hacettepe University
HTML 5 SEMANTIC ELEMENTS.
Embedding Knowledge in HTML
JSON for Linked Data: a standard for serializing RDF using JSON
XML and its applications: 4. Processing XML using PHP
Use Cases Simple Machine Translation (using Rainbow)
Programmatic interaction with the Invenio-based NADRE Repository
Programmatic interaction with the Invenio-based NADRE Repository
JSON-LD.
Information Technologies Anselm Spoerri PhD (MIT)
SDMX IT Tools SDMX Registry
Presentation transcript:

A Lightweight Structured Data Implementation Using JSON-LD and Schema A Lightweight Structured Data Implementation Using JSON-LD and Schema.org for Digital Repository Lucas Mak, Lisa Lorenzo, Nicole Smeltekop Michigan State University Libraries ALCTS CaMMS Cataloging Norms Interest Group (ALA Midwinter 2017, Atlanta GA, January 21, 2017)

Background Digital repository @ MSU Islandora repository Formats Text, audio, image, compound object Metadata MarcXML, MODS, DC, ETD-MS Stored as datastreams along with digital objects Fedora backend with Drupal front end

Structured Data Markup “Describes things on the Web with their properties”* Typically uses schema.org vocabulary Commonly in JSON-LD, RDFa, or microdata format

What others have done …

What we want …

Mapping Firstly, map MODS elements to Schema.org elements Not all MODS elements are mapped

Choosing Markup Format JSON-LD JavaScript Object Notation for Linked Data Will be embedded in HTML <head> as a block of codes instead of requiring adding attributes in HTML tags  less work

Validation Google Structured Data Testing Tool Created a sample record based on the mapping and validated it using an online validation tool by Google Validation tool does not allow mix & match of vocabularies -> can’t mix dcterms with schema -> can’t mix properties from different schema types Google Structured Data Testing Tool @ https://search.google.com/structured-data/testing-tool?url

Creating Transformation PHP can run XSLT 1.0 only </> XSLT

Implementation MODS XSLT JSON-LD PHP Decided not to store the JSON-LD data as datastream to minimize maintenance When the item page loads, a PHP script grabs the MODS records, applies the XSLT against the MODS, and embeds the output JSON-LD (wrapped in <script> tag) into the HTML header

Getting URIs into JSON-LD Getting URIs into source data URIs inserted by authority vendor during authority processing (for records originated from the catalog) Inserts URIs into $0 in MarcXML or @valueURI in MODS manually or programmatically using conversion table Possibly using MarcNext in MarcEdit or querying APIs of various linked data services in the future Builds URIs for certain elements during transformation from MarcXML to MODS e.g. Language code: eng  http://id.loc.gov/vocabulary/iso639-2/eng URIs in MODS get carried over to JSON-LD during XSLT transformation

Getting URIs into JSON-LD LCSH – the sticking point Not all possible LCSH strings have corresponding URIs Holocaust memorials http://id.loc.gov/authorities/subjects/sh88005153 Poland  http://id.loc.gov/authorities/names/n79131071 Holocaust memorials -- Poland  ?? Pattern headings Personal narratives, American, [French, etc.]  http://id.loc.gov/authorities/subjects/sh99001715 Can we really use this URI? FAST – the “solution” Holocaust memorials http://id.worldcat.org/fast/958834 Poland  http://id.worldcat.org/fast/1206891 Personal narratives--American  http://id.worldcat.org/fast/1424071

Getting URIs into JSON-LD FAST – the “solution” Gets FAST headings and IDs from LCSH using OCLC FAST Converter Builds URIs based on FAST ID inserted in $0 during transformation into MODS e.g. World War (1939-1945) (OCoLC)fst01180924  http://id.worldcat.org/fast/1180924 Problem: Lag in update of the converter database @ https://fast.oclc.org/lcsh2fast/

Next Step Script to convert LCSH in MODS to FAST using OCLC FAST API MarcXML not available for all digital collections Moving away from creating MarcXML Should we just use FAST?? Markup in vocabularies other than schema.org, e.g. dcterms

Thank You! Lucas Mak makw@mail.lib.msu.edu Lisa Lorenzo lorenzo7@mail.lib.msu.edu Nicole Smeltekop nicole@mail.lib.msu.edu