OCLC Project Passage User Interface Assisting the cataloging workflow

Slides:



Advertisements
Similar presentations
Creating Elluminate Live! Recordings Best Practices.
Advertisements

1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
ERIN STALBERG NCSU LIBRARIES SEPTEMBER 16, 2009 Cool Tools – More Connexion.
Creating Web Page Forms
M AKING E - RESOURCE ACCESSIBLE FROM ONLINE CATALOG *e-books *serials Yan Wang Senior Librarian Head of Cataloging & Database Maintenance Central Piedmont.
IPUMS to IHSN: Leveraging structured metadata for discovering multi-national census and survey data Wendy L. Thomas 4 th Conference of the European Survey.
39 Copyright © 2007, Oracle. All rights reserved. Module 39: Siebel Task UI Siebel 8.0 Essentials.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
® IBM Software Group © 2009 IBM Corporation Rational Publishing Engine RQM Multi Level Report Tutorial David Rennie, IBM Rational Services A/NZ
The world’s libraries. Connected. WorldShare platform & Management Services Integrate all of your collections: print, licensed & digital Chris Thewlis.
OCLC Online Computer Library Center Kathy Kie December 2007 OCLC Cataloging & Metadata Services an introduction.
Organizing Internet Resources OCLC’s Internet Cataloging Project -- funded by the Department of Education -- from October 1, 1994 to March 31, 1996.
1 OPOL Training (OrderPro Online) Prepared by Christina Van Metre Independent Educational Consultant CTO, Business Development Team © Training Version.
ARCHIVISTS’ TOOLKIT WORKSHOP March 13, 2008 Christine de Catanzaro Jody Thompson.
RDA Toolkit is an integrated, browser-based, online product that allow user to interact with a collection of cataloging-related documents and resources.
Evaluating & Maintaining a Site Domain 6. Conduct Technical Tests Dreamweaver provides many tools to assist in finalizing and testing your website for.
UoS Libraries 2011 EndNote X5 - basic graduate session.
Software Reuse Course: # The Johns-Hopkins University Montgomery County Campus Fall 2000 Session 4 Lecture # 3 - September 28, 2004.
Website Design, Development and Maintenance ONLY TAKE DOWN NOTES ON INDICATED SLIDES.
Santi Thompson - Metadata Coordinator Annie Wu - Head, Metadata and Bibliographic Services 2013 TCDL Conference Austin, TX.
Building KFS using KNS Presented by James SmithJustin Beltran University of ArizonaUniversity of California, Irvine.
CONTENTdm A proven solution September A complete digital collection management software solution Stores, manages and provides access for all digital.
Perform a complete mail merge Lesson 14 By the end of this lesson you will be able to complete the following: Use the Mail Merge Wizard to perform a basic.
Emdeon Office Batch Management Services This document provides detailed information on Batch Import Services and other Batch features.
1 Terminal Management System Usage Overview Document Version 1.1.
The Palantir Platform… …Changes in 2.3
General System Navigation
Core ELN Training: Office Web Apps (OWA)
AEM Digital Asset Management - DAM Author : Nagavardhan
Databases: What they are and how they work
CONTENT MANAGEMENT SYSTEM CSIR-NISCAIR, New Delhi
Internet Made Easy! Make sure all your information is always up to date and instantly available to all your clients.
KARES Demonstration.
Running a Forms Developer Application
Visual Basic 2010 How to Program
Project Management: Messages
Web Routing Designing an Interface
Working in the Forms Developer Environment
Practical Office 2007 Chapter 10
BIBFLOW Project Update
VI-SEEM Data Discovery Service
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Distribution and components
Metadata Editor Introduction
CHAPTER 3 Architectures for Distributed Systems
ALEPH Version 22 Beginning Cataloging
Prototyping a Linked Data Platform for Production Cataloging Workflows
ALA Conversation Starter
Module 6: Preparing for RDA ...
European Network of e-Lexicography
PPS/OPTRS Departmental Roles Structure System
Increased Efficiency and Effectiveness
Tools of Software Development
WorldCat: Broad Web visibility for our collection
Introduction to Metadata
Access: Access Basics Participation Project
DTN Training Welcome to: The session will begin shortly. SAY
WebDAV Design Overview
Designing and Using Normalization Rules
Using Wikibase to Manage Linked Data for Digital Collections
HOW TO USE THE NEW GLOBAL GRANT REPORT
Attributes and Values Describing Entities.
Rational Publishing Engine RQM Multi Level Report Tutorial
Products and services for digital library
Digitization Standards: Issues & Updates
CREE: HEIRPORT lite Welcome screen:
Contract Management Software 100% Cloud-Based ContraxAware provides you with a deep set of easy to use contract management features.
SDMX IT Tools SDMX Registry
Using FAST (Faceted Application of Subject Headings) in CONTENTdm
Presentation transcript:

OCLC Project Passage User Interface Assisting the cataloging workflow 2019 LD4 Conference on Linked Data in Libraries OCLC Project Passage User Interface Assisting the cataloging workflow May 11, 2019 Bruce Washburn Principal Engineer, OCLC Research

About Project Passage An OCLC Pilot Project, January - September 2018 16 OCLC Member Library participants Evaluated Wikibase as a cataloging platform for library linked data Passage Wikibase initialized with ~1M entities, representing overlaps between Wikidata, VIAF, and WorldCat Passage project participants used the platform for cataloging, focusing on selected material types In this short presentation I will talk about OCLC’s Project Passage (https://www.oclc.org/research/themes/data-science/linkeddata/linked-data-prototype.html), a pilot project carried out in 2018 to evaluate a federated instance of the Wikibase platform for creating bibliographic linked data.

The Wikibase Cataloging Interface Mature and well-tested Used by thousands for editing Wikidata The primary editing interface for Project Passage Requires some initial orientation and practice The Wikibase software platform is mature and well-tested, having been in development and widely used since 2012. For example, there are about 20,000 active users editing in Wikidata each month (which uses Wikibase), working in a global, decentralized network. https://stats.wikimedia.org/wikispecial/EN/TablesWikipediaWIKIDATA.htm#editor_activity_levels We used the built-in web-based editor that is part of the Wikibase platform as the primary means of creating and updating data in Project Passage. In our experience working with the Wikibase platform, and with the project participants, we noted that there was some initial orientation, training, and practice needed, to become acquainted with features of the user interface, but also to better understand the data architecture in an entity-oriented, rather than a record-oriented, cataloging environment.

The cataloging workflow is susceptible to interruption, depending on underlying data. An example: editing a description of a collection of photographs that depict various places. If a description of the place doesn’t already exist in the system, the description of the work can stall until the place can be described in its own, separate, entity description, after which the collection description can link to the place using the “depicts” property. Photo credit: “The Missing Piece” J. Sibiga Photography https://www.flickr.com/photos/picturepurrfect685/4628679591 Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) “The Missing Piece" by J. Sibiga Photography is licensed under CC BY-NC-ND 2.0

Here’s an example. We’re looking at a Project Passage entity description of an archival collection of stereographs. The collection is the primary entity we’re describing, but the collection is related to other entities in the system, including events (the 1906 earthquake), organizations (the Bancroft library at UC Berkeley where the collection is held), and other forms of description (an online finding aid).

There is also a property that connects the collection to other entities that the stereographs in the collection depict, including the city of San Francisco, the event of the 1906 earthquake, and refugee camps that were built after the event. The archivists who maintain this collection have noted other places that are depicted, so this list can be extended in Project Passage description.

One of those locations in Market Street in San Francisco One of those locations in Market Street in San Francisco. Using the Wikibase editing interface, a new “depicts” statement can be easily added, with the interface looking for and finding a separate entity description of Market Street in the Project Passage database.

Selecting the matching entity and clicking the “Save” link is all it takes to connect this collection to this place with a “depicts” relationship.

The collection also depicts Mission Street The collection also depicts Mission Street. In this case, the Project Passage database doesn’t have an entity description for this location, and here is where the workflow stalls. It’s kind of interesting that we’re stalled, as that highlights an important distinction between the Wikibase framework and some of the record-oriented, bibliographic frameworks we may be more accustomed to. There wouldn’t be anything in most cataloging or metadata documentation systems (using MARC, or EAD for finding aids, or CONTENTdm for digital content, for example) to stop us at this stage if we did not have an authoritative headings. We might not, in those systems, even have a way that the relationship between the work and the place is “depiction”. And if we wanted to tag the place we could add a MARC 651 field, or an EAD geogname element, or a Dublin Core Spatial Coverage element in CONTENTdm. But in the Wikibase environment if we want to connect the collection to an entity for Mission Street, we need to either: Note the missing entity, continue editing the Collection, and subsequently create the entity description and return the the collection entity and update it with a link Put the Collection editing on hold while, in a parallel process, we create at least a stub description of the location so that it can be linked to the collection, noting the new location entity as something to improve later There isn’t a way to simply add the string to the depicts relationship, and doing so would undermine the subsequent interoperability and power of the entity description. In working with the Project Passage participants, we preferred the 2nd option, but wanted to streamline the process of adding the missing entity, and create more than just a stub description, in order to keep the cataloger’s attention on describing the work.

The Project Passage Retriever Search for entity descriptions in Wikidata, VIAF, and FAST Align the classes and properties for these 3 resources with Project Passage Display data in a simple editing interface with tools to review, correct, and supplement imported data Load the data into Project Passage Out of this need to streamline the editing workflow, these functional requirements emerged for a new application that would serve as a helper for creating new descriptions for entities not found in Project Passage but described in other resources, which we named the Project Passage Retriever. This was designed as a companion application for the Wikibase Editor … imagine having two browser windows open, one for the Wikibase interface and another for the Passage Retriever. As the cataloger encountered the need to link a work description to another entity, like adding a “depicts” statement to the collection description for Mission Street, the Retriever would be asked to look for a description, map its data to the properties and classes defined in Project Passage, give the cataloger a way to review and amend the imported data, and a way to push the data into the Project Passage database.

Here’s a view the Project Passage Retriever, with search results for the keyword search “mission street”. The Retriever uses the three different APIs that are available for Wikidata, VIAF, and FAST to look for matches. It also uses the identifiers associated with the matching results to check in Project Passage, to make sure the entity isn’t already available there. There is a match for Mission Street in the Wikidata results, so we’ll select that.

Federated searching of Wikidata, VIAF, and FAST The Retriever Federated searching of Wikidata, VIAF, and FAST Built-in “crosswalks” to relate properties and classes in those systems and vocabularies to Project Passage Cross-pollination of source data. E.g., including additional aliases from VIAF, for a Wikidata match. A form-based UI for validating and improving imported data A simple bot-based mechanism for adding data to the Passage Wikibase The Project Passage Retriever uses the Wikidata API to retrieve data for the entity description, and using a built-in “crosswalk” it looks for Wikidata properties that have an equivalent property in Project Passage. Wikidata’s properties are far more extensive than those defined in Project Passage, so not all of the data from Wikidata will transfer, by design. The Retriever displays the entity data in a web form, to allow for review and minor editing. For example, an incorrect or invalid statement can be deleted, or the descriptive “Fingerprint” data can be modified. In this case we’d like improve the description, which doesn’t say what kind of a thing “Mission Street” is, and isn’t as accurate as it could be about the city where it is located. And to help with matching for searches or lookups that abbreviate the word “Street”, we can add an “also known as” string.

The new description describes what kind of a “thing” Mission Street is (a “thoroughfare”) and notes that the street extends from Daly City to San Francisco. And the abbreviated form of the name is added. (There is a missed opportunity here, where the Retriever should also be able to push those improvements back to Wikidata.)

Scrolling down we can see the other statements that have been automatically fetched by the Retriever, including geo-location data and an identifier linking the Passage entity back to its Wikidata source.

After review and editing, we’re ready to create the entity in Project Passage. Click the “Create Entity” button in the upper left …

… and its status changes while the data is being pushed to Project Passage by the Retriever application …

… and finally replaced with a success message and link to the new identifier for the Entity in Passage

Which is immediately available in the Wikibase user interface …

… with the imported statements and data, including automatically generated references from the statements back to the Wikidata Item source.

Returning to our Wikibase editing session for the collection, now when we search for Mission St. we find it in Passage

And can easily link it with a “depicts” statement.

The Project Passage Retriever Search for entity descriptions in Wikidata, VIAF, and FAST Align the classes and properties for these 3 resources with Project Passage Display data in a simple editing interface with tools to review, correct, and supplement imported data Load the data into Project Passage Returning to the high-level functional requirements for the Passage Retriever, these can be thought of more generally …

The Project Passage Retriever Federated search interface Data model crosswalk Cataloging user interface Wikibase data loader With this perspective, the Retriever can be seen as more of a mix, including the features and behaviors of a federated search interface for finding matches in other authority systems, a crosswalk or mapping from external system vocabularies to our local vocabulary, a simple cataloging interface, and a batch data loader for Wikibase.

The Project Passage Retriever Federated search interface Data model crosswalk Cataloging user interface Wikibase data loader In building the Retriever application, I found that the 2nd and 3rd features were the most challenging. We had software solutions available from prior work that helped speed up the step of building a federated search interface that worked with the Wikidata, VIAF, and FAST APIs, and had previously built Wikibase data loading scripts using the Pywikibot library.

The Project Passage Retriever Federated search interface Data model crosswalk Maintenance and synchronization challenges as external sources expand and data models change. Cataloging user interface Wikibase data loader For both items 2 and 3 in this list, there was a balancing act. For the data model crosswalk, what are the best target systems to search, how many are too many to effectively manage (from the perspectives of performance and code complexity), and how do we keep the vocabulary mappings in synch as the external sources may change over time?

The Project Passage Retriever Federated search interface Data model crosswalk Cataloging user interface Simplify entity creation while allowing data review and editing Import sufficient data for discovery and disambiguation while leveraging the network graph for a fuller description of the entity Wikibase data loader For the cataloging user interface, where the point of the application is to quickly create a description that is better than a stub description, how much of the user’s time and attention do we want to have applied to the review and editing step (understanding that Wikibase will have a much fuller set of editing features to apply, once the description is entered)? And how much data needs to be imported to support subsequent searching, selection, disambiguation, and duplicate detection, with the understanding that for some external sources a subsequent federated query can retrieve properties that were not imported, and any imported property will want to benefit from future updates and improvements made in its external source.

The Project Passage Retriever Would a “Retriever” UI and server-side application be generally useful for federated Wikibases? If so, how should the application be designed for optimal sharing and distribution? Can data model crosswalks for the application be community-managed?

Contact Bruce Washburn OCLC Research bruce_washburn@oclc.org @btwashburn