Metadata, Ingest, and Data Feeds

Slides:



Advertisements
Similar presentations
NIMAC Operations: The File Certification Process June 24, 2008 Nicole Gaines.
Advertisements

The DART-Europe E-theses Portal Martin Moyle Digital Curation Manager UCL Library Services, UK ETD 2009, University of Pittsburgh, June.
Date and place of the event here.  Learning portal  Access to digital learning resources on Organic Agriculture and Agroecology  Facilitate access,
What is touchPRO EXPRESS? touchPRO EXPRESS is a way for select industries who meet certain criteria to be able to get a mobile app at a low cost and have.
National Science Digital Library (NSDL) Core Infrastructure Metadata Repository (“union catalog”) Naomi Dushay Cornell University.
Aquifer Portal at U of Michigan Kat Hagedorn and Perry Willett University of Michigan DLF Spring Forum, Austin TX April 11, 2006.
Cambridge University Library
NOBLE Digital Library. How does it work? The NOBLE Digital Library uses the DSpace platform. Image files and metadata are imported into DSpace using.
NDR (resource references, metadata, collection data, etc.) NCS (& DDS) Expert Voices wiki.nsdl.org Harvest Manager OAI-PMH service (proai) NDR Search NCS.
Exchange formats and APIs Questions – how and when to access metadata? – lifecycle/status – how to access? can things disappear? – is CSV enough? – is.
Solutions Summit 2014 Discrepancy Processing & Resolution Terri Sullivan.
An introduction to the Service Image collections online Mike Durbin and Dot Porter Digital Library Program Digital Library Brown Bag February 15, 2012.
The TARO Project Texas Archival Resources Online Fred Gilmore Sr Operating Systems Specialist UT Austin General Libraries April.
Cataloging and Metadata at the University Library.
The Legislative Library of Ontario’s Ontario Documents Repository Road to Partnership.
Please note: this presentation has not received Director’s approval and is subject to revision.
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
I.Information Building & Retrieval Learning Objectives: the process of Information building the responsibilities and interaction of each data managing.
Creating an Open Archives Metadata Harvesting Protocol Compliant Repository for the American Memory Online Collections OAI Open Meeting, Washington, DC.
Introduction to Web Services Eric Lease Morgan University Libraries of Notre Dame June 24, 2005.
Digital Volcanoes and Data Flows Carol Hamilton 1VALA 2012.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
Argent Quality Web Portal User Manual Revision 3 March 10, Ray Moya – COO Gillian Buckley – Quality Manager.
Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Phil Barker, March © Heriot-Watt University. You may reproduce all or any part.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
Geospatial One-Stop FGDC and GOS: Working as One to Build the NSDI Sharon Shin Federal Geographic Data Committee Geospatial One-Stop Metadata Coordinator.
Archiving Oral History Online.  Jason Walker Public Services and Circulation/Technical Services Supervisor  Marti Fuerst Public Services Library Associate.
Integrating Access to Digital Content Sarah Shreeves University of Illinois at Urbana-Champaign Visual Resources Association 23 rd Annual Conference Miami.
Search Interoperability, OAI, and Metadata Sarah Shreeves University of Illinois at Urbana-Champaign Basics and Beyond Grainger Engineering Library April.
Digital Repository Service Update ___________________________ Yale University Library Roy Lechich, ILTS Audrey Novak 15 Aug 2007.
Web Discovery and Millennium Integrating Millennium with Summon Helen Bronleigh Library Systems Coordinator.
Metadata and OAI DLESE OAI Workshop April 29-30, 2002 Katy Ginger Presentation available at:
1 Overview Finding and importing data sets –Searching for data –Importing data_.
Metadata and OAI DLESE OAI Workshop June 29 to July 2, 2002 Katy Ginger Presentation available at:
A centre of expertise in digital information management UKOLN is supported by: Functional Requirements Eprints Application Profile Working.
What is touchPRO EXPRESS? touchPRO EXPRESS is a way for Associations who meet certain criteria to be able to get a mobile app at a low cost and have their.
DLESE Metadata Frameworks March Talk Organizer Terminology DLESE metadata history (DC/IMS to DLESE- IMS to ADN) ADN Collection News-opps Object.
TAG YOU’RE IT: ENHANCING ACCESS TO GRAPHIC NOVELS WENDY WEST
SSN to Single-Member LLC EIN Conversion Training Overview.
The Digital Public Library of America: How will it benefit my patrons? How are Missouri libraries participating? Emily Jaycox Missouri History Museum
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
Discovery and Metadata March 9, 2004 John Weatherley
Synchronizing data from Alma to remote digital repository
British Library Document Supply Service (BLDSS) API
VIRTA Publication Information Service
Tiewei (Lucy) Liu Metadata Librarian June 26, 2016
Jordan PIŠČANC, University of Trieste
Confirmation and Feed Logs July 2017 Tips and Tricks
Digital Commonwealth Presented By: Steven Anderson Boston Public Library (BPL) Eben English Boston Public Library (BPL)
Information modeling and infrastructures for metadata
Heinrich Widmann EUDAT & CKAN Heinrich Widmann
Building Search Systems for Digital Library Collections
Spreadsheet Modelling
Jodi Allison-Bunnell, Alliance
CSDR Submit-Review Website Submitter Guide
Library Content Comparison System
The Digital Library for Earth System Science
Addison, Joanne, Katherine, SunMi
Confirmation and Feed Logs July 2017 Tips and Tricks
DLG/HomePLACE Services Overview and Focus Group
IDEALS at the University Of Illinois: A Case Study of Integration Between an IR and Library Discovery Systems Sarah L. Shreeves University of Illinois.
eSciDoc Development Schedule
Designing and Using Normalization Rules
This presentation document has been prepared by Vault Intelligence Limited (“Vault") and is intended for off line demonstration, presentation and educational.
GIL Users Group Meeting
Origins, Current State and Future Enhancement
This presentation document has been prepared by Vault Intelligence Limited (“Vault") and is intended for off line demonstration, presentation and educational.
ESRM 250/CFR 520 Autumn 2009 Phil Hurvitz
Clemson Libraries Digital Collections
We Work for the Users! User-centric Digitization for the
Presentation transcript:

Metadata, Ingest, and Data Feeds What we do with your data and why Nicole Lawrence, DLG Mike Kanning, GALILEO GALILEO Users Conference July 12, 2018

Your presenters Nicole Lawrence Project Manager, Digital Library of Georgia nicole.lawrence@uga.edu Mike Kanning Developer, GALILEO mkanning@uga.edu

What are we going to talk about? The DLG data supply chain including: How we gather data What we do with it Batch processing Spatial lookups and other improvements Public websites DPLA harvest process

The Data Supply Chain Georgia Portal Civil Rights Digital Library OAI-PMH Harvest Civil Rights Digital Library DLG Processing DLGadmin Exported Data Civil War in the American South DLG OAI-PMH Data Feed Locally Created Digital Public Library of America EBSCO

How we gather data

The Data Supply Chain: How we gather data Georgia Portal OAI-PMH Harvest Civil Rights Digital Library DLG Processing DLGadmin Exported Data Civil War in the American South DLG OAI-PMH Data Feed Locally Created Digital Public Library of America EBSCO

How we gather data: OAI-PMH harvest

How we gather data: Exported data

How we gather data: Locally created

What we do with your data

The Data Supply Chain: What we do with it Georgia Portal OAI-PMH Harvest Civil Rights Digital Library DLG Processing DLGadmin Exported Data Civil War in the American South DLG OAI-PMH Data Feed Locally Created Digital Public Library of America EBSCO

Steps in DLG processing Normalize 01 System validation Faceting Enhance 02 Missing fields DLG specific fields Map 03 Crosswalk original scheme to DLG Ensure proper field headings and content Convert 04 Native format to active XML

Steps in DLG processing: Normalizing

Steps in DLG processing: Normalizing

Steps in DLG processing: Enhancement

Steps in DLG processing: Enhancement

Steps in DLG processing: Crosswalk

Steps in DLG processing: Data verification

Steps in DLG processing: Data verification

Steps in DLG processing: Convert

Steps in DLG processing: Convert

Steps in DLG processing: Convert

Steps in DLG processing: Convert

Steps in DLG processing: Convert

Batch processing

The Data Supply Chain: Ingesting Georgia Portal OAI-PMH Harvest Civil Rights Digital Library DLG Processing DLGadmin Exported Data Civil War in the American South DLG OAI-PMH Data Feed Locally Created Digital Public Library of America EBSCO

DLGAdmin’s Batch System Batch Import Batch Commit Batch Batch Items Items Batch Import The DLGAdmin batch system is used by DLG staff to ingest, improve and validate new records, as well as update existing records. Batches are created as units-of-work and when complete, are “committed” to the public index. Batches serve as an audit trail for records. Batch processing is complex and can take up a lot of system time, so they are queued and worked as background processes.

Populating Batches Form XML

Populating Batches Search Results

A Populated Batch

Committing a Batch Commit jobs are submitted to a worker queue and worked one at a time. Status notifications are available via a Slack integration. Completed commits show in the list of batches, and in the event of an error the user is given the opportunity to revise the batch and retry the commit job. Once complete, item records are either created or updated and the new or changed record is added to the search index. The change in live in the public DLG site as soon as this happens. Viewing the item record shows the history of batch items that were used to create or update a given record.

Spatial lookups (and other improvements)

GeoJSON On import, DLGAdmin generates and indexes a GeoJSON object for each record with spatial metadata. GeoJSON is a standard format used for plotting shapes on maps like seen here. We are hoping to improve this process to introduce higher fidelity for object mapping (get the pin closer to where a photo was actually taken, for example) and support the lookup of coordinates for novel locations.

Indexing Dates 1732-02-03/1732-03-24 1732-06-09 1732/1783 0000/1885-06-10 1/31/1991 1903-05 approx. 1934 circa 1960-1969 July 1, 1997 - June 30, 1998 1/21/1999-4/4/2012 5/1986 1776-7 1920-00-00 On import DLGAdmin also parses the dc_date field for year values that apply to the record. Ranges are parsed to include all years within the range. A variety of formats commonly found in the metadata are handled. We are working to improve this process to make it easier to return items from a user-provided range (e.g. 1900-1930) and also to increase the fidelity of the indexed date values (e.g index a date down to the individual day/month rather than just year)

Public Websites

The Data Supply Chain: Public Access Georgia Portal OAI-PMH Harvest Civil Rights Digital Library DLG Processing DLGadmin Exported Data Civil War in the American South DLG OAI-PMH Data Feed Locally Created Digital Public Library of America EBSCO

DLG Public Website: dlg.usg.edu

Other Websites: CRDL and AMSO

Other Websites: CRDL and AMSO

DPLA harvest process

The Data Supply Chain: Harvesting Georgia Portal OAI-PMH Harvest Civil Rights Digital Library DLG Processing DLGadmin Exported Data Civil War in the American South DLG OAI-PMH Data Feed Locally Created Digital Public Library of America EBSCO

DLG’s OAI-PMH Feed

DPLA Metadata Application Profile

DLG in DPLA

Questions?