Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developer’s Day Budapest, Hungary 2015-09-05 Laura Akerman.

Slides:



Advertisements
Similar presentations
xID Web Services (xISBN, xOCLCnum, xISSN) FRBR grouping of editions and formats Tim McCormick Product Manager, Grid Services Xiaoming.
Advertisements

Digital Repositories – Linked Open Data – the possible Role of D4Science Workshop, December 2010, FAO use cases A tool to create Linked Data providers.
Overview of Twitter API Nathan Liu. Twitter API Essentials Twitter API is a Representational State Transfer(REST) style web services exposed over HTTP(S).
Connections: Piloting linked data to connect library and archive resources to the new world of data, and staff to new skills Laura Akerman Metadata Librarian.
Ere’s Stuff Ere Maijala IT Research Specialist The National Library of Finland.
BC Integration of Systems and Resources MetaLib at Boston College Theresa Lyman Digital Resources Reference Librarian Boston College Libraries.
ALEPH at the Crossroad IGeLU Oxford, 2014 Dalia Mendelsson The Library Authority.
BIBFLOW: An IMLS Project
1 Cataloging for School Librarians — It Matters! Margaret Maurer Head, Catalog and Metadata Kent State University Libraries and Media Services 2006 ILF.
AgriDrupal - a “suite of solutions” for agricultural information management and dissemination, built on the Drupal CMS; - the community of practice around.
Developing a Basic Web Page Posting Files on UMBC
Leveraging Names with Linked Data Karen Smith-Yoshimura Ralph LeVan 2010 RLG Partnership Annual Meeting Chicago, IL 9 June 2010.
Presented by…. Group 2 1. Programming language 2Introduction.
The FCLA Endeca Project By Michele Newberry. M.Newberry2 Why ENDECA?  Already proven by NCSU  Build on NCSU’s work instead of starting from zero  Product.
1 LINKED OPEN DATA – an introduction Elisabeth Robinson EXPANIA 2014.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
4th project meeting 27-29/05/2013, Budapest, Hungary FP 7-INFRASTRUCTURES programme agINFRA agINFRA A data infrastructure for agriculture.
DEEP SEARCH Application of Primo Deep Search between Northwestern and Vanderbilt ELUNA 2015 Michael North - Northwestern University Dale Poulter - Vanderbilt.
Session 4B – User Experience (The Catalogue and You) New display models of bibliographic data and resources: cataloguing/resource description and search.
Matching names in parallel T. Hickey Access October.
Updated :02 Hong Kong University of Science & Technology Library XML Name Access Control Repository at the Hong Kong University of Science.
Access 2008 Using WorldCat Grid Services in Library Applications Roy Tennant Senior Program Officer OCLC Research.
Interoperable Digitised Content “Discover, search, extract, link, associate, and view digitised content” Les Carr.
Alma 1 year after STP: implementing batch services IGeLU Budapest Sep 2, 2015 Bart Peeters Head Operations LIBIS.
GCMD/IDN STATUS AND PLANS Stephen Wharton CWIC Meeting February19, 2015.
Subject To Change automatic catalog enrichment with subject headings and codes 10th IGeLU conference Budapest, Marcus Zerbst Zentralbibliothek.
ALCME: OAI at OCLC Jeffrey A. Young OCLC Online Computer Library Center, Inc.
Jen-Jou Hung DDBC Authority Database Web Services & Widgets Jen-Jou Hung Assistant Professor Dharma Drum Buddhist College PNC/ECAI 2009 ( ) A.
The FCLA Endeca Project By Michele Newberry. M.Newberry2 Current OPAC environment  Aleph 500 v.15.5  Heavily customized to reflect pre- implementation.
Challenges of Discovery Tools Challenges of Discovery Tools Shelly Shen-Aridor Younes & Soraya Nazarian Library Haifa university, Israel Session
What makes a good interactive resume? Click for detailed information Multimedia Navigation Communication.
Beyond EAD: Tools for Creating and Editing EAC-CPF Records and “Remixing” Archival Metadata Remixing Archival Metadata Project (RAMP) editor 9 January.
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
Programs and Research Moving to the network level: discovery and disclosure Lorcan Dempsey ALCTS ALA Midwinter, Seattle January
Lifecycle Metadata for Digital Objects November 1, 2004 Descriptive Metadata: “Modeling the World”
Core Integration Web Services Dean Krafft, Cornell University
XML Presented by Kushan Athukorala. 2 Agenda XML Overview Entity References Elements vs. Atributes XML Validation DTD XML Schema Linking XML and CSS XSLT.
Using RSS to Promote Scholarly Publications Ken Varnum Associate Librarian Edwin Ginn Library The Fletcher School Tufts University Cool Tools and New Technologies.
A Fedora 3 to 4 Migration Case Study for UNSW Australia Library Fedora 4 Training Workshop, eResearch Australasia 2015, Brisbane UNSW Library Arif Shaon,
A Fedora 3 to 4 Migration Case Study for UNSW Australia Library Fedora 4 Training Workshop, eResearch Australasia 2015, Brisbane UNSW Library Arif Shaon,
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Jason Platts Lead Technical Developer The Open University An overview of how the Open University has incorporated bibliographic.
What’s Hot in Finland Ere Maijala IT Research Specialist The National Library of Finland Hint: It’s not the weather.
Location Guide & Text Me a Call Number integration to Primo Presented By Dhanushka Samarakoon Marjorie Devlin.
A RCHIVAL COLLECTIONS IN A D IGITAL W ORLD Cheryl Walters Nov. 6, 2008.
The AstroGrid-D Information Service Stellaris A central grid component to store, manage and transform metadata - and connect to the VO!
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
/16 Final Project Report By Facializer Team Final Project Report Eagle, Leo, Bessie, Five, Evan Dan, Kyle, Ben, Caleb.
Exploring EAC-CPF with the Remixing Archival Metadata Project (RAMP) 8 May 2014 Society of Florida Archivists Annual Meeting Allison Jai O’Dell
Remote Data Sources in Primo Ebsco API WorldCat API Local Content.
Automating Cataloging Workflows with OCLC and Alma APIs
About me Civil engineer (not in IT) and self-taught developer
From the old to the new… Towards better resource discoverability
Yoel Kortick Senior Librarian
Primo – Alma – Primo Central
APIs (and their Relatives) Can Expand and Unify Library Services
Workshop on XML-Based Library Applications 5
BIBFRAME at the Library of Congress
WorldCat: Broad Web visibility for our collection
Metadata to fit your needs... How much is too much?
CSU Millennium to Alma migration
PREMIS Tools and Services
LOD reference architecture
Márton Németh – László Drótos How to catalogue a web archive?
Digitization Standards: Issues & Updates
Sound Preservation: First Steps
Alternate graphic representation 880 field
APE EAD3 introduction - DARIAH - Brussels
Preserving Access for the Future
Using FAST (Faceted Application of Subject Headings) in CONTENTdm
Presentation transcript:

Combine_and_stir (Aleph data + RDF + Python + other things) IGeLU 2015 Developer’s Day Budapest, Hungary Laura Akerman

Question: How can we use linked data with our library metadata to better support student and faculty research? Small scale use cases that focus on a subset of our resources (Why? Emory Libraries’ capacity to do new things limited this year (Alma migration & Hydra implementation) Integration with Primo – including Vivo Large scale integration of campus information, a la LD4L Combine and Stir - Laura Akerman - IGeLU Developers Day 2015

Background/Inspiration Emory University has 3.5 million Aleph records. As a metadata specialist for digital projects, I became familiar with some faculty requests to use data from our catalog in new ways. IGELU-ELUNA Linked Open Data Interest Group – Use Cases ses+and+scenarios ses+and+scenarios Emory University’s linked data pilot project, “Connections”: Bernardo Gomez’s work: scripts to extract data from Aleph Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Bernardo’s Work Presented at ELUNA and in a use case call: Exposing Aleph Bibliographic and Authority records as RDF/XML and providing auto- discovery in Primo a_webseminar.html a_webseminar.html He is planning to share code, but meanwhile, can contact him at if you’re Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

A few details: Service to retrieve authority keys for headings in a bib record in Aleph. Service to retrieve both bib and authority records, represent them in “MarcEdit” text format, and convert them to RDFXML (using simple vocabularies for demonstration). Could store these in triplestore (Sesame), or retrieve fuller representations from OCLC for bib and auth, or VIAF for name authorities. Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Bernardo’s use cases: Link to WorldCat Identities page for first author in Primo bib record Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Adding JSON-LD to our “Full View” Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

My ideas: Small applications for specific purposes Class projects, faculty and grad student research projects. Marry RDF from Aleph (or Primo?) with external linked data. Create web interfaces depending on use case – display, possibly search, visualization, map, timeline, etc. Write Python scripts/modules to do this – create building blocks that can be modified for more projects, maybe by savvy faculty or students. Share with everyone on GitHub Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Why Aleph? We have Bernardo’s scripts NOW to get the authority IDs out of Aleph, get authority records, and use LCCN etc. to get to VIAF. In future when we go to Alma, need to be able to use Alma APIs to get this data. If Primo JSON-LD API could furnish identifiers (authority IDs) it could be useful here. Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Why Python? I wanted to really learn it. (Newbie alert!) I thought lots of library programmers might use it too. IS THIS TRUE? IS PYTHON A GOOD LANGUAGE FOR SHARING SOMETHING LIKE THIS? Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

101 USE CASES? 1. Hurricane Katrina study class. Actual request some years ago to set up a database of our records for resources about the hurricane, which would be augmented with other resources found on the web by students and annotated. Catalog record + student contributed data + annotation (+ Wikidata?) + ? Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

More 2. Famous author: Catalog records for works by and about the person, + VIAF + Wikidata + archive data 3. Group of musicians, artists, poets… expand from #2. * SEE ALSO Networking the Belfast Group, a much more extensive and cool project from Rebecca Koeser and others at Emory involving linked data: ect-networking-belfast.html ect-networking-belfast.html Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

More……. 5. Class reading list – catalog records + Wikidata on the authors + ? 6. Unfolding event: RSS feed from Primo (limited to Aleph/Alma records?) + Wikidata successive harvests + Twitter? + ? …………………………………………………………….. ……YOUR IDEAS? Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

So far… Code to take a file of aleph numbers, and retrieve bibs as XML using Bernardo’s service File of bib and auth numbers associating the record IDs for authorities in the bib with the bib ID Retrieve auth records as XML from Aleph Create Primo permalinks, and store the relationship between a “resource” and the permalink as RDF. Working on code to harvest bibs and send to Sesame. Choice: do our own conversion on our bib/auth data, or Harvest RDF from OCLC? Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

To do: Code to send chunks of RDF from bibs and auth in Sesame Extract Viaf record ID from auths for persons or organizations. Get the record in RDF. Use links in Viaf records to Wikidata to retrieve RDF related to the person or organization Select elements to be included in Sesame and send to Sesame. Create web display of information about the person or organization of interest, information about the related resources and other persons, selected other data and links. Complete the route by coding the extraction of Aleph IDs from an RSS feed or eshelf folder. End result / first demo – enhanced info on interesting person or organization (Zsa Zsa Gabor, Mark Twain, Dalai Lama, Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Challenges for me: Using Pymarc to extract data from MARCXML – does the job but not well documented. Talking to Heidi Frank. XML handling modules and making REST api calls – differences in Python versions (using 2.7) mean older examples found on the web didn’t work. Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Questions: Are there gotchas in these scenarios, and if so how could we work around… – Uncertainty about metadata rights? – Desire to include Primo Central? – Other datasources lacking “authority records”? – External data that’s less open than we want? – Need to establish “by hand” – mapping across vocabularies (could this be shared?) Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Your thoughts… Would it be worthwhile pursuing the development of tools like this – and would you contribute? Combine and Stir - Laura Akerman - IGeLU Dev Day 2015

Thank you Contact me if you have more ideas/interest: Laura Akerman, Latest working code will be up on GitHub in a week or so as soon as I figure out how to resolve a “conflict” Combine and Stir - Laura Akerman - IGeLU Dev Day 2015