Taking Action: Linked Data for Digital Library Managers Silvia Southwick and Cory Lampert UNLV Digital Collections American Library Association Annual Meeting June 28, 2014 Las Vegas, NV
Agenda Motivation Environment UNLV Linked Data project Technologies used for transforming metadata into linked data Visualizations of linked data (demos) Next steps and Q & A
Linked Data Overview My collections are already visible through Google; so who cares This is a topic for catalogers It’s too technical / complicated / boring Actually... Linked data is the future of the Web Data will no longer be in trapped in silos imposed by systems, collections, or records Exposed open data presents new opportunities for users
What is Linked Data? Linked Data refers to a set of best practices for publishing and interlinking data on the Web Data needs to be machine-readable Linked data (Web of Data) is an expansion of the Web we know (Web of documents)
Current Practice Data (or metadata) encapsulated in records Records contained in collections Very few links are created within and/or across collections Links have to be manually created Existing links do not specify the nature of the relationships among records This structure hides potential links within and across collections
What we can do with linked data Free data from silos Expose relationships Powerful, seamless, interlinking of our data Users interact or query data in new ways Search results would be more precise Data can be easily repurposed
Why? Our data needs an upgrade.
The Linked Open Data Cloud
Making the Case for Linked Data Problem: – Rich metadata is being lost when adopting a standard that is designed for interoperability (Dublin Core) – Rationale for adopting linked data is being disseminated, but there is very little practical implementation to serve as reference; no “recipe” or uniform solution – Evolving beyond records takes resources and requires embracing an exciting but uncertain future
Example of a metadata record
How can we create linked data? Our metadata records are deconstructed in triples (statements) that are machine-readable Triples are expressed as: Subject – Predicate - Object For example: This book – has creator – Tom Heath This book – has title – Linked Data: Evolving the…” Subjects, predicates and most objects should have unique identifiers (URIs) creating data that can be used in Web architecture (HTTP) These statements are expressed using the Resource Description Framework (RDF) Linked data can be queried using SPARQL
Expressing metadata as triples
Graphic Representation
Triples and RDF – Once we have triples we need to: – Assign URIs to each subject – URIs definitely are used for subjects, and might also represent objects. URIs are essential for constructing RDF statements These steps take the human readable graph and make it machine readable!
Examples of records Showgirls Menus Dreaming the Skyline
title
How can I transform textual triples into machine-readable? We need vocabularies to express our triples Even better – a data model with these vocabularies Europeana Data Model gives us a framework to help organize, structure, and define which predicates we are going to use Adopting an existing model is preferable to creating your own (interoperability)
title
Triples with URIs & EDM model predicates (Local URI)
Machine-readable foaf:. dc:creator foaf:depicts. edm:hasType
“I’m a digital collections manager”… What is known? – lots of THEORY and lots of TECHNICAL information What is happening? – a move toward PRACTICE and APPLICATION in libraries by non-programmers Is there a “recipe” yet? - No. But, our staff CAN do significant work to prepare for linked data and to understand linked data principles, even if it isn’t realistic to run a parallel process.
UNLV Linked Data Project Goals: Study the feasibility of developing a common process that would allow the conversion of our collection records into linked data preserving their original expressivity and richness Publish data from our collections in the Linked Data Cloud to improve discoverability and connections with other related data sets on the Web
ActionsTechnologies Clean data Export data Import data Publish Open Refine Mulgara / Virtuoso CONTENTdm Import data Prepare data Generate triples Export RDF
Phase 1 Clean data Export data
Clean / Export Data Technology: CONTENTdm Increase consistency across collections: – metadata element labels – use of well-known CVs – share local CVs – etc. Export data as spreadsheet
Phase 2 Import to OpenRefine Prepare (Reconcile) Generate triples Export RDF files
OpenRefine Open source It is a server – can communicate with other datasets via http Open Refine and its RDF extension should be installed Screenshots to show some of the functions we have used
Import
Facets
Split multi-value cells
Facet view for Graphic Elements after splitting
Reconciliation
Specifying Reconciliation service
Activating Reconciliation
Creating the Mapping (Skeleton)
Exporting RDF files
ActionsTechnologies Prepare data Export data Import data Publish Query Open Refine Mulgara / Virtuoso CONTENTdm Import data Prepare data Generate triples Export RDF
Phase 3 Import data Publish Query
Mulgara Triple Store: Import
Simple SPARQL Query Select * Where {?s ?p ?o} limit 100
Visualization Open Source Tools OpenLink Virtuoso Pivot Viewer RelFinder UNLV Linked Data Blog with videos: presentations-project.html presentations-project.html
Good for displaying images Selection of images through SPARQL Queries Allows refinements using facets Allows creating dynamic “collections” OpenLink Pivot Viewer
SPARQL Query Costume Design Drawings Showgirls
Video clip: Example of Pivot Viewer
RelFinder Good to show relationships: – Among people – Among “things” Show type of relationships Demos: – African American Experience in Las Vegas (Oral History): – Cross collections people relationship:
Video clip Examples of Relfinder
Next steps for the UNLV project Transform all digital collections into linked data (parallel structure) Publish our collections metadata as Linked Open Data Increase linkage with other datasets Design and assess user friendly interfaces to access and display our data and related data from other datasets Produce a cost benefit analysis to inform future plans for the development of digital collections
Our Experience Project led, implemented and managed by two busy faculty librarians No model to follow; our model was experimentation and research With interest and motivation, Linked Open Data is a feasible goal
Thank You! Questions? Cory Lampert Head, Digital Collections Silvia Southwick Digital Collections Metadata Librarian