Not Just For Data Geeks! A Practical Approach to Linked Data for Digital Library Managers Cory Lampert and Silvia Southwick Salt Lake City October 9, 2013.

Slides:



Advertisements
Similar presentations
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Advertisements

Bibliographic Framework Initiative Approach for MARC Data as Linked Data Sally McCallum Library of Congress.
Creating Linked Data Juan F. Sequeda Semantic Technology Conference June 2011.
Unleashing Expressivity Linked Data for Digital Collections Managers Cory Lampert Head, Digital Collections Mountain West Digital Library Hubs Meeting.
Developing a Metadata Exchange Format for Mathematical Literature David Ruddy Project Euclid Cornell University Library DML 2010 Paris 7 July 2010.
OCLC Research TAI CHI Webinar 5/27/2010 A Gentle Introduction to Linked Data Ralph LeVan Sr. Research Scientist OCLC Research.
RDF Tutorial.
Semantic Web Introduction
© Copyright IBM Corporation 2014 Getting started with Rational Engineering Lifecycle Manager queries Andy Lapping – Technical sales and solutions Joanne.
Linked Data for Libraries, Archives, Museums. Learning objectives Define the concept of linked data State 3 benefits of creating linked data and making.
Linked Library Data Miiya Holmes October 6-7, 2012.
Introducing Linked Data ISD Spotlight Presented by Alison Hitchens 2013.
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
Module 2b: Modeling Information Objects and Relationships IMT530: Organization of Information Resources Winter, 2007 Michael Crandall.
Metadata : Concentrating on the data, not on the scheme Imma Subirats FAO of the United Nations Marcia Zeng Kent State University euroCRIS Meeting Bologna.
Cloud based linked data platform for Structural Engineering Experiment Xiaohui Zhang
National libraries and identity in the Semantic Web Gordon Dunsire BNE, Madrid, 14 Dec 2011.
Everything Around the Core Practices, policies, and models around Dublin Core Thomas Baker, Fraunhofer-Gesellschaft DC2004, Shanghai Library
Linked Data The Short Version. Linked Data is a set of best practices for publishing and deploying instance and class data using the RDF data model, naming.
Networking Session: Global Information Structures for Science & Cultural Heritage - The Interoperability Challenge «INTEROPERABILITY FROM THE CULTURAL.
RDA and Linking Library Data VuStuff III Conference Villanova University, Villanova, PA October 18, 2012 Dr. Sharon Yang Rider University.
The OAI-ORE based data model of Europeana and the Digital Public Library of America: implications for educational publishing Dov Winer MAKASH – Advancing.
Michalis Vafopoulos NTUA, GFOSS & The transformers GREEN CITY HACKATHON.
Metadata: An Overview Katie Dunn Technology & Metadata Librarian
Publishing and Visualizing Large-Scale Semantically-enabled Earth Science Resources on the Web Benno Lee 1 Sumit Purohit 2
The Semantic Web Web Science Systems Development Spring 2015.
Linked data the next network?. The Web of documents is for people The Web of data is for computers The Web of documents is difficult for computers to.
Integrating Live Plant Images with Other Types of Biodiversity Records Steve Baskauf Vanderbilt Dept. of Biological Sciences
Open Data Protocol * Han Wang 11/30/2012 *
Taking Action: Linked Data for Digital Library Managers Silvia Southwick and Cory Lampert UNLV Digital Collections American Library Association Annual.
Aligning library-domain metadata with the Europeana Data Model Sally CHAMBERS Valentine CHARLES ELAG 2011, Prague.
Boris Villazón-Terrazas, Ghislain Atemezing FI, UPM, EURECOM, Introduction to Linked Data.
LINKED DATA AND RDA: LOOKING TOWARD NEXT GENERATION CATALOGING Jenn Riley Head, Carolina Digital Library and Archives Digital Discussions series Twitter:
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Creating an Application Profile Tutorial 3 DC2004, Shanghai Library 13 October 2004 Thomas Baker, Fraunhofer Society Robina Clayphan, British Library Pete.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Common Terminology Services 2 CTS 2 Submission Team Status Update HL7 Vocabulary Working Group May 17, 2011.
Linked Data: Emblematic applications on Legacy Data in Libraries.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Introduction to the Semantic Web and Linked Data
Understanding RDF. 2/30 What is RDF? Resource Description Framework is an XML-based language to describe resources. A common understanding of a resource.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
THE BIBFRAME EDITOR AND THE LC PILOT Module 3 – Unit 1 The Semantic Web and Linked Data : a Recap of the Key Concepts Library of Congress BIBFRAME Pilot.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
KAnOE: Research Centre for Knowledge Analytics and Ontological Engineering Managing Semantic Data NACLIN-2014, 10 Dec 2014 Dr. Kavi Mahesh Dean of Research,
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
Santi Thompson - Metadata Coordinator Annie Wu - Head, Metadata and Bibliographic Services 2013 TCDL Conference Austin, TX.
© Copyright 2015 STI INNSBRUCK PlanetData D2.7 Recommendations for contextual data publishing Ioan Toma.
LINKED DATA DEMYSTIFIED PRACTICAL EFFORTS TO TRANSFORM CONTENTDM METADATA INTO LINKED DATA.
LINKED DATA PILOT PROJECT AT SYRACUSE UNIVERSITY LIBRARIES Sarah Theimer & Brian Dobreski Acquisitions and Cataloging Syracuse University Libraries.
Linked Open Data for European Earth Observation Products Carlo Matteo Scalzo CTO, Epistematica epistematica.
LINKED DATA what you need to know to understand, produce, and work with Linked Data Robert Chavez, PhD. Senior Content Solutions Architect, NEJMGroup NETSL.
PREPARING FOR LINKED DATA IN DIGITAL REPOSITORIES Sai Deng, University of Central Florida Libraries ACRL Technical Services Interest Group ALA.
Linked Library (+AM) Data Presented LITA Next-Generation Catalog IG Corey A Harper Publish, Enrich, Relate and Un-Silo.
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
Linked Data Competency Index
Linked Data Web that can be processed by machines
Cloud based linked data platform for Structural Engineering Experiment
Linked Data and Libraries
Middleware independent Information Service
Introduction to Metadata
Digging into Linked Data: Perspectives from the Long Tail
Getting started With Linked Data.
Linked Data for SDG Reporting
Linked Data  at  loc.gov show of hands:
LOD reference architecture
Linked Data Ryan McAlister.
Presentation transcript:

Not Just For Data Geeks! A Practical Approach to Linked Data for Digital Library Managers Cory Lampert and Silvia Southwick Salt Lake City October 9, 2013

presenters  Silvia Southwick  Digital Collections Metadata Librarian   Cory Lampert  Head of the Digital Collections Department 

Today’s Agenda Welcome Morning Linked Data Basic Concepts Creating Triples and applying EDM activity UNLV Linked Data Project Lunch Afternoon Phases of data transformation demo and activities Open Refine – data clean-up Mulgara, SPARQL Discussion and Wrap-Up

Why all the fuss?  My collections are already visible through Google; so who cares  This is a topic for catalogers  It’s too technical / complicated / boring  Actually...  Linked data is the future of the Web  Data will no longer be in silos (catalog, CONTENTdm)  Relationships are powerful and worth the effort

What do we mean by: “linked data”?  Linked Data refers to a set of best practices for publishing and interlinking data on the Web Data needs to be machine-readable Linked data (Web of Data) is an expansion of the Web we know (Web of documents)

What we do now produces:  Data (or metadata) encapsulated in records  Records contained in collections  Very few links are created within and/or across collections  Links have to be manually created  Existing links do not specify the nature of the relationships among records  This structure hides potential links within and across collections – DATA IS TRAPPED!

Where linked data can take us:  Our records can be deconstructed and assigned identifiers; creating data that can be used in Web architecture (HTTP, URIs)  Data can be expressed in triples; statements that are machine-readable when transformed into Resource Description Framework (RDF)  Linked data can be queried using SPARQL - SPARQL Protocol and RDF Query Language -- to retrieve and manipulate data stored in RDF.

Concept: Graph  A graph is a collection of objects (represented by "nodes") any of which may be connected by links between them  Graphs are human readable  Graphs can represent a metadata record showing what is known about the item; relationships  Triples are the simplest form of a graph

Concept: Triples A triple is an statement, consisting of two parts:  (a "subject" and an "object")  and a relationship between them (a verb, or "predicate"). The subject-predicate-object triple forms the smallest possible RDF graph (although most RDF graphs consist of many such statements).

Concept : URI A Uniform Resource Identifier (URI) is simply a recognized standard for identifiers.  URIs can be used to uniquely identify virtually anything  URIs play a key role in enabling Linked Data because they represent the subject, object, and predicates of triples in a machine- readable form  URIs are used in HTTP web architecture

Principles of Linked Data 1. Use URIs as names for things (people, organizations, artifacts, abstract concepts, etc.) 2. Use HTTP URIs so that people can look up those names 3. When someone looks up a URI, provide useful information, using the standards(RDF) to create statements 4. Beyond describing the item, include links to other URIs so that people can discover other related items

Where do we start? We are already have the information we need in our metadata records to create triples. We just need to think of it differently:  Subjects – Objects - Predicates  Each metadata field may contain one or several statements  One metadata record can produce many, many, triples

Expressing metadata as triples What are possible triples for this “thing”?   

Expressing records as triples: graph example  Triples are expressed as: subject – predicate – object  Examples: Frank Sinatra -- is an – entertainer Frank Sinatra – knows – Jack Entratter

Triples and RDF Once we have triples we need to:  Assign URIs to each subject, object, and predicate  Use URIs to form an RDF statement These steps take the human readable graph and make it machine readable!

Examples of records Showgirls Menus Dreaming the Skyline

Graphical representation of the photo triples

Adding triples from the other records What are the URIs for subjects, predicates and objects?

Triples: Text/Graph  RDF Source: Introduction to RDF at

ACTIVITY: Brainstorming Triples  Look at the metadata record you brought and think about subject-object- predicates  Start listing some possible triples in text  When you have several, try to graph the triples  Break into groups of four and discuss

Getting From Triples to the next step Once we understood triples we needed to answer some questions: Which triples to create? (literal, outgoing links, incoming links, triples that describe related resources, triples that link to descriptions, triples that indicate provenance of the data, etc.) Which vocabularies that will be adopted for predicates and objects? How to specify URIs for new “things”

A Little Help From EDM  Data model is a boring way to say that we needed a way to bring some order to the chaos of all these triples  Europeana Data Model gives us a framework to help organize, structure, and define how we create triples and express them in RDF.  Provides mapping between our current expression of DC, the new triples, and where we want to go with linked data  Adopting a current model is preferable to creating your own (interoperability)

Another layer of links: Vocabularies In addition to the data model we explored how data could be reconciled with existing linked data sets/vocabularies, learning from EDM.  Thesaurus of Graphic Materials and LoC  DCMI Type Vocabulary  Friend of a Friend Vocabulary (FOAF)  Geonames  Creative Commons Rights Expression vocabulary  Schema.org Many more at:

ACTIVITY: Exploring EDM  Please look at your triples and think about the subjects identified  Browse through the EDM and select classes – classes have properties  Then identify several properties and values for that class that apply to your predicates/objects including both DC and others (EDM, SKOS, etc.)  Break into groups and discuss

Is this work worth it?  We add value by creating rich metadata records at our institutions  When these records are harvested as Dublin Core they lose some of that context  When harvested metadata records are automatically transformed into linked data (OCLC) they lose even more  You get “linked data” at a cost

How can we create rich linked data? Create a complementary data structure that would allow dynamic interlinking among data How?  Export records from the collections  Deconstruct these records by extracting data from them  Apply vocabularies  Adopt a common model to express data  Publish data in a data space (Linked Data Cloud) where links among data are created automatically

UNLV Linked Data Project  Goals:  Study the feasibility of developing a common process that would allow the conversion of our collection records into linked data preserving their original expressivity and richness  Publish data from our collections in the Linked Data Cloud to improve discoverability and connections with other related data sets on the Web

How we started Created a study group in the Library (members from various areas of the library) Watched webinars on the topic and have discussions after the webinars Created an internal wiki with linked data resources Participated in linked data interest groups Follow the literature on this topic

Phases of the project  Literature Review  Evaluating Technologies Research existing technologies and best practices Develop small experiments with technologies Make decisions of which technologies to adopt, adapt or develop  Data preparation Select and prepare records from digital collections to participate in the project  Run process to generate data from the original records  Publish on the Linked Data Cloud  Assess results

Type of Data Data Preparation Data Storage Data Publication Structured Data (CONTENTdm) RDF-izers for Excel or XML RDF Store Linked Data Wrapp er Linked Data on the Web Linked Data Interfac e RDF Files Web Server Data Sourc e API Drupal DB Drupal RDFa Adapted from Linked Data: Evolving the Web into a Global Data Space by Heath and Bizer

The Linking open data Cloud diagram Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.

Project Perspective: Sara  Why we needed Sara’s help  How she accelerated our learning  What she has learned so far  Her thoughts on linked data beyond digital collections

Project Demo

Wrap up and Discussion: Challenges  Developing of a common process for transforming records into data because digital collections adopt different metadata schema  Creating URIs for all our unique materials  Finding ways to associate URIs to “things” in CONTENTdm  Adopting linked data while it is in early stage of development