eXtensible Catalog: Tools for the creation and use of RDA, FRBRized and linked data David Lindahl eXtensible Catalog Organization University of Rochester, River Campus Libraries Rochester, NY LITA National Forum September 30, 2011 eXtensible Catalog
Funders and Sponsors Major Funding Andrew W. Mellon Foundation Sponsors Consortium of Academic and Research Libraries in Illinois (CARLI) Kyushu University University of North Carolina at Charlotte University of Rochester 2
User Research
Problem: User research is of limited value if a library doesn’t have control over its discovery environment Our solution: – Develop our own software (eXtensible Catalog) – Offer a modular architecture (4 “toolkits”) – Build in tons of configurability – Use established standards and protocols – Give it away (open source)
What articles, books and other resources had researchers used most recently? – How did they know the items existed? – How did they obtain them? – How did they use them? How do they keep current in their fields? XC User Research Approach
User Research Findings Users want to choose between versions of a resource, see relationships between resources – Underlying XC metadata is based on FRBR model: works, expressions, manifestations, etc. – Use some RDA data elements in FRBR structure – Metadata services to aggregate/group FRBR entities in the User Interface 7
User Research Findings Users have preferred material and format types, depending upon their projects – Show online materials only – Exclude microforms Users want to know why items appear on a search result list – Show keywords in context 8
9 Acting on User Research Findings
XC: “Taking Control” of metadata More Control over Metadata More Options for Customizing the User Interface 10
XC Schema Dublin Core terms (all) RDA – subset of elements and role designators XC elements (newly-defined) – when necessary to contain MARC vocabularies, linking fields, etc. 11 DCMI RDA XC
Discovery Interface Translating User Research Findings into XC Functionality
13
14
15
16
17
18
19
20
21
22
FRBR Structure - Pyramid 23 Work Expression Manifestation Holdings
FRBR Structure - Hourglass 24 Manifestation Expression Work Holdings Work Expression
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Software Overview Discovery, Metadata Management, and Connectivity
XC Software 40 OAI Toolkit OAI Toolkit ILS Connectivity Synchronize data with XC NCIP Toolkit NCIP Toolkit ILS Connectivity - Circ. status - Account info MST Toolkit MST Toolkit Metadata Services - Cleanup - Format Convert Drupal Toolkit Drupal Toolkit User Interface - Search - Browse Each toolkit is eXtensible with add-on packages User Interface FeaturesMore Metadata ServicesILS Export Scripts XSLT Scripts ILS connectors
XC Software 41 OAI Toolkit OAI Toolkit ILS Connectivity Synchronize data with XC NCIP Toolkit NCIP Toolkit ILS Connectivity - Circ. status - Account info MST Toolkit MST Toolkit Metadata Services - Cleanup - Format Convert Drupal Toolkit Drupal Toolkit User Interface - Search - Browse Voyager ILS Metadata Live Circ. Data User Interface Voyager “Driver” Voyager “Driver”
Drupal Toolkit 42 OAI Toolkit OAI Toolkit ILS Connectivity Synchronize data with XC NCIP Toolkit NCIP Toolkit ILS Connectivity - Circ. status - Account info MST Toolkit MST Toolkit Metadata Services - Cleanup - Format Convert Drupal Toolkit Drupal Toolkit User Interface - Search - Browse
Drupal Toolkit Features 43 Drupal Toolkit Drupal Toolkit User Interface - Search - Browse Search/Browse Customization and theming Platform for applications – Library website – Modules add functionality
Drupal Toolkit In Use 44 Drupal Toolkit Drupal Toolkit User Interface - Search - Browse Kyushu University
Drupal Toolkit In Use 45 Drupal Toolkit Drupal Toolkit User Interface - Search - Browse Kyushu University
Drupal Toolkit In Use 46 Drupal Toolkit Drupal Toolkit User Interface - Search - Browse “Creating Denver Public Library
Drupal Toolkit In Use 47 Drupal Toolkit Drupal Toolkit User Interface - Search - Browse “Creating Denver Public Library
Metadata Services Toolkit (MST) 48 OAI Toolkit OAI Toolkit ILS Connectivity Synchronize data with XC NCIP Toolkit NCIP Toolkit ILS Connectivity - Circ. status - Account info MST Toolkit MST Toolkit Metadata Services - Cleanup - Format Convert Drupal Toolkit Drupal Toolkit User Interface - Search - Browse
MST Features 49 MST Toolkit MST Toolkit Metadata Services - Cleanup - Format Convert Collect metadata from repositories Process metadata with services: – Normalize – Convert – Merge – Add identifiers Platform for building new services
MST In Use 50 Demonstration Rochester MST Toolkit MST Toolkit Metadata Services - Cleanup - Format Convert
MST In Use 51 Demonstration Rochester MST Toolkit MST Toolkit Metadata Services - Cleanup - Format Convert
MST In Use 52 Perseus Digital Tufts University (dev.) MST Toolkit MST Toolkit Metadata Services - Cleanup - Format Convert
MST In Use 53 Perseus Digital Tufts University (dev.) MST Toolkit MST Toolkit Metadata Services - Cleanup - Format Convert
MST In Use 54 Union Ministerio de Cultura, Madrid, Spain MST Toolkit MST Toolkit Metadata Services - Cleanup - Format Convert
Metadata Services Toolkit XC Metadata Services Toolkit DC to XC Transformation DC to XC Transformation MARC to XC Transformation MARC to XC Transformation MARC Normalization MARC Normalization DC Normalization DC Normalization XC Aggregation XC Authority XC Authority Clean- up Format conversionMergeAdd Identifiers OAI-PMH MST decides which services and in which order to process incoming records ILS IR Digital Repository Discovery Service
Creating XC Schema data from MARC 56 MARCXML Bibliographic XC Work XC Expression XC Manifestation Parse MARCXML records into linked FRBR-based records Holdings can be separate or embedded Manage uplinks XC Holdings MARCXML Holdings OO4 “Uplink” Manifestation Held Expression Manifested Work Expressed
Other XC records M M W M M E M M M 5. Index4. Aggregate3. Transform Following one MARC record through XC Steps: 1.Convert from raw MARC to MARCXML (minor cleanup) 2.Normalize MARCXML (major cleanup) 3.Transform from MARCXML to XC (FRBRize) 4.Aggregate at each FRBR level (match and merge) 5.Index records / create WEMs (one for each unique Manifestation) 57 MARC MARCXML (dirty) MARCXML (clean) W E M XC 2. Normalize1. Convert WEM Index Data is ready for search and faceted browse XC merge W E M match ? ? ? 5. Index4. Aggregate3. Transform2. Normalize1. Convert
Metadata Services Toolkit (MST) 58 OAI Toolkit OAI Toolkit ILS Connectivity Synchronize data with XC NCIP Toolkit NCIP Toolkit ILS Connectivity - Circ. status - Account info MST Toolkit MST Toolkit Metadata Services - Cleanup - Format Convert Drupal Toolkit Drupal Toolkit User Interface - Search - Browse
Connectivity Tools 59 OAI Toolkit OAI Toolkit ILS Connectivity Synchronize data with XC NCIP Toolkit NCIP Toolkit ILS Connectivity - Circ. status - Account info OAI Toolkit – Synchronizes metadata with XC – Cleans up MARC data – Uses export scripts NCIP 2 Toolkit – Looks up circulation status – Places requests (renew, hold) – Retrieves user account information – Enables resource sharing Evergreen ILS OCLC Worldcat Navigator SirsiDynix Symphony PALCI’s EZBorrow – Test bed available now!
NCIP 2 Toolkit: Testbed NCIP Toolkit NCIP Toolkit ILS Connectivity - Circ. status - Account info
NCIP 2 Toolkit: Testbed NCIP Toolkit NCIP Toolkit ILS Connectivity - Circ. status - Account info
RDA and FRBR Helping libraries make the transition
63
64
U.S. RDA Test Coordinating Committee 65 Overall Recommendation: “…the Coordinating Committee recommends that RDA should be implemented by LC, NAL, and NLM no sooner than January 2013.”
Bottom line…by January 2013… 66 Libraries will be able to use RDA in MARC and RDA in a non-MARC environment at the same time. XC provides one option for doing this
67 Recommended Tasks and Action Item: “Solicit demonstrations of prototype input and discovery systems that use the RDA element set (including relationships)...” Timeframe for completion: within 18 months. U.S. RDA Test Coordinating Committee
Breaking down the Recommendation prototype input discovery RDA element set including relationships XC is near production-ready MARC data (bulk) XC has a discovery interface Uses subsets of RDA elements and roles to date Primary relationships between work, expression and item so far 68 What XC Provides “Solicit demonstrations of prototype input and discovery systems that use the RDA element set (including relationships)...”
XC: Facilitating the Transition 69 XC enables risk-free experimentation with RDA data while the library community develops a successor to MARC XC can serve as a “bridge” between using RDA in MARC-based systems and in emerging applications
Linked Data in XC
Library of Congress statement, May 13, Transforming our Bibliographic Framework “Experiment with Semantic Web and linked data technologies to see what benefits to the bibliographic framework they offer our community and how our current models need to be adjusted to take fuller advantage of these benefits.”
Semantic Web and Linked Data The Semantic Web refers to a set of technologies that allow computers to understand the meaning of information on the web Linked data is a mechanism for exposing, sharing and connecting data on the web 72
Semantic Web and Linked Data If everything has a unique identifier, then information from one website can be related to information from another via a computer program Everything includes people, places, things, vocabularies, metadata elements, web documents, … 73
Getting Started To create Linked Data, we need: – Software to transform legacy data – Analysis: mapping of legacy metadata to Linked Data properties 74
Converting MARC to Linked Data What XC software can do: – Convert MARC codes to vocabulary values – Remove extraneous data – Normalize inconsistencies – Map most MARC fields/subfields and parse to appropriate FRBR Group 1 entity records 75
Best Practices for Linked Data - Unique identifiers for XC metadata records - Data elements from registered schemas - Registered vocabularies 76 By attempting to follow best practices in XC for Linked Data, we hope to facilitate eventual output of XC metadata in RDF.
RDF Triple 77 This resource Poets, American has subject ObjectPredicate Subject URIs for each?
RDF Triple – Record identifiers 78 ObjectPredicate Subject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ This resourcehas subjectPoets, American
Identifiers for XC Schema records 79 PS3505.U /.52 B Sawyer-Lauc anno, Christopher, E.E. Cummings : Cummings, E. E. (Edward Estlin), Poets, American-20th century-Biography. A persistent, globally unique identifier for each XC Schema record
RDF Triple - Registered Data Elements 80 extensiblecatalog.inf o/Elements/subject ObjectPredicate Subject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ This resourcehas subjectPoets, American
XC Schema Elements 81 DCMI RDA XC Dublin Core terms (DCMI) - all RDA – subset of elements and role designators XC elements (newly-defined) – when necessary to enable XC system functionality 81
XC Schema “work” record: data elements 82 PS3505.U /.52 B Sawyer-Lauc anno, Christopher, E.E. Cummings : Cummings, E. E. (Edward Estlin), Poets, American-20th century-Biography. Data elements from registered namespaces for DC terms, RDA roles and vocab, and XC
RDF Triple - Registered Vocabularies 83 s/sh #concept extensiblecatalog.inf o/Elements/subject ObjectPredicate Subject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ This resourcehas subjectPoets, American
84 … Poets, American-20th century-Biography. Poets, American 20th century Biography … XC Work record with embedded URI for LCSH “Poets, American”
RDF Triple 85 s/sh #concept extensiblecatalog.inf o/Elements/subject ObjectPredicate Subject oai:mst.rochester.edu: MST/ MARCToXCTransformation/ This resourcehas subjectPoets, American
XC Software is “Linked Data Ready” Converts metadata to FRBR entities with RDA elements and roles Adds identifiers for “things” Provides a platform for service development Synchronizes with existing tools – Cataloging staff client – Institutional repository 86
Download XC software at eXtensibleCatalog.org