International Bibliographic Standards, Linked Data, and the Impact on Library Cataloging Gordon Dunsire A NISO/DCMI Webinar 24 August 2011.

Slides:



Advertisements
Similar presentations
Presented to the ALCTS FRBR Interest Group, ALA Annual, 24 June 2011
Advertisements

Interoperability and semantics in RDF representations of FRBR, FRAD and FRSAD Gordon Dunsire Presented at the Cologne Conference on Interoperability and.
Bibliographic data in the Semantic Web – what issues do we face in getting it there? Gordon Dunsire Presented to the ALCTS Cataloging and Classification.
Linked Data, Libraries and the Semantic Web Gordon Dunsire Library science talk, Geneva-Bern 12 & 13 March 2012.
Subjects in the FR family Gordon Dunsire Presented at the CC:DA/SAC joint meeting, ALA Annual, 27 June 2011.
Transitions, transformations, and shifting sands: the landscape beyond MARC; the ground beneath the record Gordon Dunsire Presented to the RDA Programs.
A short history of the evolution of the library catalogue record Gordon Dunsire 2009.
Initiatives to make standard library metadata models and structures available to the Semantic Web Gordon Dunsire, UK Mirna Willer,
Resource description and access for the digital world Gordon Dunsire Centre for Digital Library Research University of Strathclyde Scotland.
From content standards to RDF Gordon Dunsire Presented at AKM 15, Porec, 2011.
Introduction to linked data Gordon Dunsire Presented at the Cataloguing and Indexing Group Scotland seminar Linked data and the Semantic Web: what have.
An introduction to RDF and library linked data Gordon Dunsire Presented at the Dewey Decimal Classification Executive Briefing 15 Sep 2011, London.
Bibliographic data in the Semantic Web – what issues do we face in getting it there? Gordon Dunsire Presented to the ALCTS Cataloging and Classification.
RDA and the semantic Web Lectio magistralis in Library Science by Gordon Dunsire Florence University, Florence, Italy 4th March, 2014.
Future of Cataloging RDA and other innovations pt.1a.
The Vocabulary Mapping Framework matrix Gordon Dunsire Presented to the Workshop on Conceptual Modelling for Archives, Libraries and Museums Jan.
Representation of the UNIMARC bibliographic data format in Resource Description Framework Gordon Dunsire, Mirna Willer, Predrag Perožić Presented at DC-2013,
Mapping FRBR, ISBD, RDA, and other namespaces to DC for interoperability Gordon Dunsire Presented at Kunnskapsorganisasjonsdagene 2013, 7-8 February 2013,
The UNIMARC in RDF project: namespaces and linked data Mirna Willer, Gordon Dunsire, Predrag Perožić Presented at Session 222, IFLA WLIC 2013, 22 August.
RDA and libraries Gordon Dunsire Presented at a College Development Network webinar, 13 June 2013.
IFLA Namespaces Gordon Dunsire Chair, IFLA Namespaces Technical Group Session 204 — IFLA library standards and the IFLA Committee on Standards – how can.
UNIMARC, RDA and the Semantic Web Gordon Dunsire Presented at Les Journées ABES May 2010, Montpellier, France (Originally presented at WLIC 2009,
Looking to the Future with RDA Presented by Dr. Barbara B. Tillett Chief, Policy & Standards Division, Library of Congress For National Central Library.
National libraries and identity in the Semantic Web Gordon Dunsire BNE, Madrid, 14 Dec 2011.
RDA Test “Train the Trainer Module 1: What RDA is and isn’t [Content as of Mar. 31, 2010]
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
An introduction to open linked data for librarians Gordon Dunsire National Library of Finland, Helsinki 11 December 2012.
Multilingual Issues in the Representation of International Bibliographic Standards for the Semantic Web Gordon Dunsire Independent Consultant; Chair of.
RDA data and applications Gordon Dunsire Presented to staff of the British Library, Boston Spa, 20 Mar 2014.
RDA and Linked Data by Gordon Dunsire National Seminar, National Library of Finland, Helsinki, Finland, 25 March 2014.
Bibliographic Framework and Future Scenarios for RDA Records Dr. Barbara B. Tillett Chief, Policy & Standards Division, Library of Congress & Chair, Joint.
ISBD for the Semantic Web: namespaces, elements, vocabularies, application profile Gordon Dunsire Presented at Centar zu Stalno Stručno Usavršavanje (CSSU),
Looking to the Future with RDA Presented by Dr. Barbara B. Tillett Chief, Policy & Standards Division, Library of Congress For Georgia Cataloging Summit.
RDA : Resource Description and Access Deirdre Kiorgaard Australian Committee on Cataloguing Representative to the Joint Steering Committee for the Development.
Report on recent activity of the IFLA Namespaces Task Group and the DCMI/RDA Task Group Gordon Dunsire Presented to the Semantic Web Special Interest Group,
Implementation scenarios, encoding structures and display Rob Walls Director Database Services Libraries Australia.
Looking to the Future: Information Systems and Metadata Presented by Dr. Barbara B. Tillett Chief, Policy & Standards Division, Library of Congress January.
The Future of Cataloging Codes and Systems: IME ICC, FRBR, and RDA by Dr. Barbara B. Tillett Chief, Cataloging Policy & Support Office Library of Congress.
FRBR, RDA and other acronyms: is this the end of cataloguing as we know it? Gordon Dunsire Depute Director, Centre for Digital Library Research University.
Resource Description and Access Deirdre Kiorgaard Australian Committee on Cataloguing Representative to the Joint Steering Committee for the Development.
Linked Data by Dr. Barbara B. Tillett Chief, Policy and Standards Division Library of Congress For Texas Library Association Conference April 12, 2011.
The Semantic Web and expert metadata: pull apart then bring together Presented at 12.seminar Arhivi, Knjižnice, Muzeji Nov 2008, Pore č, Croatia.
Relevance of the consolidated edition ISBD for national bibliographies Professor Mirna Willer, PhD University of Zadar Department of Information Sciences.
Evidence from Metadata INST 734 Doug Oard Module 8.
RDA: Benefits and opportunities Gordon Dunsire Centre for Digital Library Research University of Strathclyde, Glasgow Presented at the CIG Standards Forum,
Building blocks for RDA Theory behind RDA ALLUNY Annual Meeting September 28-30, 2012.
Looking to the Future with RDA Presented by Dr. Barbara B. Tillett Chief, Policy & Standards Division, Library of Congress For AMIGOS February 4, 2011.
Linked data and the implications for library cataloguing: metadata models and structures in the Semantic Web Gordon Dunsire Presented at the Canadian Library.
RDA: an introduction Gordon Dunsire Presented to the Workshop on Conceptual Modelling for Archives, Libraries and Museums Jan 2010, National Gallery,
RDA: thinking globally, acting globally Gordon Dunsire Presented at International Developments in Library Linked Data: Think Globally, Act Globally – Part.
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
FROM AACR2 to RDA (and a few things in between) The history and context of RDA development Jenny Stephens, National Library of Australia, October 2010.
Getting triples from records: the role of ISBD Gordon Dunsire Presented at Centar zu Stalno Stručno Usavršavanje (CSSU), Zagreb 21 Nov 2011.
Current initiatives in developing library linked data Gordon Dunsire Presented at the Cataloguing and Indexing Group Scotland seminar “Linked data and.
RDA and Linked Data Gordon Dunsire Presented at Cita BNE - RDA and Linked Data, 15 April 2016, Madrid, Spain.
RDA and Linked Data Gordon Dunsire Presented at Selmathon 1, 9 May 2016, Stockholm, Sweden.
Shrinking the silo boundary: data and schema in the Semantic Web Gordon Dunsire Presented at AKM 16, Poreč, 2012.
Subjects in the FR family
RSC Strategy Gordon Dunsire, Chair, RDA Steering Committee
RDA work plan: current and future activities
The Vocabulary Mapping Framework matrix
Chair, IFLA Namespaces Technical Group; Chair-Elect, JSC/RDA
UNIMARC and linked data
Applications of IFLA Namespaces
Gordon Dunsire, Françoise Leresche, Mirna Willer
Gordon Dunsire, Françoise Leresche, Mirna Willer
Introducing IFLA-LRM Gordon Dunsire, Chair, RSC
RDA in a non-MARC environment
The new RDA: resource description in libraries and beyond
Future directions for RDA
Presentation transcript:

International Bibliographic Standards, Linked Data, and the Impact on Library Cataloging Gordon Dunsire A NISO/DCMI Webinar 24 August 2011

Overview  Bibliographic standards  International Federation of Library Associations and Institutions (IFLA)  Others  Representation in Resource Description Framework (RDF)  Creating triples from catalogue records  Impact and implications for catalogues

IFLA standards  RDF representations of standards for “universal” bibliographic control are being developed  “FR” (Functional Requirements) family of models  For Bibliographic Records (FRBR)  For Authority Data (FRAD)  For Subject Authority Data (FRSAD)  International Standard Bibliographic Description (ISBD)  Record structure and content standard for exchange of national metadata  UNIMARC  Encoding for ISBD records (Bibliographic) and FRAD (Authorities)

Representation in RDF  Entities => RDF classes  E.g. FRBR “Person”  Attributes, tags, (sub)fields, relationships => RDF properties  E.g. ISBD “title proper”  E.g. UNIMARC “200 $a” (title proper)  E.g. FRBR “title of the manifestation”  Controlled term values => SKOS vocabularies  E.g. ISBD Area 0 (content and media type)

FR family  Each model has its own namespace  To reflect historical development  Each re-uses earlier RDF elements  Consolidated model under development  Being informed by analysis of RDF representation  FRBR RDF published  FRBRer (entity-relationship) ontology  Namespace elements plus OWL  FRBRoo (object-oriented)  Extension of CIDOC Conceptual Reference Model (for museums)  FRAD and FRSAD imminent  Approved at IFLA 2011 conference

ISBD  Element set and vocabularies for content and media types  Namespace now published  DC Application Profile in development  Models the ISBD record  What properties (fields)  Mandatory? Repeatable?  Aggregated statements  Sub-elements and punctuation

ISBD AP snippet

UNIMARC  Proposal for RDF representation made at IFLA 2011  papers/ifla77/187-dunsire-en.pdf papers/ifla77/187-dunsire-en.pdf  Discussed with Permanent UNIMARC Committee  Decision taken to proceed

Other library standards in RDF (1)  RDA: resource description and access  Content standard based on FR models  Refines the FR properties  Many more controlled vocabularies than AACR  Anglo-American Cataloguing Rules  MODS/MADS (Metadata Object/Authority Description Schema)  Metadata structure based on MARC21  Library of Congress Name Authority File in MADS RDF  RDF representation of MODS just beginning...

Other library standards in RDF (2)  BIBO: Bibliographic Ontology  Classes and properties for citations and bibliographic references  DCMI Metadata Terms (Dublin Core)  High-level common-denominator classes and properties for memory institution metadata  Lots of controlled vocabularies  Library of Congress Subject Headings, Rameau (French subject headings), SWD (German subject headings), Dewey Decimal Classification, RDA vocabularies, etc.

From record to triples (in 9 stages)  Very large numbers of records  Catalogue records, finding aids, etc.  300 million; 1 billion?  High quality metadata  In comparison with other communities  Each record may generate many triples  30 “raw” triples (no inferences) per MARC record?  Very, very large numbers of triples  Billions? Trillions?

1. Take a record Field/attributeValue Record ID54321 TitleMuseum archives: an introduction AuthorWythe, Deborah Date2004 LCSHMuseum archives Media/GMDElectronic Content formText

2. Disaggregate to single statements RecordAttributeValue 54321(has) titleMuseum archives: an introduction 54321(has) authorWythe, Deborah 54321(has) date (has) LCSHMuseum archives 54321(has) media typeElectronic 54321(has) content formText

3. Create URI for record  Must be unique, so no good on its own  http URIs are a good thing (W3C)  So add record ID to a unique http domain  E.g. (unique to the library)    (or  This is not a URL!

4. Replace record ID with URI URIAttributeValue mlx:54321(has) titleMuseum archives: an introduction mlx:54321(has) authorWythe, Deborah mlx:54321(has) date2004 mlx:54321(has) LCSHMuseum archives mlx:54321(has) media typeElectronic mlx:54321(has) content formText “mlx” = qname (xmlns) = shorthand for “

5. Find URIs for attributes  Attributes are modelled as RDF properties (predicates) in “element set” namespaces  E.g. Dublin Core terms (dct); ISBD (isbd); FRBR (frbrer); RDA (rdaxxx); Bibliographic Ontology (bibo); etc.  Choose a namespace, find property with same (or closest) “meaning” (e.g. definition) as attribute  Nearest property minimises loss of information  Get URI for property  If no suitable property, choose another namespace  Properties do not have to come from single namespace  Match and mix!

5 (cont). Find URI for title  (dct:title)  (isbd:P1014)  hasTitleProper  (rd aGR1:titleProper)

5 (cont). Find URI for author  dct:creator  rdarole:author  (isbd does not cover “headings”)

5 (cont). Find URI for date  dct:date  isbd:P1018  hasDateOfPublicationProductionDistribution  rdaGr1:dateOfPublication

5 (cont). Find URI for LCSH  LCSH is a subject vocabulary  Controlled terms  So attribute is really “subject”  And the term itself is the value  dct:subject

5 (cont). Find URI for media type  Assuming record uses new ISBD Area 0...  isbd:P1003  hasMediaType

5 (cont). Find URI for content form  Assuming record uses new ISBD Area 0...  isbd: P1001  hasContentForm

6. Replace attributes with URIs URI Value mlx:54321isbd:P1014Museum archives: an introduction mlx:54321rdarole:authorWythe, Deborah mlx:54321isbd:P mlx:54321dct:subjectMuseum archives mlx:54321isbd:P1003Electronic mlx:54321isbd:P1001Text

7. Find URIs for values  If object of a triple is a URI, it can link to the subject of another triple with the same URI  Linked data!  Values from controlled vocabularies may have URIs  Possible vocabularies: author, subject, ISBD Area 0  NOT: title, date  For author: Virtual International Authority File (VIAF)  For LCSH: Library of Congress Authorities & Vocabularies  For ISBD Area 0: Open Metadata Registry

7 (cont). Find URI for author  Author: Wythe, Deborah  VIAF:  viaf: /#Wythe,+Deborah

7 (cont). Find URI for subject (LCSH)  LCSH: Museum archives  LoC:  lcsh:/sh #concept

7 (cont). Find URIs for ISBD Area 0  Media type: Electronic  ISBD media type  isbdmt:T1002  Content form: Text  ISBD Content form  isbdcf:T1009

8. Replace values with URIs subjectpredicateobject mlx:54321isbd:P1014“Museum archives: an introduction” mlx:54321rdarole:authorviaf: /#Wythe,+ Deborah mlx:54321isbd:P1018“2004” mlx:54321dct:subjectlcsh:/sh #conce pt mlx:54321isbd:P1003isbdmt:T1002 mlx:54321isbd:P1001isbdcf:T1009

9. Publish triples (linked data) mlx:54321 | isbd:P1014 | “Museum archives: an introduction” mlx:54321 | rdarole:author | viaf: /#Wythe,+Deborah mlx:54321 | isbd:P1018 | “2004” mlx:54321 | dct:subject | lcsh:/sh #concept mlx:54321 | isbd:P1003 | isbdmt:T1002 mlx:54321 | isbd:P1001 | isbdcf:T1009

Linked data chains mlx:54321 | dct:subject | lcsh:/sh #concept lcsh:/sh #concept | skos:related | rameau:XXX rameau:XXX | frbrer:isSubjectOf | mly:98765 rameau:XXX | skos:prefLabel | “archives du musée” mly:98765 | rda:titleOfTheWork | “Managing archives in museums”

Linked data cluster = “record” mlx:54321 | isbd:P1014 | “Museum archives: an introduction” mlx:54321 | rdarole:author | viaf: /#Wythe,+Deborah mlx:54321 | isbd:P1018 | “2004” mlx:54321 | dct:subject | lcsh:/sh #concept mlx:54321 | isbd:P1003 | isbdmt:T1002 mlx:54321 | isbd:P1001 | isbdcf:T1009

Duplication and legacy records  Many copies of legacy records  Copied and amended for local use  Danger of minting multiple URIs for the same resource  National bibliographic agencies have significant role to play  As memory/cultural institutions  The linked-data memory/culture of a nation  Proposal from IFLA Namespaces Task Group to IFLA Bibliography Section

FRBRization  FRBR splits record into four functional parts  User-centred functions  Subject of a FRBR triple is one of the parts, not the resource as a whole  But subject of ISBD triple is the resource as a whole  Class collisions can be avoided by using unbounded (no domain or range) versions of properties

A short history of the evolution of the library catalogue record

Lee, T. B. Cataloguing has a future. - Audio disc (Spoken word). - Donated by the author. 1. Metadata In the beginning the catalogue card

Author: Title: Content type: Provenance: Subject: Lee, T. B. Cataloguing has a future Spoken word Audio disc Metadata Donated by the author Carrier type: From flat-file record to relational record Name: Biography:... Name authority Term: Definition:... Subject authority Bibliographic description

Author: Title: Content type: Provenance: Subject: Lee, T. B. Cataloguing has a future Spoken word Audio disc Metadata Donated by the author Carrier type: From flat-file description to FRBR record Name: Biography:... Name authority Term: Definition:... Subject authority Bibliographic description Item Manifestation Author: Content type: Subject: Spoken word Expression Work

Lee, T. B. Metadata From FRBR record to extinction! Name: Name authority Term: Subject authority Item Manifestation Expression Work Provenance: Donated by the author Subject: Author: Title: Cataloguing has a future Content type: Spoken word Audio disc Carrier type: Term: RDA content type Term: RDA carrier type Donor: Title: Amazon/Publisher

Where is the record?  Implicit, not explicit  Everywhere and nowhere  A semantic Web will allow machines to create the record just-in-time  We will not have to maintain records just-in-case  The user will have control over the presentation  I want to see an archive or library or museum or Amazon or Google or Flickr or ? display  And by avoiding duplication, we can all get on with describing new stuff...

The hyperdimensional (Tardis) card Lee, T. B. Cataloguing has a future. - Audio disc (Spoken word). - Donated by the author. 1. Metadata Audio shop Lee Museum Spoken word archive W3C Library “TARDIS four port USB hub, for office-bound Time Lords: Open a time vortex on your desk” – Pocket-lint

Metadata focus Shift of focus of metadata creation, maintenance, storage, preservation (by professionals, amateurs, machines) From RecordTo Statement(s) = triple(s) But metadata display aggregates triples (from multiple sources) to create records on the fly

Thank you!    Open Metadata Registry   IFLA Namespaces Task Group (needs updated)   DCMI/RDA Task Group (in revision)