Presentation is loading. Please wait.

Presentation is loading. Please wait.

Metadata standards and interoperability 384C – Organizing Information Spring 2016 Karen Wickett School of Information University of Texas at Austin.

Similar presentations


Presentation on theme: "Metadata standards and interoperability 384C – Organizing Information Spring 2016 Karen Wickett School of Information University of Texas at Austin."— Presentation transcript:

1 Metadata standards and interoperability 384C – Organizing Information Spring 2016 Karen Wickett School of Information University of Texas at Austin

2 The world of standards A standard is any agreed-upon means of doing something Standards can be formally created and adopted, or merely customary With standardization, products and processes have a level of consistency and predictability – This can make production and use more efficient

3 Goals of metadata standards Enable more reliable and consistent description within a system For example – If we agree to use separate fields for first and last names for authors, search results can be properly alphabetized and more easily read – This is an issue with how the data is structured and organized, regardless of how it is displayed Facilitates the sharing of data across different systems – i.e. interoperability

4 Interoperability Interoperable records facilitate information access and exchange across contexts – not just in cultural heritage informatics For example – Amazon.com sells products from many different providers. Getting product records from suppliers that match up in terms of content and structure makes their job easier.

5 Interoperability How interoperable are these descriptions? Astor Wines Wine.com Sherry-Lehmann 67 Wine

6 Types of standards From Elings and Waibel: Data structure (attributes, elements, or fields) – Dublin Core, CDWA, EAD, MARC Data content (values) – CCO, RDA, DACS Data format (record syntax) – XML, MARC21, EAD-XML, MARC-XML Data exchange (interchange syntax) – Z39.50, OAI-PMH

7 Types Elings and Waibel's categories may seem familiar – Recall Gilliland's similar categories These categories are useful for understanding the interlocking pieces of an info org system – doing information organization work involves working across these categories But standards may straddle categories, and the instances across categories are not always independent Data content standards are intended to be independent of encodings, but they tend to be written with particular applications or encodings in mind

8 Multiple standards at work A cataloger uses RDA to determine: – That a book's title should be part of its description – The wording, spelling, capitalization, and punctuation of the title The cataloger uses MARC to record title information in a consistent form that computers can process

9 Multiple standards at work Two computer networks can use Z39.50 to determine how to exchange their MARC catalog records Then a user at Library A can search Library B's catalog and not discern a difference in the way that information is structured and presented.

10 Multiple standards at work An archivist uses EAD to determine that their archival finding aid should include a scope and content note The archivist uses DACS for guidance on what information should appear in the scope and content and how to express that information

11 Multiple standards at work A museum curator is documenting a new acquisition in their database system The collection management system includes a field for the "Work Type" which is a core attribute from CDWA. Guidance for describing the work type is given in CCO The Art and Architecture Thesaurus (AAT) includes vocabulary terms that can be used to describe the work type.

12 Multiple standards at work Later, collection data is mapped from CDWA (data structure) to the Europeana Data Model (EDM) (data structure), for aggregation into Europeana and subsequent data reuse. In this mapping, the proprietary database format (data format) is translated to the EDM's RDF/XML schema (data format).

13 Developing and adopting standards Organziations adopt standards because the benefits of creating products or services that work together can be great However, developing standards and forging that agreement is a difficult process For metadata content standards, using them can be complicated and there is room for interpretive flexibility

14 Content standards: considerations Content standards tend to be voluminous and intricate. Why? – Because documents are so varied. Most content standards will try to implement a few basic guidelines supplemented by rules and options for special cases Ideally, the basic guidelines will be based on clearly articulated goals and principles.

15 Example: RDA goals RDA has articulated a concrete set of descriptive goals and principles. Including: Enable description of any resource – not just printed materials Align with the FRBR conceptual model and its objectives Create content descriptions that can be used in multiple encodings and displays Retain backward compatibility with existing records

16 Example: RDA Principles One principle is that descriptions should reflect "the resource's representation of itself" Sound familiar? This is linked to the objective of finding known items – The catalog description should match how the item is known to others, which mostly likely from the item itself

17 Example: RDA guidelines This principle of transcription underlies the basic guideline for RDA titles – the "title proper" or primary title should come from the preferred source of information – which for books, is the title page The wording comes from the title page, but the capitalization and punctuation are standardized for all titles.

18 Easy, right? In the catalog

19 Hmm... In the catalog

20 RDA special title cases What if... Some introductory words on the title page seem like they're not really part of the title – e.g. "Miyazaki's Sprited Away" The title is given in two languages There is a spelling mistake in the title The document is a manifestationof a commonly known work but has a slightly different title than most manifestations – e.g. William Shakespeare's Hamlet A subtitle appears under what seems to be the main title The title is over one paragraph long

21 Keeping standards relevant A published standard is immediately out of date Particular institutions will issue their own rules for interpreting standards, which smaller institutions may or may not choose to adopt.

22 Levels of interoperability Different kinds of standards enable different kinds of interoperability Let's say someone gives you a metadata record to incorporate into your database of records from your schema What can you do with it?

23 Levels of interoperability System interoperability – your computer can read the file Syntax interoperability – your database understands the file format Structural interoperability – The attributes match other records in the database Semantic interoperability – The values in the fields are consistent with orther records in the database

24 Methods of interoperability What do we do when we have records that weren't created using the same schema?

25 Derivation New schemas are subsets, supersets, or direct translations of existing schemas. CDWA Lite is subset of CDWA – removes some attributes French Dublin Core is a translated version of Dublin Core – same attributes, different labels Gateway to Educational Materials extends Dublin Core – adds elements

26 Application profiles Application profiles mix attributes from different existing schemas or mix usage rules from different existing schemas The DPLA MAP uses elements from – Dublin Core – The Europeana Data Model – A "Basic Geo" schema created by the w3c (wgs84) for simple geographic information – The DPLA itself (published separately from the profile)

27 Crosswalks Crosswalks are mappings between one schema and another For example – a crosswalk might specify that the Title element in CDWA should be mapped to the Title element in Dublin Core Crosswalks can map only schema elements that are semantically equivalent – or they can map semantically "close" elements to each other

28 Switching languages Switches map multiple schemas to a single switching language For example – Multiple content schemas could all be mapped to Dublin Core – The Dublin Core content could then in turn be mapped to something else – This more efficient than mapping each individual schema to the result Imagine a multilingual conversation in which everyone has a different native language but they all also speak French...

29 Frameworks A basic set of concepts and specifications that are agreed upon by a particular group For example – The Warwick Framework is an early specification that designates the idea of a "container as an aggregation of metadata sets, or "packages" Agreements on ideas like containers and packages facilitate the sharing of different sorts of units – The DPLA, for example, relies on 'service hubs' that aggregate metadata sets from individual contributing institutions

30 Registries Registries publish information about metadata schemas Registries constitute reference information that facilitate the development of new application profiles, crosswalks, etc. Open Metadata Registry

31 Aggregated infrastructures Europeana, the Europeana cultural heritage data aggregation. Europeana The Digital Public Library of America (DPLA) The Digital Public Library of America These systems are enabled by standards, crosswalking, frameworkds


Download ppt "Metadata standards and interoperability 384C – Organizing Information Spring 2016 Karen Wickett School of Information University of Texas at Austin."

Similar presentations


Ads by Google