Presentation is loading. Please wait.

Presentation is loading. Please wait.

Thesauri, interoperability and the role of ISO 25964

Similar presentations


Presentation on theme: "Thesauri, interoperability and the role of ISO 25964"— Presentation transcript:

1 Thesauri, interoperability and the role of ISO 25964
Stella G Dextre Clarke Project Leader, ISO NP 25964 Chair, ISKO UK

2 Summary Brief thesaurus chronology
What role does the thesaurus have now? The demand for interoperability Highlights from ISO 25964

3 Thesauri – a brief chronology
Once upon a time, thesauri were at the cutting edge of Information Retrieval (IR) technology Hey-day in 1960s and 1970s; after mid-1980s popularity declined ISO 2788 and ISO 5964 (for monolingual and multilingual thesauri respectively) came out Internet/intranets in 1990s brought resurgence and diversification (into other forms of controlled vocabulary, such as “taxonomies”) TREC (1992 onwards) has shown dominance of statistical methods in IR. But stats alone are not enough! At the turn of the century, thesauri back in fashion and work began on refurbishing the British and International standards Semantic Web and SKOS developments provide more incentive Today, even Google employs some “taxonomists”.

4 Slide unearthed from TR’01(2001): The thesaurus coming back into fashion!

5

6

7

8 The role of controlled vocabularies today
Needed where full text is not available, e.g. image libraries and audio resources Invaluable for crossing language barriers Especially useful in-house, where the page rank algorithms are less effective Essential to access vast databases and catalogues of bibliographic data from decades past Provide added value in combination with other methods, often hidden behind the scenes In all these contexts, interoperability is key.

9 Introducing ISO 25964 ISO 25964: Thesauri and interoperability with other vocabularies Part 1: Thesauri for information retrieval Part 2: Interoperability with other vocabularies It updates ISO 2788 and ISO 5964 based on BS 8723, with much reworking Part 1, published in August 2011, covers monolingual and multilingual thesauri Part 2, to be published in January 2013, covers mapping between thesauri and other types of vocabulary information retrieval seen as main application, including indexing as well as searching

10 What does “interoperability” mean?
Definition: ability of two or more systems or components to exchange information and to use the information that has been exchanged. In the case of thesauri and other KOS, broadly speaking interoperability applies at more than one level: presenting data in a standard way to enable import and use in other systems (ISO Part 1) providing mappings between the terms/concepts of one KOS and those of another (ISO Part 2) plus any other type of exchange between one KOS and another (ISO Part 2)

11 Linked Data Cloud in 2011 - Richard Cyganiak and Anja Jentzsch see http://lod-cloud.net/

12 A simplified view of interoperability
My thesaurus

13 Interoperability between vocabularies (see ISO 25964-2)
Wordnet GEMET LCSH My thesaurus Your thesaurus Dewey AGROVOC

14 Interoperability between applications (see ISO 25964-1)
indexing/tagging software Vocabulary management software search/browsing software

15 Content of ISO 25964-1, supporting interoperability between applications
thesaurus content and construction, mono- or multi-lingual (i.e. a complete update of ISO 2788 and ISO 5964) guidance on applying facet analysis to thesauri guidance on managing thesaurus development and maintenance functional requirements for software to manage thesauri a data model and derived XML schema

16

17 Content of ISO 25964-2, supporting interoperability between vocabularies
Models for mapping Guidelines for mapping Recommendations on mapping types How to handle pre-coordination Mapping to vocabularies other than thesauri: classification schemes file plans (Classification schemes used for records management) taxonomies subject heading schemes ontologies terminologies name authority lists synonym rings Brief guidance on handling mappings data

18 Recommended “Models for mapping”
B F C D H E G P Q R S

19 What does “mapping” mean?
Definition: process of establishing relationships between the concepts of one vocabulary and those of another Recommended types of mapping are based on the standard internal relationship types, basically: equivalence, hierarchical and associative Greater differentiation of mapping types is allowed, but is optional, to avoid complexity in simple applications

20 Full range of ISO 25964-2 mapping types
Basic mapping types: Equivalence Simple Compound Intersecting compound equivalence Cumulative compound equivalence Hierarchical Broader Narrower Associative Simple equivalence can be marked as “Exact” or “Inexact”

21 Full range of ISO 25964-2 mapping types with examples
Basic mapping types: Equivalence Simple: Laptop computers EQ Notebook computers Compound Intersecting compound equivalence: Women executives EQ Women + Executives Cumulative compound equivalence: Inland waterways EQ Rivers | Canals Hierarchical Broader: Streets BM Roads Narrower: Roads NM Streets Associative: e-Learning RM Distance education Exact equivalence: Aubergines =EQ Egg-plants Inexact equivalence: Horticulture ~EQ Gardening

22 The joys of pre-coordination
Examples: (084.12) photographs of lions (from UDC) Automobiles--Air conditioning--Maintenance and repair (from LCSH) Occurs characteristically in subject heading schemes, classification schemes, taxonomies and file plans Mapping obliges use of the more complicated mapping types, especially compound equivalence

23 Vocabularies other than thesauri
ISO is a standard for thesauri; it does not attempt to standardize other types of KOS. It guides only on interoperability between thesauri and other types of KOS. The clause on each KOS type presents: Key characteristics of the KOS (non-normative) Semantic components/relationships (non-normative) Recommendations for interoperability between the KOS and a thesaurus, especially mapping (normative)

24 Vocabularies other than thesauri
The following are dealt with in ISO 25964: classification schemes file plans (classification schemes used for records management) taxonomies subject heading schemes name authority lists synonym rings terminologies ontologies

25 General prospects for mapping
- thesauri mapping relatively straightforward - classification schemes - file plans - taxonomies - subject heading schemes concept mapping useful in IR, pre-coordination common - name authority lists mapping usually straightforward but common concepts few - synonym rings - terminologies - ontologies concept mapping rarely useful; complementary uses are a more likely prospect

26 Ontologies are special…
Definition of ontology excludes “lightweight” examples such as thesauri and classification schemes The Gruber/Studer definition is adopted, and interpreted broadly enough to admit OWL-based examples such as ORE and FOAF. Mapping between ontologies and thesauri is not recommended. Interoperability recommendations focus on use cases such as reengineering a thesaurus as an ontology, and complementary use of thesaurus with ontology.

27 Simple ontology illustration (credit: Jutta Lindenthal; see http://www

28 Structural comparison
The illustration is used in ISO to draw out key similarities and differences between ontologies and thesauri. The aim is to encourage emerging applications in which thesauri and ontologies can usefully interoperate.

29 Interoperability at the level of standards
ISO2709 Z39.50 MARC 21 SPARQL Z39.19 OWL SKOS ZThes JSON REST ISO25964 RDF BS 8723 HTTP SRU XML

30 From ISO 2788 to ISO 25964: the evolution of thesaurus standards towards interoperability and data modeling. Information Standards Quarterly (Winter 2012, v.24, no. 1), by Stella G. Dextre Clarke and Marcia Lei Zeng, Available at: Dextre Clarke and Zeng,

31 The thesaurus coming back into fashion…

32 …although often hidden behind the scenes

33 And interoperability makes new tricks easier…

34 Want a copy of the standards?
Download Part 1 from ISO at Part 2 will be in the ISO catalogue next year Order from your national standards body (e.g. BSI, DIN, ANSI, AFNOR) Some public/academic reference libraries stock them ISO standards are not cheap to purchase However, the data model and XML schema for exchange of thesaurus data are available online without charge or password control. Go to

35 Some extra slides with more detail
APPENDIX Some extra slides with more detail

36 Who is involved in developing the standard?
A Working Group (WG8), under the ISO subcommittee known as ISO TC46/SC9, has drafted the standard. WG8 has members from 15 countries. The WG8 Secretariat is provided by NISO in the USA Currently active members of WG8 include: Johan De Smedt Marianne Lykke Stella Dextre Clarke (Leader) Esther Scheven Michèle Hudon Douglas Tudhope Daniel Kless Leonard Will Jutta Lindenthal Marcia Lei Zeng

37 Intersecting versus cumulative equivalence

38 Mapping example from a pre-coordinated concept: inland waterway transport
Inland waterway transport EQ transport + (rivers | canals) The Rialto Bridge, Venice Michele Marieschi © Bridgeman Education


Download ppt "Thesauri, interoperability and the role of ISO 25964"

Similar presentations


Ads by Google