1 Introducing some Standards Paul Miller Interoperability Focus UK Office for Library & Information Networking (U KOLN )

Slides:



Advertisements
Similar presentations
Subject Based Information Gateways in The UK Coordinated Activities in The UK Within the UK Higher Education community, the JISC (Joint Information Systems.
Advertisements

Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
1 Thesauri, Controlled Terminologies, and other solutions Paul Miller (UKOLN) & Matthew Stiff (mda)
1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN)
1 Introducing Technical Standards Paul Miller Interoperability Focus UK Office for Library & Information Networking (U KOLN )
UKOLN, University of Bath
Canada The Bath Profile and The Journey To Interoperability Carrol D Lunau Bath Profile Maintenance Agency July 7, 2003
February Harvesting RDF metadata Building digital library portals with harvested metadata workshop EU-DL All Projects concertation meeting DELOS.
Z39.50 Profiles The Bath Profile ZIG Meeting Leuven, Belgium July 2000 William E. Moen School of Library and Information Sciences University.
1 The Bath Profile: making Z39.50 interoperable UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC)
Dublin Core and the Cultural Heritage Paul Miller UK Interoperability Focus
National Library of Canada Bibliothèque nationale du Canada Canada International Z39.50 Profile for Searching Virtual Catalogues Presentation by Barbara.
1 Adaptive Management Portal April
Using Metadata in CONTENTdm Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct.
31 Aug 2003 Talking Systems Janice Sim Technical Services Manager University of Wales College, Newport.
1 Languages for aboutness n Indexing languages: –Terminological tools Thesauri (CV – controlled vocabulary) Subject headings lists (CV) Authority files.
Dublin Core as a tool for interoperability Common presentation of data from archives, libraries and museums DC October 2006 Leif Andresen Danish.
UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.
Z39.50, XML & RDF Applications ZIG Tutorial January 2000 Poul Henrik Jørgensen, Danish Bibliographic Centre,
Practical approaches to standardizing vocabularies: the Cultural Heritage experience. Phil Carlisle English Heritage National Monuments Record and European.
UKOLN and the Interoperability Focus Paul Miller Interoperability Focus
The role of metadata schema registries XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN.
1 Metadata for the Masses Paul Miller Interoperability Focus UK Office for Library & Information Networking (U KOLN )
1 Metadata for Citizens’ Information UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher.
1 A little bit of joined-up thinking (some) issues of convergence in our memory institutions Dr. Paul Miller Interoperability Focus UK Office for Library.
Profiling Metadata Specifications David Massart, EUN Budapest, Hungary – Nov. 2, 2009.
1 Joining it up making our cultural heritage visible online Paul Miller Interoperability Focus UK Office for Library & Information Networking (U KOLN )
A centre of expertise in digital information management The MEG Metadata Schemas Registry Pete Johnston, Research Officer (Interoperability),
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
ICS-FORTH January 11, Thesaurus Mapping Martin Doerr Foundation for Research and Technology - Hellas Institute of Computer Science Bath, UK, January.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
Possible Developments in Resource Discovery & National Directories. Paris, 6 July Metadata for interoperable cultural content: a personal viewpoint.
The UNESCO Thesaurus Meeting for Managers of UNESCO Documentation Networks Meron Ewketu UNESCO Library June
1 Hybrid Libraries and information Clumps: a view from the UK Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN)
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
Accessing a national digital library: an architecture for the UK DNER Andy Powell ELAG 2001, Prague 7 June 2001 UKOLN, University of Bath
ONE-2 Profile ZIG Tutorial 19 th January 2000 Poul Henrik Jørgensen, Danish Bibliographic Centre
Metadata Bridget Jones Information Architecture I February 23, 2009.
1 Controlled Vocabularies Paul Miller Interoperability Focus UKOLN U KOLN is funded by Resource: the Council.
1 XML and RDF Paul Miller Interoperability Focus UK Office for Library & Information Networking (U KOLN ) U.
Joint Information Systems Committee Supporting Higher and Further Education Rachel Bruce Programme Manager, JISC Executive Collection.
1 Interoperability and the DNER Paul Miller Interoperability Focus UK Office for Library & Information Networking (U KOLN )
1 Interoperability: What, Why, and some How Dr. Paul Miller Interoperability Focus UK Office for Library & Information Networking (U KOLN )
1 Metadata for Joined-up Government Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN)
A centre of expertise in digital information managementwww.ukoln.ac.uk DCMI Affiliates: Implications for Institutions Rosemary Russell UKOLN University.
1 Interoperability: architectures and connections John Gilby, M25 Systems Team, LSE Ashley Sanders, Copac Team, MIMAS "Hyper Clumps, Mini Clumps and National.
1 Convergence and Technology Dr. Paul Miller Interoperability Focus UK Office for Library & Information Networking (U KOLN )
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Z39.50 & The Z Texas Profile William E. Moen School of Library and Information Sciences University of North Texas Denton, TX.
Bath Profile - vendor considerations Page 1 The Bath Profile - vendor considerations Rob Bull.
1 Building our DNER the Z way Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN)
Overviews of the Library of Texas & ZLOT Project Dr. William E. Moen Principal Investigator.
1 Dublin Core and its implementation in RDF/XML Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN)
Controlled Vocabulary & Thesaurus Design Associative Relationships & Thesauri.
No Longer Under Our Control? The Nature and Role of Standards in the 21 st Century Library William E. Moen School of Library and Information Sciences Texas.
1 Dublin Core in Z39.50: The Bath Profile Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN)
Surveying the landscape: collection-level description & resource discovery JISC/NSF DLI Projects meeting, Edinburgh, 24 June 2002 Pete Johnston UKOLN,
ISO TC 37/CLARIN DISCUSSION UTRECHT, DECEMBER 9/ Thinning Down a Bloated Cat SUE ELLEN WRIGHT DECEMBER 2013.
Interoperability and Standards for Bibliographic Applications Poul Henrik Jørgensen Danish Library Centre Telematics for.
1 Educational Metadata Paul Miller Interoperability Focus UKOLN U KOLN is funded by Resource: the Council for.
1 Interoperability Focus Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN)
1 Z39.50 and the DNER UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher Education.
1 DC, RDF, Z39.50, and assorted other acronyms UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC)
1 Bath Profile and the DNER Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN)
Metadata Schema Registries: background and context MEG Registry Workshop, Bath, 21 January 2003 Rachel Heery UKOLN, University of Bath Bath, BA2 7AY UKOLN.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
MICHAEL and the European Digital Library: promoting teaching, learning and research The MICHAEL Project is funded under the European Commission eTEN Programme.
Accessing a national digital library: an architecture for the UK DNER
Accommodating local cataloguing traditions in a global context
Presentation transcript:

1 Introducing some Standards Paul Miller Interoperability Focus UK Office for Library & Information Networking (U KOLN ) U KOLN is funded by Resource: the Council for Museums, Archives and Libraries, the Joint Information Systems Committee (J ISC ) of the Further and Higher Education Funding Councils, as well as by project funding from J ISC and the EU. U KOLN also receives support from the Universities of Bath and Hull where staff are based.

2 So… why use standards? Benefit from the expertise of others Enforce rigour in internal practices Facilitate interoperability (and access) –Considered deployment of standard solutions makes access to your resources feasible for many.

3 What do standards do? Help identify what’s important –CIMI’s “Access Points” –Mandatory fields Allow for consistent use of terminology –Name Authority Files –Thesauri –Look–up tables Enable internal and external data exchange or access Reduce duplication of effort Minimise (hopefully!) wasted effort Reflect consensus.

4 What types of standard are there? Terminology –‘Roma’, not ‘Rome’ –‘Roma’ is preferred to ‘Rome’ Format –‘Miller, A.P. 1971–’, not ‘Paul Miller’ ‘Semantics’ –A gross simplification, and a very big bucket –‘Creator’, ‘Subject’, ‘Title’, ‘Description’… Syntax – Transfer –ftp://ftp.niso.org/ ….

5 Terminological Standards (Based upon an earlier presentation with Matthew Stiff of mda) See

6 The need for control… European Community E.E.C. Common Market European Union !

7 Without control of terms... Users are –incorrectly utilising search terms –failing to find significant resources –suffering from information overload –almost as well using Google Creators are –cataloguing inconsistently –unable to convey hierarchical concepts –Scotland is in United Kingdom is in Europe is in... –perpetuating localised terminology –unable to assess, let alone undertake, integration projects.

8 With control... Users might –gain more effective access to a resource –gain far more effective access across resources –reduce the number of ‘false hits’ –find what they are looking for –even learn to think and express themselves in a structured manner. Creators might –produce more valuable resources –convey complex semantic and structural concepts –move towards disciplinary, national, international or global terminologies –effectively integrate both new and existing resources.

9 Controlled Vocabulary European Union  E.E.C.  Common Market  European Community ... Etc. With a controlled vocabulary, one or more of these terms might be permitted. Use of the others for record creation or retrieval would be rejected by the system.

10 Thesaurus-based Control European Union [preferred term] E.E.C. [synonym] Common Market [synonym] European Community [synonym]... Etc. [synonyms] In a thesaurus, all of the terms might be considered equally valid, with one identified as the preferred term and the others as synonyms But... Are they really synonymous...?

11 Thesauri A traditional thesaurus defines synonyms and, perhaps, antonyms for terms within a given language. E.g. –‘workshop’ atelier, factory, mill, plant, shop, studio, workroom...or... ? class, discussion group, seminar, study group.

12 Thesauri in Information Retrieval In the context of information retrieval, thesauri do more, facilitating the creation of hierarchies of meaning....

13 Hierarchies of Meaning ‘Glass’ ‘Beer Glass’ ‘Wine Glass’ ‘Red wine glass’ ‘White wine glass’

14 Thesaurus Components Most thesauri are constructed in a standard form, as defined by ISO 2788 and various national standards. –ISO 5964 extends discussion to multilingual issues Four basic relationships are fundamental in thesaurus construction and use... –Equivalence (preferred and non-preferred terms) –Hierarchy (‘glass’ is broader than ‘wine glass’) –Association (establishes non-hierarchical relationships) –Scope notes (provide guidance and clarification).

15 Equivalence As with the European Union example, there are often situations in which users or cataloguers wish to allow multiple synonyms for any one term. –In these cases, one term may be defined as a preferred term “Electricity Plant USE Power Station” –Here, ‘Power Station’ is the preferred term Example from RCHME Thesaurus of Monument Types, © RCHME 1995.

16 Hierarchy An important capability of thesauri is their ability to reflect hierarchies, whether conceptual, spatial, or whatever. –Individual thesaurus entries are linked to a class (CL), as well as to broader (BT) and narrower (NT) terms. “BAYONET CL Armour and Weapons BT Edged Weapon NT Plug Bayonet NT Socket Bayonet” Example from mda Archaeological Objects Thesaurus, © mda, English Heritage, RCHME 1997.

17 Association In any large thesaurus, a significant number of terms will mean similar things or cover related areas, without necessarily being synonyms or fitting into a defined hierarchy. –Related Terms (RT) can be used to show these links within the thesaurus. “CHURCH RT Churchyard RT Crypt RT Presbytery” Example from RCHME Thesaurus of Monument Types, © RCHME 1995.

18 Scope Notes Thesaurus entries can often be terse, and difficult to interpret for the non- expert. –Scope Notes (SN) serve to clarify entries and avoid possible confusion. They serve to embody the underlying concept, rather than the language-specific word. “CHITTING HOUSE SN A building in which potatoes can sprout and germinate” “FERRY SN Includes associated structures” Examples from RCHME Thesaurus of Monument Types, © RCHME 1995.

19 Putting it all together... “FERROUS METAL EXTRACTION SITE SN Includes preliminary processing CL Industrial BT Metal Industry Site NT Ironstone Mine NT Ironstone Pit NT Ironstone Workings RT Ironstone Workings” Example from RCHME Thesaurus of Monument Types, © RCHME 1995.

20 Working with the tools Thesauri, controlled vocabulary lists, etc, are all useful, but they –often rely upon both cataloguers and users having direct access to these usually weighty tomes –require an awareness of cataloguing issues and practice to be used most effectively –have predominantly developed within –– rather than between –– communities, regions, etc. –rapidly become destabilised as distributed users add new terms in a non–complimentary fashion

21 Effective distributed thesauri [1] In order for thesauri to be effective in the online environment, research and good practice need to address; –mapping between existing thesauri –technical mapping –semantic mapping are ‘E.E.C.’ and ‘Common Market’ synonymous? –restructuring one or both where necessary/ possible –inter–disciplinary mapping the ‘God Problem’ –addressing legacy data

22 Effective distributed thesauri [2] –delivery of training to remote cataloguers –providing online access to more existing thesauri –development of cataloguing tools –capable of accessing various remote thesauri and selecting terms in an intuitive, timely, fashion Nordic Metadata Project Dublin Core tool –raising the profile of thesauri as “A Good Thing”! –Development of user interface tools –capable of integrating various remote thesauri into the search process without slowing it intolerably, losing contextual awareness or subjecting the browser to information overload.

23 Some links English Heritage Thesauri Getty Thesauri HASSET biron.essex.ac.uk/searching/zhasset.html HIgh Level Thesaurus Project (HILT) hilt.cdlr.strath.ac.uk/ Pan–Government Thesaurus Should be visible from eventually.

24 Metadata

25 What is ‘Metadata’? –meaningless jargon –or a fashionable, and terribly misused, term for what we’ve always done –or “a means of turning data into information” –and “data about data” –and the name of a person (‘Tony Blair’) –and the title of a book (‘The Name of the Rose’).

26 What is ‘Metadata’? Metadata exists for almost anything; People Places Objects Concepts Web pages Databases.

27 What is ‘Metadata’? Metadata fulfils three main functions; Description of resource content –“What is it?” Description of resource form –“How is it constructed?” Description of resource use –“Can I afford it?”.

28 Challenges  Many flavours of metadata  which one do I use?  Managing change  new varieties, and evolution of existing forms  Tension between functionality and simplicity, extensibility and interoperability Functions, features, and cool stuff Simplicity and interoperability Opportunities

29 Introducing the Dublin Core An attempt to improve resource discovery on the Web –now adopted more broadly Building an interdisciplinary consensus about a core element set for resource discovery –simple and intuitive –cross–disciplinary — not just libraries!! –international –open and consensual –flexible. See purl.org/dc/

30 15 elements of descriptive metadata All elements optional All elements repeatable The whole is extensible –offers a starting point for semantically richer descriptions. Introducing the Dublin Core

31 Title Creator Subject Description Publisher Contributor Date Type Format Identifier Source Language Relation Coverage Rights purl.org/dc/ Introducing the Dublin Core

32 Z39.50

33 What is Z39.50? ANSI/NISO Z39.50–1995, Information Retrieval (Z39.50): Application Service Definition and Protocol Specification ISO 23950:1998, Information and Documentation — Information Retrieval (Z39.50) — Application Service Definition and Protocol Specification. See lcweb.loc.gov/z3950/agency/1995doce.html

34 What is Z39.50? “This standard specifies a client/server based protocol for Information Retrieval. It specifies procedures and structures for a client to search a database provided by a server, retrieve database records identified by a search, scan a term list, and sort a result set. Access control, resource control, extended services, and a ‘help’ facility are also supported. The protocol addresses communication between corresponding information retrieval applications, the client and server (which may reside on different computers); it does not address interaction between the client and the end-user.” (Z39.50–1995, page 0). See lcweb.loc.gov/z3950/agency/1995doce.html

35 Some gory details… Z39.50 follows client/server model But calls them Origin and Target Client/origin Server/target

36 Client/Server architecture

37 Client/Server architecture

38 Some gory details… Z39.50–1995 is divided into eleven ‘Facilities’ InitializationSearch RetrievalResult–set–delete BrowseSort Access ControlAccounting ExplainExtended Services Termination. See

39 Facilities and Services Each Facility comprises at least one Service A Service facilitates a particular interaction between Origin and Target The three key services are Init, Search, and Present. See

40 Init The only Service of the Initialization Facility Origin–initiated Used to start a ‘Z–association’ Origin requests a number of parameters under which the searches will be conducted Target responds, either accepting offered parameters or proposing others if necessary.

41 Search The only Service of the Search Facility Origin–initiated Used to actually conduct a search Origin specifies databases to be searched, attribute combinations, and query Target responds, identifying the number of matching results.

42 Present Main Service of the Retrieval Facility (along with Segment) Origin–initiated although Target can initiate a Segment request if the result set is very large Used to return records to the user.

43 Init for dummies Hello. Do you speak English? Hello. Yes, I do. Let’s talk.

44 Search for dummies Cool. Can I have anything you’ve got on a place called “London”? I’ve got 25 records matching your request, and here’s the first five. As you didn’t specify anything else, I’ve sent them to you in MARC, so I hope that’s OK.

45 Present for dummies 25, eh? Can I have the first ten, please? Oh, and I really don’t like MARC. If you can send Dublin Core that would be great, and if not I’ll settle for some SUTRS. DC:Creator – blah DC:Title – blah …

46 Now it gets hairy… To communicate successfully, Origin and Target need to use the same Attribute Set. An Attribute Set like Bib–1 defines six forms of Attribute — –Use –Relation –Truncation –Completeness –Position –Structure.

47 Use Attributes Define the ‘access points’ on which a search takes place Title, author, subject, etc. See lcweb.loc.gov/z3950/agency/defns/bib1.html

48 Relation Attributes Defines the relationship between the search term and values stored in the database/index Less than, greater than, equal to, phonetically matched, etc.

49 Truncation Attributes Defines which part of the stored value is to be searched on Beginning of any word, end of any word, etc. ‘Smith’ finds ‘Smithsonian’ and not ‘Wordsmith’, and vice versa.

50 Completeness Attributes Defines how much of the stored index term must be in the search term ‘Smith’ finds ‘Smith’, but not ‘Smithsonian’ or ‘the Smith’, etc.

51 Position Attributes Defines where in the index the search term should be located At the start of the field, anywhere, etc.

52 Structure Attributes Specifies the form to be searched for Word, phrase, date, etc.

53 Record Syntaxes Record Syntaxes define the structure in which results are returned to the Origin. This does not mean that Targets need to store data in these formats MARC UKMARC, USMARC/MARC21, DANMARC, MARB, UNIMARC… SUTRS Simple Unstructured Text Record Syntax GRS–1 Generic Record Syntax XML.

54 Profiles Groupings of Attribute Sets, Record Syntaxes, etc. to meet specific needs Disciplinary –Cultural Heritage (CIMI) –Geospatial (GEO) Geographic/Cultural/National –Texas Profile –OPAC Network for Europe (ONE) –Conference of European National Librarians (CENL) Functional –Collections Profile Etc.

55 What’s wrong with Z39.50? Profiles for each discipline Defeats interoperability? Vendor interpretation of the standard Bib–1 bloat Largely invisible to the user Seen as complicated, expensive and old–fashioned Surely no match for XML/RDF/ whatever.

56 Some Joined up working: The Bath Profile Vendors and systems implement areas of the Z39.50 standard differently Regional, National, and disciplinary Profiles have appeared over previous years, many of which have basic functions in common Users wish to search across national/regional boundaries, and between vendors. See

57 Learning from the past The Bath Profile is heavily influenced by ATS–1 CENL DanZIG MODELS ONE Z Texas vCUC See

58 Learning from the past See

59 Doing the work ZIP–PIZ–L mailing list, hosted by National Library of Canada Meeting face–to–face JISC supported a face–to–face meeting in Bath (UK) over the summer of 1999 A draft was widely circulated for comment ISO accreditation process Resulting in Internationally Registered Profile status Ongoing Maintenance Agency activity. See

60 Makx Dekkers PricewaterhouseCoopers/ EC Janifer Gatenby GEAC Juha Hakala National Library of Finland Poul Henrik Jørgensen Danish Library Centre Carrol Lunau National Library of Canada Paul Miller UKOLN Slavko Manojlovich SIRSI/ Memorial University of Newfoundland Bill Moen University of North Texas Judith Pearce National Library of Australia Joe Zeeman CGI. Doing the work See

61 What we proposed Minimisation of ‘defaults’ Where possible, every attribute is defined in the Profile (Use, Relation, Position, Structure, Truncation, Completeness) Three Functional Areas Basic Bibliographic Search & Retrieval Bibliographic Holdings Search & Retrieval Cross–Domain Search & Retrieval Three Levels of Conformance in each Area. See

62 What we proposed SUTRS or XML and UNIMARC or MARC21 for Bibliographic Search results SUTRS and Dublin Core (in XML) for Cross–Domain results Other record syntaxes also permitted, but conformant tools must support at least these. See

63 Making it work… Adopted already by Texas, Atlantic Canada, CIC (Big 10), CENL, etc. Interoperability suite MARC21 in Texas UNIMARC and cross–domain in Europe? Direct approaches to international vendors User testing in Europe and North America Addition of Functional Areas and Levels of Conformance as required Community Information? See

64 Standards… Technical standards make the job easier in the long run for users, curators, and managers but can make it harder to get started There is rarely a ‘right’ standard for all situations so identify a need to do something, without being specific about how know who your audience is, what you have to offer, and what your purpose/message is..