Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dublin Core, OAI-PMH and the eBank UK schema Monica Duke UKOLN, University of Bath, UK UKOLN is supported by:

Similar presentations


Presentation on theme: "Dublin Core, OAI-PMH and the eBank UK schema Monica Duke UKOLN, University of Bath, UK UKOLN is supported by:"— Presentation transcript:

1 Dublin Core, OAI-PMH and the eBank UK schema Monica Duke m.duke@ukoln.ac.uk UKOLN, University of Bath, UK http://www.ukoln.ac.uk/ UKOLN is supported by: eBank UK workshop on Chemistry schemas University of Bath, 18 th February 2005

2 2 Contents Whirlwind guide to DC DC abstract model Encoding in XML OAI-PMH and the eBank UK project eBank UK XML schema definition Crystallography suggestions Note: you are going to see some angle-brackets

3 3 Acknowledgement Andy Powell (UKOLN) For donation of slides Recommendation Dublin Core Conference Tutorials (Oct 04) http://www.ukoln.ac.uk/metadata/prese ntations/ecdl-2004/dc-tutorial/

4 4 Bluffers guide to DC 1.DC short for Dublin Core 2.simple metadata standard, supporting cross-domain resource discovery 3.original focus on Web resources but that is no longer the case – e.g. usage to describe physical artefacts in museums 4.current usage across wide range of sectors – academic, e-government, museums, libraries, business, semantic Web http://dublincore.org/

5 5 Bluffers Guide to DC simple DC provides 15 elements (metadata properties) multiple encoding syntaxes including HTML tags, XML and RDF/XML (XML schema are available) dc:titledc:contributordc:source dc:creatordc:datedc:language dc:subjectdc:typedc:relation dc:descriptiondc:formatdc:coverage dc:publisherdc:identifierdc:rights

6 6 Bluffers Guide to DC 7.relatively slow programme of adding new terms to qualified DC new elements (e.g. dcterms:audience) element refinements (e.g. dcterms:dateCopyrighted) encoding schemes (e.g. dcterms:LCSH and dcterms:W3CDTF 48 elements and 17 encoding schemes http://dublincore.org/documents/dcmi-terms/

7 7 Bluffers Guide to DC 8.DC can be embedded into HTML pages but almost none of the big search engines will use it! Why? Lack of trust… meta-spam meta-crap however, embedding DC in HTML may be worthwhile if your own site search engine uses it 9.however, simple DC forms baseline metadata format for the OAI protocol.

8 Important DCMI documents… DCMI Abstract Model – DRAFT http://www.ukoln.ac.uk/metadata/dcmi/abstract-model/ http://www.ukoln.ac.uk/metadata/dcmi/abstract-model/ Expressing Dublin Core in HTML/XHTML meta and link elements http://dublincore.org/documents/dcq-html/ http://dublincore.org/documents/dcq-html/ Guidelines for implementing Dublin Core in XML http://dublincore.org/documents/dc-xml-guidelines/ http://dublincore.org/documents/dc-xml-guidelines/ Expressing Simple Dublin Core in RDF/XML http://dublincore.org/documents/dcmes-xml/ http://dublincore.org/documents/dcmes-xml/ Expressing Qualified Dublin Core in RDF/XML http://dublincore.org/documents/dcq-rdf-xml/ http://dublincore.org/documents/dcq-rdf-xml/ Namespace Policy for the DCMI http://dublincore.org/documents/dcmi-namespace/ http://dublincore.org/documents/dcmi-namespace/ DCMI Metadata Terms http://dublincore.org/documents/dcmi-terms/ http://dublincore.org/documents/dcmi-terms/

9 9 Abstract models for DC

10 10 Why an abstract model? Before we start creating DCMI descriptions we need to understand what kinds of things we want to say about resources the DCMI view of the world/resources we want to describe (the DCMI resource model) the DCMI view of the descriptions we make about that world (the DCMI description model) Known as the DCMI abstract model Simplified view presented here

11 11

12 12 What is a resource? W3C/IETF definition of resource is …anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Not all resources are network "retrievable"; e.g., human beings, corporations, and bound books in a library can also be considered resources. i.e. a resource is anything physical things (books, cars, people) digital things (Web pages, digital images) conceptual things (colours, points in time)

13 13 DC and resources but… this seems to be too wide for the things we can describe with DC! can we really describe people using DC? do people have titles and subjects? no… in general we only use DC to describe a sub-set of all resources anything covered by the DCMIType list… Collection, Dataset, Event, Image (Still or Moving), Interactive Resource, Service, Software, Sound, Text, Physical Object

14 14 DCMI resource model each resource that we want to describe has zero or more properties a property is a specific aspect, characteristic, attribute or relation used to describe a resource each property has one or more values each value is a resource (the physical or conceptual entity that is associated with a property when it is used to describe a resource)

15 15 DCMI description model a description is made up of one or more statements (about one, and only one, resource) and zero or one resource URI (a URI reference that identifies the resource being described) each statement is made up of a property URI (that identifies a property), zero or one value URI (that identifies a value of the property), zero or one encoding scheme URI (that identifies the class of the value) and zero or more value representations of the value

16 16 DCMI description model (2) each property is an attribute of the resource being described each property URI may be repeated in multiple statements the value representation may take the form of a value string, a rich value or a related description Value string is a simple human-readable string May have an associated language (e.g.en- gb)

17 17 The 1:1 principle notice that the model indicates that each property used in a description must be an attribute of the resource being described this is commonly referred to as the 1:1 principle - the principle that a DCMI metadata description describes one, and only one, resource however…

18 18 Description sets real-world metadata applications tend to be based on loosely grouped sets of descriptions (where the described resources are typically related in some way) known here as description sets for example, a description set might comprise descriptions of both a painting and the artist

19 19 DCMI records description sets are instantiated, for the purposes of exchange between software applications, in the form of metadata records each record conforms to one of the DCMI encoding guidelines (XHTML meta tags, XML, RDF/XML, etc.) a document andy powell

20 20 Model summary record (encoded as XHTML, XML or RDF/XML) description set description (about a resource (URI)) statement property (URI) value (URI) representationvalue string OR rich value OR related description vocabulary encoding scheme (URI) syntax encoding scheme (URI) language (e.g. en-GB)

21 21 Simple and qualified a simple DC record is a record that: conforms to the abstract model, comprises only a single description, uses only the 15 properties in the Dublin Core Metadata Element Set, makes no use of value URIs, encoding schemes, rich values or related descriptions.

22 22 Simple and qualified DC a simple DC record is a record that: conforms to the abstract model, comprises only a single description, uses only the 15 properties in the Dublin Core Metadata Element Set, makes no use of value URIs, encoding schemes, rich values or related descriptions.

23 23 Qualified DC model a qualified DC record is a record that: conforms to the DCMI abstract model, contains at least one property taken from the DCMI Metadata Terms recommendation

24 24 A couple of notes… everything in DC is optional Dumb-down the process of translating a qualified DC metadata record into a simple DC metadata record informed dumb-down uninformed dumb-down …

25 25 Encoding DC in XML

26 26 DCMI recommendations For the full details: Guidelines for implementing Dublin Core in XML http://dublincore.org/documents/dc-xml-guidelines/ http://dublincore.org/documents/dc-xml-guidelines/ nine recommendations for encoding DC in XML

27 27 General Recommendations implementers should base their XML applications on XML Schemas rather than XML DTDs Use of upper and lower case in property names and encoding schemes e.g. property names for the 15 DCMES elements should be lower-case

28 28 Properties and values implementers should encode properties as XML elements and values as the content of those elements the name of the XML element should be an XML qualified name (QName) of the property Dublin Core in XML do not use constructs like

29 29 Repeating Properties multiple value strings should be encoded by repeating the XML element for that property First title Second title

30 30 Value String language where the language of the value is indicated, it should be encoded using the xml:lang attribute seafood fruits de mer

31 31 Container elements note that it is anticipated that records will be encoded within one or more container XML element(s) of some kind candidate container element names include,,, and

32 32 Element Refinements element refinements should be treated in the same way as other properties the name of the XML element should be an XML qualified name (QName): 2002-06 do not use any of the following: 2002-06 2002-06 2002-06

33 33 Encoding schemes encoding schemes should be implemented using the ' xsi:type ' attribute of the XML element for the property the name of the encoding scheme should be given as the attribute value, and should be in the form of an XML qualified name (QName): http://www.ukoln.ac.uk/

34 OAI-PMH

35 35 OAI-PMH OAI Protocol for Metadata Harvesting simple protocol for sharing metadata records between applications currently at version 2.0 based on HTTP, XML, XML Schema and XML namespaces allows a harvester to ask a remote repository for some or all of its metadata records where some is based on date-stamps, sets, metadata formats http://www.openarchives.org/

36 36 OAI-PMH (2) OAI-PMH carries only metadata content (e.g. full-text or image) made available separately – typically at URL in metadata simple DC is default (mandatory) record format supports any record format provided it can be encoded using XML (e.g. DC, IMS, MARC, ODRL, …)

37 37 OAI-PMH model OAI-PMH identifier = entry point to all records pertaining to the resource resource Dublin Core Metadata item records MARC Metadata Crystal Structure Report Jump-off page (HTML) Model adapted from: http://www.dlib.org/dlib/december04/vandesompel/12vandesompel.html Model adapted from: http://www.dlib.org/dlib/december04/vandesompel/12vandesompel.html

38 38 Data Flow in eBank UK Submit Store/link Data files Metadata present HTML Institutional repository OAI-PMH Harvest (XML) Index and Search present HTML eBank aggregator create

39 39 OAI-PMH model OAI-PMH identifier = entry point to all records pertaining to the resource resource Dublin Core Metadata item records IMS Metadata Crystal Structure Report Jump-off page (HTML) Linking Dublin Core Metadata type Date created 1:1 principle

40 40 OAI-PMH outline record oai:ecrystals.chem.soton.ac.uk:27 2004-07-20 7374617475733D707562 http://ecrystals.chem.soton.ac.uk/archive/00000027/

41 41 OAI-PMH outline record http://ecrystals.chem.soton.ac.uk/archive/00000027/ http://ecrystals.chem.soton.ac.uk/archive/00000027/#cif

42 42 OAI-PMH outline record <!– Need a wrapper here http://ecrystals.chem.soton.ac.uk/archive/00000027/ http://ecrystals.chem.soton.ac.uk/archive/00000027/#cif <!– insert end wrapper here

43 43 Wrapper choices Invent our own Re-use a packaging standard Choice from MPEG_DIDL, METS, IMS METS preliminary use –Free –Innate support for DC –Dig-lib currency Increasing interest in packaging formats in the OAI-PMH community For Links see:http://www.dlib.org/dlib/december04/vandesompel/12vandesompel.h tml For Links see:http://www.dlib.org/dlib/december04/vandesompel/12vandesompel.h tml

44 44 Using packaging <!– Need a wrapper here http://ecrystals.chem.soton.ac.uk/archive/00000027/

45 45 OAI-PMH model OAI-PMH identifier = entry point to all records pertaining to the resource resource Dublin Core Metadata item records METS Metadata Crystal Structure Report Jump-off page (HTML) Linking Dublin Core Metadata (eBank_dc) DC 1:1 principle Crystal Structure CIFDataset

46 46 Anatomy of an eBank UK record

47 47 eBank_dc schema http://ecrystals.chem.soton.ac.uk/archive/00000027/ Crystal Structure Hursthouse, Michael B. Coles, Simon J. C14H22O6 (5,2 -Dimethyl-5 -oxo-octahydro-[2, 2 ]bifuranyl-5-yl)- hydroxy-acetic acid ethyl ester HUZDEL Organic http://scripts.iucr.org/cgi- bin/getarticleid?issn=1600-5368&volume=59&fpage=o501&details=yes 2004-05-23

48 48 eBank_dc schema http://ecrystals.chem.soton.ac.uk/archive/00000027/ Crystal Structure Hursthouse, Michael B. Coles, Simon J. ??????? C14H22O6 (5,2 -Dimethyl-5 -oxo-octahydro-[2, 2 ]bifuranyl-5-yl)- hydroxy-acetic acid ethyl ester HUZDEL Organic http://scripts.iucr.org/cgi- bin/getarticleid?issn=1600-5368&volume=59&fpage=o501&details=yes 2004-05-23

49 49 eBank_dc schema http://ecrystals.chem.soton.ac.uk/archive/00000027/ Crystal Structure Hursthouse, Michael B. Coles, Simon J. C14H22O6 (5,2 -Dimethyl-5 -oxo-octahydro-[2, 2 ]bifuranyl-5-yl)- hydroxy-acetic acid ethyl ester HUZDEL Organic http://scripts.iucr.org/cgi- bin/getarticleid?issn=1600-5368&volume=59&fpage=o501&details=yes 2004-05-23 Add general subject terms e.g. Chemistry, Crystallography Add general subject terms e.g. Chemistry, Crystallography Are there existing ontologies that can be re-used for general subject terms? Are there existing ontologies that can be re-used for general subject terms?

50 50 eBank_dc schema http://ecrystals.chem.soton.ac.uk/archive/00000027/ Crystal Structure Hursthouse, Michael B. Coles, Simon J. C14H22O6 (5,2 -Dimethyl-5 -oxo-octahydro-[2, 2 ]bifuranyl-5-yl)- hydroxy-acetic acid ethyl ester HUZDEL Organic http://scripts.iucr.org/cgi- bin/getarticleid?issn=1600-5368&volume=59&fpage=o501&details=yes 2004-05-23

51 51 eBank_dc schema http://ecrystals.chem.soton.ac.uk/archive/00000027/ Crystal Structure Hursthouse, Michael B. Coles, Simon J. C14H22O6 (5,2 -Dimethyl-5 -oxo-octahydro-[2, 2 ]bifuranyl-5-yl)- hydroxy-acetic acid ethyl ester HUZDEL Organic http://scripts.iucr.org/cgi- bin/getarticleid?issn=1600-5368&volume=59&fpage=o501&details=yes 2004-05-23

52 52 eBank_dc schema http://ecrystals.chem.soton.ac.uk/archive/00000027/ Crystal Structure Hursthouse, Michael B. Coles, Simon J. C14H22O6 (5,2 -Dimethyl-5 -oxo-octahydro-[2, 2 ]bifuranyl-5-yl)- hydroxy-acetic acid ethyl ester HUZDEL Organic http://scripts.iucr.org/cgi- bin/getarticleid?issn=1600-5368&volume=59&fpage=o501&details=yes 2004-05-23

53 53 eBank_dc schema (cont.) http://ecrystals.chem.soton.ac.uk/archive/00000027/#cif http://ecrystals.chem.soton.ac.uk/archive/00000027/#proc http://ecrystals.chem.soton.ac.uk/archive/00000027/#rfne http://ecrystals.chem.soton.ac.uk/archive/00000027/#soln CIFDataset http://ecrystals.chem.soton.ac.uk/archive/00000027/#cif METS Metadata Dublin Core Metadata (ebank_dc)

54 54 Suggestions It useful to design two sets of metadata: A core set expressible within the OAI- PMH Dublin Core manifestation; an extended set specific to value- adding agents. Use OAI-PMH friends facility Define OAI-PMH sets for crystallography data

55 55 Questions?


Download ppt "Dublin Core, OAI-PMH and the eBank UK schema Monica Duke UKOLN, University of Bath, UK UKOLN is supported by:"

Similar presentations


Ads by Google