An abstract model for DCMI metadata descriptions Andy Powell UKOLN, University of Bath, UK UKOLN is supported by: DC Usage Board meeting at DC2003, Seattle September/October 2003
DC Seattle, Sept/Oct I am going to… assume people have read the current ‘Abstract Model’ working draft propose a revised (more generic) abstract model look at some of the issues that have been raised encourage discussion of the revised model and the issues consider what happens next with the abstract model document
DC Seattle, Sept/Oct Major issues why develop an abstract model? what is ‘qualified DC’? why limit to DCMI properties? what is a ‘record’? what is ‘simple DC’? why limit to DCMES what is a ‘value’? where does DCSV fit in? relationship to ‘application profiles’? relationship to RDF? abstract model and dumb-down?
DC Seattle, Sept/Oct Why? non-syntax-based view of what constitutes a DC metadata description need to understand what kinds of descriptions we are trying to encode best done without reference to any particular syntax allows us to compare and contrast the capabilities of different encodings syntax X supports feature Y but syntax Z doesn’t supports better mappings between syntaxes
DC Seattle, Sept/Oct What is qualified DC? general feeling that limiting abstract model for ‘qualified DC’ to DCMI properties is too limiting real world applications typically go beyond this therefore, need to re-model at more generic level DCMI Abstract Model frankly my dear, I don’t give a DAM
DC Seattle, Sept/Oct DCMI abstract model a description is made up of one or more properties and their associated values each property is an attribute of the resource being described properties may be repeated a record is a set of descriptions about one or more related resources therefore… each description is about one, and only one, resource (the 1:1 principle) use of the word record may be a problem?
DC Seattle, Sept/Oct DCMI abstract model (2) each value is a resource each value may be denoted by a value string each value string may have an associated encoding scheme each encoding scheme is identified by an encoding scheme URI each value string may have an associated language (e.g. en-GB) a value string is a ‘simple’, human- readable string
DC Seattle, Sept/Oct DCMI abstract model (3) each value may be identified by a value URI each value may have an associated rich value (some marked-up text, an image, a video, some audio, etc. or some combination thereof) each value may have some associated related metadata related metadata is a description of a related resource – e.g. metadata about the person who is the creator of a document…
DC Seattle, Sept/Oct What is a record? a record is a set of descriptions about one or more related resources, e.g. a description of a resource and a description of its creator a description of a resource, a rights statement about the resource and a description of the description note: a description is about a single resource and is made up of one or more properties and their associated values
DC Seattle, Sept/Oct What is a value? a value is the physical or conceptual entity that is associated with a property when it is used to describe a resource a person (physical) an organisation (physical) a subject (conceptual) a country (physical) a type (conceptual) etc. therefore, in the abstract model, a value is always a resource
DC Seattle, Sept/Oct A value is always a resource in the DCMI abstract model, a value is always a resource the value resource may be identified by a value URI be denoted by a string value and/or a rich value have some associated related metadata …but the value is always a resource! I think this has an impact on the RDF encodings??
DC Seattle, Sept/Oct But some problems… some problems with wording of existing DCMES definitions… CCP element values defined to be a ‘…resource…’ relation, identifier and source defined to be a ‘…reference to a resource…’ rights defined to be either a ‘…resource…’ or a ‘link to a service that provides a resource…’ problem: too much of the model is embedded into the definition!
DC Seattle, Sept/Oct What is qualified DC? a ‘qualified DC record’ is … any record that conforms to the DCMI abstract model contains a description that uses at least one DCMI term however, this means that it is probably not possible to define a single XML schema for qualified DC records – but can provide a template XML schema
DC Seattle, Sept/Oct What is simple DC? a ‘simple DC record’ is … any record that conforms to the DCMI abstract model comprises only a single description uses only properties taken from DCMES makes no use of value URIs, encoding schemes, rich values or related metadata
DC Seattle, Sept/Oct …or to put it differently a simple DC record is made up of a single description that description is made up of one or more properties and their associated values each property is an attribute of the resource being described each property must be one of the 15 DCMES elements properties may be repeated each value is denoted by a value string each value string may have an associated language (e.g. en-GB)
DC Seattle, Sept/Oct …or to put it differently simple DC is an ‘application profile’ that only uses terms taken from the DCMES
DC Seattle, Sept/Oct simple DC and value URIs all values in simple DC are denoted using only a value string the value string can be a URI… …but there is nothing to formally indicate that the value string is a URI simple DC software applications may choose to guess which value strings are URIs and which aren’t
DC Seattle, Sept/Oct Simple DC and audience why isn’t dcterms:audience included in ‘simple DC’? because single namespace is simpler than multiple namespaces dc:xxx and dcterms:xxx because static definition is simpler than one that grows over time audience + … + … because, arguably, audience not part of the ‘core’ the ‘t-shirt’ problem
DC Seattle, Sept/Oct Abstract model and DCSV? DCSV provides mechanism for encoding ‘markup’ in value string thus DCSV runs slightly counter to the abstract model DCSV better handled as ‘related metadata’ e.g. Period provides related metadata about a conceptual ‘period in time’ impact? XML enc. good – string enc. bad? suggest no new proposals based on DSCV for the time being
DC Seattle, Sept/Oct What is a DCAP? a Dublin Core Application Profile (as currently defined) declares the properties and encoding schemes used to construct a description as used within a particular application problems… DCAPs don’t currently cover the whole abstract model DCAPs define what a description is – but most ‘applications’ need defining at the record level
DC Seattle, Sept/Oct RDF vs. abstract model what is the relationship between RDF and the abstract model? RDF provides richest encoding syntax currently full encoding of all features of the model but expect to see model fully implemented in XML as well (expect HTML syntax to always be a partial implementation)
DC Seattle, Sept/Oct Dumb-down intelligent vs. dumb, element vs. value element dumb-down (dumb) ignore anything that isn’t [DCMES/an element] element dumb-down (intelligent) resolve sub-properties until you get to [DCMES/an element] value dumb-down (dumb) use value URI or value string as value string value dumb-down (intelligent) use knowledge of related metadata, or value string to create new value string resolve sub-classes/broader terms
DC Seattle, Sept/Oct sub-properties and classes RDFS and human-readable declarations of DCMI terms refer to sub-properties and sub-classes however, these don’t formally appear in the abstract model (expect as part of dumb-down) where do these fit into the model? I think they belong in the ‘grammatical principles’ document
DC Seattle, Sept/Oct
DC Seattle, Sept/Oct Example 1 – dc:creator <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:my=" Andy Powell Example RDF description using dc:creator…
DC Seattle, Sept/Oct <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:my=" Andy Powell Example 1 – dc:creator dc:creator Andy Powell… my:affiliation my: …and the RDF model it represents. UKOLN, Univ… Andy Po… rdfs:label my:name
DC Seattle, Sept/Oct <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:my=" Andy Powell Example 1 – dc:creator dc:creator Andy Powell… my:affiliation my: UKOLN, Univ… Andy Po… rdfs:label my:name But… we don’t want to embed all this information into every instance metadata record do we? relatedMetadata
DC Seattle, Sept/Oct <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:my=" Andy Powell Example 1 – dc:creator dc:creator Andy Powell… rdfs:label <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:my=" Andy Powell my:affiliation my: UKOLN, Univ… Andy Po… my:name Need to separate part of the information out and store it in a single place – in this case in a directory service…
DC Seattle, Sept/Oct <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:my=" Andy Powell Example 1 – dc:creator valueURI dc:creator Andy Powell… rdfs:label <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:my=" Andy Powell valueURI my:affiliation my: UKOLN, Univ… Andy Po… my:name To do this we need to assign a URI (the ‘valueURI’) to the anonymous ‘value’ node…
DC Seattle, Sept/Oct <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:my=" Andy Powell Example 1 – dc:creator valueURI dc:creator Andy Powell… rdfs:label <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:my=" Andy Powell valueURI my:affiliation my: UKOLN, Univ… Andy Po… my:name relatedMetadataURI The document containing this information is itself an RDF resource (the ‘relatedMetadata’) and has a URI
DC Seattle, Sept/Oct <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:my=" Andy Powell Example 1 – dc:creator valueURI dc:creator Andy Powell… rdfs:label <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:my=" Andy Powell valueURI my:affiliation my: UKOLN, Univ… Andy Po… my:name relatedMetadataURI rdfs:seeAlso Use rdf:seeAlso to form linkage between description and relatedMetadata…
DC Seattle, Sept/Oct Example 2 – dc:subject <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:dcterms=" D Formate Dehydrogenase Example RDF description using dc:subject (taken from Qualified DC in RDF recommendation…
DC Seattle, Sept/Oct Example 2 – dc:subject <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:dcterms=" D Formate Dehydrogenase dcterms:MESH dc:subject rdf:type D08.586… rdf:type rdfs:label Formated… rdfs:value …and the RDF model it represents.
DC Seattle, Sept/Oct Example 2 – dc:subject <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:dcterms=" D Formate Dehydrogenase dcterms:MESH dc:subject rdf:type But… we don’t want to embed all this information into every instance metadata record do we? relatedMetadata D08.586… rdfs:label Formated… rdfs:value
DC Seattle, Sept/Oct Example 2 – dc:subject <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:dcterms=" D dcterms:MESH dc:subject rdf:type <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:dcterms=" D Formate Dehydrogenase dcterms:MESH D08.586… Formated… Need to separate part of the information out and store it in a single place – in this case with the terminology owner… rdfs:label Formated…
DC Seattle, Sept/Oct Example 2 – dc:subject <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:dcterms=" D valueURI dcterms:MESH dc:subject rdf:type <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:dcterms=" D Formate Dehydrogenase valueURI dcterms:MESH D08.586… Formated… To do this we need to assign a URI (the ‘valueURI’) to the anonymous ‘value’ node… rdfs:label Formated…
DC Seattle, Sept/Oct Example 2 – dc:subject <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:dcterms=" D valueURI dcterms:MESH dc:subject rdf:type <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:dcterms=" D Formate Dehydrogenase valueURI dcterms:MESH D08.586… Formated… relatedMetadataURI The document containing this information is itself an RDF resource (the ‘relatedMetadata’) and has a URI rdfs:label Formated…
DC Seattle, Sept/Oct Example 2 – dc:subject <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:dcterms=" D valueURI dcterms:MESH dc:subject rdf:type <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:dcterms=" D Formate Dehydrogenase valueURI dcterms:MESH D08.586… Formated… relatedMetadataURI rdfs:seeAlso Use rdf:seeAlso to form linkage between description and relatedMetadata… rdfs:label Formated…
DC Seattle, Sept/Oct Abstract DC model <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:dcterms=" D valueURI dcterms:MESH dc:subject rdf:type <rdf:RDF xmlns:rdf= xmlns:rdfs= xmlns:dc= xmlns:dcterms=" D Formate Dehydrogenase valueURI dcterms:MESH D08.586… Formated… relatedMetadataURI rdfs:seeAlso resource property valueURI valueString In terms of abstract DC model we now have: resource, property, valueURI, valueString (and valueStringLang), encodingScheme, relatedMetadata resource property valueURI relatedMetadata encodingScheme rdfs:label Formated… valueString (valueStringLang)