Download presentation
Presentation is loading. Please wait.
1
Introduction to Persistent Identifiers
Summary Definition of persistent identifiers Which objects – physical / digital Proliferation of identifiers What do identifiers allow us to do The challenge of real persistence Social contracts Present, real examples Course on persistent identifiers, Madrid (Spain) Vocabularies and RDF Kevin Richards Software Developer Landcare Research February 8th, 2012
2
Introduction This presentation explains how to express your data using standard or common vocabularies This session is presented by Kevin Richards, Software Developer, at Land Research in New Zealand.
3
Summary Vocabularies and their expression in RDF Recommended vocabularies Introduction to RDF (syntax and semantics) Tools for working with RDF
4
Recommended Vocabularies
Areas of interest Taxa, Taxon Concepts Specimens, Observations, Surveys Collections Descriptions (of taxa) Publications Geospatial data Media People/Agents Basic metadata
5
Recommended Vocabularies
Taxa, Taxon Concepts Taxon Name, Taxon Concept, Concept Relationships TDWG TCS Schema Taxonomic Databases Working Group, Taxon Concept Schema, DwC Vocabulary Darwin Core vocabulary,
6
Recommended Vocabularies
Specimens, Observations, Collection Items TDWG ABCD Schema Taxonomic Databases Working Group Access to Biological Collections Data schema, DwC Vocabulary Darwin Core vocabulary,
7
Recommended Vocabularies
Natural Collections Describe a collection, e.g. a herbarium collection TDWG NCD schema Taxonomic Databases Working Group Natural Collections Description schema,
8
Recommended Vocabularies
Descriptions Circumscriptions and protologues for specimens and taxa TDWG SDD schema Taxonomic Databases Working Group Structured Descriptive Data schema, TDWG SPM (Species Profile Model)
9
Recommended Vocabularies
Literature A bit of a short-coming at present Can be done to a small degree with Darwin Core ( and other schemas such as RIS Or linked data ontologies, e.g. Open Citations Note problem is much bigger than our domain
10
Recommended Vocabularies
Geospatial Data E.g. locations, coordinates OGC standards, Some in DwC
11
Recommended Vocabularies
Media TDWG Audubon Core
12
Recommended Vocabularies
Other related vocabularies Generic vocabularies: People – Friend of a Friend (FOAF), Metadata - Dublin Core, Linked Data GeoNames, DBPedia, TaxonConcept.org, Flickr, Gene Ontology, PubMed
14
Introduction to RDF Resource Description Framework
Resource = thing or piece of data to describe Based on triples (Subject : Predicate : Object) Subject X has property value Y E.g. Specimen CHR1234 has CollectionDate January Object can be literal or another resource Has a defined XML representation
15
Introduction to RDF Local identifier: EC 12921 Triples:
E.g. Local identifier: EC 12921 Triples: EC is in collection EC EC has accession number EC 12921 EC collected on 1 August 2001 EC collected by B. Smith … Possible persistent identifier (URI):
16
RDF formats Statement - “ has a creator whose value is John Smith” N3 notation – : dc:creator “John Smith”. Xml – <rdf:Description rdf:about=" <dc:creator>John Smith</dc:creator> </rdf:Description>
17
rdf:resource Eg <?xml version="1.0"?>
<rdf:RDF xmlns:rdf=" xmlns:dc=" <rdf:Description rdf:about=" <dc:language>en</dc:language> <dc:creator rdf:resource=" /> </rdf:Description> </rdf:RDF>
18
Basic RDF rdf:Description – basic description xml node
<?xml version="1.0"?> <rdf:RDF xmlns:rdf=" xmlns:dc=" <rdf:Description rdf:about=" <dc:language>en</dc:language> </rdf:Description> </rdf:RDF> rdf:Description – basic description xml node Every xml node must be namespaced and the namespace must be resolvable DublinCore – handy set of basic descriptive elements, such as title, creator, language, etc
19
RDF Triples are atomic May not have all triples at once e.g.
<?xml version="1.0"?> <rdf:RDF xmlns:rdf=" xmlns:dc=" <rdf:Description rdf:about=" <dc:language>en</dc:language> </rdf:Description> </rdf:RDF> <?xml version="1.0"?> <rdf:RDF xmlns:rdf=" xmlns:dc=" <rdf:Description rdf:about=" <dc:lcreator>John Smith</dc:creator> </rdf:Description> </rdf:RDF>
20
Introduction to RDF Possible RDF XML representation (DwC):
<?xml version="1.0" encoding="UTF-8"?> <rdf:RDF …> <rdf:Description rdf:about=" <rdfs:type rdf:resource=" <dwc:occurrenceID> <dwc:catalogNumber>EC 12921</dwc:catalogNumber> <dwc:recordedBy>B. Smith</dwc:recordedBy> <dwc:eventDate> </dwc:eventDate> <dwc:locationID rdf:resource=“ <dwc:locality></dwc:locality> <dwc:country>New Zealand</dwc:country> <dwc:countryCode>NZL</dwc:countryCode> </rdf:Description> <rdf:Description rdf:about=" <rdfs:type rdf:resource=" <dwc:identificationID> <dwc:dateIdentified> </dwc:dateIdentified> <dwc:identifiedBy>B. Smith</dwc:identifiedBy> <dwc:taxonID>urn:lsid:indexfungorum.org:names:494891</dwc:taxonID> <dwc:scientificName>Amanita alba Lam.</dwc:scientificName> </rdf:RDF> Comment on location ID (RDF will be shown elsewhere) and describe locationID / locality – locality is the cached text of the location, for handiness, same with taxon Comment on countryCode – important to use standards!!
21
Introduction to RDF Online resources very useful TDWG
Google sites for Darwin Core etc have a look at
22
Introduction to RDF RDF Schema Like XML schema
Not “nested” like XML – i.e. Each nested item must be identifiable itself More object oriented Various ontology methodologies RDFS OWL OWL-DS Reasoning, inference
23
RDF Types Eg example.com employee Jane Smith is of type Person
Define data types Improves control of input data ranges Good to show if you have typed your objects well Eg birthDate of a person <person:birthDate rdf:dataType= " Eg example.com employee Jane Smith is of type Person (n3 notation) _:jane exterms:mailbox . _:jane rdf:type exterms:Person . _:jane exterms:name "Jane Smith" . _:jane exterms:empID "23748"
24
Inference Person A Rdf triple – A knows B Person B
Inferred triple - A is two steps separated from C Rdf triple – B knows C Person C
25
Introduction to RDF SPARQL SPARQL Protocol and RDF Query Language
Similar to SQL and other query languages Allows querying of RDF triple stores
26
Introduction to RDF Vocabularies and inference
Important to reuse existing vocabularies and ontologies if possible Important to define linkages with other vocabularies if not re-using Important to define linkages with other date objects if possible allows reuse, integration, inference queries
27
Tools Useful tools to work with RDF Protégé Sesame Tabulator
Tool for creating RDF vocabularies, ontologies and documents Sesame Web based system for creating RDF triple stores and allowing SPARQL queries Tabulator Browser plug-in to display user friendly representations of RDF More on W3 site,
28
Summary/conclusions Recommended vocabularies Introduction to RDF Tools
Taxa, taxon concepts, observations, literature, natural collections, descriptions, geospatial TDWG a good resource Introduction to RDF uses triples, xml representation, uses (requires) URIs and persistent identifiers, schema language with constructs and inferencing, query language Tools
29
Introduction to Persistent Identifiers
Summary Definition of persistent identifiers Which objects – physical / digital Proliferation of identifiers What do identifiers allow us to do The challenge of real persistence Social contracts Present, real examples Course on persistent identifiers, Madrid (Spain) Vocabularies and RDF Kevin Richards Software Developer Landcare Research February 8th, 2012
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.