Metadata-based Discovery: Experience in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK A centre of expertise in digital information management
Discovery in the curation life-cycle “Digital Curation itself is the active management of data over the life-cycle of scholarly and scientific interest; it is the key to reproducibility and re-use. Metadata for resource discovery and retrieval are important, with mark-up on time/place referencing as well as subject description and linkage to discipline based ontologies providing key research foci.” Chris Rusbridge et al.
Digital Library Infrastructures Historically, cross-search and discovery protocols an area of interest and research Z39.50 perceived to have barriers/limitations OAI-PMH developed using a harvesting model
The OAI-PMH Data providers Service providers Harvesting based on OAI-PMH
The OAI-PMH OAI Protocol for Metadata Harvesting simple protocol for sharing metadata records between applications currently at version 2.0 based on HTTP, XML, XML Schema and XML namespaces allows a harvester to ask a remote repository for some or all of its metadata records where ‘some’ is based on date-stamps, sets, metadata formats
Metadata in the eBank UK project Simple Dublin Core Intended for resource discovery Compatible with OAI-PMH Qualified to specify ‘vocabularies’ Refinements: aid interpretation of element value E.g. seafood “Dumbing-down” principle applies
Metadata terms Creator Rights Date Type Identifier Specified using XML schema and documented using an Application Profile Subject InChI ChemicalFormula Organic
Information sources for Crystallography Cross-discipline sources OAIster DAREnet Discipline-specific ChemRefer Chemistry Central Crystallography Open Database Reciprocal Net Texts/publications, chemistry general Data, crystallography
The discovery landscape Some within OAI-PMH infrastructure (metadata- based) Variety of (human) search interfaces (simple to advanced) Well established sources Cambridge Structural Database Protein Data Bank
OAIster An OAI-PMH aggregator Wide-ranging and inclusive: Any repository, all content types Metadata from 675 institutions Limit by resource type inc. datasets (5 results) Pointers to collections of data records for ‘crystallography’ Results spread across several sources
OAIster
DAREnet Worldwide access to Dutch academic research results Simple search: “crystallography” (40 results) General advanced search (author, year)
DAREnet
ChemRefer Access to full text chemical, pharmaceutical literature Index Simple search interface
ChemRefer
ChemRefer display of results
Chemistry Central No search feature (through Biomed central)
Crystallography Open Database (COD) Promotes open data Allows submission ‘REF’ format also used 40K entries
COD
Reciprocal Net A distributed crystallography network for researchers, students and the general public Search engine Crystallography-specific search interface
Reciprocal Net Search Interface
Dataset result in Reciprocal Net
Joining up the landscape Technical infrastructure differences can be overcome Agreement on common APIs, metadata sets Hide API differences from user Survey in one application area – how similar are other disciplines?
Issues with cross-search Audiences Who are the user groups? What are their information needs? Selection Identifying subsets of interest Human Interface design Search options Presentation of heterogenous information