Download presentation
Presentation is loading. Please wait.
1
EPrints Workshop, January 20051 eBank UK: Dissemination of research data using EPrints Simon Coles, School of Chemistry, University of Southampton
2
EPrints Workshop, January 20052 Overview Scholarly communications in Chemistry Data, information, workflows and provenance The data publication bottleneck e-Science and chemistry eBank UK Information architecture, data flow and interoperability Challenges for the future Expansion into other disciplines and data formats
3
EPrints Workshop, January 20053 Research & e-Science workflows Aggregator services: national, commercial Repositories : institutional, e-prints, subject, data, learning objects Data curation: databases & databanks Validation Harvesting metadata Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Deposit / self- archiving Peer-reviewed publications: journals, conference proceedings Publication Validation Data analysis, transformation, mining, modelling Searching, harvesting, embedding Presentation services: subject, media-specific, data, commercial portals Resource discovery, linking, embedding Linking The scholarly knowledge cycle. Liz Lyon, eBankUK article. Ariadne, July 2003.
4
EPrints Workshop, January 20054 Learning & Teaching workflows Research & e- Science workflows Aggregator services: eBank UK Repositories : institutional, e-prints, subject, data, learning objects Data curation: databases & databanks Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules Validation Harvestin gmetadat a Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Resource discovery, linking, embedding Deposit / self- archiving Peer-reviewed publications: journals, conference proceedings Publication Validation Data analysis, transformation, mining, modelling Resource discovery, linking, embedding Deposit / self- archiving Learning object creation, re-use Searching, harvesting, embedding Quality assurance bodies Validation Presentation services: subject, media-specific, data, commercial portals Resource discovery, linking, embedding Linking
5
EPrints Workshop, January 20055 Current chemistry publishing protocols Ideas and interpretationsHooks into the literature Results & derived data Raw data!
6
EPrints Workshop, January 20056
7
EPrints Workshop, January 20057 Data Overload! How do we disseminate? EPSRC National Crystallography Service The data deluge
8
EPrints Workshop, January 20058 CombeChem: eScience testbed Properties X-Ray e-Lab Analysis Properties e-Lab Simulation Video Diffractometer Grid Middleware Structures Database
9
EPrints Workshop, January 20059 Establishing common ground… Understand the data creation process Terminology and definitions –Data –Metadata –Datafile –Dataset –Data holding Different views –Digital library researchers, computer scientists, chemists –Generic vs specific –Modeller vs practitioner Aim for a common ontology Modelling the domain Creating a metadata schema
10
EPrints Workshop, January 200510 Crystallography workflow Initialisation: mount new sample on diffractometer & set up data collection Collection: collect data Processing: process and correct images Solution: solve structures Refinement: refine structure CIF: produce CIF (Crystallographic Information File format) Report: generate Crystal Structure Report RAW DATADERIVED DATARESULTS DATA
11
EPrints Workshop, January 200511 Deposition into the archive
12
EPrints Workshop, January 200512 An Archive entry ecrystals.chem.soton.ac.uk
13
EPrints Workshop, January 200513 Access to the underlying data
14
EPrints Workshop, January 200514 Some metadata issues Using simple and qualified Dublin Core Additional chemical information in schema for harvesting e.g. empirical formula Schema contains International Chemical Identifier (InChI) Links to all datasets associated with an experiment Links to individual datasets within an experiment Links to EPrints (and other published literature) derived from the data Using vocabularies specific to crystallography Engaging the broader scientific community to ensure different schemas are compliant and standards can emerge
15
EPrints Workshop, January 200515 ebank_dc record (XML) Crystal structure (data holding) Crystal structure report (HTML) Dataset Institutional repository eBank UK aggregator service ePrint UK aggregator service Subject service Deposit Harvesting OAI-PMH ebank_dc Harvesting OAI-PMH oai_dc Dataset dc:identifier dcterms:references Linking dc:type=“CrystalStructure” and/or “Collection” Model input Andy Powell, UKOLN. Eprint oai_dc record (XML) dcterms:isReferencedBy dc:type=“Eprint” and/or ”Text” Data flow in eBank Eprint “jump-off” page (HTML) dc:identifier Eprint manifestation (e.g. PDF) Linking
16
EPrints Workshop, January 200516 Harvesting: OAIster
17
EPrints Workshop, January 200517 Linking and aggregating
18
EPrints Workshop, January 200518 Embedded in a science portal
19
EPrints Workshop, January 200519 Current situation Version 2.0 eBank metadata schema Pilot institutional e-data repository for harvesting (raw, derived, results data) using EPrints.org software Exports records as ebank_dc and oai_dc Validation of schema & discussion with International Union of Crystallography for final developments and wider deployment Pilot eBank UK aggregator service Developing search interface Version 1.0 Testing with PSIgate physical sciences portal – embedding eBank UK
20
EPrints Workshop, January 200520 What’s next? Progress towards generic metadata schemas Validation against other schema (CCLRC Model) Eprints.org software: allow for more generic scientific data and schemas? Metadata enhancement: keywords based on knowledge of keywords in related publications? Investigate identifiers: International Chemical Identifier Explore context sensitive linking Full embedding into chemical and crystallographic research and publishing e-Learning embedding and pedagogic evaluation Feasibility study in related domains
21
EPrints Workshop, January 200521 Breakout Session? Describing non ‘Dublin Core’ terms Qualified Dublin Core Complex object formats: METS vs MPEG-21 DIDL Set & Friends containers Compliance between schemas One generic schema Develop multiple schemas Rights Use / reuse Publisher Linking & aggregating DOI Keyword ontologies Identifiers Context sensitive linking
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.