UKOLN is supported by: Enhancing access to research data: the challenge of crystallography Rachel Heery, Monica Duke, Michael Day UKOLN, University of.


Similar presentations
IATUL Porto, May 21, 2006 DOI and e-Science Dr Anne E Trefethen Oxford e-Research Centre

Comb-e-Chem Jeremy Frey Sept 2003 From e-Science to Jeremy Frey School of Chemistry University of Southampton, UK X-ray single Mol STM.
Preserv Preservation Eprint Services Scenario: Digital lifecycle begins with author creation and deposit of paper or data content into the institutional.
AHM, Nottingham, September eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon.
S.J. Coles a*, M.B. Hursthouse a, R.A. Stephenson a, P. Cliff b, E. Lyon b, M. Patel b J. Downing c & P. Murray-Rust.
© S.J. Coles 2006 Usability WS, NeSC Jan 06 Enabling the reusability of scientific data: Experiences with designing an open access infrastructure for sharing.
Crystal Structure EPrints: Source Through the Open Archive Initiative S.J. Coles a*, J.G. Frey a, M.B. Hursthouse a, L. Carr b & C.J. Gutteridge.
Opening the Research Data Lifecycle Workshop Capturing and Sharing Research Data Simon Coles School of Chemistry, University of Southampton, U.K.
Crystallographic Metadata Simon Coles CrystalGrid Collaboratory Foundation Meeting September 2004.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. CLADDIER workshop.
Data and metadata in the Reciprocal Net John C. Bollinger Indiana University Molecular Structure Center, Bloomington, IN.
CURRENT ISSUES Current contents Over 3,000 items open access, 42% reports and working papers, 21% journal articles, 21% conference items, 7% book chapters,
S.J. Coles a*, J.G. Frey a, M.B. Hursthouse a, L. Carr b & C.J. Gutteridge b. a School of Chemistry, University of Southampton, UK.; b School of Electronics.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
RCUK, Octiber Archiving research data and research publications. Dr Leslie Carr, Intelligence, Agents Multimedia, University of Southampton Dr Simon.
© S.J. Coles 2006 eCrystals: A Route for Open Access to Small Molecule Crystal Structure Data Simon Coles School of Chemistry, University of Southampton,
Why metadata matters for libraries... Rachel Heery UKOLN: The UK Office for Library and Information Networking, University of Bath
E-Prints What are they?! How do they relate to CombeChem? Simon Coles CombeDay (08/01/2004)
UKOLN is supported by: Enhanced support for eScience: the role of Digital Libraries Digital Libraries Go eScience, ECDL, Alicante September 2006 Rachel.
A centre of expertise in digital information management UKOLN is supported by: Digital libraries and digital scholarship: changing roles.
A centre of expertise in digital information management UKOLN is supported by: Adding Value to Data and Information: Moving towards a Science.
UKOLN is supported by: Realising the scholarly knowledge cycle: The experience of eBank UK Dr Liz Lyon, UKOLN, University of Bath, UK CNI Task Force Meeting.
UKOLN is supported by: From research data to new knowledge: a lifecycle approach. Dr Liz Lyon, Director UKOLN, University of Bath, UK JISC/SURF/CNI Conference.
A centre of expertise in digital information management UKOLN is supported by: Curating the Scientific Record: The Challenges Ahead Dr.
Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director.
UKOLN is supported by: Digital Repositories Roadmap: looking forward The JISC/CNI Meeting, July 2006 Rachel Heery Assistant Director R&D, UKOLN
Integrating research data into the publication workflow: eBank UK experience Rachel Heery, UKOLN, University of Bath
UKOLN is supported by: Digital Libraries and e-Research: new horizons, new challenges? Dr Liz Lyon, Director UKOLN, University of Bath, UK 8 th International.
UKOLN is supported by: Digital Libraries and e-Research: a UK perspective on a changing landscape. Dr Liz Lyon, Director UKOLN, University of Bath, UK.
UKOLN is supported by: eBank UK : linking research data, scholarly communications and learning. Dr Liz Lyon, UKOLN, University of Bath, UK JISC CNI Conference.
UKOLN is supported by: Data, information and knowledge repositories: developing infrastructure to support the e-Research landscape. Dr Liz Lyon, Director.
JISC Joint Programmes Meeting eBank UK : linking research data, learning and scholarly communications. Dr Liz Lyon, UKOLN, University of Bath Dr.
A centre of expertise in digital information management UKOLN is supported by: Digital repositories as research infrastructure: a UK perspective.
UKOLN is supported by: Adding value to open access research data: the eBank UK Project. Dr Liz Lyon, Director UKOLN, University of Bath, UK OAI4, CERN.
A centre of expertise in digital information management UKOLN is supported by: British Academy e-Resources Policy Review: UKOLN Report.
UKOLN is supported by: Emergent technologies & digitisation: the institutional impact. Liz Lyon & Kevin Edge VCs Retreat, October a.
Federation The eCrystals Federation Dr Simon Coles, University of Southampton, UK Dr Liz Lyon, UKOLN, University of Bath, UK Open Repositories 2008, University.
A centre of expertise in digital information management UKOLN is supported by: UK Perspectives on the Curation and Preservation of Scientific.
Federation eCrystals Federation: Open Repositories for Data-driven Science Dr Liz Lyon, UKOLN, University of Bath, UK Dr Simon Coles, University of Southampton,
UKOLN is supported by: The JISC Information Environment Metadata Schema Registry (IEMSR): Update DC-2006, Manzanillo, Mexico October 3-6, 2006 Rachel Heery.
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
© S.J. Coles 2006 Institutional Data Repositories for Chemistry Simon Coles School of Chemistry, University of Southampton, U.K.
EBankII Workshop 1 Making Scientific Data Openly Available Simon Coles School of Chemistry, University of Southampton.
UKOLN is supported by: Developing e-Infrastructure to support new research and learning paradigms. Dr Liz Lyon, Director UKOLN, University of Bath, UK.
Digital | Curation | Centre Supporting Digital Curation to safeguard research data: adding value today and ensuring long-term access Dr Liz Lyon, DCC Associate.
EBank UK CCLRC Workshop February eBank and CCLRC Workshop February 2005 University of Bath.
Digital Repositories: interoperability & common services Closing Remarks Dr Liz Lyon, UKOLN, University of Bath, UK
The Discovery Landscape in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK – eBank UK project A centre.
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
University of Southampton, U.K.
EPrints Workshop, January eBank UK: Dissemination of research data using EPrints Simon Coles, School of Chemistry, University of Southampton.
© S.J. Coles 2005 ACS 2005, San Diego Furthering Chemoinformatics through ‘Crystalloinformatics’ Simon J. Coles EPSRC National Crystallography Service.
© S.J. Coles 2006 Data Management in the Chemistry Domain Simon Coles School of Chemistry, University of Southampton, U.K.
© S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,
EBank UK: linking scientific data, scholarly communication and learning Michael Day and Rachel Heery UKOLN, University of Bath
A centre of expertise in digital information management RDN, e-Prints UK and NOF- Digitise: a (very) small sample of UK OAI activity Andy.
UKOLN is supported by: Digital Preservation Benefits Tools Project Dissemination Workshop Dr Liz Lyon, Associate Director, UK Digital Curation Centre Director,
UKOLN is supported by: Enhancing access to research data: the e-Science project eBank UK A centre of expertise in digital information management.
UKOLN is supported by: Introduction to UKOLN Dr Liz Lyon, Director UKOLN, University of Bath, UK Grand Challenge Meeting, June a centre.
CombeDay Making Data Openly Available Simon Coles.
Open Archive Forum Rachel Heery UKOLN, University of Bath UKOLN is funded by Resource: The Council for Museums, Archives.
Metadata-based Discovery: Experience in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK A centre of.
UKOLN is supported by: Library futures in the new research landscape. Dr Liz Lyon, UKOLN, University of Bath, UK CURL Members Meeting October 2004, London.
eCrystals Federation: Open Repositories for global Open Science
VI-SEEM Data Repository
Realising the scholarly knowledge cycle:
JISC Joint Programmes Meeting 2005
Developing Institutional Data Repositories
eCrystals Federation: Open Repositories for global Open Science
Presentation transcript:

UKOLN is supported by: Enhancing access to research data: the challenge of crystallography Rachel Heery, Monica Duke, Michael Day UKOLN, University of Bath Leslie Carr, Simon Coles University of Southampton A centre of expertise in digital informaion management JCDL 2005, June 7-11, Denver

Enhancing access to research data: overview Crystallography as an exemplar Impact of digital technologies on scientific research process Need new modes of data curation eBank project: applying digital library techniques to support data curation Next steps

Changes in scientific research process Increasing data volumes from eScience / Grid- enabled / cyber-infrastructure applications, big science Changing research methods: high througput technologies, automation, smart labs Potential for re-use of data, new inter-disciplinary research Different types of data: observational data, experimental data, computational data: different stewardship requirements

Data Overload! How do we disseminate? EPSRC National Crystallography Service The data deluge: crystallography

Data overload & the publication bottleneck 25,000,000 2,000, ,000

Current Publishing Process Journal articles: aims, ideas, context, conclusions – only most significant data Raw & underlying data required by peers not readily available

Context: existing data repositories National data archives: –UK Data Archive, Arts and Humanities Data Service, US National Archives and Records Administration (NARA), Atlas Datastore Discipline specific archives: –GenBank, Protein Data Bank Crystallography archives –Cambridge Crystallographic Data Centre (Cambridge Structural Database), Indiana University Molecular Structure Center (Crystal Data Server, Reciprocal Net), FIZ Karlsruhe (Inorganic crystals), Toth Information Systems (CHRYSTMET) Journals require deposit of data to support articles –Typically deposit of summary data…. partial coverage

Crystallography workflow RAW DATADERIVED DATARESULTS DATA Initialisation: mount new sample on diffractometer & set up data collection Collection: collect data Processing: process and correct images Solution: solve structures Refinement: refine structure CIF: produce CIF (Crystallographic Information File) Validation: chemical & crystallographic checks

eBank UK project overview JISC funded in 2003, now in Phase 2 to 2006 Joint effort between crystallographers, computer scientists, digital library researchers Investigating contribution of existing digital library technologies to enable publication at source Partners have interest in dissemination of chemistry research data, open access, OAI, institutional repositories

eBank project team University of Bath, UKOLN Michael Day, Monica Duke, Rachel Heery, Liz Lyon, Traugott Koch University of Southampton, School of Chemistry Simon Coles, Jeremy Frey, Mike Hursthouse University of Southampton, School of Electronics and Computer Science Leslie Carr, Chris Gutteridge University of Manchester, PSIgate John Blunden-Ellis

eBank phase one: achievements Gathered requirements from crystallographers Established pilot institutional repository for crystallography data at Southampton with web interface Developed a demonstrator aggregator service at UKOLN (CCDC exploring aggregation service) Developed appropriate schema Demonstrated a search interface as an embedded service at PSIgate portal Demonstrated an added value service linking research data to papers (one-off)

Institutional repositories…publication at source Institution establishes repository(s) Institution pro-actively supports deposit process OAI provides basis for interoperability Potential for added value services And/Or ….international subject based archives?

Crystallography good fit…. Crystallography has well defined data creation workflow Tradition of sharing using standard file format Crystallography Information File (CIF) What about other chemistry sub- disciplines? other scientific disciplines?

Data Flow in eBank UK OAI-PMH Submit Store/link Harvest (XML) Index and Search Data files Metadata present HTML present HTML Institutional repository eBank aggregator Create

Southampton digital repository

Access to ALL underlying data

OAI-PMH: harvesting and aggregating eBank aggregator at UKOLN demo/ Demonstrating potential for linking between data and journal article

Embedded search service at PSIgate PSIgate subject gateway: service provider

Schema for records made available for harvesting Data holding (collection of files associated with experiment) Qualified Dublin Core data elements plus additional chemical properties –Empirical formula –International Chemical Identifier (InChI) –Compound Class Individual data files Separate records for stage status of each file Description set wrapped into one XML record using METS Research metadata/data as a complex object

ebank_dc record (XML) Crystal structure (data holding) Crystal structure report (HTML) Dataset Institutional repositories eBank UK aggregator service ePrint UK aggregator service Other aggregators and services Deposit Harvesting OAI-PMH ebank_dc Harvesting OAI- PMH oai_dc,ebank_dc Harvesting OAI-PMH oai_dc Dataset dc:identifier dcterms:references Linking dc:type=CrystalStructure Model input Andy Powell, UKOLN. Eprint oai_dc record (XML) dcterms:isReferencedBy dc:type=Eprint and/or Text eBank data model Eprint jump-off page (HTML) dc:identifier Eprint manifestation (e.g. PDF) Linking Deposit

Creating the metadata Potential to embed deposit and disseminate into workflow of chemist in automated way

Data Collection Diffraction Unit Cell Success Strategy Data Collection Data Process System Y PreScans Yes BruNo Mount BruNo Unmount Setup via GUI Sample Tray No

eBank phase two work areas Sub-disciplines of chemistry and physical sciences Pursue generic data model Use of identifiers for citing datasets Subject approach to discovering research data Access to research data in teaching and learning context Liaise with other digital repository initiatives

For the future… Who provides added value services? –Authority files, automated subject indexing, annotation, data mining, visualisation What are the preservation issues? –UK Digital Curation Centre –National Science Board Draft report on long-lived data collections How to manage complex objects descriptions within OAI Digital curation of research data presents new roles for scientists, computer scientists, data managers…. data scientists

Thank you. Comments, questions? Acnowledgement to all project partners for their contributions to this presentation.