A centre of expertise in digital information management UKOLN is supported by: Adding Value to Data and Information: Moving towards a Science Commons? Dr Liz Lyon Director, UKOLN Science Commons Workshop, Brussels, September This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0
Scholarship today? OA landscape
A centre of expertise in digital information management tostream/ 15 September 2006 Architecture of Participation?
Data- centric 2020 vision Reference datasets as infrastructure?
(Very simple) e-Research Cycle Formulate hypothesis / ideas, test, experiment, observe: data creation, collection & capture Adding value: Data linking, annotation, visualisation, simulation (New) knowledge extraction: data mining, modelling, analysis, synthesis e-Infrastructure Open access Collaboration Scholarly communications: data disclosure, publication, citation, discovery, re-use Data management storage & validation: description, deposit, self-archiving, preservation, certification Data processing This work is licensed under a Creative Commons License Attribution-ShareAlike 2.0Creative Commons License
Understanding the research process: workflows UK JISC-funded activity Project StORe: Source-to-Output Repositories (Edinburgh) RepoMMan: Repository Metadata and Management (Hull) –Primary data : research publications –Survey questionnaire, activity diagrams e-Scientist desktop? Slide: Carole Goble
A centre of expertise in digital information management Data capture
Deposit scenario (…part of….) 1.Produce strategy for synthesis (=idea) 2.Submit plan to SmartTea system (incl. identifiers) 3.Retrieve and follow instructions (sub-workflow?) 4.Experimental synthesis metadata automatically recorded on instruments (Smart Lab) 5.Create record for synthesised sample (+ proposed chemical identifier) in R4L laboratory data management system 6.Run spectral analyses on sample capturing further analysis metadata (incl. time-stamp, analysis software version, researcher details etc.) 7.Save spectrum in native and common formats 8.Invoke R4L data capture service and deposit files + metadata in laboratory repository… RAW DATADERIVED DATARESULTS DATA
The R4L Repository Deposit Search / Browse Create new compoundAdd experiment data and metadata Slide: Simon Coles
eBank UK Project Promoting open access data in an institutional repository Adding value through linking from data to derived publication Embedding data service in learning workflows: pedagogy UKOLN (lead), University of Southampton, University of Manchester
e-Research workflows Aggregator services Institutional data repositories Data curation & preservation: databases & databanks Validation Harvest Data creation & capture in Smart lab Deposit Publishers: peer-review journals, conference proceedings Publication Validation Data analysis, transformation, mining, modelling Search, harvest Presentation services: portals Data discovery, linking, citation Linking, citation Laboratory repository Deposit (Chemistry Central) e-Crystals Federation model This work is licensed under a Creative Commons Licence Attribution-ShareAlike 2.0
Digital repositories, OA & preservation Long-term access: trust, responsibility, policy Trusted DR Audit Checklist for Certification Draft Research Libraries Group-NARA Taskforce 2005 Self-certification: DINI-Zertifikat UK Digital Curation Centre: advice, tools & services RepInfo Registry EU CASPAR Integrated Project Task Force on the Permanent Access to the Records of Science
Data, metadata and interdisciplinary discovery Validation, publication & discovery of data models & schema Metadata packaging standards –METS, MPEG 21 DIDL –Complex object model? Semantic descriptions –Formal high-level and domain ontologies ePrints DC Application Profile Eprints_Application_Profile eBank Application Profile crystallography data uk/schemas/ UK Intute IR search service (eprints) Informal social network approaches folksonomies
Persistent identifiers for data citation How will they be used? We need use cases: depositor, author, service provider, researcher, publisher? Schemes: DOI, Handle, ARK, PURL Publication & citation of scientific primary data project National Library for Science & Technology (TIB), University of Hanover, Germany. STD-DOI Project DOI registry for datasets eBank exemplar DOIs from TIB Data citation policy html
Discovering data: Coles, S.J., Day, N.E., Murray-Rust, P., Rzepa, H.S., Zhang, Y., Org. Biomol. Chem., 2005, (10), DOI: /b502828k Domain identifier: International Chemical Identifier (INChI) code Google molecule using INChI Slide from Simon Coles
Adding value: repository services Tools: for deposit, normalisation, manipulation, transformation….. Linking, annotation, visualisation Aggregators: generic, (sub-) disciplinary Knowledge extraction: Mining (data, text, structures) Modelling (economic, climate, mathematical, biological…) Analysis (statistical, lexical, gene….)
Adding value: eBank linking data to publications
A centre of expertise in digital information management New forms of publication: integration of data and journals
Linking research to learning - embedding eBank aggregator service in a science portal for student learners MChem course Assess role in Undergraduate Chemical Informatics courses Pedagogic evaluation Report to be published.
Nature 23 March 2006 OTMI: Open Text Mining Interface NaCTeM Emerging tools: TerMine, GENIA, Cafetiere
Avian flu outbreaks mashup - Nature January 2006 Data from FAO, WHO… +Google Earth
A centre of expertise in digital information management Thank you. UKOLN receives core funding from the Joint Information Systems Committee (JISC) and the Museums, Libraries & Archives Council (MLA) and is based at the University of Bath, UK.