Science as an Open Enterprise: Open Data for Open Science Professor Brian Collins CB, FREng UCL, June 2012 Emerging conclusions from a Royal Society Policy.

Slides:



Advertisements
Similar presentations
April 2010 MRC Data Sharing Policy Peter Dukes Policy Lead – Data Sharing & Preservation.
Advertisements

Assessing Excellence with Impact Ian Diamond ESRC.
Research Councils ICT Conference Welcome Malcolm Atkinson Director 17 th May 2004.
A centre of expertise in data curation and preservation DCC/NeSC eScience Workshop, June 2008 Working in partnership with the eScience community This work.
ESRC Future Strategy for Resources and Methods Professor Ian Diamond Chief Executive ESRC.
UKRDS Conference 26 February 2009 A Researchers Perspective: the Value and Challenge of Data Professor John Coggins Vice Principal, Life Sciences & Medicine.
The Role of Environmental Monitoring in the Green Economy Strategy K Nathan Hill March 2010.
December 2008 MRC Data Support Services (DSS) Chris Morris 13 th February 2009 Sharing Research Data: Pioneers, Policies and Protocols The seventh cat.
The JISC vision of research information management Dr Malcolm Read Executive Secretary, JISC.
Data-intensive research The RCUK Data Policy Mark Thorley
Assessing requirements for research data management support in academic libraries: introducing a new multi-faceted capability tool. Liz Lyon Visiting Professor,
The New EU Framework Programme for Research and Innovation HORIZON 2020 Judit Fejes Executive Agency of Small and Medium Enterprises (EASME)
Chapter 12 Strategies for Managing the Technology Infrastructure.
The Changing Face of Research Anthony Beitz DART Integration Manager.
JRC's Open Access (OA) Policy G. P. Tartaglia, A. Annoni, G. Merlo, F
THE JOINED UP WORLD OF E-RESEARCH Professor Neil McLean National Technical Standards Adviser to the Department of Education Science and Training (DEST)
Saturday 1 SN4CI. November 2005SNAC2 Words (used across 3 or more groups) Defined: community, scope Identifying: developers, early adopters, mechanism.
Vivien Bonazzi Ph.D. Program Director: Computational Biology (NHGRI) Co Chair Software Methods & Systems (BD2K) Biomedical Big Data Initiative (BD2K)
Scientific Publication in the European Research Area: moving towards change Pēteris Zilgalvis Head of Unit, Governance and Ethics European Commission,
Institutional Perspective on Credit Systems for Research Data MacKenzie Smith Research Director, MIT Libraries.
Software Sustainability Institute Training in Computational Skills Scientific Meeting 2014 “NGS Data after the Gold Rush” TGAC, Norwich.
Alma Swan Key Perspectives Ltd Truro, UK. Key Perspectives Ltd.
Data Management Development and Implementation: an example from the UK SLA Conference, Boston, June 2015 Geraldine Clement-Stoneham Knowledge and Information.
Prof. Yuan-Shyi Peter Chiu
Much Ado about Everything: Data, Publications, and the Role of Repositories Rebecca Kennison Center for Digital Research and Scholarship Columbia University.
A centre of expertise in digital information management UKOLN is supported by: Benefits of Research360 Catherine Pink Institutional Data.
UCL LIBRARY SERVICES Open Access publishing tools and services Dr Paul Ayris Director of UCL Library Services and UCL Copyright Officer President of LIBER.
Scientific Data Infrastructure: activities in the Capacities Programme of FP7 Presentation at euroCRIS Workshop, Brussels 15 September 2009 "The views.
Data Infrastructures Opportunities for the European Scientific Information Space Carlos Morais Pires European Commission Paris, 5 March 2012 "The views.
1 European policies for e- Infrastructures Belarus-Poland NREN cross-border link inauguration event Minsk, 9 November 2010 Jean-Luc Dorel European Commission.
Managing Research Data – The Organisational Challenge at Oxford James A J Wilson Friday 6 th December,
Integrated e-Infrastructure for Scientific Facilities Kerstin Kleese van Dam STFC- e-Science Centre Daresbury Laboratory
AIAA’s Publications Business Publications New Initiatives Subcommittee Wednesday, 9 January 2008 Rodger Williams.
Session Chair: Peter Doorn Director, Data Archiving and Networked Services (DANS), The Netherlands.
We are the 92% Valuing the contribution of research software Neil Chue Hong, FORCE2015 Research Communications and e-Scholarship.
Chapter © 2009 Pearson Education, Inc. Publishing as Prentice Hall.
1 NEST New and emerging science and technology EUROPEAN COMMISSION - 6th Framework programme : Anticipating Scientific and Technological Needs.
A centre of expertise in digital information management UKOLN is supported by: University of Bath Roadmap for EPSRC Catherine Pink Institutional.
ICSTI Annual Members’ Meeting & Workshop Dr. Stefan Winkler-Nees; Paris, 5. March 2012 The Alliance of German Science Organisations - Recommendations on.
Context and Linking in the Research Lifecycle CERIF and other standards Catherine Jones Scientific Information Group Scientific Computing Department STFC.
‘intelligent openness’ The common objective of an RCUK data policy Gregor McDonagh
Alma Swan Key Perspectives Ltd Truro, UK.  Researchers’ attitudes to data sharing  Data scientist skills  Both self-archived at:
EPA Geospatial Segment United States Environmental Protection Agency Office of Environmental Information Enterprise Architecture Program Segment Architecture.
SEEK Welcome Malcolm Atkinson Director 12 th May 2004.
Interoperability from the e-Science Perspective Yannis Ioannidis Univ. Of Athens and ATHENA Research Center
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Science as an open enterprise Geoffrey Boulton International Open Athens October 2013.
The EU framework programme for research and innovation.
Research Information Management: Continuity, Change and Impact Michael Jubb Research Information Network UUK Workshop 5 December 2007.
Computer Science School of Mathematical and Computer Sciences Professor Andrew Ireland.
JISC and the Big (Research) Data Challenge Simon Hodson JISC Programme Manager, Managing Research Data Thursday 10 May 2012 Eduserv Symposium: Big Data.
It’s the data that makes a paper Joerg Heber Executive Editor Nature Communications.
RDM Survey Survey questions and polling data collected at the LIASA Research Data Management workshop Cape Town 27 March 2014 Event web page:
International digital data management and sharing initiatives in the social sciences Peter Elias University of Warwick, England Presentation to the First.
Date, location Open Access policy guidelines for research institutions Name Logo area.
Open Data for Open Science: implications for European universities Geoffrey Boulton EUA, Brussels 2012 Some emerging conclusions from a Royal Society Policy.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
Cultural Heritage in Tomorrow ’s Knowledge Society Cultural Heritage in Tomorrow ’s Knowledge Society Claude Poliart Project Officer Cultural Heritage.
Open Science (publishing) as-a-Service Paolo Manghi (OpenAIRE infrastructure) Institute of Information Science and Technologies Italian Research Council.
Fedora Commons Overview and Background Sandy Payette, Executive Director UK Fedora Training London January 22-23, 2009.
Driving Innovation Connect & Catalyse The Cultural – Creative Industries Contemporary & Future Challenges Sian Brereton 24 th February 2010.
The opportunities and challenges of sharing genomics data with the pharmaceutical industry Shahid Hanif, Head of Health Data & Outcomes, ABPI DNA digest.
Kathleen Shearer Data management: The new frontier for libraries.
National e-Infrastructure Vision
Center for Technology Transfer and Innovations
ICTs transforming agricultural science, research & technology generation Science Forum Workshop Theme 3.
Bird of Feather Session
Director, ICT Centre of Excellence andOpen Data
Presentation transcript:

Science as an Open Enterprise: Open Data for Open Science Professor Brian Collins CB, FREng UCL, June 2012 Emerging conclusions from a Royal Society Policy Report

Open data as the engine of the “scientific revolution” Publish scientific theories – and the experimental and observational data on which they are based – to permit others to scrutinise them, to identify errors, to support, reject or refine theories and to reuse data for further understanding and knowledge. Henry Oldenburg

Why is “open data” a big current issue? The data deluge from powerful acquisition tools coupled with powerful tools for storing, manipulating, analysing, displaying and transmitting data and citizens interest in scrutinising scientific claims have created new challenges & new opportunities that require newforms of openness and novel social dynamics in science

Challenges Maintaining scientific self-correction (closing the concept-data gap) Responding to citizens’ demands for evidence in “public interest science” Opportunities Exploiting data-intensive science – a 4 th paradigm? The potential of linked data “Data is the new raw material for business” Exposing malpractice and fraud Stimulating citizen science Aspiration: all scientific literature online, all data online, and for them to interoperate

Openness of data per se has no value. Open science is more than disclosure For effective communication, we need intelligent openness. Data must be: Accessible Intelligible Assessable Re-usable Only when these four criteria are fulfilled are data properly open Metadata must be audience-sensitive METADATA Scientific data rarely fits neatly into an EXCEL spreadsheet!

Boundaries of openness? Legitimate commercial interests Privacy (complete anonymisation is impossible) Safety & Security But the boundaries are fuzzy & complex

Benefits/costs of open data to the science process Pathfinder disciplines where benefit is recognised and habits are changing Bioinformatics (-omics disciplines) Biological science Particle physics Nanotechnology Environmental science Longitudinal societal data Astronomy & space science Costs Tier 1 – International databases – e.g. Worldwide Protein Databank: >65 staff; $6.5M pa; 1% of cost of collecting data Tier 3 – Institutional data management - UK 2011, average UK university repository FTE (managerial, administrative, technical) e.g. Gene Omnibus – 2700 GEO uploads by non-contributors in 2000 led to 1150 papers (>1000 additonal papers over the 16 that would be expected from investment of $400,000)

Levels of data curation Tier 1 – International databases Tier 2 – National (e.g. Research Councils Tier 3 – Institutions (Universities & Institutes) Tier 4 – “Small science” researchers & research groups Financial sustainability? upward data migration Data loss

Priorities for action- 1 1)Change the mindset: publicly funded data is a public resource 2)Credit for useful data and productive, novel collaboration (the Tim Gowers phenomenon) 3)Mandatory access to data underlying publications 4)Common standards for communicating data 5)Sustainability (the power needs of current modes of data storage will outstrip the global electricity supply within the decade)

Priorities for action - 2 R & D on software tools (Enabling dynamic data; managing the data lifecycle; tracking provenance, citation, indexing and searching, standards & inter-operability, sustainability - note that the ICT industry is often way ahead - & the US prioritises investment here) Institutional responsibility for the knowledge they create (cumulative small science data > cumulative big science data) Data scientists (they are being trained, and the commercial demand is large) “Big Iron” is a national infrastructure priority “Big data” is a science priority – the big costs are people and software, not computers

Targets for recommendations Scientists – changing cultural assumptions Employers (universities/institutes) – data responsibilities; crediting researchers Funders of research - the cost of curation is a cost of research Learned societies – influencing their communities Publishers of research – mandatory open data Business – exploiting the opportunity; awareness & skills Government – efficiency of the science base; exploiting its data Governance processes for privacy, safety, security - proportionality