David Lynn The Wellcome Trust Data Matters UK Research Data Service conference 26 February 2009
The Wellcome Trust independent biomedical research charity established in 1936 current spend of over £600m pa supports over 3,000 researchers in more than 50 countries, across six continents works to engage the public in research and to explore its societal implications
Mission and strategic aims Our mission is to foster and promote research with the aim of improving human and animal health We have six strategic aims advancing knowledge using knowledge engaging society developing people facilitating research developing our organisation
Facilitating research we strive to foster a research environment in which biomedical science can flourish we partner with others to develop key data resources: Human Genome Project Structural Genomics Consortium UK Biobank we also fund key databases via: Wellcome Trust Sanger Institute European Bioinformatics Institute additional grant funding we work to maximise access to research outputs (publications, data and collections)
Research is generating rapidly increasing volumes of data… DNA sequencing: total gigabases by week (80 gigabases per week is 130,000 bases per second) Structural Biology: Structures deposited in the Protein Data Bank* *graph extracted from the Interim Report of the Blue Ribbon Task Force on Sustainable Digital Preservation and Access (Dec 2008) Courtesy of Julian Parkhill
…in a diverse range of formats
Through sharing data we can increase its power Genome-wide association studies are revealing the genetic basis of common diseases through combining data across large patient cohorts… … and databases such as DECIPHER at the Wellcome Trust Sanger Institute, enable researchers to share data to gain new insights DECIPHER: Overview map of consortium members Courtesy of Leena Peltonen
Researchers are integrating large datasets to gain new insights into complex systems Malaria Atlas Project, University of OxfordHeart modelling - CardioViz3D* *Toussaint et al (2008) An Integrated Platform for Dynamic Cardiac Simulation and Image Processing: Application to In Proc. Eurographics Workshop on Visual Computing for Biomedicine (VCBM) Linking cholera outbreaks to sea temperature in Bangladesh
And there is immense potential to link research papers and data…
A vast number of research users are accessing key data resources For the year to August 07, the EBI website served an average of 340k unique hosts per month, with over 2m requests per day* The Wellcome Trust Sanger Institute website regularly received 15m hits per week during 2007/08 (a rise of 25% compared to previous year) *Source: European Bioinformatics Institute, Annual Scientific Report 2007 Wellcome Trust Sanger Institute: Total number of web pages requested)
Meeting the challenges 1: infrastructure rising volumes and complexity of data pose immense challenges for storage and curation e.g. WT Sanger Institute data storage capacity increased from 300 TB in 2005 to 1,500 TB in 2008 key data resources need coordinated and long-term sustainable funding ELIXIR is aiming to build a sustainable infrastructure for biological information across Europe
Meeting the challenges 2: technical and cultural issues coordination and advocacy from key communities (e.g. funders, institutions, publishers) provision of information and guidance for researchers appropriate incentives and recognition for researchers development of key technical standards, metadata, etc nurturing skills in data management – career support and training
Meeting the challenges 3: data security research involving personal data must have appropriate safeguards to protect participants recent high-profile incidents have sharpened concerns around privacy, confidentiality and responsibilities of researchers the issues have been addressed in several recent reports: Academy of Medical Sciences Council of Science and Technology Thomas/Walport data sharing review US Institute of Medicine management & governance of data is a key concern
The Wellcome Trusts approach Long track record of promoting access to research outputs: Bermuda principles (1996); Fort Lauderdale principles (2003) tailored data policies for major initiatives strong advocate of open access publishing data management and sharing policy published in Jan 2007: researchers should maximise access to research data with as few restrictions as possible data management plans (DMPs) required for projects generating resources or large datasets that could be shared for added value will meet costs for data sharing activities outlined in DMPs increasing convergence of DMP approach amongst funders
The UKRDS in context: ongoing Trust activity At UK level… ongoing interactions with RCUK, JISC, HEFCE, RIN, UKDA and others several other multi-funder initiatives – e.g. NCRI Informatics, UK Data Forum UK PubMed Central development active discussions around research uses of electronic patient records At European level… ESFRI (ELIXIR) proposals for EU PMC resource At international level… developing a code of conduct for public health and epidemiological data Fort Lauderdale follow-up meeting (led by Genome Canada) in May 09
UKRDS – what is needed a coordinated approach to preserve key research data and ensure its long-term value is maximised a service which meets the needs of researchers and funders but the devil is in the detail – must ensure that the project is developed in a way that truly adds value will still depend upon sustainable long-term funding for key data resources
UKRDS – critical success factors must build upon and link effectively with existing UK activity, and develop effective international links will need full buy in from all major funders (Research and Funding Councils & charities) and other key stakeholders will need to accommodate the differences between funders approaches, and between disciplines must be appropriately resourced to meet its goals
UKRDS – pathfinder study clarification on the role that it is envisaged research funders will play a more detailed project specification will need to be developed with: clearly stated expectations for partners full justification for the anticipated costs the study will need full buy-in from funders, institutions and the wider research community the resource implications of the study will need to be assessed carefully