Building a community for genome and proteome annotation

Slides:



Advertisements
Similar presentations
Pre-SIG meeting " Genome Annotation" A BioSapiens initiative Goal of the workshop were - to create an open forum to discuss current problems on function.
Advertisements

DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Supporting Engagement in Open Access: a Publishers Perspective
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Contents of this Talk [Used as intro to Genome Databases Seminar, 2002] Overview of bioinformatics Motivations for genome databases Analogy of virus reverse-eng.
Biomarkers in Transplantation. A Genome Canada Initiative for Human Health Biomarkers in Transplantation A Knowledge Base for Allograft Rejection Benjamin.
1 Enriching UK PubMed Central SPIDER launch meeting, Wolfson College, Oxford Paul Davey, UK PubMed Central Engagement Manager.
Protein databases Morten Nielsen. Background- Nucleotide databases GenBank, National Center for Biotechnology Information.
Archives and Information Retrieval
Class European Resources Protein Focused. Protein Databases EBI – European Bioinformatics Institute
Corporation For National Research Initiatives NSF SMETE Library Building the SMETE Library: Getting Started William Y. Arms.
UniProt - The Universal Protein Resource
Moving beyond free text. Authors Scientist does research Scientist publishes research results in journal article Old Paradigm:
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
New data and tools at TAIR (The Arabidopsis Information Resource)
Bioinformatics and medicine: Are we meeting the challenge?
UNIVERSITY OF DAR ES SALAAM, TANZANIA ACTIVITIES ON INTERNATIONAL RELATIONS: EXPERIENCE AND CHALLENGES By. H.R.T. Muzale and R. Toba.
Research Project Grant (RPG) Retreat K-Series March 2012 Bioengineering Classroom.
Module 2 Stakeholder analysis. What’s in Module 2  Why do stakeholder analysis ?  Identifying the stakeholders  Assessing stakeholders importance and.
Interpreting & communicating about computational models: using visual analytics as a bridge between policy makers and researchers Philippe J. Giabbanelli,
EMBL-EBI EMBL-EBI EMBL-EBI What is the EBI's particular niche? Provides Core Biomolecular Resources in Europe –Nucleotide; genome, protein sequences,
1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
Protein Information Resource Protein Information Resource, 3300 Whitehaven St., Georgetown University, Washington, DC Contact
ST-09-01: Catalyzing Research and Development (R&D) Funding for GEOSS Florence Béroud, EC Jérome Bequignon, ESA Kathy Fontaine, US ST Kick-off Meeting.
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
Introduction to the Gene Ontology GO Workshop 3-6 August 2010.
Construction of Shanghai Life Science & Bio-technology Service Platform for Data Access and Sharing International Workshop on Strategies Presentation of.
Project Database Handler The Project Database Handler is a brokering application, which will mediate interactions between the project database and other.
RiceWiki: a wiki-based database for community curation of rice genes Available at
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
ArrayExpress Ugis Sarkans EMBL - EBI
R01? R03? R21? How to choose the right funding mechanism Thomas Mitchell, MPH Department of Epidemiology & Biostatistics University of California San Francisco.
Project Database Handler The Project Database Handler is a brokering application which will mediate interactions between the project database and other.
TDM in the Life Sciences Application to Drug Repositioning *
Towards a unified MOD resource: An Overview
Warwick Journal of Education – Transforming Teaching
Challenges of open science
Community of Practice (CoP)
EMBL’s European Bioinformatics Institute
Engaging with global clinical communities (on a day to day basis)
Jarek Nabrzyski Director, Center for Research Computing
KnowEnG: A SCALABLE KNOWLEDGE ENGINE FOR LARGE SCALE GENOMIC DATA
VIVO: Faculty Research Information System and Discovery
Data Science Diversity from the Perspective of a National Laboratory
Support to NAPs by the LEG, AC and other bodies under the Convention
생물정보학 Bioinformatics.
Workshop Aims TAMU GO Workshop 17 May 2010.
Department of Genetics • Stanford University School of Medicine
UTILISATION AT APPLIED MECHANICS
Welcome - webinar instructions
Population Information Integration, Analysis and Modeling
Functional Annotation of the Horse Genome
The Q Improvement Lab August 2017.
Knowledge Exchange Networks
Project tracking system for the structure solution software pipeline
Introduction to Bioinformatics
Strategies for annotation of a genome
A User’s Guide to GO: Structural and Functional Annotation
European Open Science Cloud All Hands Meeting Pisa 8-9 March 2018
Eirini Politi EuroLag March 2018 Athens, Greece
Brian Matthews STFC EOSCpilot Brian Matthews STFC
Foster Carer Retention Project Michelle Galbraith Project Manager
CCG Merger Proposal Consultation Event St Peter’s in the City, Derby
Bird of Feather Session
Hoop Magic Sports Academy Educational Technology Center
Lecturette 2: Planning Change
Dr.s Khem Ghusinga and Alan Jones
BCoN Data Integration Workshop, University of Kansas, Feb 13-14, 2018
Presentation transcript:

Building a community for genome and proteome annotation Claire O’Donovan odonovan@ebi.ac.uk

Building a community for genome and proteome annotation Workshop aims: March 2015 at Georgetown University, Washington USA (funded by NIH through UniProt grant) To discuss a shared vision for the future of annotation in the genome era with a special focus on protein functional prediction. To have a stimulating discussion about what we all do, what we would like to achieve in the future and how to build a community.

Participants Institute Name University of California Dr Patricia Babbitt J. Craig Venter Institute Dr Granger Sutton Broad Institute Dr Gustavo Cerqueira Joint Genome Institute Dr Nikos Kyrpides Texas A&M University Eric Rasche The University of Maryland Dr Michelle Giglio Indiana University (CAFA SIG) Dr Predrag Radivojac Miami University (CAFA SIG) Dr Iddo Friedberg University of Florida Dr Svetlana Gerdes SRI International Dr Peter Karp NCBI Dr Tatiana Tatusova EMBL-EBI Dr Maria Martin University of Southern California Dr Huaiyu Mi

Current status This is our challenge! Huge advances in genome sequencing technology, quality and standards But sequence function! This is our challenge!

Aim to go beyond the name Ontologies Nomenclature Functional Annotations Sequence features

Building a community for genome and proteome annotation Presented lightening talks about our perspectives/opinions Broke into groups to brainstorm on Our vision of the future of annotation What are the barriers? What are the solutions?

Solutions / What is needed Establishing the community. Bringing together the experimentalists and experts in annotation pipelines, database providers, standards developers and computational researchers is critical to successfully and accurately annotate genomes and proteomes. By doing so, we ensure we have the experimental data and domain knowledge needed to inform the computational pipelines and databases and to create a forum for the establishment and exchange of data, code and best practices.

Solutions / What is needed Defining comprehensive annotation requirements and standards for automatic annotation systems to deliver consistent annotation across our resources and available to the community to use. Need to develop and make available Standard Operating Procedures (SOPs) for the annotation pipelines to enable both accurate interpretation and reproducibility. Imperative for the community to move away from the protein name as the primary piece of annotation information for the gene/protein product as simply insufficient to capture the relevant  information available for each gene/protein

Solutions / What is needed The importance of a central resource of curated and experimental data and computational methods Currently there is no exhaustive database of experimentally determined protein function which inhibits the proper validation and generalization via computational methods. Sharing code and annotation pipelines in a central, publicly supported platform with well defined modules and interfaces would enable smaller resources or individual researchers to contribute new methods and utilise others’ efforts and expertise

Solutions / What is needed Interaction with the journals to ensure standardized data submission to enable computational effective data gathering While natural language processing is improving, the only reliable way to capture information computationally is via structured data. The authors are best positioned and motivated to provide this data at the time of publication and therefore it is critical to engage with the publishers to ensure this is part of the submission process.

Solutions / What is needed Engaging with funding agencies to develop new proposals for annotation involving both experimentalists and computational scientists and that they mandate openness eg publications software, and data sharing need to communicate better what annotation actually is 1) gene calling 2) developing a controlled vocabulary for function 3) capturing/propagating functional annotation 4) discovering the function of uncharacterized genes/proteins Building the collaborations to enable such comprehensive proposals to deliver on all aspects

Solutions / What is needed Engaging with the experimentalists in the development of experimental assays which would address the questions we need answered The annotation/predicting communities know there is a lot of known unknowns and it would be of great benefit to work together with the experimentalists to direct research to address these experimental gaps

Solutions / What is needed Engaging with cutting edge computational scientists and hardcoding methods identified as useful for our community We need YOUR expertise for the development of better computational approaches for functional prediction and improving text mining to extract data from the experimental literature So get involved!!