1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International.

Slides:



Advertisements
Similar presentations
SRI International Bioinformatics Comparative Analysis Q
Advertisements

SRI International Bioinformatics 1 Orthology-Based Multi-PGDB Curation Tools Suzanne Paley Pathway Tools Workshop 2010.
Modeling Functional Genomics Datasets CVM Lesson 3 13 June 2007Fiona McCarthy.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
BsubCyc – A Model-Organism Database for Bacillus subtilis Ing Ingrid M. Keseler SRI International.
SRI International Bioinformatics 1 The consistency Checker, or Overhauling a PGDB By Ron Caspi.
Gene Ontology John Pinney
Curation of the EcoCyc Database: The EcoCyc Update Project Martha Arnaud Scientific Database Curator Bioinformatics Research Group SRI International
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Introduction to the Pathway Tools Software David Walsh and Simon Eng bigDATA Workshop—May 29, 2010.
EcoliWiki and GONUTS Wiki-based Systems for Community Annotation Jim Hu Dept. of Biochemistry and Biophysics Texas A&M University.
Genome Annotation BCB 660 October 20, From Carson Holt.
Update on The Pathway Tools Software Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org MetaCyc.org.
Creating a … Community Database Organism-Specific Database Model-Organism Database.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
1 SRI International Bioinformatics Advanced PGDB Editing: Regulation GO Terms Ingrid M. Keseler Bioinformatics Research Group SRI International
Ch10. Intermolecular Interactions and Biological Pathways
1 SRI International Bioinformatics BioCyc Tutorial Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org,
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
1 SRI International Bioinformatics Large-Scale Metabolic Network Alignment: MetaCyc and KEGG Tomer Altman Bioinformatics Research Group SRI International.
Data Content of the BioCyc Databases. BioCyc Tier 1 Databases.
RLIMS-P: A Rule-Based Literature Mining System for Protein Phosphorylation Hu ZZ 1, Yuan X 1, Torii M 2, Vijay-Shanker K 3, and Wu CH 1 1 Protein Information.
Annotating Gene Products to the GO Harold J Drabkin Senior Scientific Curator The Jackson Laboratory Mouse.
IProLINK – A Literature Mining Resource at PIR (integrated Protein Literature INformation and Knowledge ) Hu ZZ 1, Liu H 2, Vijay-Shanker K 3, Mani I 4,
The BioCyc Collection of Pathway/Genome Databases Alexander Shearer Bioinformatics Research Group SRI International BioCyc.org EcoCyc.org.
SRI International Bioinformatics 1 Recent Developments in Pathway Tools GMOD Workshop November ‘07 Suzanne Paley Bioinformatics Research Group SRI International.
What's True For E. coli… Enlisting The Community In Ongoing Genome Annotation Jim Hu EcoliHub/EcoliWiki Texas A&M University.
Adding GO for Large Datasets COST Functional Modeling Workshop April, Helsinki.
SRI International Bioinformatics 1 Object Groups & Enrichment Analysis Suzanne Paley Pathway Tools Workshop 2010.
PlantCyc, AraCyc, PoplarCyc and more... Building databases and connecting to researchers at the Plant Metabolic Network kate dreher curator PMN/TAIR.
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
MetaCyc and AraCyc: Plant Metabolic Databases Hartmut Foerster Carnegie Institution.
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
SRI International Bioinformatics 1 Submitting pathway to MetaCyc Ron Caspi.
1 SRI International Bioinformatics And now for our ‘Feature’ presentation: Automatic Loading of Protein Sequence Annotation Data from UniProt to Pathway.
PIRSF Classification System PIRSF: Evolutionary relationships of proteins from super- to sub-families Homeomorphic Family: Homologous proteins sharing.
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
Protein and RNA Families
SRI International Bioinformatics 1 SmartTables & Enrichment Analysis Peter Karp SRI Bioinformatics Research Group September 2015.
© 2014 SRI International About OMICS Group OMICS Group International is an amalgamation of Open Access publications and worldwide international science.
Functional Annotation and Functional Enrichment. Annotation Structural Annotation – defining the boundaries of features of interest (coding regions, regulatory.
Copyright OpenHelix. No use or reproduction without express written consent1.
Other biological databases and ontologies. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and.
Operated by Los Alamos National Security, LLC for NNSA Bioscience Discovering virulence genes present in novel strains and metagenomes Chris Stubben IC.
Rice Proteins Data acquisition Curation Resources Development and integration of controlled vocabulary Gene Ontology Trait Ontology Plant Ontology
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
Computer Science Ph. D. Seminar Gene Ontology (GO) Based Search for Protein Structure Similarity Clustering Metrics Ph.D. Candidate Steve Johnson Committee.
GeWorkbench Overview Support Team Molecular Analysis Tools Knowledge Center Columbia University and The Broad Institute of MIT and Harvard.
SRI International Bioinformatics 1 Editing Pathway/Genome Databases Ron Caspi.
Building and Refining AraCyc: Data Content, Sources, and Methodologies Kate Dreher TAIR, AraCyc, PMN Carnegie Institution for Science.
1 AraCyc Metabolic Pathway Annotation. 2 AraCyc – An overview  AraCyc is a metabolic pathway database for Arabidopsis thaliana;  Computational prediction.
SRI International Bioinformatics 1 Pathway Tools Features Available Only in the Desktop Version PathoLogic.
SRI International Bioinformatics Selected PathoLogic Refining Tasks Creation of Protein Complexes Assignment of Modified Proteins Operon Prediction.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.
Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,
Sequence based searches:
Department of Genetics • Stanford University School of Medicine
A Community Effort to Model the Human Microbiome
Modified from slides from Jim Hu and Suzi Aleksander Spring 2016
Genome Annotation Continued
Advanced PGDB Editing: Regulation GO Terms
Strategies for annotation of a genome
Importing GO terms from UniProt to a PGDB
Advanced PGDB Editing: Gene Ontology (GO) Terms
Overview of the Pathway Tools FBA Module
SRI Bioinformatics Research Group
Presentation transcript:

1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International

2 SRI International Bioinformatics History of Classification and GO terms in EcoCyc The MultiFun classification scheme was/is used for gene/gene product classification in EcoCyc. Developed by Monica Riley and collaborators Hierarchical classification scheme with 10 major categories for cellular function In 2005, we began to add support for adding GO terms to genes/gene products.

3 SRI International Bioinformatics Why go with GO? GO has become the standard ontology/classification scheme for gene products GO is being actively developed with input from the user communities GO is allowing standardization of annotation across all domains of life l Data mining across genomes l Genome annotation by similarity (e.g. via InterPro, Pfam, TIGRFAM, COG mappings) Tools that take advantage of GO annotations, e.g. microarray data clustering etc.

4 SRI International Bioinformatics The Evolution of GO Within EcoCyc 1.12/ Mapping of MultiFun terms to GO terms (multifun2go – Ashburner and Lomax): multiple specific GO terms were sometimes mapped to one general MultiFun term, resulting in misleading GO term annotations in EcoCyc; no evidence codes, citations 2.12/ Mapping of EC reactions to GO terms (ec2go): imported GO terms for enzymes that catalyzed reactions with full EC number assignments; no evidence codes, citations

5 SRI International Bioinformatics 3.4/ Importing GO term assignments from UniProt; mostly computational evidence codes 4.Since ~ Manual curation of GO terms based on publications, with evidence codes (mostly experimental) and literature citations 5.Since ~ EcoCyc and EcoliWiki are the source of the official E. coli gene-association file (in collaboration with J. Hu and D. Siegele, EcoliWiki, Texas A&M) The Evolution of GO Within EcoCyc

6 SRI International Bioinformatics Of Requirements and Differences Specific requirements for GO gene-association file l Presence of evidence codes and citations l Pathway Tools uses a different evidence code ontology; it is therefore necessary to map the evidence codes carefully l Some types of evidence require use of a With/From qualifier in GO – e.g IPI, ISS l Annotation with other qualifiers is not required by GO (e.g. NOT, contributes_to, colocalizes_with) and is not (yet) supported by Pathway Tools

7 SRI International Bioinformatics Tools for the Curator GO classification editor is accessible via the protein editor GO database can be searched in the editor; term definitions are available Tools available locally (ask developers about general availability): l Import new GO database (for newly created terms etc.) l Export gene-association file

8 SRI International Bioinformatics Manual Curation of GO terms Ongoing when we curate or re-curate gene products within EcoCyc No particular effort to back-fill GO terms; e.g. metabolic enzymes get experimental GO term assignments when we re-curate old metabolic pathways, or when new literature appears Texas A&M team is part of the Reference Genome Annotation Project; GO term assignments from EcoliWiki get imported into EcoCyc on a regular basis

9 SRI International Bioinformatics GO Term Statistics for E. coli (8/2009) 3721 gene products annotated with at least one GO term total GO term annotations, of which there are 6330 non-IEA annotations

10 SRI International Bioinformatics Acknowledgements Peter Karp Suzanne Paley Markus Krummenacker Tomer Altman Jim Hu Debby Siegele GO experts at the GO consortium