What's True For E. coli… Enlisting The Community In Ongoing Genome Annotation Jim Hu EcoliHub/EcoliWiki Texas A&M University.

Slides:



Advertisements
Similar presentations

Advertisements

21 st Century Classrooms Tammy Elledge Coal City Unit District #1
The National Center for Biotechnology Information (NCBI) a primary resource for molecular biology information Database Resources.
ABSTRACT WormBase is a freely available information resource primarily for the nematode Caenorhabditis elegans but which progressively includes data from.
NATIONAL LIBRARY OF MEDICINE PubMed Central Brooke Dine National Library of Medicine Medical Library Association Conference May 2005.
Collaboration with IntAct and InterMine: SGD Rama Balakrishnan Saccharomyces Genome Database Gene Ontology Consortium Stanford University, CA USA.
Welcome Windows SharePoint Service 3.0. Craig Carpenter MCSE, MCT Director, Combined Knowledge.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Community Annotation of Gene Function with GONUTS Jim Hu EcoliHub/EcoliWiki Dept. of Biochemistry and Biophysics Texas A&M University.
CACAO Biocurator Training CACAO Fall CACAO Syllabus What is CACAO & why is it important? Training Examples.
CACAO - Remote training Gene Function and Gene Ontology Fall 2011
Data, data standards and sharing Dr Daniel Swan Bioinformatics Support Unit
Self Evaluation and Transition Plan Session October 31, 2007 ATI Technical Assistance Workshop.
CACAO - Penn State Gene Function and Gene Ontology January 2011
EcoliWiki and GONUTS Wiki-based Systems for Community Annotation Jim Hu Dept. of Biochemistry and Biophysics Texas A&M University.
Do’s and Don’t Of Web Design BY Julia Butterfield.
Introductory Overview
SEARCHING FOR A JOB CCM Mission Station F.I.T. Program 2015.
DAISY AND DEVELOPING COUNTRIES PERSPECTIVE BY DIPENDRA MANOCHA.
Microsoft SharePoint Document Libraries & Management 1.
Systems Used for Collaboration When to achieve a common goal, result or work product.
Feasibility Study of a Wiki Collaboration Platform for Systematic Review Eileen Erinoff AHRQ Annual Meeting September 15, 2009.
Recommendation “Landing Pages” RDAP this is last-minute filler, as I only found out the day before that one of panel members couldn’t make it, so.
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
GONUTS Community annotation and usage guides for Gene Ontology TAMU GO Workshop 17 May 2010.
Neil Brown Suse Labs, Novell Inc April 2009 World Domination through Distributed Collaboration Philosophy Tools People.
NCBI Vector-Parasite Genomic Related Databases Chuong Huynh NIH/NLM/NCBI Sao Paulo, Brasil July 12, 2004
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Pathway Interaction Database (PID) Market Research BioPortals Tiger Team Meeting Mervi Heiskanen January 31, 2013.
Wiki, aka Web 2.0 N J Sparling Educ Story Board Slide 1 Why do we want this? Students need tools relevant to their time to be productive in society.
Document Management Service MaestroTec, Inc. D ocument M anagement S ervice Improve the way you manage your critical business documents.
1 SRI International Bioinformatics GO Term Integration and Curation in Pathway Tools and EcoCyc Ingrid M. Keseler Bioinformatics Research Group SRI International.
16 Ways To Take To Work. How Would You Use PBwiki At Work? Over 1,000 non-business users were surveyed How would you use PBwiki at work? All responses.
Individualized Knowledge Access David Karger Lynn Andrea Stein Mark Ackerman Ralph Swick.
Wiki Training By Debby and Robin. What’s a wiki? wiki-wiki – is Hawaiian for “quick” wiki-wiki – is Hawaiian for “quick” A wiki is a type of website that.
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
The New Website of the Gene Ontology Consortium Seth Carbon Chris Mungall, PhD Monica Munoz-Torres, PhD Genomics Division,
The Collaborative Reference Database Project of the National Diet Library of Japan By Kiyoko MURAKAMI Assistant Director Domestic Materials Acquisition.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
What is an Annotated Bibliography? First, what is an annotation?  More than just a brief summary of an article, book, Web site etc.  It combines summary.
CACAO Training Fall Community Assessment of Community Annotation with Ontologies (CACAO)
University of Hartford May 20, 2008 Dr. Deborah Allen’s contribution to this presentation is acknowledged and appreciated. Using Collaborative Technology.
Introduction to the Gene Ontology GO Workshop 3-6 August 2010.
How to Create and Use Wiki. Resources Jordan, Michael. “The Wonderful World of Wikis: Create and Maintain any kind of content, quickly and easily.” Lessons.
Educational Networks What are they and why are they important?
Bioinformatics Lecture to accompany BLAST/ORF finder activity
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
ARCHIVES AND RECORDS MANAGEMENT PROFESSIONAL ASSOCIATION AND JOURNAL ANALYSIS Kim Edwards MARA September 2015.
Wikis Collaborative tool for building documents Anjesh Tuladhar.
NCBI: something old, something new. What is NCBI? Create automated systems for knowledge about molecular biology, biochemistry, and genetics. Perform.
What is a Wiki? A wiki is an online database that can be edited by anyone with access to it. “ Wiki ” is Hawaiian meaning ‘ fast ’ or ‘ quick ’
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
A WEB USAGE MINING FRAMEWORK FOR MINING EVOLVING USER PROFILES IN DYNAMIC WEB SITES.
Nicole C. Engard Wikis Collaborate, Connect & Contribute.
CACAO Training Jim Hu and Suzi Aleksander Fall 2015.
` Comparison of Gene Ontology Term Annotations Between E.coli K12 Databases REDDYSAILAJA MARPURI WESTERN KENTUCKY UNIVERSITY.
Towards a unified MOD resource: An Overview
CACAO Training ASM-JGI 2012.
Why Create a PGDB? Perform pathway analyses as part of a genome project Analyze omics data Create a central public information resource for the organism,
Introduction to Configuration Management
Functional Annotation of the Horse Genome
Modified from slides from Jim Hu and Suzi Aleksander Spring 2016
Overview of Microbial Pathway and Genome Databases
TAMU Bovine QTL db and viewer
Importing GO terms from UniProt to a PGDB
School Improvement Strategies and Resources
CottonGen: Enabling Cotton Research through Big-Data Analysis and Integration Jing Yu, Sook Jung, Chun-Huai Cheng, Taein Lee, Katheryn Buble, Ping Zheng,
Training Users to Create & Maintain a Cohesive Website
Temple BETT Technology Applications
Dr.s Khem Ghusinga and Alan Jones
Presentation transcript:

What's True For E. coli… Enlisting The Community In Ongoing Genome Annotation Jim Hu EcoliHub/EcoliWiki Texas A&M University

Why more E. coli websites? The number of E. coli databases is large Extensive coverage exists for many aspects of E. coli biology Journals contain half a century of E. coli data Don't we already know everything?

Why more E. coli websites? The number of E. coli databases is large Extensive coverage exists for many aspects of E. coli biology Journals contain half a century of E. coli data Don't we already know everything? #(1-3) The problem isn't the amount of information, it's finding it #4: No

The diversity of information on different genomes, proteins, phenotypes and so on makes it difficult to keep track of all details. Molecular Systems Biology 3:128 (2007) Why more E. coli websites? Part of what we don't know yet is how the things we do know fit together Most of us need help mining what's out there The diversity of information on different genomes, proteins, phenotypes and so on makes it difficult to keep track of all details. Molecular Systems Biology 3:128 (2007)

1-2:30 today: Session 173/K Poster K-133, Board 0542 EcoliHub: Development of the Information Resource Problems and approaches Finding data from different resources –EcoliHub - information from collaborating biological electronic data resources Making data curation faster, cheaper, and better –EcoliWiki - community annotation for E. coli K-12 Community functional curation for cross-species comparison –GONUTS - a community Gene Ontology resource 1-2:30 today: Session 173/K Poster K-133, Board 0542 EcoliHub: Development of the Information Resource

Integrating information from multiple sites EcoliHub is based on web services A user query to EcoliHub is passed on to participating sites or

Integrating information from multiple sites EcoliHub is based on web services A user query to EcoliHub is passed on to participating sites EcoliHub gathers the responses and assembles output for the user or

Integrating information from multiple sites

But the users won't have to start at the EcoliHub site

Integrating information from multiple sites But the users won't have to start at the EcoliHub site EcoliHub will provide the infrastructure to help member sites do peer-to- peer queries who has info? Try EcoCyc and RegulonDB

Integrating information from multiple sites But the users won't have to start at the hub site EcoliHub will provide the infrastructure to help member sites do peer-to- peer queries The users don't need to know or care about the EcoliHub

What kinds of nodes are connected to EcoliHub? So far: –EcoCyc everything E. coli; professionally curated –EcoGene* everything E. coli; professionally curated –GenoBase functional genomics and resources –EcoliPredict protein structure models –OU GenExpDB transcriptomes, experimental data –RegulonDB* operons and regulons –EcoliWiki everything E. coli; community curated –GONUTS Community curation of the Gene Ontology; not just E. coli More coming…

The need for Annotation is growing

“What is true of Escherichia coli is true of the elephant” - Jacques Monod “Thanks to annotation creep, what’s false for E. coli is false for the elephant too” - Jim Hu “What is true of Escherichia coli is true of the elephant” - Jacques Monod “Thanks to annotation creep, what’s false for E. coli is false for the elephant too” - Jim Hu

People are limiting for annotation Major MODs (EcoCyc, SGD, Wormbase, Flybase, MGI, Zfin, TAIR etc.) employ large numbers of PhD-level curators This model problematic for the future of biocuration, and not just for E. coli –Curators are expensive NIH and NSF cannot afford to staff every organism at this level –Broad expertise across all areas is hard Curators have to read papers in areas they were not trained in. Curators may not recognize the significance of papers in areas they were not trained in Can we make it: –cheaper? –faster? –better?

The Wikipedia approach Get your user community to work for free! Many groups have tried community annotation, with mixed success (at best) Wikipedia has added more than a million articles in English since I made the first version of this slide!

EcoliWiki or.net or.com or come from EcoliHub

EcoliWiki philosophy Any registered user can edit Any registered user can register new users Any registered user can create new pages It's easier to revise than to create new content –Seed content from other places, mostly EcoCyc Any registered user can edit Any registered user can register new users Any registered user can create new pages It's easier to revise than to create new content –Seed content from other sites, mostly EcoCyc

But won't that invite chaos? GenBank's managers are dead set against letting users into GenBank's files, however. They say there already are procedures to deal with errors in the database, and researchers themselves have created secondary databases that improve on what GenBank has to offer. "That we would wholesale start changing people's records goes against our idea of an archive," says David Lipman, director of the National Center for Biotechnology Information (NCBI), GenBank's home in Bethesda, Maryland. "It would be chaos."

Correct compared to what? NCBI RefSeq: Wikipedia:

Correct compared to what? NCBI RefSeq: Wikipedia:

Correct compared to what? NCBI RefSeq: Wikipedia:

Correct compared to what?

This is how biology achieves fidelity A collage of books I haven’t read

Biology Wikis are proliferating

Participation is the major challenge Anyone can edit ≠ Anyone will edit Wikipedia: a tiny fraction of the users edit anything –A tiny fraction of those do major editing –Really big denominator Outreach to increase our user base

Participation is the major challenge Tools to make it easier to edit

Participation is the major challenge Biggest difference from other systems: –Partial annotations are wanted –It doesn't matter if you don't know the wiki markup –It doesn't matter if what you're adding isn't fully worked out Someone else can fix it And you can fix what others write

Community annotation for everyone What if I don't work on E. coli? Community annotation of gene function via the Gene Ontology Gene Ontology Normal Usage Tracking System (GONUTS)

Community annotation for everyone Annotation pages based on UniProt IDs

The future of EcoliHub and EcoliWiki Making the resource more useful to the community –incorporating more resources –providing integration workflows –teaching users how to use them –adding content people want Making the approach available to other biology communities –reusable open source tools –public web services E. COLI 2008 don't forget the acknowledgements!

Thanks to EcoliWiki/GONUTS Team –Chris Elsik –Gwen Knapp –Debby Siegele –Daniel Renfro –Jerry Tsai –Xiaotao Qu –Rosemarie Swanson –Anand Venkatraman –Adrienne Zweifel Sabbatical hosts –SGD/Stanford –Stein Lab/CSHL GO consortium EcoliHub Team Leaders –Barry Wanner PI, Purdue –Walid Aref, co-PI, Purdue –Tyrell Conway, co-PI, Oklahoma –Mike Gribskov, co-PI, Purdue –Peter Karp, co-PI, SRI –Daisuke Kihara, co-PI, Purdue Funding NIH U24-GM URLs: