The GMOD Project Lincoln Stein Cold Spring Harbor Laboratory.

Slides:



Advertisements
Similar presentations
Annotation of Gene Function …and how thats useful to you.
Advertisements

Stein Lab In-House Symposium The Plan  Overview of my lab’s activities  Detailed look at the Gramene Database  Run out of time  Talk really.
Www. GeneOntology.org Gene Ontology Collaboration.
Generic model/many/my organism database toolkit Dec 2007 Don Gilbert Genome Informatics Lab, Biology Dept., Indiana University GMOD.
Chado Generic model organism database schema Presented at the NESCent GMOD Meeting 20 January, 2005 David Emmert
GBrowse – Introduction Developed by GMOD Generic Model Organism Database Generic Genome Browser Web application to explore genomes Free software Goal:
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Algorithm Animation for Bioinformatics Algorithms.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
GMOD: Building Blocks for a Model Organism System Database Lincoln Stein, CSHL.
WormBase: A Resource for the Biology & Genome of C. elegans Lincoln D. Stein.
GMOD in the Cloud Genome Informatics November 3, 2011 Scott Cain GMOD Project Coordinator Ontario Institute for Cancer Research
WFleaBase Daphnia Genome Database from Common Components Daphnia Genomic Consortium Meeting, Sept Don Gilbert,
WebGBrowse A Web Server for GBrowse Configuration Ram Podicheti B.V.Sc. & A.H. (D.V.M.), M.S. Staff Scientist – Bioinformatics Center for Genomics and.
The GMOD Project Lincoln Stein Cold Spring Harbor Laboratory.
Gramene Objectives Develop a database and tools to store, visualize and analyze data on genetics, genomics, proteomics, and biochemistry of grass plants.
The GMOD Project: Creating Reusable Software Components for Genome Data Scott Cain GMOD Project Coordinator Cold Spring Harbor Laboratory.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Copyright OpenHelix. No use or reproduction without express written consent 2 Overview of Genome Browsers Materials prepared by Warren C. Lathe, Ph.D.
Generic model/many/my organism database Oct 2007 Don Gilbert Genome Informatics Lab, Biology Dept., Indiana University GMOD.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
GMOD: Managing Genomic Data from Emerging Model Organisms Dave Clements 1, Hilmar Lapp 1, Brian Osborne 2, Todd J. Vision 1 1 National Evolutionary Synthesis.
Apollo Future Plans Nomi Harris, BDGP/FlyBase GMOD Meeting, Cambridge April 27, 2004.
The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
COSMIC GBrowse Visualising cancer mutations in genomic context Dave Beare Cancer Genome Project Wellcome Trust Sanger Institute, Hinxton,
Common Gene Pages Scott Cain GMOD Coordinator Cold Spring Harbor Laboratory.
Browsing the Genome Using Genome Browsers to Visualize and Mine Data.
Got genom e? Community Meetings GMOD.org The GMOD community meets semi- annually to discuss GMOD components, best practices,
Porting CHADO and GMOD Tools to Oracle and Integration with dictyBase Eric Just dictyBasehttp://dictybase.org Center for Genetic Medicine Northwestern.
Professional Development Course 1 – Molecular Medicine Genome Biology June 12, 2012 Ansuman Chattopadhyay, PhD Head, Molecular Biology Information Services.
Toward a Unified Gene Page GMOD Meeting, April 2004 Don Gilbert,
The generic Genome Browser (GBrowse) A combination database and interactive web page for manipulating and displaying annotations on genomes Developed by.
Bulk data files // TeraGrid uses for Genome Databases GMOD meet, June 2006 Don Gilbert,
Gramene Objectives Provide researchers working on grasses and plants in general with a bird’s eye view of the grass genomes and their organization. Work.
24th Feb 2006 Jane Lomax GO Further. 24th Feb 2006 Jane Lomax GO annotations Where do the links between genes and GO terms come from?
GMOD Meeting August 6-7, 2009 Oxford, UK Scott Cain, PhD. GMOD Project Coordinator Ontario Institute for Cancer Research
Copyright OpenHelix. No use or reproduction without express written consent1.
GBrowse Population Display and CMap SMBE 2009 Ben Faga.
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
This tutorial will describe how to navigate the section of Gramene that provides descriptions of alleles associated with morphological, developmental,
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
5/8/06 Scott Cain Stein Lab Retreat, 2006 GMOD Update Progress since last year  Software releases  Notable new users  Schema enhancements  New GMOD.
Copyright OpenHelix. No use or reproduction without express written consent1.
Bioinformatics and Computational Biology
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Stein Lab In-House Symposium Lincoln Sends His Regrets.
ARGOS (A Replicable Genome InfOrmation System) for FlyBase and wFleaBase Don Gilbert, Hardik Sheth, Vasanth Singan { gilbertd, hsheth, vsingan
What's new with GMOD Scott Cain GMOD Coordinator
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
GBrowse: Generic Genome Browser May 2003 Update Lincoln Stein, CSHL.
IMDB: A Generic Insertional Mutagenesis Database Xiaokang Pan and Lincoln Stein Cold Spring Harbor Laboratory.
Accessing and visualizing genomics data
Apollo Progress Report GMOD Meeting, Berkeley September 15, 2003.
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
Welcome to the combined BLAST and Genome Browser Tutorial.
Joined up ontologies: incorporating the Gene Ontology into the UMLS.
The Bovine Genome Database Abstract The Bovine Genome Database (BGD, facilitates the integration of bovine genomic data. BGD is.
Sequence-Structure-Function Sequence Structure Function Threading Ab initio BLAST Folding: impossible but for the smallest structures Function prediction.
Annotating with GO: an overview
Behavior and Phenotype in GMOD Natural Diversity in GMOD
Bioinformatics Tools for Comparative Genomics of Vectors
Daphnia Genome Preview at wFleaBase.org
EPConDB: Endocrine Pancreas Consortium Database
Genomes and Their Evolution
Welcome to the Gene and Allele Database Tutorial
got genome? Community Meetings Databases Training GMOD.org
Apollo Progress Report
Presentation transcript:

The GMOD Project Lincoln Stein Cold Spring Harbor Laboratory

Test Subject: Michael Caudy oDrosophila neurobiologist oProneural differentiation onotch pathway oHLH transcriptional activators/repressors oachaete/scute complex oNo computer science training oTook my “bioinformatics for biologists” course

“Simple” Problem oDiscover the transcriptional factor binding site code controlling proneural differentiation.

Regular Expression Search oUsing achaete promoter as exemplar, search for combinations of known binding sites in particular architectures

Mike’s Got Lots of Data o90-11,000 TF binding site clusters o100s-1000s of genes omillions of interactions oWhich genes are involved in neural differentiation? oWhich have interactions with the pathway? oWhich have suggestive mutant phenotypes?

Mike Needs a Database oDatabase management system for proneural differentiation genes. oVisualization/exploration tools for relationship of genes to putative TF clusters. oLiterature citations oLink out to FlyBase, Genbank & other DBs. oAdd notes and other annotations.

Try to do it with Filemaker o“Cluster-centric” vs “gene-centric”? oData import from FlyBase? oStoring images? oMaintaining relationships between genes & clusters? oUpdates?

Mike Needs a MOD oModel Organism Database oRepository for reagents oStocks, vectors, clones oGenetic & physical maps oLarge-scale data sets oGenome oEST sets, microarray results, 2-cell hybrid interactions oLiterature oOntologies & Nomenclature oMeetings, announcements

Example MOD: WormBase

Looking for Sex

An Author Entry

Bibliography

Citation

Gene

Genome

Proteome

Comparative Genomics

Functional Genomics

Anatomy

How WormBase Works ACeDB Images, Movies Database access library Web server Perl scripts You MySQL Genomic Data

Can Mike reuse WormBase to manage his data? No!

Sorry Mike oWormBase website difficult to install oData model nematode-centric oData entry tools very process- specific oCustomization difficult oSoftware documentation uneven oStandard operating procedure documentation uneven

MOD Redux oSGD, MGD, FlyBase, TAIR, RGD… oThe same basic idea as WormBase oImplementation entirely different oWheel reinvented many times oLittle software sharing oThis madness must stop!

The GMOD Project oPortable, open source software to support model organism databases oMultiple MODs involved oWorm, fly, yeast, mouse, arabidopsis, rat, monocot, [fugu], [E. coli] oFunded by NIH as of June 2002 oProgrammers, coordinator, quarterly meetings

GMOD Home Page

The GMOD Pyramid Open Source DBMS & Middleware Modular Schema Modular Applications

A MOD Construction Set genome genetic maps liter- ature genomes Middleware Layer Database Layer Appplication Layer mapscitations genome browser genome editor map browser map editor citation browser citation editor Bioperl BioJava BioPython annotation pipeline

Chado – Modular Schema oCommon schema for use by FlyBase and WormBase oOntology Driven oSmall number of generic tables e.g. “feature” oControlled vocabulary names object types and relationships among them: o“achaete protein is a HLH activator” o“m8 protein inhibits achaete transcription” oEvidence-Savvy

GMOD Applications oApollo genome annotation editor oGbrowse generic genome browser oPubSearch literature curation editor oCMAP comparative map browser oIMD insertional mutagenesis database management system

Apollo – BDGP & Sanger Center

Apollo Data adapters oParser -> data models -> display oExisting data adapters oGAME XML oGFF oEnsembl CGI server oDAS oWrite your own data adapter! oExtend AbstractDataAdapter class oDisplay options defined in config file

Who is Using Apollo? oBDGP oReannotated Drosophila genome oBristol-Myers Squibb oLaunching Apollo from web browser via mime types oGNF oJDBC adapter layer over BioSQL oBiogen oView human genome alignment between public and Biogen internal database oConnected BLAT pipeline to Apollo oHGMP-RC Fugu Genomics group oDisplaying annotations on fugu scaffolds

PubSearch – TAIR & RatDB

PubSearch – Gene Association

IMD – Insertional Mutagenesis Db

CMap – Gramene

Cmap – Detailed View

GBrowse – WormBase

GBrowse – Zoomed in

GBrowse – Zoomed Way In

GBrowse – Zoomed Way Way In

GBrowse – Keyword Search

GBrowse – Third Party Annotations

Sequence dumps & other reports

Extensively Customizable oEnd-user oTurn tracks on and off, change order, change packing & labeling attributes (stored in cookie) oData provider oChange fonts, colors, text. oChange overview – genetic map, contigs, coverage, karyotype. oDefine new tracks using simple config file. oTinker with track appearance to hearts content.

Adding a New Track (a) Create a GFF file named “deletions.gff” Chr1 targeted deletion Deletion d101k2 Chr1 targeted deletion Deletion d680k2 Chr2 targeted deletion Deletion d007k2 (b) Run the load_gff.pl script > load_gff.pl –d example_database deletions.gff Loading features… Done. 3 features loaded. (c) Add a new track “stanza” to the gbrowse configuration file [Knockout] feature = deletion glyph = span fgcolor = red key = Knockouts link = citation = These are deletion knockouts produced by the example knockout consortium (

Extensively Extensible Apache Web Server gbrowse CGI script BioPerl library Bio::DB::GFF adaptor Chado adaptor MySQL/Postgres Plugins Bio::Graphics library Oracle Oracle adaptorFlat File adaptor Flat Files Glyphs

GBrowse on GenBank? Apache Web Server gbrowse CGI script BioPerl library Plugins Bio::Graphics library Glyphs GenBank Proxy Adaptor GenBank GBrowse on GenBank! Bio::DB::GFF adaptor MySQL

B. burgdorferi via GenBank proxy

Who is Using GBrowse? oGMOD Members oWormBase, FlyBase, RatDB oHGMP-RC Fugu genomics group oKEGG (multiple microorganisms) oIngenium AG (mouse) oBristoll-Myers Squibb (drosophila) oTexas A&M University (salmonella) oMcGill University (human chr7) oInstitute of Systems Biology (human)

Genome Knowledgebase (GK)

“Constellation View” (in dev) TCA Cycle Oxidative Decarboxylation Amino Acid Biosynthesis Ethanol Catabolism Glucose Metabolism RNA Splicing DNA Replication

“Constellation View” (in dev) TCA Cycle Oxidative Decarboxylation Amino Acid Biosynthesis Ethanol Catabolism Glucose Metabolism RNA Splicing DNA Replication

Can Mike use GMOD to manage his data? Almost

Mike’s very own flybase

Uploaded Annotations

Details

Essential Pieces in Progress oGeneric MOD web site oStrain & phenotype curation tools oPathway tools and browsers oTree (e.g. phylogenetic) tools & browsers oBiopipe – genome annotation pipeline

Find out more about GMOD oGo to oExamine software matrix oFind a project you’re interested in oContact project leader oOr contact Scott Cain: oOr mail

Credits CSHL Adrian Arva Shuly Avraham Scott Cain Ken Clark Allen Day Xiaokang Pan BDGP Nomi Harris Suzanna Lewis Chris Mungall John Richter ShengQiang Shu Colin Weil EBI Michele Clamp Stephen Searle Carnegie Institute Sue Rhee Danny Yoo Harvard David Emmert Stan Letovsky Cornell Medical School Michael Caudy