Download presentation
Presentation is loading. Please wait.
Published byBritney Syms Modified over 9 years ago
1
The GMOD Project Lincoln Stein Cold Spring Harbor Laboratory
2
Test Subject: Michael Caudy oDrosophila neurobiologist oProneural differentiation onotch pathway oHLH transcriptional activators/repressors oachaete/scute complex oNo computer science training oTook my “bioinformatics for biologists” course
3
“Simple” Problem oDiscover the transcriptional factor binding site code controlling proneural differentiation.
4
Regular Expression Search oUsing achaete promoter as exemplar, search for combinations of known binding sites in particular architectures
5
Mike’s Got Lots of Data o90-11,000 TF binding site clusters o100s-1000s of genes omillions of interactions oWhich genes are involved in neural differentiation? oWhich have interactions with the pathway? oWhich have suggestive mutant phenotypes?
6
Mike Needs a Database oDatabase management system for proneural differentiation genes. oVisualization/exploration tools for relationship of genes to putative TF clusters. oLiterature citations oLink out to FlyBase, Genbank & other DBs. oAdd notes and other annotations.
7
Try to do it with Filemaker o“Cluster-centric” vs “gene-centric”? oData import from FlyBase? oStoring images? oMaintaining relationships between genes & clusters? oUpdates?
8
Mike Needs a MOD oModel Organism Database oRepository for reagents oStocks, vectors, clones oGenetic & physical maps oLarge-scale data sets oGenome oEST sets, microarray results, 2-cell hybrid interactions oLiterature oOntologies & Nomenclature oMeetings, announcements
9
Example MOD: WormBase
10
Looking for Sex
11
An Author Entry
12
Bibliography
13
Citation
14
Gene
15
Genome
16
Proteome
17
Comparative Genomics
18
Functional Genomics
19
Anatomy
20
How WormBase Works ACeDB Images, Movies Database access library Web server Perl scripts You MySQL Genomic Data
21
Can Mike reuse WormBase to manage his data? No!
22
Sorry Mike oWormBase website difficult to install oData model nematode-centric oData entry tools very process- specific oCustomization difficult oSoftware documentation uneven oStandard operating procedure documentation uneven
23
MOD Redux oSGD, MGD, FlyBase, TAIR, RGD… oThe same basic idea as WormBase oImplementation entirely different oWheel reinvented many times oLittle software sharing oThis madness must stop!
24
The GMOD Project oPortable, open source software to support model organism databases oMultiple MODs involved oWorm, fly, yeast, mouse, arabidopsis, rat, monocot, [fugu], [E. coli] oFunded by NIH as of June 2002 oProgrammers, coordinator, quarterly meetings http://www.gmod.org
25
GMOD Home Page
26
The GMOD Pyramid Open Source DBMS & Middleware Modular Schema Modular Applications
27
A MOD Construction Set genome genetic maps liter- ature genomes Middleware Layer Database Layer Appplication Layer mapscitations genome browser genome editor map browser map editor citation browser citation editor Bioperl BioJava BioPython annotation pipeline
28
Chado – Modular Schema oCommon schema for use by FlyBase and WormBase oOntology Driven oSmall number of generic tables e.g. “feature” oControlled vocabulary names object types and relationships among them: o“achaete protein is a HLH activator” o“m8 protein inhibits achaete transcription” oEvidence-Savvy
29
GMOD Applications oApollo genome annotation editor oGbrowse generic genome browser oPubSearch literature curation editor oCMAP comparative map browser oIMD insertional mutagenesis database management system
30
Apollo – BDGP & Sanger Center
31
Apollo Data adapters oParser -> data models -> display oExisting data adapters oGAME XML oGFF oEnsembl CGI server oDAS oWrite your own data adapter! oExtend AbstractDataAdapter class oDisplay options defined in config file
32
Who is Using Apollo? oBDGP oReannotated Drosophila genome oBristol-Myers Squibb oLaunching Apollo from web browser via mime types oGNF oJDBC adapter layer over BioSQL oBiogen oView human genome alignment between public and Biogen internal database oConnected BLAT pipeline to Apollo oHGMP-RC Fugu Genomics group oDisplaying annotations on fugu scaffolds
33
PubSearch – TAIR & RatDB
34
PubSearch – Gene Association
35
IMD – Insertional Mutagenesis Db
36
CMap – Gramene
37
Cmap – Detailed View
38
GBrowse – WormBase
39
GBrowse – Zoomed in
40
GBrowse – Zoomed Way In
41
GBrowse – Zoomed Way Way In
42
GBrowse – Keyword Search
43
GBrowse – Third Party Annotations
44
Sequence dumps & other reports
45
Extensively Customizable oEnd-user oTurn tracks on and off, change order, change packing & labeling attributes (stored in cookie) oData provider oChange fonts, colors, text. oChange overview – genetic map, contigs, coverage, karyotype. oDefine new tracks using simple config file. oTinker with track appearance to hearts content.
46
Adding a New Track (a) Create a GFF file named “deletions.gff” Chr1 targeted deletion 1293224 1294901... Deletion d101k2 Chr1 targeted deletion 8239811 8241116... Deletion d680k2 Chr2 targeted deletion 5866382 5866500... Deletion d007k2 (b) Run the load_gff.pl script > load_gff.pl –d example_database deletions.gff Loading features… Done. 3 features loaded. (c) Add a new track “stanza” to the gbrowse configuration file [Knockout] feature = deletion glyph = span fgcolor = red key = Knockouts link = http://example.org/cgi-bin/knockout_details?$name citation = These are deletion knockouts produced by the example knockout consortium (http://example.org/knockouts.html)
47
Extensively Extensible Apache Web Server gbrowse CGI script BioPerl library Bio::DB::GFF adaptor Chado adaptor MySQL/Postgres Plugins Bio::Graphics library Oracle Oracle adaptorFlat File adaptor Flat Files Glyphs
48
GBrowse on GenBank? Apache Web Server gbrowse CGI script BioPerl library Plugins Bio::Graphics library Glyphs GenBank Proxy Adaptor GenBank GBrowse on GenBank! Bio::DB::GFF adaptor MySQL
49
B. burgdorferi via GenBank proxy
50
Who is Using GBrowse? oGMOD Members oWormBase, FlyBase, RatDB oHGMP-RC Fugu genomics group oKEGG (multiple microorganisms) oIngenium AG (mouse) oBristoll-Myers Squibb (drosophila) oTexas A&M University (salmonella) oMcGill University (human chr7) oInstitute of Systems Biology (human)
51
Genome Knowledgebase (GK)
52
“Constellation View” (in dev) TCA Cycle Oxidative Decarboxylation Amino Acid Biosynthesis Ethanol Catabolism Glucose Metabolism RNA Splicing DNA Replication
53
“Constellation View” (in dev) TCA Cycle Oxidative Decarboxylation Amino Acid Biosynthesis Ethanol Catabolism Glucose Metabolism RNA Splicing DNA Replication
54
Can Mike use GMOD to manage his data? Almost
55
Mike’s very own flybase
56
Uploaded Annotations
57
Details
58
Essential Pieces in Progress oGeneric MOD web site oStrain & phenotype curation tools oPathway tools and browsers oTree (e.g. phylogenetic) tools & browsers oBiopipe – genome annotation pipeline
59
Find out more about GMOD oGo to www.gmod.org oExamine software matrix oFind a project you’re interested in oContact project leader oOr contact Scott Cain: cain@cshl.org oOr mail gmod-dev@lists.sourceforge.net
60
Credits CSHL Adrian Arva Shuly Avraham Scott Cain Ken Clark Allen Day Xiaokang Pan BDGP Nomi Harris Suzanna Lewis Chris Mungall John Richter ShengQiang Shu Colin Weil http://www.gmod.org EBI Michele Clamp Stephen Searle Carnegie Institute Sue Rhee Danny Yoo Harvard David Emmert Stan Letovsky Cornell Medical School Michael Caudy
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.