Genome Database Comparative Genomics Phylogenomics Variation GrameneMart (BioMart) Discovery Environment Josh Stein Cold Spring Harbor Laboratory 1
Exploring Plant Genomes Browse Search Upload personal data Analysis tools
Gramene’s Key Strengths Comparative genomics – Complete reference genomes for 11 plant species including A. thaliana & A. lyrata – Whole genome alignments – Phylogenetic gene trees Ability to upload and share data Data mining using Gramene Mart Extensive variation data sets for Arabidopsis Integration with Pathways databases
Quick entry points
Browser tracks Whole genome alignments Synteny views Location-based variation
Gene sequence Splice variants Gene centered variation Phylogenetic trees Cross-reference to external databases Gene sequence Splice variants Gene centered variation Phylogenetic trees Cross-reference to external databases
Transcript & protein sequences Protein structure Transcript & protein based variation GO and other ontologies Transcript & protein sequences Protein structure Transcript & protein based variation GO and other ontologies
Location View Browser Tracks TAIR 10 Annotation EST/cDNA alignments Array probes Variation Genome alignments -cross-species browsing Repeats
Configuring Tracks
Standard Analysis & Visualization InterPro domain & GO functional annotation Cross-reference to external ID’s Whole Genome Alignment (Blastz-chain-net) Phylogenetic Gene Trees (Compara) Synteny Analysis Consequences of SNP 11
InterPro/dbXref/GO Structural prediction: Pfam, PIRSF, PRINTS, PROSITE, SMART, SUPERFAMILY, TIGRFAM, TMHMM, SignalP Cross-reference genes to 3 rd party identifiers: Entrez Gene, PlantGDB, PUTs, RefSeq, Gene Index, UniGene, UniProtKb/Swissprot, NASC, IPI, WikiGene Gene Ontology, Plant Ontology
Alignment View Pairwise BLASTZ-CHAIN- NET whole genome alignment Arabidipsis lyrata, Poplar, Grapevine Rice, Brachypodium, Sorghum Physcomitrella
Multi-species View A. lyrata Arabidopsis Grapevine Poplar
Conserved non-coding regions 15
View Sequence Alignment
Phylogenetic Analysis Tools
18 Compara Gene Trees Gene Trees for 11 plants plus human, Ciona, fly, worm, & yeast Infers orthologs and paralogs by reconciling gene tree with input species tree Taxonomic dating Gene Trees for 11 plants plus human, Ciona, fly, worm, & yeast Infers orthologs and paralogs by reconciling gene tree with input species tree Taxonomic dating Reconstructing evolutionary histories mology_method.html Vilella A.J., et al. (2008). Genome Res. Pre-print: doi: /gr ~35,000 trees ~24,500 plant specific ~10,000 containing Arabidopsis 1059 specific to Arabidopsis genus 79 specific to A. thaliana 527 specific to A. lyrata ~35,000 trees ~24,500 plant specific ~10,000 containing Arabidopsis 1059 specific to Arabidopsis genus 79 specific to A. thaliana 527 specific to A. lyrata
Tree Viewer Speciation node = ortholog Duplication node = paralog
Newick Tree & Alignment 20 (((ENSCINP _Cint_:0.0000, R10D12.12_Cele_:3.4477):0.7716, FBpp _Dmel_:0.8566):0.0000, (((((BRADI3G _Bdis_:0.0615, BRADI2G _Bdis_:0.1536):0.0214, ((LOC_Os02g _Osat_:0.0000, BGIOSGA PA_Oind_:0.0000):0.0000, ORGLA02G _Ogla_:0.0000):0.0938):0.0231, (((GRMZM2G050705_P02_Zmay_:0.0099, GRMZM2G124671_P01_Zmay_:0.0745):0.0043, Sb08g _Sbic_:0.0348):0.0000, (GRMZM2G022470_P01_Zmay_:0.0475, Sb04g _Sbic_:0.1037):0.0000):0.0917):0.1118, (((POPTR_0005s _Ptri_:0.0420, POPTR_0013s _Ptri_:0.0427):0.0918, (GSVIVT _Vvin_:0.0342, GSVIVT _Vvin_:0.0817):0.1210):0.0363, ((scaffold_ _Alyr_:0.0043, scaffold_ _Alyr_:0.0632):0.0277, AT4G _Atha_:0.0204):0.2813):0.1261):0.5081, E_GW _Ppat_:0.3698):0.3605):0.0000; ORGLA02G _Ogla_ VFVTVGTTCF DALVKAVDSP QVKEALLEKG YTDLIIQMGR GTY BRADI2G _Bdis_ VFVTVGTTCF DALVKAVDSE EVKQALLRKG YTDLLIQMGR GTY GRMZM2G050705_P02_Zmay_ VFVTVGTTCF DALVMAVDSP EVKKALLQKG YSNLLIQMGR GTY POPTR_0005s _Ptri_ VFVTVGTTLF DALVRTVDTK EVKQELLRNG YTHLIIQMGR GSY GRMZM2G022470_P01_Zmay_ VFVTVGTTCF DALVMAVDSP EVKKTLLQKG YSNLLIQMGR GTY BRADI3G _Bdis_ VFVTVGTTCF DALVKKVDSP QVKEALWQKG YTDLFIQMGR GTY GSVIVT _Vvin_ VFVTVGTTCF DALVKAVDTQ EFKKELSARG YTHLLIQMGR GSY Sb08g _Sbic_ MAVDSP EVKMALLQKG YSNLLIQMGR GTY GRMZM2G124671_P01_Zmay_ VFVTVGTTCF DALVMAVDSP EVKKALLQKG YSNLLIQMGR GTY Sb04g _Sbic_ MAVASP EVKKALLQKG YSNLVIQMGR GTY BGIOSGA PA_Oind_ E_GW _Ppat_ VLVTVGTTLF DALVREASSQ PCRQVLADFG YSSLVIQRGK GSF scaffold_ _Alyr_ VFVTVGTTSF DALVKAVVSE DVKDELQKRG FTHLLIQMGR GIF R10D12.12_Cele_ NQDVIDR ENSCINP _Cint_ IFVTVGTTSF DELTETITSK PVQKVLQSQG YDKVTIQYGR GKH scaffold_ _Alyr_ VFVTVGTTSF DALVKAVVSE DVKDELQKRG FTHLLIQMGR GNF AT4G _Atha_ VFVTVGTTSF DALVKAVVSQ NVKDELQKRG FTHLLIQMGR GIF LOC_Os02g _Osat_ VFVTVGTTCF DALVKAVDSP QVKEALLEKG YTDLIIQMGR GTY GSVIVT _Vvin_ VFVTVGTTCF DALVKAVDTH EFKRELFARG YTHLLIQMGR GSY FBpp _Dmel_ VYITVGTTKF DALISTASTE PALKALQNRK CTKLVIQHGN SQP POPTR_0013s _Ptri_ VFVTVGTTLF DALVRTVDTK EVKQELLRKG YTDLVIQMGR GSY
Orthologs & Paralogs 21
Gene-Centered Synteny Build 22 Oryza sativa JaponicaO.jap Brachypodium distachyonYESB.dis Sorghum bicolorYES S.bic Arabidopsis thaliana---A.tha Arabidopsis lyrata---YESA.lyr Vitis vinifera---YES V.vin Poplar trichocarpa---YES P.tri Compara OrthologsCollinear mappings (DAGchainer) “in-range” mappings near collinear anchors Map
Synteny View Available for A. lyrata, grapevine, & poplar Navigate to other genome Ortholog browser Link to multi-species view
Browse across duplicated regions from polyploidy Chr 1 vs PoplarChr 1 vs GrapevineSwitch reference to grape
Some Applications …
Distinguish “Real” Genes From Transposons 26 FAR1/FHY3 transcription factor family functions in light sensing Evolved from Mu-related transposes Cannot distinguish by BLAST FHY3 “Rule-in” functioning genes Missing annotation in A. lyrata? Domesticated TE
Enrich Annotations in Other Species Arabidopsis and Rice orthologs both show one gene Arabidopsis ortholog in correct syntenic context 27 Putative mis-annotated Grape gene
Adding Custom Tracks
Custom Tracks Salk T-DNA lines Uploaded from my laptop GFF file format EST alignments from non-model plants DAS: Distributed Annotation system Protocol for sharing 3 rd party data DAS Registry Methylome (Ecker) Uploaded from an URL BED file format
Upload Your Data chr1 SALK T-DNA e ID=SALK_ x chr1 SALK T-DNA e ID=SALK_ x chr1 SALK T-DNA e ID=SALK_ x chr1 SALK T-DNA e ID=SALK_ n chr1 SALK T-DNA e ID=SALK_ x chr1 SALK T-DNA e ID=SALK_ n chr1 SALK T-DNA ID=SALK_ x
Attach From Remote File track name="mCIP col/met1 BU" color=darkgreen description="Methylation" useScore=3 visibility=2 height=30 chr mCIP_col/met1_BU chr mCIP_col/met1_BU chr mCIP_col/met1_BU chr mCIP_col/met1_BU chr mCIP_col/met1_BU chr mCIP_col/met1_BU
Add DAS: Distributed Annotation System Protocol for sharing 3 rd party data via a DAS registry
Manage Custom Tracks
Turn On/Off Custom Tracks
GrameneMart Orthologs in lyrata, grape, poplar, rice, Brachypodium, sorghum maize, & moss Custom queries for bulk downloads Powerful tool for data mining
BioMart Use Cases All transmembrane-targeted genes, showing InterPro domains, GO terms, and AFFY id’s
BioMart Use Case Evolution of cyclin genes: Taxon of origin for paralog pairs of cyclin-domain genes that have an ortholog in Physcomitrella
BioMart Use Cases Mine germplasm for loss of function alleles in diversity populations: All Myb-domain genes with “STOP_GAINED” SNP allele
Additional Data Access 39 FTP: Data files, SQL dump, SoftwareRead-only Public MySQL Web Services
HELP!
Contact Us