Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bioinformatics of mammaliain gene expression (BoMGE) 07 June 2005 Gene Regulation Informatics.

Similar presentations


Presentation on theme: "Bioinformatics of mammaliain gene expression (BoMGE) 07 June 2005 Gene Regulation Informatics."— Presentation transcript:

1 Bioinformatics of mammaliain gene expression (BoMGE) 07 June 2005 Gene Regulation Informatics

2 Deliver what? System... History/timelines Competitive position

3 Deliver what? ‘Comprehensive catalog’ of mammalian regulatory elements ‘Validated’, known accuracy Clustered into similar groups - ‘TF models’ Annotated as known/novel Modules identified, ‘specific to...’ Predictions extrapolated to remote regions

4 Predictive system Mostly Java Some Perl/bash 270 CPUs/OSCAR TRANFSAC 9.1 Manual TFBS EnsEMBL-based Generalize... OPTICS Accuracy metrics

5 Coexpression resource How best to use it? Motif discovery? Motif co-ccurrence?

6 Multi-source orthologue resource Compara, HomoloGene, Inparanoid, KEGG Compara, HomoloGen e, Inparanoid, KEGG, …

7 Visual comparative genomics: Assessing ortholog annotations LAGAN alignment detects misannotated chicken gene Orthologues of a human gene Assess sequence conservation for a coding exon (MLAGAN).

8 Motif discovery with multiple methods/params Methods (W)CONSENSUS MEME MotifSampler Gibbs Sampler Bioprospector, MDmodule, … Weeder CisModule NestedMICA, Sombrero,... ‘Multiple’ means Methods Motif occurrence models Other parameters

9 Motif scores  p-values Target Cumulative motif score distns p-val = 0.02 No p-val threshold 1 Discover with target and random sequences. 2 Apply method-independent score. 3 Use random distribution to assign p-value to a score. Random 1500b region

10

11 Motif clustering, co-occurrence TRANFSAC 9.1 Manual TFBS OPTICS Accuracy metrics

12 Clustering with OPTICS Reachability plot JASPAR scan test: 50-PWMs, 100 target sequence sets Labeled cluster contents 1 Pairwise motif similarity measure. 2 Scalable hierarchical clustering method with automatic stopping. [32 CPUs, 96 GB RAM, 64-bit OS]

13 www.cisred.org v1.1: human, mouse human: 6K genes, 120K motifs Web database design and construction

14 Main competitors Zhang - Cold Spring Harbor Lab Lander/Kellis - MIT Bolouri - Institute for Systems Biology Hardison/Haussler - Penn State/UCSC... High throughput... low throughput

15 Large scale’s here. Now what? Production / R&D Hi/lo throughput. Collaborators Accuracy / complexity / data integration ChIP-xxxx, expression specificity, chromatin state, 3’UTRs, LREs... ENCODE Regulatory networks and cascades

16 Competitive opportunities Monica - C. elegans, briggsae, unannotated Erin - Drosophila,..., unannotated Han Hao / Jim Kronstad (UBC) - fungi Generaliz e SNPs - Stephen Montgomery Repetitive regions - Dixie Mager

17 Competitive opportunities Many target genes, many orthologues Low-coverage/unannotated genomes Accuracy - resources, methods, protocols,... Coexpression and orthology Discovery input vs. co-occurrence/modules Motif similarity, clustering - a superset? cisRED annotations in EnsEMBL ‘Contextual’ motif/module resource...

18 ‘Context’ in cisRED Discovered motifs Motif similarity measures Clustering methods ‘Known’ motif resources Annotate motifs as known/novel Motif groups (specific to...) Other result types ‘Accuracy’ Motif classification system

19 Competitive opportunities Validated predictions Myers/Stanford Collaborators Be ‘on the short list’ Collaborators, publications GC3 - ChIP-SAGE, networks...

20 Acknowledgements Misha Bilenky, Chris Fjell, Obi Griffith, Han Hao, Ann He, Bernard Li, Keven Lin, Stephen Montgomery, Mehrdad Oveisi, Erin Pleasance, Neil Robertson, Wenjia Pan, Monica Sleumer, Kevin Teague, Richard Varhol, Maggie Zhang, Asim Siddiqui, Steven Jones Jianjun Zhou, Jörg Sander Dept. Computing Science, University of Alberta Tamara Astakhova, Maik Hassel, James Kennedy, Eddy Tsang, Tony Fu,... Funding Genome Canada, BC Cancer Foundation, Michael Smith Foundation for Health Research

21 TF classification / known motifs


Download ppt "Bioinformatics of mammaliain gene expression (BoMGE) 07 June 2005 Gene Regulation Informatics."

Similar presentations


Ads by Google