Download presentation
Presentation is loading. Please wait.
1
Data Mining in Ensembl with EnsMart
2
2 of 24 All genes from a candidate region Genes with a particular protein domain Members of a protein family Genes associated with SNPs Possible queries…
3
3 of 24 Specific queries Disease related genes between markers D10S255 and D10S259 Transmembrane proteins with an Ig-MHC domain (IPR003006) on chromosome 2 Genes with associated coding SNPs on chromosomal band 5q35.3 Mouse homologues for human disease genes.
4
4 of 24 Human genes with upstream regions conserved w.r.t. mouse Upstream sequence for all Ensembl genes mapped to U95A chip (similarly, complete genomic annotation of MG_U74). Genomic location and description of all mouse, rat and fugu homologues of all human genes, with transmembrane domains, expressed in cardiovascular system and have non-synonymous SNPs. More specific queries
5
5 of 24 EnsMart – vertical and horizontal data integration Ensembl Genes EST Genes Vega Genes SNPs Zebrafish Human MouseAnophelesFugu Rat
6
6 of 24 Genes EST Markers Diseases Protein Annotation SNPs Homology Expression Ensembl data sets
7
7 of 24 Data retrieval tool Query builder interface Gene or SNP lists Associated features or sequences Various output formats EnsMart
8
8 of 24 SPECIES FOCUS REGION SNP PROTEIN HOMOLOGY GENE EXPRESSION REFSEQ INTERPRO GO SWISSPROT EMBL AFFY REGION SNP PROTEIN HOMOLOGY GENE EXPRESSION FASTA FILE EXCEL TEXT GTF HTML startfilteroutput Information flow
9
9 of 24 Species and focus
10
10 of 24 Restrict your query
11
11 of 24 Restrict your query
12
12 of 24 Select output options
13
13 of 24 Select output options
14
14 of 24 Output formats HTML
15
15 of 24 Obtaining sequences
16
16 of 24 Normalised Each data point stored only once Quick updates Minimal storage requirements But: Many tables Many joins for complicated queries Slow for data mining questions Ensembl core database
17
17 of 24 De-normalised Tables with ‘redundant’ information Query-optimised Fast and flexible Ideal for data mining Mart database
18
18 of 24 Mart database Arek Kasprzyk Damian Keefe Damian Smedley Darin London Craig Meslopp User interface (MartView) Will Spooner Data and general support The entire Ensembl team Acknowledgements
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.