Download presentation
Presentation is loading. Please wait.
1
Data retrieval BioMart Data sets on ftp site MySQL queries of databases Perl API access to databases Export View
3
Data Mining in Ensembl with EnsMart August 2005
4
All genes from a candidate region Genes with a particular protein domain Members of a protein family Genes associated with SNPs Possible queries…
5
Human genes with upstream regions conserved w.r.t. mouse Upstream sequence for all Ensembl genes mapped to U95A chip (similarly, complete genomic annotation of MG_U74). Genomic location and description of all mouse, rat and fugu homologues of all human genes, with transmembrane domains, expressed in cardiovascular system and have non- synonymous SNPs. More specific queries
6
Normalised Each data point stored only once Quick updates Minimal storage requirements But: Many tables Many joins for complicated queries Slow for data mining questions Ensembl core database
7
BioMart and EnsMart Large-scale data retrieval tool Query builder interface Databases: Ensembl, SNP, Vega, (MSD, UniProt) Associated features or sequences Flexible output formats http://www.ebi.ac.uk/biomart/ http://www.ensembl.org/EnsMart/
8
De-normalised Tables with ‘redundant’ information Query-optimised Fast and flexible designed for data mining Mart database
9
Primary Data Sets Ensembl genes SNP –Single nucleotide polymorphisms –Deletion-insertion polymorphisms –Short tandem repeats Vega genes (MSD protein structures) (UniProt proteomes)
10
Secondary Data Sets Markers Diseases Gene ontology Gene expression information Homology predictions Protein annotation
11
SPECIES FOCUS REGION SNP PROTEIN HOMOLOGY GENE EXPRESSION REFSEQ INTERPRO GO SWISSPROT EMBL AFFY REGION SNP PROTEIN HOMOLOGY GENE EXPRESSION FASTA FILE EXCEL TEXT GTF HTML startfilteroutput Information flow
12
BioMart http://www.biomart.org/
13
BioMart - Features
14
BioMart - Sequences
15
Output formats HTML
16
Direct database access at ensembldb.ensembl.org martdb.ebi.ac.uk MySQL client Download MySQL for Windows http://www.winmysql.com/page4.html File: wmysr11.zip What about queries not possible to do in EnsMart
17
Based on bioperl Ensembl modules For an introduction, see the tutorial at: http://www.ensembl.org/info/software/core/ Access via Perl object API
18
There are other ways… MartShell Commandline interface to Mart written in Java. It works with a Mart Query Language
19
MartExplorer
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.