ID Mapping tools: Converting Accessions between Databases Genomic Annotation and Functional Modeling Workshop Maxwell H. Gluck Equine Research Center 15-16 November, 2011
Converting database accessions UniProt database Ensembl BioMart Online analysis tools DAVID, g:profiler, etc AgBase database ArrayIDer tool
1. UniProt ID Mapping
Paste accession list (>1000 may cause errors).
Select the accession type you have: and the accession type you want to convert to: Click an MAP
The mapping link will display a tab separated file:
2. Ensembl BioMart
Clicking on these headings allows you to set up searches. Selecting FILTERS gives you different filtering options:
Expand GENE and check “ID list limit” to select a defined list of accessions. Enter your list of accessions.
Selecting ATTRIBUTES allows you to choose what information is reported: Check accessions from external databases (UniProt & RefSeq).
Clicking on RESULTS will show you the output information. Output can be displayed online and/or downloaded (text, Excel). Selecting FILTERS or ATTRIBUTES will allow you to go back and make changes. Limited to species represented in Ensembl
3. Online analysis tools This tool works for a wide range of species. Database for Annotation, Visualization and Integrated Discovery (DAVID) http://david.abcc.ncifcrf.gov/conversion.jsp This tool works for a wide range of species.
Paste in your accession list (You can also upload a file of accessions.)
Select accession type. NOTE: If you choose “Not Sure” the tool will try to decide what type of accession you have.
Select gene list. Submit list.
Select the type of accession you want to convert TO.
Any ambiguous IDs are listed for you to decide.
4. AgBase: ArrayIDer Maps ESTs to gene/protein accessions.
An email will be sent with a link to the results An email will be sent with a link to the results. Results are formatted as an Excel file.
Combining ID Mapping If you have mixed accession types, you may have to do your ID mapping in sections and link together the results. Most mappings are not 100% - need to note how many from your original dataset are available for further analysis.