Download presentation
Presentation is loading. Please wait.
1
Class 3 2009 European Resources Protein Focused
2
Protein Databases EBI – European Bioinformatics Institute http://www.ebi.ac.uk/
3
What is the difference between dealing with nucleotide DBs and protein DBs?
4
Protein information Name & description Gene encoded from Organism Function (only one?) Enzyme? Ligands? PTMs? Interactions? Biological processes. Structure. Sequence. Localization More...
5
Protein DB -short history Pre-UniProt Swiss-Prot: created in July 1986; since 1987, a collaboration of the SIB and the EMBL/EBI; TrEMBL: created at the EBI in 1996 as a computer-annotated protein sequence database supplementing Swiss-Prot. It was introduced to deal with the increased data flow from genome projects
6
PIR EBI SIB
7
The three-layered approach The UniProt Archive (UniParc) UniProtKB + all other protein sequences publicly available Completeness The UniProt Reference Clusters (UniRef) Non-redundant views of UniProtKB + selected UniParcsets Speed The UniProt Knowledgebase (UniProtKB) Central database of annotated protein sequences and functional information UniProtKB/Swiss-Prot + UniProtKB/TrEMBL
8
Protein DBs Swiss-Prot - manually annotated. TrEMBL – translated EMBL, automatically annotated. UniProtKB – The UniProt Knowledge UniParc – The Achieve pf UniProt PIR - Protein Information Resource UniRef – The UniProt Reference Clusters PDB – Protein Data Bank – structure PRIDE – Resource for experimental proteomics (not in this class)
9
Databases growth www.genome.jp/en/db_growth.html
10
Protein DBs Swiss-Prot - manually annotated 2005- ~100,000 2009 - ~400,000
11
. TrEMBL – translated EMBL, automatically annotated.
13
Protein Names Different DBs – different accessions AccessionsDB P12345TrEMBL MAPK_HUMANSwiss-Prot (to be changed..) NP_123456 XP_123456 RefSeq UniRef100_P99999 UniRef90_P99999 UniRef50_P99999 UniRef ENSP00000123456Ensembl
14
Protein DBs Swiss-Prot - manually annotated. TrEMBL – translated EMBL, automatically annotated. UniProtKB – The UniProt Knowledge UniParc – The Achieve pf UniProt PIR - Protein Information Resource UniRef – The UniProt Reference Clusters PDB – Protein Data Bank – structure PRIDE – Resource for experimental proteomics (not in this class)
15
Principles
16
More in UniProt a complete annotated protein sequence database The Universal Protein Resource for protein sequences.UniProt A non-redundant archive of protein sequences extracted from public databases and contains only protein sequences. UniProt Archive Features clustering of similar sequences to yield a representative subset of sequences. This produces very fast search times. UniProt/UniRef A repository specifically developed for metagenomic and environmental data. UniProt/UniMES
17
Protein DBs Swiss-Prot - manually annotated. TrEMBL – translated EMBL, automatically annotated. UniProtKB – The UniProt Knowledge UniParc – The Achieve pf UniProt PIR - Protein Information Resource UniRef – The UniProt Reference Clusters PDB – Protein Data Bank – structure PRIDE – Resource for experimental proteomics (not in this class)
18
How is it built?
20
http://beta.uniprot.org/ What’s in UniProt?
21
EBI interface
22
PIR – Protein Information Resource Protein Family Classification System Integrated Protein Knowledgebase Integrated Protein Literature, Information and Knowledge
23
END If you got lost…(class exercise) some more slides…
24
EB-eye search
27
NCBI - Entrez
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.