UniProt: Universal Protein Resource Central Resource of Protein Sequence and Function International Consortium PIR at GUMC European Bioinformatics Institute Swiss Institute of Bioinformatics Unifies PIR-PSD, Swiss-Prot, TrEMBL Protein Sequence Databases http://www.uniprot.org
UniProt Databases UniParc: Comprehensive Sequence Archive with Sequence History UniRef: Non-redundant Reference Databases for Sequence Search UniProtKB: Knowledgebase with Full Classification and Functional Annotation 3-yr, $15M
UniProt Archive (UniParc) An archive for tracking protein sequences Comprehensive: All published protein sequences Non-Redundant: Merge identical sequence strings Traceable: Versioned, with ‘Active’ or ‘Obsolete’ status tag Concise: no annotation of function, species, tissue, etc. 5 million unique entries from 13 million source-database entries
UniProt Reference Clusters (UniRef) Non-Redundant Reference Clusters for Sequence Searching UniRef100 for Comprehensive Sequence Similarity Search 100% sequence identity from all species, merging sub-fragments Derived from UniProtKB – Splice variants as separate entries Additional UniParc sources (e.g. Ensembl, IPI, EMBL_WGS) Sub-fragments WGS (whole genome shotgun) UniParc since Sep 2004 Splice variants
UniProt Reference Clusters (UniRef) UniRef90/50 for Faster Searches using Reduced Data Sets UniRef90: 90% sequence identity (35% reduction from UniRef100) UniRef50: 50% sequence identity (65% reduction) Representative Sequence for cluster Release 4.4 (03/29/05) Database Size WGS (whole genome shotgun) UniParc since Sep 2004
UniProt Knowledgebase (UniProtKB) Objective: Stable, Comprehensive, Fully Classified, Richly and Accurately Annotated Describe in a single record all protein products derived from a certain gene in a given species Information Content Isoform Presentation: Alternatively Spliced Forms, Proteolytic Cleavage, and Post-Translational Modification (each with FTid) Nomenclature: Gene/Protein Names (Nomenclature Committees) Family Classification and Domain Identification: InterPro and PIRSF Functional Annotation: Function, Functional Site, Developmental Stage, Catalytic Activity, Modification, Regulation, Induction, Pathway, Tissue Specificity, Subcellular Location, Disease, Process VARSPLIC derived by alternative splicing, proteolytic cleavage, and post-translational modification own identifiers
UniProtKB Report (I)
UniProtKB Report (II) http://www.pir.uniprot.org/cgi-bin/upEntry?id=PH4H_HUMAN