Download presentation
Presentation is loading. Please wait.
Published byGilbert Hartshorne Modified over 10 years ago
1
Bio-Trac 25 (Proteomics: Principles and Methods) March 26, 2004 Zhang-Zhi Hu, M.D. Senior Bioinformatics Scientist Protein Information Resource National Biomedical Research Foundation, GUMC Tutorial: Bioinformatics Resources Tutorial: Bioinformatics Resources (http://pir.georgetown.edu/~huz/class/bioinfo_resource.html)http://pir.georgetown.edu/~huz/class/bioinfo_resource.html
2
2 computer + mouse = bioinformatics (information) (biology) NIH Biomedical Information Science and Technology Initiative (BISTI) Working Definition (2000) - Research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data. What is Bioinformatics?
3
3 Molecular Biology Database Collection Molecular Biology Database Collection ( http://nar.oupjournals.org/cgi/content/full/32/suppl_1/D3 ) http://nar.oupjournals.org/cgi/content/full/32/suppl_1/D3 -- 548 key databases of 11 categories
4
4 (http://pir.georgetown.edu/~huz/class/2004_database_update.html)http://pir.georgetown.edu/~huz/class/2004_database_update.html
5
5 Overview I. Text search / Information retrieval II. Sequence & genomics databases III. Protein family databases IV. Database of protein functions V. Databases of protein structures VI. 2D-gel databases VII. Proteomics databases Database Contents, Search and Retrieval
6
6 Text Searches Entrez Text Searches (http://www.ncbi.nlm.nih.gov/Entrez/)http://www.ncbi.nlm.nih.gov/Entrez/
7
7 PubMed Literature Database ( http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Search&DB=PubMed) http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Search&DB=PubMed
8
8 UniProt Text Search (http://www.pir.uniprot. org/cgi-bin/textSearch)http://www.pir.uniprot. org/cgi-bin/textSearch
9
9 PIR Text Search (I) (http://pir.georgetown.edu/pir www/search/textsearch.html) http://pir.georgetown.edu/pir www/search/textsearch.htmlhttp://pir.georgetown.edu/pir www/search/textsearch.html What’s different between CRAA_RABIT & CYRBAA? How about Search: Crystallin and SuperFamily?
10
10 PIR Text Search (II) Can you find which crystallin that has 3D structure determined using PIR text search?
11
11 I. Sequence & Genomics Databases GenBank An annotated collection of all publicly available nucleotide and protein sequences. GenBank : An annotated collection of all publicly available nucleotide and protein sequences. RefSeq: NCBI non-redundant set of reference sequences, including genomic DNA, transcript (RNA), and protein products UniProt Consortium Database : U niversal protein knowledgebase, a central resource of protein sequence and function from Swiss-Prot, TrEMBL and PIR. LocusLink : Curated sequences and descriptions of genetic loci. UniGene: Unified clusters of ESTs and full-length mRNA sequences. OMIM : Online Mendelian inheritance in man: a catalog of human genetic and genomic disorders. Model Organism Genome Databases: MGD, RGD, SGD, Flybase… GeneCards : Integrated database of human genes, maps, proteins and diseases. SNP Consortium Database
12
12 UniProt Consortium Database (http://www.uniprot.org) http://www.uniprot.org UniProt (knowledgebase) UniRef (100,90,50) UniParc (archive)
13
13 UniProt Sequence Report (I) (http://www.pir.uniprot.org/cgi- bin/unipEntry?id=CRAA_RABIT)http://www.pir.uniprot.org/cgi- bin/unipEntry?id=CRAA_RABIT
14
14 UniProt Sequence Report (II) (http://www.pir.uni prot.org/cgi- bin/unipEntry?id= UniRef90_P02489)http://www.pir.uni prot.org/cgi- bin/unipEntry?id= UniRef90_P02489
15
15 NCBI LocusLink (http://www.ncbi.nlm.nih.gov/LocusLink)http://www.ncbi.nlm.nih.gov/LocusLink
16
16 OMIM: Online Mendelian inheritance in man (http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=123580)http://www.ncbi.nlm.nih.gov/entrez/dispomim.cgi?id=123580
17
17 II. Protein Family Databases Whole Proteins PIRSF: A Network Classification System of Protein Families COG (Clusters of Orthologous Groups) of Complete Genomes ProtoNet: Automated Hierarchical Classification of Proteins Protein Domains Pfam: Alignments and HMM Models of Protein Domains SMART: Protein Domain Families CDD: Conserved Domain Database Protein Motifs PROSITE: Protein Patterns and Profiles BLOCKS: Protein Sequence Motifs and Alignments PRINTS: Protein Sequence Motifs and Signatures Integrated Family Databases iProClass: Superfamilies/Families, Domains, Motifs, Rich Links InterPro: Integrate Pfam, PRINTS, PROSITES, ProDom, SMART, PIRSF, SuperFamily
18
18 Domain Classification (http://pir.georgetown.edu/cgi-bin/ipcEntry?id=CRAA_RABIT)http://pir.georgetown.edu/cgi-bin/ipcEntry?id=CRAA_RABIT (http://www.sanger.ac.uk/cgi- bin/Pfam/swisspfamget.pl?na me=CRAA_RABIT)http://www.sanger.ac.uk/cgi- bin/Pfam/swisspfamget.pl?na me=CRAA_RABIT
19
19 Pfam Domain (http://www.sanger.ac.uk/cgi- bin/Pfam/getacc?PF00525)http://www.sanger.ac.uk/cgi- bin/Pfam/getacc?PF00525
20
20 Integrated Family Classification InterPro InterPro: An integrated resource unifying PROSITE, PRINTS, ProDom, Pfam, SMART, and TIGRFAMs, PIRSF. (http://www.ebi.ac. uk/interpro/search. html)http://www.ebi.ac. uk/interpro/search. html
21
21 PIRSF: Full Length Classification iProClass Family Report (http://pir.georgetown.edu/c gi-bin/ipcSF?id=SF002280)http://pir.georgetown.edu/c gi-bin/ipcSF?id=SF002280
22
22 III. Databases of Protein Functions Metabolic Pathways, Enzymes, and Compounds Enzyme Classification: Classification and Nomenclature of Enzyme-Catalysed Reactions (EC-IUBMB) KEGG (Kyoto Encyclopedia of Genes and Genomes): Metabolic Pathways LIGAND (at KEGG): Chemical Compounds, Reactions and Enzymes EcoCyc: Encyclopedia of E. coli Genes and Metabolism MetaCyc: Metabolic Encyclopedia (Metabolic Pathways) WIT: Functional Curation and Metabolic Models BRENDA: Enzyme Database UM-BBD: Microbial Biocatalytic Reactions and Biodegradation Pathways Cellular Regulation and Gene Networks EpoDB: Genes Expressed during Human Erythropoiesis BIND: Descriptions of interactions, molecular complexes and pathways DIP: Catalogs experimentally determined interactions between proteins BioCarta: Biological pathways of human and mouse GO: Gene Ontology Consortium Database
23
23 KEGG Metabolic & Regulatory Pathways (http://www.genome.ad.jp/dbget- bin/show_pathway?hsa00220+4.3.2.1)http://www.genome.ad.jp/dbget- bin/show_pathway?hsa00220+4.3.2.1 KEGG is a suite of databases and associated software, integrating our current knowledge on molecular interaction networks, the information of genes and proteins, and of chemical compounds and reactions. (http://www.genome.ad.jp/kegg/kegg2.html)http://www.genome.ad.jp/kegg/kegg2.html
24
24 BioCyc (EcoCyc/MetaCyc Metabolic Pathways) The BioCyc Knowledge Library is a collection of Pathway/Genome Databases (http://biocyc.org/)http://biocyc.org/
25
25 BioCarta Cellular Pathways (http://www.biocarta.com/index.asp)http://www.biocarta.com/index.asp
26
26 Protein-Protein Interaction: BIND (http://www.bind.ca/) http://www.bind.ca/
27
27 Gene Ontology (http://www.geneontology.org/) http://www.geneontology.org/ Three GOs: Molecular Function Biological Process Cellular Component
28
28 IV. Databases of Protein Structures Protein Structure PDB: Structure Determined by X-ray Crystallography and NMR PDBsum: Summaries and analyses of PDB structures MMDB: NCBI’s database of 3D structures, part of NCBI Entrez SWISS-MODEL Repository: Database of annotated protein 3D models ModBase: Annotated comparative protein structure models Structure Classification CATH: Hierarchical Classification of Protein Domain Structures SCOP: Familial and Structural Protein Relationships FSSP: Protein Fold Classification Based on Structure--Structure Alignment
29
29 PDB 3D Structure (http://www.rcsb.org/pdb/)http://www.rcsb.org/pdb/ Rat gamma-crystallin, chain A, B. Can you do a text search at PIR to find this?
30
30 PDBsum: Summary and Analysis Summary and Analysis (http://www.biochem.ucl. ac.uk/bsm/pdbsum)http://www.biochem.ucl. ac.uk/bsm/pdbsum
31
31 Protein Structural Classification (1) CATH: Hierarchical domain classification of protein structures (http://www.biochem. ucl.ac.uk/bsm/cath_new/)http://www.biochem. ucl.ac.uk/bsm/cath_new/
32
32 Protein Structural Classification (2) (http://scop.mrc-lmb.cam.ac.uk/scop/data/scop.b.html)http://scop.mrc-lmb.cam.ac.uk/scop/data/scop.b.html SCOP: comprehensive description of structural and evolutionary relationships between all proteins whose structure is known.
33
33 SWISS-MODEL Repository A database of annotated three-dimensional comparative protein structure models A database of annotated three-dimensional comparative protein structure models (http://swissmodel.expasy.org/repository/s mr.php?sptr_ac=CRGE_RAT&job=2)http://swissmodel.expasy.org/repository/s mr.php?sptr_ac=CRGE_RAT&job=2
34
34 VI. Proteomic Resources GELBANK (http://gelbank.anl.gov): 2D-gel patterns from completed genomes; SWISS-2DPAGE (http://www.expasy.org/ch2d/) http://gelbank.anl.govhttp://www.expasy.org/ch2d/http://gelbank.anl.govhttp://www.expasy.org/ch2d/ PEP: Predictions for Entire Proteomes: (http://cubic.bioc.columbia.edu/ pep/): Summarized analyses of protein sequences http://cubic.bioc.columbia.edu/ pep/http://cubic.bioc.columbia.edu/ pep/ Proteome BioKnowledge Library: (http://www.proteome.com): Detailed information on human, mouse and rat proteomes http://www.proteome.com Proteome Analysis Database (http://www.ebi.ac.uk/proteome/): Online application of InterPro and CluSTr for the functional classification of proteins in whole genomes http://www.ebi.ac.uk/proteome/ Expression Profiling databases: GNF (http://expression.gnf.org/cgi- bin/index.cgi, human and mouse transcriptome), SMD (http://genome- www5.stanford.edu/MicroArray/SMD/, Stanford microarray data analysis), EBI Microarray Informatics (http://www.ebi.ac.uk/microarray/ index.html, managing, storing and analyzing microarray data) http://expression.gnf.org/cgi- bin/index.cgihttp://genome- www5.stanford.edu/MicroArray/SMD/http://www.ebi.ac.uk/microarray/ index.htmlhttp://expression.gnf.org/cgi- bin/index.cgihttp://genome- www5.stanford.edu/MicroArray/SMD/http://www.ebi.ac.uk/microarray/ index.html
35
35 2D-Gel Image Databases (1) (http://us.expasy.org/ch2d/2d-index.html)http://us.expasy.org/ch2d/2d-index.html (http://us.expasy.org/cgi-bin/nice2dpage.pl?P02489)http://us.expasy.org/cgi-bin/nice2dpage.pl?P02489
36
36 2D-Gel Image Databases (2) (http://gelbank.anl.gov/2dgels/index.asp)http://gelbank.anl.gov/2dgels/index.asp
37
37 Expression Profiling Human and Mouse Transcriptome (http://expression.gnf.org/cgi-bin/index.cgi)http://expression.gnf.org/cgi-bin/index.cgi (http://genome- www.stanford.edu /serum/)http://genome- www.stanford.edu /serum/ (http://expression.gnf.org/ cgi-bin/index.cgi/)http://expression.gnf.org/ cgi-bin/index.cgi/
38
38 Choose additional protein IDs to browse the variety of molecular biology databases each sequence report links to. Delta crystallin II (Argininosuccinate lyase) (UniProt: CRD2_ANAPL) Alpha crystallin (UniProt: CRAA_RABIT)Lab:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.