March 28, 2002 NIH Proteomics Workshop Bethesda, MD Lai-Su Yeh, Ph.D. Protein Scientist, National Biomedical Research Foundation Demo: Protein Information.

Slides:



Advertisements
Similar presentations
Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins iProClass Protein Knowledgebase.
Advertisements

1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Pfam(Protein families )
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
EBI is an Outstation of the European Molecular Biology Laboratory. Alex Mitchell InterPro team Using InterPro for functional analysis.
Bioinformatics for biomedicine Summary and conclusions. Further analysis of a favorite gene Lecture 8, Per Kraulis
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
Scientific publications and archives: media, content and access Lesk, Ch 3 (Lesk, 2008)
Archives and Information Retrieval
Protein structure (Part 2 of 2).
Readings for this week Gogarten et al Horizontal gene transfer….. Francke et al. Reconstructing metabolic networks….. Sign up for meeting next week for.
MCSG Site Visit, Argonne, January 30, 2003 Genome Analysis to Select Targets which Probe Fold and Function Space  How many protein superfamilies and families.
The Protein Data Bank (PDB)
Protein Modules An Introduction to Bioinformatics.
Protein Structure and Function Prediction. Predicting 3D Structure –Comparative modeling (homology) –Fold recognition (threading) Outstanding difficult.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
The Poor Beginners’ Guide to Bioinformatics. What we have – and don’t have... a computer connected to the Internet (incl. Web browser) a text editor (Notepad.
Protein Structure Prediction II
Protein and Function Databases
Introduction to Bioinformatics - Tutorial no. 8 Protein Prediction: - PROSITE - Pfam - SCOP - TOPITS - genThreader.
Protein Sequence Analysis - Overview Raja Mazumder Senior Protein Scientist, PIR Assistant Professor, Department of Biochemistry and Molecular Biology.
Predicting Function (& location & post-tln modifications) from Protein Sequences June 15, 2015.
1 iProLINK: An integrated protein resource for literature mining and literature-based curation 1. Bibliography mapping - UniProt mapped citations 2. Annotation.
Comparative Genomics of Viruses: VirGen as a case study Dr. Urmila Kulkarni-Kale Bioinformatics Centre University of Pune Pune
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
SUPERVISED NEURAL NETWORKS FOR PROTEIN SEQUENCE ANALYSIS Lecture 11 Dr Lee Nung Kion Faculty of Cognitive Sciences and Human Development UNIMAS,
Wellcome Trust Workshop Working with Pathogen Genomes Module 3 Sequence and Protein Analysis (Using web-based tools)
ComPath Comparative Metabolic Pathway Analyzer Kwangmin Choi and Sun Kim School of Informatics Indiana University.
Databases in Bioinformatics and Systems Biology Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
PROTEIN PATTERN DATABASES. PROTEIN SEQUENCES SUPERFAMILY FAMILY DOMAIN MOTIF SITE RESIDUE.
Information Resources for Bioinformatics 1 MARC: Developing Bioinformatics Programs July, 2008 Alex Ropelewski Hugh Nicholas
Good solutions are advantageous Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Sequence analysis: Macromolecular motif recognition Sylvia Nagl.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Module 3 Sequence and Protein Analysis (Using web-based tools) Working with Pathogen Genomes - Uruguay 2008.
Sequence Based Analysis Tutorial NIH Proteomics Workshop Lai-Su Yeh, Ph.D. Protein Information Resource at Georgetown University Medical Center.
Anastasia Nikolskaya Lai-Su Yeh Protein Information Resource Georgetown University Medical Center Washington, DC PIR: a comprehensive resource for functional.
Bioinformatics: Theory and Practice – Striking a Balance (a plea for teaching, as well as doing, Bioinformatics) Practice (Molecular Biology) Theory: Central.
BLOCKS Multiply aligned ungapped segments corresponding to most highly conserved regions of proteins- represented in profile.
Protein Database David Shiuan Department of Life Science Institute of Biotechnology Interdisciplinary Program of Bioinformatics National Dong Hwa University.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
PIRSF Classification System PIRSF: Evolutionary relationships of proteins from super- to sub-families Homeomorphic Family: Homologous proteins sharing.
Protein Information Resource Protein Information Resource, 3300 Whitehaven St., Georgetown University, Washington, DC Contact
Protein and RNA Families
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
Motif discovery and Protein Databases Tutorial 5.
Protein Domain Database
PROTEIN PATTERN DATABASES. PROTEIN SEQUENCES SUPERFAMILY FAMILY DOMAIN MOTIF SITE RESIDUE.
Sequence Based Analysis Tutorial March 26, 2004 NIH Proteomics Workshop Lai-Su L. Yeh, Ph.D. Protein Science Team Lead Protein Information Resource at.
Sequence Based Analysis Tutorial
Exercises Pairwise alignment Homology search (BLAST) Multiple alignment (CLUSTAL W) Iterative Profile Search: Profile Search –Pfam –Prosite –PSI-BLAST.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
InterPro Sandra Orchard.
Welcome to the Protein Database Tutorial. This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
Protein families, domains and motifs in functional prediction May 31, 2016.
Functional manual annotation including GO
Demo: Protein Information Resource
Sequence based searches:
Genome Annotation Continued
PIR: Protein Information Resource
Genome organization and Bioinformatics
Sequence Based Analysis Tutorial
Tutorial: Bioinformatics Resources
Protein Sequence Analysis - Overview -
Sequence Based Analysis Tutorial
Protein Sequence Analysis - Overview -
Overview of Enzyme, Protein and Network Databases
Presentation transcript:

March 28, 2002 NIH Proteomics Workshop Bethesda, MD Lai-Su Yeh, Ph.D. Protein Scientist, National Biomedical Research Foundation Demo: Protein Information Resource

2 Database Demo NREF Database iProClass Database iProClass Sequence (A58910), Motif (PCM00487)A5PCM00487 PIR-PSD Database PIR Entry (A58910) Other Molecular Databases Function: KEGG Enzyme (EC ), KEGG Pathway (MAP00230); BRENDA (EC )EC MAP00230EC Structure: PDB (1AK5), SCOP ( ), CATH (1AK5)1AK AK5 Family: Pfam (PF00478), BLOCKS (BL00487), PROSITE (PS00487)PF00478BL00487PS00487

3 PIR Web Site (

4 Text Search Result with NULL/NOT NULL

5 Peptide Search Results

6 PIR-NREF Search Result (I) Test Sequence: ftp://nbrfa.georgetown.edu/pir/misc/test.seq

7 PIR-NREF Search Result (II)

8 HMM Domain/Motif Search

9 PIR Pattern Search

10 PIR Pattern Search Result (I) Pattern Match: Sequence vs. PROSITE

11 PIR Pattern Search Result (II) Search a query pattern against a sequence database.

12 PIR Domain Display

13 PIR-NREF Database ( search

14 PIR-NREF Report

15 PIR-iProClass Database

16 iProClass Sequence Report

17 PDB Structure of Molecule: Inosine- 5'-Monophosphate Dehydrogenase

18 Related Sequences

19 Protein Family Classification Superfamily, Domain, and Motif Classification Superfamily Concept End-to-End Similarity & Same Overall Domain Architecture Significance Improve Sensitivity of Protein Identification Provide Complete Clustering for Database Organization Detect and Correct Genome Annotation Errors Systematically Drive Other Annotations Stimulate Evolution, Genomics and Proteomics Research Discovery of New Knowledge by Using Information Embedded within Families of Homologous Sequences and Their Structures

20 Protein Family/Superfamily Definitions Family A Set of Protein Sequences That Share a Common Evolutionary Ancestor with End-to-End Sequence Similarity (No Major Discrepancy by Standard Multiple Alignment Methods) Have the Same Domain Architecture (Except Incomplete or Alternately Spliced) Overall Sequence Identity >=45% Superfamily A Set of Protein Families That Share a Common Evolutionary Ancestor From End-to-end Have the Same Domain Architecture Overall Sequence Identity >=20%

21 Protein Domain Definition Homology Domain A Recognizable Region of Similarity Have a Common Ancestry Found in Diverse Protein Sequences (in >= 2 Superfamilies) A Sequence Can Belong to Only One Protein Family and Superfamily, but May Contain More Than One Domains.

22 Superfamily-Domain-Motif Relationship

23 iProClass Superfamily List All Superfamilies Containing PF00001 Superfamily-Domain Relationship: ~6000 SFs have >=1 Domains Superfamily for Domain Architecture

24 iProClass Superfamily Report

25 Alignment and Tree View

26 PIR-Protein Sequence Database

27 PIR-PSD Entry

28 BLAST/FASTA Search

29 PIR FASTA Search Result

30 PIR Searches and Aligment BLAST Search Multiple Alignment & Tree View

31 PIR Hidden Markov Model HMM Model Building & Sequence Search One Protein Against All HMMs All Proteins Against One HMM

32

33 Bibliography Submission View Bibliography Information View Protein Entry Submit Citation with Optional Categorization S1

34 Bibliography Information Display (I) From PIR-NREF From Other Curated Database (e.g., SGD)

35 Bibliography Information Display (II) From User Submission From Computer- Mapping (e.g. Gene Symbol)

36 Oracle Demo Java Web Interface for Oracle Database Search: ( WebDB Interface to Oracle (WebDB)WebDB Tables/Views/Objects Functions/Procedures/Packages

37 Proteomic Bioinformatics Large-Scale Analysis of Proteomic Data: Homology Search for Pathways