Protein Structure Prediction II

Slides:



Advertisements
Similar presentations
LSM2104/CZ2251 Essential Bioinformatics and Biocomputing Essential Bioinformatics and Biocomputing Protein Structure and Visualization (2) Chen Yu Zong.
Advertisements

Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein.
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Protein Structure Database Introduction Database of Comparative Protein Structure Models ModBase 生資所 g 詹濠先.
Pfam(Protein families )
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala.
Protein Tertiary Structure Prediction
Tema 14. Bases of protein structure and structural prediction. Structural data bank. Protein Data Bank. Molecular Visualization Tools for 3D. Prediction.
Structural bioinformatics
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Protein structure. Amino acids Amino acids: R group properties.
Protein secondary structure prediction methods TDVEAAVNSLVNLYLQASYLS “From sequence to structure”
Protein secondary structure prediction methods TDVEAAVNSLVNLYLQASYLS “From sequence to structure”
Protein structure (Part 2 of 2).
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Remote homology detection  Remote homologs:  low sequence similarity, conserved structure/function  A number of databases and tools are available 
Protein secondary structure prediction methods TDVEAAVNSLVNLYLQASYLS “From sequence to structure”
The Protein Data Bank (PDB)
ProteinStructuralDatabases. Proteins are built from amino-acids. Introduction H | NH2-c-CO2H | R.
CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.
Protein Tertiary Structure. Primary: amino acid linear sequence. Secondary:  -helices, β-sheets and loops. Tertiary: the 3D shape of the fully folded.
Protein Structure and Function Prediction. Predicting 3D Structure –Comparative modeling (homology) –Fold recognition (threading) Outstanding difficult.
Protein structures in the PDB
Protein structure Classification Ole Lund, Associate professor, CBS, DTU.
BLOSUM Information Resources Algorithms in Computational Biology Spring 2006 Created by Itai Sharon.
PDB-Protein Data Bank SCOP –Protein structure classification CATH –Protein structure classification genTHREADER–3D structure prediction Swiss-Model–3D.
Introduction to Bioinformatics - Tutorial no. 8 Protein Prediction: - PROSITE - Pfam - SCOP - TOPITS - genThreader.
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structure Prediction and Analysis
Protein Tertiary Structure Prediction
Structural alignment Protein structure Every protein is defined by a unique sequence (primary structure) that folds into a unique.
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
COMPARATIVE or HOMOLOGY MODELING
Protein 3D-structure analysis Exercises. Practicals Find update frequency for RCSB PDB: weekly. When was the last update? How many protein structures.
Gene Annotation and Analysis Lab Work Reference: European Multimedia Bioinformatics Educational Resource.
Exploiting Structural and Comparative Genomics to Reveal Protein Functions  Predicting domain structure families and their domain contexts  Exploring.
Bioinformatics 2 -- Lecture 8 More TOPS diagrams Comparative modeling tutorial and strategies.
CATH – a hierarchic classification of protein domain structures Rui Kuang.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
PROTEIN STRUCTURE CLASSIFICATION SUMI SINGH (sxs5729)
Tertiary structure combines regular secondary structures and loops (coil) Bovine carboxypeptidase A.
Protein Structure Comparison. Sequence versus Structure The protein sequence is a string of letters: there is an optimal solution (DP) to the problem.
Part I : Introduction to Protein Structure A/P Shoba Ranganathan Kong Lesheng National University of Singapore.
NIGMS Protein Structure Initiative: Target Selection Workshop ADDA and remote homologue detection Liisa Holm Institute of Biotechnology University of Helsinki.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
PIRSF Classification System PIRSF: Evolutionary relationships of proteins from super- to sub-families Homeomorphic Family: Homologous proteins sharing.
Protein Strucure Comparison Chapter 6,7 Orengo. Helices α-helix4-turn helix, min. 4 residues helix3-turn helix, min. 3 residues π-helix5-turn helix,
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Protein structure – introduction “Bioinformatics: genes, proteins and computers” Orengo, Jones and Thornton (2003).
Protein Sequence Analysis - Overview - NIH Proteomics Workshop 2007 Raja Mazumder Scientific Coordinator, PIR Research Assistant Professor, Department.
Homology modeling with SWISS-MODEL
DDPIn Distance and Density Based Protein Indexing David Hoksza Charles University in Prague Department of Software Engineering Czech Republic.
Comparing and Classifying Domain Structures
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
March 28, 2002 NIH Proteomics Workshop Bethesda, MD Lai-Su Yeh, Ph.D. Protein Scientist, National Biomedical Research Foundation Demo: Protein Information.
Guidelines for sequence reports. Outline Summary Results & Discussion –Sequence identification –Function assignment –Fold assignment –Identification of.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
InterPro Sandra Orchard.
3.3b1 Protein Structure Threading (Fold recognition) Boris Steipe University of Toronto (Slides evolved from original material.
Using the Fisher kernel method to detect remote protein homologies Tommi Jaakkola, Mark Diekhams, David Haussler ISMB’ 99 Talk by O, Jangmin (2001/01/16)
Sequence: PFAM Used example: Database of protein domain families. It is based on manually curated alignments.
Chapter 14 Protein Structure Classification
Bio/Chem-informatics
Demo: Protein Information Resource
Classification: understanding the diversity and principles of
Protein structure prediction.
Protein Structural Classification
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

Protein Structure Prediction II SCOP – Protein structure classification CATH – Protein structure classification genTHREADER – 3D structure prediction Swiss-Model – 3D structure prediction ModBase - A database of 3D struc. Predict.

SCOP: Structural Classification of Proteins http://scop.mrc-lmb.cam.ac.uk/scop/ Based on known protein structures Manually created by visual inspection Hierarchical database structure: Class, Fold, Superfamily, Family, Protein and Species

Node Parents of node Children of node

Node Parents of node Children of node

CATH: Protein Structure Classification by Class, Architecture, Topology and Homology http://www.cathdb.info/latest/index.html Class: The secondary structure composition: mainly-alpha, mainly-beta and alpha-beta. Architecture: The overall shape of the domain structure. Orientations of the secondary structures : e.g. barrel or 3-layer sandwich. Topology: Structures are grouped into fold groups at this level depending on both the overall shape and connectivity of the secondary structures. Homologous Superfamily: Evolutionary conserved structures

CATH: Protein Structure Classification by Class, Architecture, Topology and Homology

genTHREADER http://bioinf.cs.ucl.ac.uk/psipred/psiform.html Input sequence Type of Analysis (PSIPRED,MEMSAT, genTHREAD) http://bioinf.cs.ucl.ac.uk/psipred/psiform.html

GenTHREADER Output

GenTHREADER Output The output sequences show some extent of sequence homology But high level of secondary structure conservation

An automated protein modeling server. SWISS-MODEL An automated protein modeling server. http://swissmodel.expasy.org/

SWISS-MODEL The SWISS-MODEL algorithm can be divided into three steps: Search for suitable templates: the server finds all similarities of a query sequence to sequences of known structure. It uses the BLASTP2 program with the ExNRL-3D database (a derivative of PDB database, specified for SWISS-MODEL). You get these partial results as a SwissModel TraceLog file. Check sequence identity with target: All templates with sequence identities above 25% are selected Create the model using the ProModII program. You get this as a SwissModel-Model file.

SWISS-MODEL Get PDB file by E-mail Load to J-Mol

Homology Modeling Single Structure

Structures used for the homology model Swiss-Model file query Structures used for the homology model

Comparative Modeling Accuracy of the comparative model is related to the sequence identity on which it is based >50% sequence identity = high accuracy 30%-50% sequence identity= 90% modeled <30% sequence identity =low accuracy (many errors)

ModBase A Homology Model Database

Ligand Binding Site

A Clan is defined as a group of Pfam families which share a common evolutionary origin. They are generally different at the sequence and functional level but similar at the structure level. Histone superfamily