Protein structure prediction

Slides:



Advertisements
Similar presentations
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Advertisements

Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Protein Structure – Part-2 Pauling Rules The bond lengths and bond angles should be distorted as little as possible. No two atoms should approach one another.
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
Protein Tertiary Structure Prediction
Protein Structure, Databases and Structural Alignment
CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.
Structures and Structure Descriptions Chapter 8 Protein Bioinformatics.
1. Primary Structure: Polypeptide chain Polypeptide chain Amino acid monomers Peptide linkages Figure 3.6 The Four Levels of Protein Structure.
Protein Basics Protein function Protein structure –Primary Amino acids Linkage Protein conformation framework –Dihedral angles –Ramachandran plots Sequence.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
A PEPTIDE BOND PEPTIDE BOND Polypeptides are polymers of amino acid residues linked by peptide group Peptide group is planar in nature which limits.
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Protein Tertiary Structure Prediction
Chapter 12 Protein Structure Basics. 20 naturally occurring amino acids Free amino group (-NH2) Free carboxyl group (-COOH) Both groups linked to a central.
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Lecture 10: Protein structure
3-Dimensional Structure of Proteins 4 levels of protein structure:
Predicting Secondary Structure of All-Helical Proteins Using Hidden Markov Support Vector Machines Blaise Gassend, Charles W. O'Donnell, William Thies,
Basic Computations with 3D Structures
Proteins. Proteins? What is its How does it How is its How does it How is it Where is it What are its.
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
Representations of Molecular Structure: Bonds Only.
ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory.
Lecture 12 CS5661 Structural Bioinformatics Motivation Concepts Structure Prediction Summary.
Study of Loop Length & Residue Composition of β-Hairpin Motif
Bioinformatics 2 -- Lecture 8 More TOPS diagrams Comparative modeling tutorial and strategies.
Protein Secondary Structure Prediction Based on Position-specific Scoring Matrices Yan Liu Sep 29, 2003.
Department of Mechanical Engineering
Protein Structure 1 Primary and Secondary Structure.
The α-helix forms within a continuous strech of the polypeptide chain 5.4 Å rise, 3.6 aa/turn  1.5 Å/aa N-term C-term prototypical  = -57  ψ = -47 
10/3/2003 Molecular and Cellular Modeling 10/3/2003 Introduction Objective: to construct a comprehensive simulation software system for the computational.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
. Protein Structure Prediction. Protein Structure u Amino-acid chains can fold to form 3-dimensional structures u Proteins are sequences that have (more.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Chemistry XXI Unit 3 How do we predict properties? M1. Analyzing Molecular Structure Predicting properties based on molecular structure. M4. Exploring.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Query sequence MTYKLILNGKTKGETTTEAVDAATAEKVFQYANDN GVDGEWTYTE Structure-Sequence alignment “Structure is better preserved than sequence” Me! Non-redundant.
Protein backbone Biochemical view:
An Efficient Index-based Protein Structure Database Searching Method 陳冠宇.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Proteins Structure Predictions Structural Bioinformatics.
Protein Structure Prediction: Threading and Rosetta BMI/CS 576 Colin Dewey Fall 2008.
Protein Structure BL
Computational Structure Prediction
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
March 21, 2008 Christopher Bruns
Generating, Maintaining, and Exploiting Diversity in a Memetic Algorithm for Protein Structure Prediction Mario Garza-Fabre, Shaun M. Kandathil, Julia.
The heroic times of crystallography
Protein Structure Prediction and Protein Homology modeling
Beta sheets come in two flavors: parallel (shown on this slide) and anti parallel. The geometry of the individual beta strandis are almost identical in.
Database extraction of residue-specific empirical potentials
Hierarchical Structure of Proteins
Introduction to Bioinformatics II
Study Question: What are enzymes?
Protein Structure Prediction
Protein folding.
Protein Structures.
Yang Zhang, Andrzej Kolinski, Jeffrey Skolnick  Biophysical Journal 
Rosetta: De Novo determination of protein structure
Protein structure prediction.
Tertiary and Quaternary Protein Structure
Conformational Search
Volume 85, Issue 4, Pages (October 2003)
Hydrogen Bonding in Helical Polypeptides from Molecular Dynamics Simulations and Amide Hydrogen Exchange Analysis: Alamethicin and Melittin in Methanol 
Volume 74, Issue 5, Pages (May 1998)
Presentation transcript:

Protein structure prediction Siddhartha Jain

Amino acid structure

4 levels of protein structure

Protein secondary structural motifs Alpha helices Each AA corresponds to 100 degree turn in helix and translation of 1.5 angstroms

Protein secondary structural motifs Beta sheets Composed of beta strands hydrogen bonded together Participating strands don’t have to be close in the primary sequence

Protein secondary structural motifs Turns Allow polypeptide chain to change direction Classified according to various criteria (# of residues, bonding, etc.) Usually have 4-5 residues Loops Any irregular/unclassified turns

Structure prediction strategies Molecular dynamics Energy function minimization

Protein representation Cartesian space X, Y, Z coordinates Torsion (internal coordinate) space Bond length (2 atoms), Bond angle (3 atoms), Torsion/Dihedral angle (4 atoms) Advantages Highly parallelizable Small changes in coordinates likely lead to small changes in energy – easy to prevent steric clashes Disadvantages Harder to maintain bond length, bond angle, dihedral angle constraints (local geometry) Easy to maintain local geometry Energy functions usually characterized in these parameters Disadvantanges Harder to parallelize Small changes can lead to big structural changes

Amber energy function

Lennard Jones potential

Strategies for protein folding Rosetta (Template based structure search) AlphaFold (by DeepMind)

AlphaFold

Features Multiple Sequence Alignment (MSA) features Sequence features Have coevolutionary information VERY IMPORTANT – on contact prediction, performance drops from 50% to 13% without them! Sequence features

Coevolutionary constraints Homologs of proteins are identified Multiple sequence alignment (MSA) is done Coevolutionary restraints are identified

Main idea Predict a distribution of inter-residue distances and bond angles (distance take with respect to alpha carbon of residue) Trained via cross entropy loss They call it distogram

Structure generation Just do gradient descent which works very well! Score function for gradient descent is (Statistical potential + Torsion likelihood + Rosetta energy function)

Statistical potential

Learn statistical potential likelihood Learn a potential function to assign a potential to every state (based on just inter-residue distances as features) Normalize potential function with respect to a reference state Based on location of residues and protein length Is learnt from data

Final scoring network Use distogram, contact map based on distogram, and MSA features to predict GDT distribution Use this network to select between final set of structures

Evaluation criterion Root mean square deviation (RMSD) Sensitive to outlier regions created by poor modeling of individual loop regions Global distance test (GDT TS) Largest set of AA’s alpha carbon atoms falling within a defined distance cutoff of their position in the experimental structure