Computational Structure Prediction

Slides:



Advertisements
Similar presentations
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Advertisements

1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Chapter 9 Structure Prediction. Motivation Given a protein, can you predict molecular structure Want to avoid repeated x-ray crystallography, but want.
Thomas Blicher Center for Biological Sequence Analysis
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
Energetics and kinetics of protein folding. Comparison to other self-assembling systems?
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.
Molecular modelling / structure prediction (A computational approach to protein structure) Today: Why bother about proteins/prediction Concepts of molecular.
Physics of Protein Folding. Why is the protein folding problem important? Understanding the function Drug design Types of experiments: X-ray crystallography.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Protein Structures.
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Protein modelling ● Protein structure is the key to understanding protein function ● Protein structure ● Topics in modelling and computational methods.
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
Tertiary Structure Prediction Methods Any given protein sequence Structure selection Compare sequence with proteins have solved structure Homology Modeling.
Comparative Protein Modeling Jason Wiscarson ( Lloyd Spaine ( Comparative or homology modeling, is a computational.
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
COMPARATIVE or HOMOLOGY MODELING
Conformational Sampling
Protein Secondary Structure Lecture 2/19/2003. Three Dimensional Protein Structures Confirmation: Spatial arrangement of atoms that depend on bonds and.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
 Four levels of protein structure  Linear  Sub-Structure  3D Structure  Complex Structure.
Representations of Molecular Structure: Bonds Only.
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
1 P9 Extra Discussion Slides. Sequence-Structure-Function Relationships Proteins of similar sequences fold into similar structures and perform similar.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Department of Mechanical Engineering
Molecular visualization
Part I : Introduction to Protein Structure A/P Shoba Ranganathan Kong Lesheng National University of Singapore.
Protein secondary structure Prediction Why 2 nd Structure prediction? The problem Seq: RPLQGLVLDTQLYGFPGAFDDWERFMRE Pred:CCCCCHHHHHCCCCEEEECCHHHHHHCC.
Applied Bioinformatics Week 12. Bioinformatics & Functional Proteomics How to classify proteins into functional classes? How to compare one proteome with.
New Strategies for Protein Folding Joseph F. Danzer, Derek A. Debe, Matt J. Carlson, William A. Goddard III Materials and Process Simulation Center California.
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
Predicting Protein Structure: Comparative Modeling (homology modeling)
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Structure prediction: Ab-initio Lecture 9 Structural Bioinformatics Dr. Avraham Samson Let’s think!
Programme Last week’s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues Summary.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Bioinformatics 2 -- lecture 9
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Lecture 47: Structure II -- Proteins. Today’s Outline The monomers: amino acids – Side chain characteristics – Acid-base equilibria and pK a Peptide backbone.
Sequence: PFAM Used example: Database of protein domain families. It is based on manually curated alignments.
Homology 3D modeling Miguel Andrade Mainz, Germany Faculty of Biology,
Protein Structure BL
Protein Structure Visualisation
March 21, 2008 Christopher Bruns
Protein Structure Prediction and Protein Homology modeling
Protein dynamics Folding/unfolding dynamics
Comparison of Exemplars of Rotamer Clusters Across the Proteinogenic Amino Acids
Protein dynamics Folding/unfolding dynamics
Conformation Dependence of Backbone Geometry in Proteins
Computational Analysis
Protein Structure Prediction
Protein Structures.
Molecular Modeling By Rashmi Shrivastava Lecturer
Volume 18, Issue 11, Pages (November 2010)
Homology Modeling.
Protein structure prediction.
Programme Last week’s quiz results + Summary
Protein Homology Modelling
Protein structure prediction
The Three-Dimensional Structure of Proteins
Presentation transcript:

Computational Structure Prediction Kevin Drew Systems Biology/Bioinformatics 2/11/16

Outline Structural Biology Basics Torsion angles, secondary structure, Ramachandran plots Comparative Modeling – create a structure model for a protein of interest Find templates - HHPRED build model - MODELLER evaluate - PyMol

Sequence defines Structure Structure defines Function

Protein Data Bank (PDB) http://www.rcsb.org/pdb/ PDBid: 1DFJ Molecules, Resolution, Publication, Download Links, etc. Experimental method: X-ray crystallography NMR Electron Microscopy

What is a 3D structure? Representation of a molecule. Static snapshot of a dynamic object Atoms and Bonds Coordinates ATOM 1 N LYS E 1 15.101 25.279 -11.672 1.00 97.78 N ATOM 2 CA LYS E 1 14.101 24.190 -11.496 1.00 95.96 C ATOM 3 C LYS E 1 13.269 24.511 -10.248 1.00 94.22 C ATOM 4 O LYS E 1 12.861 25.671 -10.051 1.00 94.62 O ATOM 5 CB LYS E 1 14.792 22.807 -11.375 1.00 97.64 C ATOM 6 CG LYS E 1 13.854 21.594 -11.530 1.00102.46 C ATOM 7 CD LYS E 1 14.278 20.409 -10.652 1.00109.05 C ATOM 8 CE LYS E 1 13.220 19.304 -10.681 1.00108.13 C ATOM 9 NZ LYS E 1 13.536 18.165 -9.780 1.00106.31 N Secondary Structure Surface

R PSI R = 1 of 20 amino acids PHI / PSI rotatable Omega =180 PHI Omega What is a 3D structure? Red = Oxygen Blue = Nitrogen Green = Carbon Ignore Hydrogens for now Atoms and Bonds R PSI R = 1 of 20 amino acids PHI / PSI rotatable Omega =180 (sometimes 0 for proline) PHI Omega

Phi / Psi torsion angles 135 -90 -140

Ramachandran Plot Propensity for phi/psi value combinations (statistics from PDB) Relationship between phi/psi angles and secondary structure S.C. Lovell et al. 2003

Levinthal’s Paradox – thought experiment Want to find lowest energy conformation of a protein (values of all phi and psi angles) RiboA = 124 residues = 123 peptide bonds 2 torsion angles per peptide bond (phi and psi) = 246 degrees of freedom Assume 3 stable conformations per torsion angle = 3^(246) = 10^118 possible states Assume each state takes a picosecond to sample. = 10^20 years to test all states > 13.8 x 10^9 age of universe Proteins take millisecs to microsecs to fold < the age of the universe) More importantly, how are we going to do it? Thus a paradox, how do proteins do it?

Use similar proteins with known structure Structure is more conserved than sequence Chothia, C. and A.M. Lesk, 1986. Structure Similarity - Pair of homologues Sequence Similarity Use similar proteins with known structure

Comparative Modeling Predict structure of a protein using the structure of a closely related protein. 1) Identify related proteins with known structure (templates) 2) Align protein sequence with template sequence 3) Build model based on alignment with template 4) Evaluate Eswar et al. 2006

Comparative Modeling Predict structure of a protein using the structure of a closely related protein. Generally both done by the same tool: Single sequence (previous lectures): ex. Blast Seq vs Profile = frequencies in multiple seq alignment: ex. PSI-Blast Profile vs profile: ex. COMPASS Hidden Markov Models (HMM, next lecture): ex. HMMER HMM vs HMM: ex. HHPRED 1) Identify related proteins with known structure (templates) 2) Align protein sequence with template sequence 3) Build model based on alignment with template 4) Evaluate

Chinchilla Ribonuclease HHPRED Demo! http://toolkit.tuebingen.mpg.de/hhpred Chinchilla Ribonuclease >gi|533199034|ref|XP_005412130.1| PREDICTED: ribonuclease pancreatic [Chinchilla lanigera] MTLEKSLVLFSLLILVLLGLGWVQPSLGKESSAMKFQRQHMDSSGSPSTNANYCNEMMKGRNMTQGYCKP VNTFVHEPLADVQAVCFQKNVPCKNGQSNCYQSNSNMHITDCRLTSNSKYPNCSYRTSRENKGIIVACEG NPYVPVHFDASV

Sequence Profiles Profiles can be built from multiple sequence alignments and contain frequencies of all amino acids in each column. This has more information than a single sequence. Hidden Markov Models (HMM) are like profiles but model insertions and deletions. HHPRED is HMM vs HMM with secondary structure prediction comparisons + Soding 2005

HHPRED Performance http://toolkit.tuebingen.mpg.de/hhpred/help_ov

Chinchilla Ribonuclease HHPRED Demo! Chinchilla Ribonuclease >gi|533199034|ref|XP_005412130.1| PREDICTED: ribonuclease pancreatic [Chinchilla lanigera] MTLEKSLVLFSLLILVLLGLGWVQPSLGKESSAMKFQRQHMDSSGSPSTNANYCNEMMKGRNMTQGYCKP VNTFVHEPLADVQAVCFQKNVPCKNGQSNCYQSNSNMHITDCRLTSNSKYPNCSYRTSRENKGIIVACEG NPYVPVHFDASV

Comparative Modeling Predict structure of a protein using the structure of a closely related protein. 1) Identify related proteins with known structure (templates) 2) Align protein sequence with template sequence 3) Build model based on alignment with template 4) Evaluate Eswar et al. 2006

3) Build Model: Computational Modeling Representation Sampling Procedures Energy Function Energy = van der Waals (Lennard-Jones) + Implicit Solvent (LK model) + Residue Pair Interactions (PDB) + Hydrogen Bonding + Side chains (Dunbrack) + Torsion Parameters (PDB) Monte Carlo Molecular Dynamics Minimization Simulated Annealing … Molecular Mechanics Knowledge Based (Stats from PDB) Specific knowledge (restraints) Internal Cartesian Full Atom Centroid

MODELLER Modeling by satisfaction of spatial restraints 3) Build model based on alignment with template A. Gather spatial restraints Residue - Residue distance Main chain PHI / PSI angles Solvent Accessibility Side chain angles H-bonds Residue neighborhood Secondary Structure B-factor Resolution of template … S.C. Lovell et al. 2003 Rost 2007

MODELLER Modeling by satisfaction of spatial restraints https://salilab.org/modeller/ 3) Build model based on alignment with template A. Gather spatial restraints B. Convert restraints to probability density function (pdf) Target aligns to two template structures. Calpha calpha distance pdf of a residue pair in target is made up of residue pair distances in two templates. The alignment of target to t1 is centered around 18A and target to t2 is around 22. Search pdb (or database) for homologous proteins to templates t1 and t2 for frequency counts. Build probability density function from frequency counts for each template (dashed lines) and combine (weighted linearly by similarity to template neighborhood) into one pdf (solid line). C. Satisfy spatial restraints Sample pdf for model that maximizes probability, P Sample using Molecular Dynamics, Conjugate Gradient Minimization and Simulated Annealing Sali 1993

Chinchilla Ribonuclease MODELLER Demo! Chinchilla Ribonuclease >gi|533199034|ref|XP_005412130.1| PREDICTED: ribonuclease pancreatic [Chinchilla lanigera] MTLEKSLVLFSLLILVLLGLGWVQPSLGKESSAMKFQRQHMDSSGSPSTNANYCNEMMKGRNMTQGYCKP VNTFVHEPLADVQAVCFQKNVPCKNGQSNCYQSNSNMHITDCRLTSNSKYPNCSYRTSRENKGIIVACEG NPYVPVHFDASV

Comparative Modeling Predict structure of a protein using the structure of a closely related protein. 1) Identify related proteins with known structure (templates) 2) Align protein sequence with template sequence 3) Build model based on alignment with template 4) Evaluate Eswar et al. 2006

Comparative Modeling 4) Evaluate Eswar et al. 2006

Comparative Modeling 4) Evaluate Common Errors: A. Side Chain packing B. Alignment shift C. No template D. Misalignment E. Wrong template Eswar et al. 2006

Chinchilla Ribonuclease Pymol Demo! Chinchilla Ribonuclease >gi|533199034|ref|XP_005412130.1| PREDICTED: ribonuclease pancreatic [Chinchilla lanigera] MTLEKSLVLFSLLILVLLGLGWVQPSLGKESSAMKFQRQHMDSSGSPSTNANYCNEMMKGRNMTQGYCKP VNTFVHEPLADVQAVCFQKNVPCKNGQSNCYQSNSNMHITDCRLTSNSKYPNCSYRTSRENKGIIVACEG NPYVPVHFDASV