Protein Structure: protein folding Bioinformatics in Biosophy Park, Jong Hwa MRC-DUNN Hills Road Cambridge CB2 2XY England 1 Next : 02/06/2001.

Slides:



Advertisements
Similar presentations
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Advertisements

05/27/2006 Modeling and Determining the Structures of Proteins and Macromolecular Assemblies Depts. of Biopharmaceutical Sciences and Pharmaceutical Chemistry.
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Folding and flexibility. Outline What is protein folding ? How proteins fold in vivo ? What is protein flexibility ?
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
Structural bioinformatics
Amino Acid and Protein1. 2  The formation of a peptide bond between glycine and alanine is shown in Figure 5.8. The product is called dipeptide, the.
Graphical Models for Protein Kinetics Nina Singhal CS374 Presentation Nov. 1, 2005.
Energetics and kinetics of protein folding. Comparison to other self-assembling systems?
Mossbauer Spectroscopy in Biological Systems: Proceedings of a meeting held at Allerton House, Monticello, Illinois. Editors: J. T. P. DeBrunner and E.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
1 Protein Structure Prediction Charles Yan. 2 Different Levels of Protein Structures The primary structure is the sequence of residues in the polypeptide.
Protein Structures.
(Foundation Block) Dr. Ahmed Mujamammi Dr. Sumbul Fatma
Protein Folding Protein Structure Prediction Protein Design
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
1 Protein Folding Atlas F. Cook IV & Karen Tran. 2 Overview What is Protein Folding? Motivation Experimental Difficulties Simulation Models:  Configuration.
Computational Chemistry. Overview What is Computational Chemistry? How does it work? Why is it useful? What are its limits? Types of Computational Chemistry.
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
Protein Tertiary Structure Prediction
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
Conformational Sampling
Protein Structure Prediction. Historical Perspective Protein Folding: From the Levinthal Paradox to Structure Prediction, Barry Honig, 1999 A personal.
Proteins. Proteins? What is its How does it How is its How does it How is it Where is it What are its.
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
Representations of Molecular Structure: Bonds Only.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
PROTEINS PROTEINS Levels of Protein Structure.
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
Protein Folding Programs By Asım OKUR CSE 549 November 14, 2002.
Department of Mechanical Engineering
Secondary structure prediction
Doug Raiford Lesson 19.  Framework model  Secondary structure first  Assemble secondary structure segments  Hydrophobic collapse  Molten: compact.
Operone lac Principles of protein structure and function Function is derived from structure Structure is derived from amino acid sequence Different.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Protein Structure 1 Primary and Secondary Structure.
Last Tuesday and Beyond Common 2° structural elements: influenced by 1° structure –alpha helices –beta strands –beta turns Structure vs. function –Fibrous.
Applied Bioinformatics Week 12. Bioinformatics & Functional Proteomics How to classify proteins into functional classes? How to compare one proteome with.
Protein Folding and Modeling Carol K. Hall Chemical and Biomolecular Engineering North Carolina State University.
STATISTICAL COMPLEXITY ANALYSIS Dr. Dmitry Nerukh Giorgos Karvounis.
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Structure prediction: Ab-initio Lecture 9 Structural Bioinformatics Dr. Avraham Samson Let’s think!
PROTEIN PHYSICS LECTURE 21 Protein Structures: Kinetic Aspects (3)  Nucleation in the 1-st order phase transitions  Nucleation of protein folding  Solution.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Events in protein folding. Introduction Many proteins take at least a few seconds to fold, but almost all proteins undergo major structural transitions.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein backbone Biochemical view:
Protein Folding & Biospectroscopy Lecture 4 F14PFB David Robinson.
Lecture 10 CS566 Fall Structural Bioinformatics Motivation Concepts Structure Solving Structure Comparison Structure Prediction Modeling Structural.
Ab-initio protein structure prediction ? Chen Keasar BGU Any educational usage of these slides is welcomed. Please acknowledge.
Protein Structure Prediction: Threading and Rosetta BMI/CS 576 Colin Dewey Fall 2008.
Protein Structure Prediction. Protein Sequence Analysis Molecular properties (pH, mol. wt. isoelectric point, hydrophobicity) Secondary Structure Super-secondary.
Polypeptide Chains Can Change Direction by Making Reverse Turns and Loops.
Lecture 13 Protein Structure II Chapter 3. PROTEIN FOLDING.
Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:
Protein Folding.
Computational Structure Prediction
Protein Structure and Properties
Protein Structure Prediction and Protein Homology modeling
Protein dynamics Folding/unfolding dynamics
Computational Analysis
Enzyme Kinetics & Protein Folding 9/7/2004
Protein Structure Prediction
Protein Structures.
Large Time Scale Molecular Paths Using Least Action.
Protein structure prediction.
Understanding protein folding via free-energy surfaces from theory and experiment  Aaron R Dinner, Andrej Šali, Lorna J Smith, Christopher M Dobson, Martin.
Protein structure prediction
Presentation transcript:

Protein Structure: protein folding Bioinformatics in Biosophy Park, Jong Hwa MRC-DUNN Hills Road Cambridge CB2 2XY England 1 Next : 02/06/2001

An Unusual Fold Alpha helix is in the middle of beta sheets

From DNA To Protein In Cell

Different types of protein folds

Types of protein folds

Most proteins => Common folds From Alex Finkelstein’s talk

unusual fold: No Alpha no Beta

Beta Inside

Few seq: Rare fold, Many seq: Common Fold

Physical Selection of Folds (NOT evolutionary selection)

What is protein folding Protein folding refers to the process of protein primary structures folding into tertiary structures to produce stable and functioning biological molecules.

Experimental background for Bioinformatics Understanding experimental techniques and knowledge on protein folding is essential for making bioinformatic algorithms for folding. Theories of folding and implementation of them are possible and quite acceptable. It is possible to compute 3D folds using computer for small peptides.

Why protein folding problem? 1. Protein folding is one of the best known challenges humans have  good research target 2. The physical rules of folding are not known. 3. The computational complexity is extremely high. 4. There are many diseases which are linked to protein folding. 1. Alzheimer, BSE (mad cow disease), Parkinson, etc 5. Without knowing how proteins fold, it is not possible to design proteins fast and accurately.

Proteins fold back! In 1961, Christian Anfinsen showed that the proteins actually tie themselves: If proteins become unfolded, they fold back into proper shape of their own accord; no shaper or folder is needed. This means most of the information for protein folding is encoded in the primary structure of proteins (sequences) ! It is quite unbelievable!

Levinthal’s paradox The fact that many naturally-occurring proteins fold reliably and quickly to their native state despite the astronomical number of possible configurations has come to be known as Levinthal's Paradox. C. Levinthal, in Mössbauer Spectroscopy in Biological Systems, Proceedings of a Meeting Held at Allerton House, Monticello, Illinois, edited by J. T. P. DeBrunner and E. Munck, pp , University of Illinois Press, Illinois (1969).

Astronomical possible conformations? there would be 3 N possible configurations in our theoretical protein (N = length of protein). In nature, proteins apparently do not sample all of these possible configurations since they fold in a few seconds, and even postulating a minimum time for going from one conformation to another, the proteins would have time to try on the order of 10 8 different conformations at most before reaching their final state. –From Levinthal

Longer than the age of universe? Levinthal estimated the number of configurations a typical protein could adopt (3 N where N is the number of amino acids) and noted that even if protein configurations could be sampled at a rate of, say, per second, it would take longer than the age of the universe to find the native structure if this sampling was simply random. The result of this simple calculation markedly contradicts the actual properties of proteins, which can generally find the native structure from an unfolded state on time scales of the order of seconds or less. This well-known contradiction has come to be known as Levinthal's paradox.

NP-Hard? The most relevant class is that of NP-hard problems, for which there is no known algorithm that is guaranteed to find the solution within polynomial time, effectively rendering NP-hard problems intractable for large sizes. It has been shown that finding the global minimum of a protein is NP-hard.

Native fold is not the most stable

Guided Folding and nucleation We feel that protein folding is speeded and guided by the rapid formation of local interactions which then determine the further folding of the peptide. This suggests local amino acid sequences which form stable interactions and serve as nucleation points in the folding process.

Nucleus of folding

Folding Time and protein size

Proteins have Folding Pathways As proteins do not search for all the possible conformations for folding, there must be one or MORE folding pathways to have the final stable folds.

Many different paths Folding paths can have different shapes. It could be narrow and specific path or some kind of funnel. The paths can be complicaated and reversible.

Multiple pathways on a protein-folding energy landscape Goldbeck, R.A., Thomas, Y.G., Chen, E., Esquerra, R.M., and Kliger, D.S. (1999) Multiple pathways on a protein-folding energy landscape: kinetic evidence. Proc Natl Acad Sci U S A 96: The funnel landscape model predicts that protein folding proceeds through multiple kinetic pathways. Abstract: Experimental evidence is presented for more than one such pathway in the folding dynamics of a globular protein, cytochrome c. After photodissociation of CO from the partially denatured ferrous protein, fast time- resolved CD spectroscopy shows a submillisecond folding process that is complete in approximately 10(-6) s, concomitant with heme binding of a methionine residue. Kinetic modeling of time-resolved magnetic circular dichroism data further provides strong evidence that a 50-microseconds heme- histidine binding process proceeds in parallel with the faster pathway, implying that Met and His binding occur in different conformational ensembles of the protein, i.e., along respective ultrafast (microseconds) and fast (milliseconds) folding pathways. This kinetic heterogeneity appears to be intrinsic to the diffusional nature of early folding dynamics on the energy landscape, as opposed to the late-time heterogeneity associated with nonnative heme ligation and proline isomers in cytochrome c. Evidence from BPTI

Pathway means some intermediates The folding pathway naturally indicate some intermediate states for protein folding. 1. Pre-folded or misfolded state 2. Some Nucleus 3. Intermediates such a molten globule or partially folded state 4. Natural fold state (folded)

Intermediates and Molten Globules Molten globule is not just misfolded structure.

Native versus Molten golubles

Free energy barrier

Native folds and energy difference

Correct Knotting for the final Str. From Alexy Finkelstein’s lecture

Is folding hierarchical? Baldwin, R.L., and Rose, G.D. (1999) Is protein folding hierarchic? II. Folding intermediates and transition states. Trends Biochem Sci 24: The folding reactions of some small proteins show clear evidence of a hierarchic process, whereas others, lacking detectable intermediates, do not. Evidence from folding intermediates and transition states suggests that folding begins locally, and that the formation of native secondary structure precedes the formation of tertiary interactions, not the reverse. Some computational experiment results support a hierarchic model of protein folding.

Dynamic nature of folding Protein folding is dynamic, so the intermediates, misfolded and folded structures exist at the same time until all the species become natually folded. There is a continuous exchange of species during the processes. Dynamic nature  Studying Unfolding of proteins is necessary

Network of unfolding pathways

Typically accepted stages of folding 1. Forming short segments of secondary structure forming folding nuclei 2. Interaction of nuclei to form domains 3. Interaction of domains to form the molten globule 4. Rearrangement of molten globule to native conformation of a monomer 5. Interaction of monomers to form multimers in multisubunit proteins. 6. Further conformational adjustments.

Wait! Here comes the chaperone We found some proteins which help other proteins fold correctly. Folding became complicated by chaperones

Computational Folding

The essence of protein folding problem in Bioinformatics We have learnt physical and experimental backgrounds of folding. For Bioinformatists, the folding problem is on HOW to map and predict all the long range residue-residue interactions. Secondary structure prediction is easy. Putting secondary structures to tertiary structure is NOT. Solving protein folding problem can be said as : “Ab Initio Structure Prediction”

Theory of protein structure. There are two theoretical approaches to the protein folding problem. The first approach is to consider proteins merely as long polymer chains whose energies must be minimized by searching in the space of torsion angles or by using simplified lattice models.

All atoms model Parameters are obtained through high-level quantum mechanical calculations on short peptide fragments Such an approach has several advantages. It assures the generality and allows further refinement upon the availability of more accurate quantum mechanical methods and upon the need for such an improvement. Detailed models require both a large number of particles, typically more than 10,000 A small time step of one to two femtoseconds (10–15 seconds), direct simulation of the folding processes, which take place on a microsecond or larger time scale, has been difficult.

An alternative way of looking at the problem is to consider proteins as systems of regular secondary structures. In terms of secondary structure, the protein folding process can be represented as a sequence of the following events: (1) formation of  -helices and  -sheets by the hydrophobically collapsed peptide chain, (2) assembly of the regular secondary structure elements into the protein core, and (3) joining of nonregular loops and the less stable "peripheral" helices and  -strands to the core and the association of independently formed domains.

Theoretical protein folding A theory of protein self-organization must reproduce all these events to finally calculate three-dimensional structures of proteins. The development of this theory requires, first of all, a design of new, specialized force field to calculate free energies of  -helices and  -sheets relative to the coil (the secondary structures and the coil are represented by large ensembles of conformers), instead of enthalpies of individual conformers given by molecular mechanics force fields. This theory requires also a design of alternative global energy optimization strategies, such as calculation of the lowest energy partition of the peptide chain into different secondary and supersecondary structures, instead of searching in the space of torsion angles or lattice simulations.

Protein Structure Modelling: Comparative modelling In 4 years, humans will have almost all protein domain structures in nature. Instead of solving the difficult protein folding problem, we can model most of protein structures by comparative modelling Methods.

The steps of modelling Sequence  homologous Structure  Alignment  Superimposition  adjustment of atomic coordinates (such as by satisfaction of spatial restraints )  Optimization  validation  Iteration of the process.

Modelling process

Modeller by Andrej Sali First, many distance and dihedral angle restraints on the target sequence are calculated from its alignment with template 3D structures. The form of these restraints was obtained from a statistical analysis of the relationships between many pairs of homologous structures. This analysis relied on a database of 105 family alignments that included 416 proteins with known 3D structure [ali & Overington, 1994]. By scanning the database, tables quantifying various correlations were obtained, such as the correlations between two equivalent - distances, or between equivalent mainchain dihedral angles from two related proteins.

These relationships were expressed as conditional probability density functions (pdf's) and can be used directly as spatial restraints. For example, probabilities for different values of the mainchain dihedral angles are calculated from the type of a residue considered, from mainchain conformation of an equivalent residue, and from sequence similarity between the two proteins. Another example is the pdf for a certain - distance given equivalent distances in two related protein structures. An important feature of the method is that the spatial restraints are obtained empirically, from a database of protein structure alignments.

Next, the spatial restraints and CHARMM energy terms enforcing proper stereochemistry [Brooks et al., 1983] are combined into an objective function. Finally, the model is obtained by optimizing the objective function in Cartesian space. The optimization is carried out by the use of the variable target function method [Braun & Gõ, 1985] employing methods of conjugate gradients and molecular dynamics with simulated annealing. Several slightly different models can be calculated by varying the initial structure. The variability among these models can be used to estimate the errors in the corresponding regions of the fold.

Running Modelling Program MODEL.html