Protein Folding Protein Structure Prediction Protein Design

Slides:



Advertisements
Similar presentations
GroEL and GroES By: Jim, Alan, John. Background Proteins fold spontaneously to their native states, based on information in their amino acid sequence.
Advertisements

PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Protein Structure Prediction using ROSETTA
Proteins - Many Structures, Many Functions 1.A polypeptide is a polymer of amino acids connected to a specific sequence 2.A protein’s function depends.
S ASC Answer to Practice Problem
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
Folding and flexibility. Outline What is protein folding ? How proteins fold in vivo ? What is protein flexibility ?
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
Biology 107 Macromolecules II September 5, Macromolecules II Student Objectives:As a result of this lecture and the assigned reading, you should.
College 4. Coordination interaction A dipolar bond, or coordinate covalent bond, is a description of covalent bonding between two atoms in which both.
Energetics and kinetics of protein folding. Comparison to other self-assembling systems?
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
(Foundation Block) Dr. Ahmed Mujamammi Dr. Sumbul Fatma
Proteins account for more than 50% of the dry mass of most cells
Proteins Dr. Sumbul Fatma Clinical Chemistry Unit
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Protein modelling ● Protein structure is the key to understanding protein function ● Protein structure ● Topics in modelling and computational methods.
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
Proteins account for more than 50% of the dry mass of most cells
Diverse Macromolecules. V. proteins are macromolecules that are polymers formed from amino acids monomers A. proteins have great structural diversity.
Supersecondary structures. Supersecondary structures motifs motifs or folds, are particularly stable arrangements of several elements of the secondary.
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
Protein “folding” occurs due to the intrinsic chemical/physical properties of the 1° structure “Unstructured” “Disordered” “Denatured” “Unfolded” “Structured”
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
A protein’s function depends on its specific conformation (shape) A functional proteins consists of one or more polypeptides that have been precisely twisted,
Protein Structure Stryer Short Course Chapter 4. Peptide bonds Amide bond Primary structure N- and C-terminus Condensation and hydrolysis.
Shaping up the protein folding funnel by local interaction: Lesson from a structure prediction study George Chikenji*, Yoshimi Fujitsuka, and Shoji Takada*
Department of Mechanical Engineering
Secondary structure prediction
Operone lac Principles of protein structure and function Function is derived from structure Structure is derived from amino acid sequence Different.
Part I : Introduction to Protein Structure A/P Shoba Ranganathan Kong Lesheng National University of Singapore.
Protein Structure (Foundation Block) What are proteins? Four levels of structure (primary, secondary, tertiary, quaternary) Protein folding and stability.
Protein Folding and Modeling Carol K. Hall Chemical and Biomolecular Engineering North Carolina State University.
Proteins are instrumental in about everything that an organism does. These functions include structural support, storage, transport of other substances,
THE STRUCTURE AND FUNCTION OF MACROMOLECULES Proteins - Many Structures, Many Functions 1.A polypeptide is a polymer of amino acids connected to a specific.
Protein Structure (Foundation Block) What are proteins? Four levels of structure (primary, secondary, tertiary, quaternary) Protein folding and stability.
Tertiary Structure Globular proteins (enzymes, molecular machines)  Variety of secondary structures  Approximately spherical shape  Water soluble 
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Folding of proteins Proteins are synthesized on ribosomes as linear chains of amino acids. In order to be biologically active, they must fold into a unique.
Protein Design with Backbone Optimization Brian Kuhlman University of North Carolina at Chapel Hill.
Proteins Dr. Sumbul Fatma Clinical Chemistry Unit Department of Pathology Tel
Structure prediction: Ab-initio Lecture 9 Structural Bioinformatics Dr. Avraham Samson Let’s think!
3-D Structure of Proteins
PROTEIN FOLDING AND DEGRADATION Kanokporn Boonsirichai.
PROTEIN FOLDING: H-P Lattice Model 1. Outline: Introduction: What is Protein? Protein Folding Native State Mechanism of Folding Energy Landscape Kinetic.
Objective 7: TSWBAT recognize and give examples of four levels of protein conformation and relate them to denaturation.
Intended Learning Objectives You should be able to… 1. Give 3 examples of proteins that are important to humans and are currently produced by transgenic.
Chemistry XXI Unit 3 How do we predict properties? M1. Analyzing Molecular Structure Predicting properties based on molecular structure. M4. Exploring.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein backbone Biochemical view:
Levels of Protein Structure. Why is the structure of proteins (and the other organic nutrients) important to learn?
What is Protein Folding? Implications of Misfolding Computational Techniques Background image: Staphylococcal protein A, Z Domain (
Protein Folding & Biospectroscopy Lecture 4 F14PFB David Robinson.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
Protein Structure Prediction: Threading and Rosetta BMI/CS 576 Colin Dewey Fall 2008.
CHAPTER 5 THE STRUCTURE AND FUNCTION OF MACROMOLECULES Copyright © 2002 Pearson Education, Inc., publishing as Benjamin Cummings Section D: Proteins -
Polypeptide Chains Can Change Direction by Making Reverse Turns and Loops.
Protein Folding.
Protein Structure and Properties
Enzyme Kinetics & Protein Folding 9/7/2004
Directed Mutagenesis and Protein Engineering
Rosetta: De Novo determination of protein structure
Protein structure prediction.
CHAPTER 5 THE STRUCTURE AND FUNCTION OF MACROMOLECULES
Protein structure (Foundation Block).
The Three-Dimensional Structure of Proteins
Presentation transcript:

Protein Folding Protein Structure Prediction Protein Design Brian Kuhlman Department of Biochemistry and Biophysics

Protein Folding Why do we care about protein folding? The process by which a protein goes from being an unfolded polymer with no activity to a uniquely structured and active protein. Why do we care about protein folding? If we understand how proteins fold, maybe it will help us predict their three-dimensional structure from sequence information alone. Protein misfolding has been implicated in many human diseases (Alzheimer's, Parkinson’s, …)

Protein folding in vitro is often reversible (indicating that the final folded structure is determined by its amino acid sequence) 37° C 70° C 37° C Chris Anfinsen - 1957

How Do Proteins Fold? Do proteins fold by performing an exhaustive search of conformational space? Cyrus Levinthal tried to estimate how long it would take a protein to do a random search of conformational space for the native fold. Imagine a 100-residue protein with three possible conformations per residue. Thus, the number of possible folds = 3100 = 5 x 1047. Let us assume that protein can explore new conformations at the same rate that bonds can reorient (1013 structures/second). Thus, the time to explore all of conformational space = 5 x 1047/1013 = 5 x 1034 seconds = 1.6 x 1027 years >> age of universe This is known as the Levinthal paradox.

How do proteins fold? Do proteins fold by a very discrete pathway? Flat landscape (Levinthal paradox) Tunnel landscape (discrete pathways) Realistic landscape (“folding funnel”)

How do proteins fold? Typically, proteins fold by progressive formation of native-like structures. Folding energy surface is highly connected with many different routes to final folded state.

How do proteins fold? Interactions between residues close to each other along the polypeptide chain are more likely to form early in folding.

Protein Folding Rates Correlate with Contact Order N = number of contacts in the protein DLij = sequence separation between contacting residues

Protein misfolding: the various states a protein can adopt.

Molecular Chaperones Nature has a developed a diverse set of proteins (chaperones) to help other proteins fold. Over 20 different types of chaperones have been identified. Many of these are produced in greater numbers during times of cellular stress.

Example: The GroEL(Hsp60) family GroEL proteins provide a protected environment for other proteins to fold. Binding of U occurs by interaction with hydrophobic residues in the core of GroEL. Subsequent binding of GroES and ATP releases the protein into an enclosed cage for folding.

Hsp60 Proteins The Chaperonin - GroEL

Protein misfolding: the various states a protein can adopt.

Amyloid fibrils rich in b strands (even if wild type protein was helical) forms by a nucleation process, fibrils can be used to seed other fibrils generally composed of a single protein (sometimes a mutant protein and sometimes the wildtype sequence)

Amyloid fibrils implicated in several diseases Amyloid fibrils have been observed in patients with Alzheimers disease, type II diabetes, Creutzfeldt-Jakob disease (human form of Mad Cow’s disease), and many more …. In some cases it is not clear if the fibrils are the result of the disease or the cause. Fibrils can form dense plaques which physically disrupt tissue The formation of fibrils depletes the soluble concentration of the protein

Folding Diseases: Amyloid Formation

Misfolded proteins can be infectious (Mad Cow’s Disease, Prion proteins) PrPSc Active protein PrPC Stanely Prusiner: 1997 Nobel Prize in Medicine

Structure Prediction DEIVKMSPIIRFYSSGNAGLRTYIGDHKSCVMCTYWQNLLTYESGILLPQRSRTSR

Prediction Strategies Homology Modeling Proteins that share similar sequences share similar folds. Use known structures as the starting point for model building. Can not be used to predict structure of new folds. De Novo Structure Prediction Do not rely on global similarity with proteins of known structure Folds the protein from the unfolded state. Very difficult problem, search space is gigantic

De Novo Structure Prediction DEIVKMSPIIRFYSSGNAGLRTYIGDHKSCVMCTYWQNLLTYESGILLPQRSRTSR

Fragment-based Methods (Rosetta) Hypothesis, the PDB database contains all the possible conformations that a short region of a protein chain might adopt. How do we choose fragments that are most likely to correctly represent the query sequence?

Fragment-based Methods (Rosetta) Hypothesis, the PDB database contains all the possible conformations that a short region of a protein chain might adopt. How do we choose fragments that are most likely to correctly represent the query sequence?

Fragment Libraries A unique library of fragments is generated for each 9-residue window in the query sequence. Assume that the distributions of conformations in each window reflects conformations this segment would actually sample. Regions with very strong local preferences will not have a lot of diversity in the library. Regions with weak local preferences will have more diversity in the library.

Monte Carlo-based Fragment Assembly start with an elongated chain make a random fragment insertion accept moves which pass the metropolis criterian ( random number < exp(-DU/RT) ) to converge to low energy solutions decrease the temperature during the simulation (simulated annealing)

movie

Multiple Independent Simulations Any single search is rapidly quenched Carry out multiple independent simulations from multiple starting points.

Fragments are only going to optimize local interactions Fragments are only going to optimize local interactions. How do we favor non-local protein-like structures? An energy function for structure prediction should favor:

An energy function for structure prediction should favor: Fragments are only going to optimize local interactions. How do we favor non-local protein-like structures? An energy function for structure prediction should favor: Buried hydrophobics and solvent exposed polars Compact structures, but not overlapped atoms Favorable arrangement of secondary structures. Beta strand pairing, beta sheet twist, right handed beta-alpha-beta motifs, … Favorable electrostatics, hydrogen bonding For the early parts of the simulation we may want a smoother energy function that allows for better sampling.

Protein Design XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Protein Design A rigorous test of our understanding of protein stability and folding Applications increase protein stability increase protein solubility enhance protein binding affinities alter protein-protein binding specificities (new tools to probe cell biology) build small molecule binding sites into proteins (biosensors, enzymes)

Central Problem: Identifying amino acids that are compatible with a target structure. To solve this problem we will need: A protocol for searching sequence space An energy function for ranking the fitness of a particular sequence for the target structure

Rosetta Energy Function 1) Lennard-Jones Potential (favors atoms close, but not too close) 2) implicit solvation model (penalizes buried polar atoms) 3) hydrogen bonding (allows buried polar atoms) 4) electrostatics (derived from the probability of two charged amino acids being near each other in the PDB) 5) PDB derived torsion potentials 6) Unfolded state energy (3) (2) (1) (5) (4)

Search Procedure – Scanning Through Sequence Space Monte Carlo optimization start with a random sequence make a single amino acid replacement or rotamer substitution accept change if it lowers the energy if it raises the energy accept at some small probability determined by a boltzmann factor repeat many times (~ 2 million for a 100 residue protein) 34

Search Procedure start with a random sequence 35

Search Procedure try a new Trp rotamer 36

Search Procedure Trp to Val 37

Search Procedure Leu to Arg 38

Search Procedure 39

Search Procedure final optimized sequence 40

Designing a Completely New Backbone draw a schematic of the protein Identify constraints that specify the fold (arrows) Assign a secondary structure type to each residue (s = strand, t = turn) Pick backbone fragments from the PDB that have the desired secondary structure Assemble 3-dimensional structure by combining fragments in a way that satisfies the constraints (Rosetta). t t s s s s s s s s

Target Structure

An Example of a Starting Structure

Design Model and Crystal Structure of Top7