Multi-Scale Hierarchical Structure Prediction of Helical Transmembrane Proteins Zhong Chen and Ying Xu Department of Biochemistry and Molecular Biology.

Slides:



Advertisements
Similar presentations
Toward high-resolution prediction and design of transmembrane helical protein structures P. Barth, J. Schonbrun and D. Baker PNAS Sep Tim Nugent.
Advertisements

Review of Basic Principles of Chemistry, Amino Acids and Proteins Brian Kuhlman: The material presented here is available on the.
Review: Amino Acid Side Chains Aliphatic- Ala, Val, Leu, Ile, Gly Polar- Ser, Thr, Cys, Met, [Tyr, Trp] Acidic (and conjugate amide)- Asp, Asn, Glu, Gln.
FUNDAMENTALS OF MOLECULAR BIOLOGY Introduction -Molecular Biology, Cell, Molecule, Chemical Bonding Macromolecule -Class -Chemical structure -Forms Important.
Protein Threading Zhanggroup Overview Background protein structure protein folding and designability Protein threading Current limitations.
An overview of amino acid structure Topic 2. Biomacromolecule A naturally occurring substance of large molecular weight e.g. Protein, DNA, lipids etc.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Protein-a chemical view A chain of amino acids folded in 3D Picture from on-line biology bookon-line biology book Peptide Protein backbone N / C terminal.
1 Levels of Protein Structure Primary to Quaternary Structure.
Amino Acids and Proteins 1.What is an amino acid / protein 2.Where are they found 3.Properties of the amino acids 4.How are proteins synthesized 1.Transcription.
©CMBI 2008 Aligning Sequences The most powerful weapon in the bioinformaticist’s armory is sequence alignment. Why? Lets’ think about an alignment. It.
Thomas Blicher Center for Biological Sequence Analysis
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
©CMBI 2005 Why align sequences? Lots of sequences with unknown structure and function. A few sequences with known structure and function If they align,
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Genetic Threading By J.Yadgari and A.Amir Published: special issue on Bioinformatics in Journal of Constraints, June 2001 Alexandre Tchourbanov University.
The relative orientation observed for  helices packed on ß sheets.
Protein Structure Elements Primary to Quaternary Structure.
Chapter 3 The Chemistry of Organic Molecules
Protein Structure FDSC400. Protein Functions Biological?Food?
Proteins. The central role of proteins in the chemistry of life Proteins have a variety of functions. Structural proteins make up the physical structure.
Marlou Snelleman 2012 Proteins and amino acids. Overview Proteins Primary structure Secondary structure Tertiary structure Quaternary structure Amino.
Proteins are polymers of amino acids.
Protein Structural Prediction. Protein Structure is Hierarchical.
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
BINF6201/8201 Principle components analysis (PCA) -- Visualization of amino acids using their physico-chemical properties
Proteins account for more than 50% of the dry mass of most cells
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
Proteins: Secondary Structure Alpha Helix
©CMBI 2006 Amino Acids “ When you understand the amino acids, you understand everything ”
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
Computational prediction of 3D Structure of Bilitranslocase Membrane Transporter: Drug Development Perspectives Amrita Roy Choudhury National Institute.
Transmembrane proteins in the Protein Data Bank: identification and classification Gabor, E. Tusnady, Zsuzanna Dosztanyi and Istvan Simon Bioinformatics,
On the nature of cavities on protein surfaces: Application to the Identification of drug-binding sites Murad Nayal, Barry Honig Columbia University, NY.
Representations of Molecular Structure: Bonds Only.
ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory.
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
1.Overall amino acid structure 2.Amino acid stereochemistry 3.Amino acid sidechain structure & classification 4.‘Non-standard’ amino acids 5.Amino acid.
Department of Mechanical Engineering
Amino Acids & Side Groups Polar Charged ◦ ACIDIC negatively charged amino acids  ASP & GLU R group with a 2nd COOH that ionizes* above pH 7.02nd COOH.
Secondary structure prediction
The α-helix forms within a continuous strech of the polypeptide chain 5.4 Å rise, 3.6 aa/turn  1.5 Å/aa N-term C-term prototypical  = -57  ψ = -47 
10/3/2003 Molecular and Cellular Modeling 10/3/2003 Introduction Objective: to construct a comprehensive simulation software system for the computational.
Protein Secondary Structure Prediction G P S Raghava.
New Strategies for Protein Folding Joseph F. Danzer, Derek A. Debe, Matt J. Carlson, William A. Goddard III Materials and Process Simulation Center California.
Amino Acids ©CMBI 2001 “ When you understand the amino acids, you understand everything ”
Marlou Snelleman 2011 Proteins and amino acids. Overview Proteins Primary structure Secondary structure Tertiary structure Quaternary structure Amino.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Hierarchy of Protein Structure
X-ray detection xray/facilities.html.
 The generated models are used in various coarse-grain and other molecular modelling studies.  Coarse-grain analysis includes: Gaussian Network Models.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Protein backbone Biochemical view:
Protein structure prediction Haixu Tang School of Informatics.
Mean Field Theory and Mutually Orthogonal Latin Squares in Peptide Structure Prediction N. Gautham Department of Crystallography and Biophysics University.
Fibrous Proteins Examples 1. a-keratins 2. Silk Fibroin 3. Collagen
Structural Bioinformatics Elodie Laine Master BIM-BMC Semester 3, Genomics of Microorganisms, UMR 7238, CNRS-UPMC e-documents:
A new protein-protein docking scoring function based on interface residue properties Reporter: Yu Lun Kuo (D )
Methods of Protein Structure Elucidation Yaroslav Ryabov.
Protein Structure and Properties
Protein Structure FDSC400. Protein Functions Biological?Food?
High-Resolution Model of the Microtubule
Haixu Tang School of Inforamtics
Protein Structure Prediction
Packet #9 Supplement.
Amino Acids Amine group -NH2 Carboxylic group -COOH
Levels of Protein Structure
Protein structure prediction.
Justin Spiriti Zuckerman Lab MMBioS meeting 5/22/2014
Presentation transcript:

Multi-Scale Hierarchical Structure Prediction of Helical Transmembrane Proteins Zhong Chen and Ying Xu Department of Biochemistry and Molecular Biology and Institute of Bioinformatics University of Georgia

Outline 1.Background information 2.Statistical analysis of known membrane protein structures 3.Structure prediction at residual level 4.Helix packing at atomistic level 5.Linking predictions at residue and atomistic levels

Membrane Proteins  Roles in biological process: Receptors; Channels, gates and pumps; Electric/chemical potential; Energy transduction  > 50% new drug targets are membrane proteins (MP). Beta structureHelical structure

Membrane Proteins 20-30% of the genes in a genome encode MPs. < 1% of the structures in the Protein Data Bank (PDB) are MPs difficulties in experimental structure determination.

Membrane Proteins  Prediction for transmembrane (TM) segments (α-helix or β-sheet) based on sequence alone is very accurate (up to 95%); ? Prediction of the tertiary structure of the TM segments: how do these α-helices/β-sheets arrange themselves in the constrains of bi- lipid layers? Helical structures are relatively easier to solve computationally

Membrane Protein Structures Difficult to solve experimentally Computational techniques could possibly play a significant role in solving MP structures, particularly helical structures

Statistical analysis of known structures: Unveil the underlying principles for MP structure and stability; Develop knowledge-based propensity scale and energy functions. Structure prediction at residue level Structure prediction at atomistic level: MC, MD multi-scale, hierarchical computational framework High Level Plan

Part I: Statistical Analysis of Known Structures

Database for Known MP Structures: Helical Bundles Redundant database 50 pdb files 135 protein chains Non-redundant database (identity < 30%) 39 pdb files 95 protein chains (avg. length ~220 AA)

Bi-lipid Layer Chemistry Polar header (glycerol, phosphate) Hydrophobic tail (fatty acid)

Statistics-based energy functions Length of bi-lipid layer: ~60 Å  Central regions  Terminal regions Three energy terms  Lipid-facing potential  Residue-depth potential  Inter-helical interaction potential Central Terminal 30 Å 60 Å

Lipid-facing Propensity Scale ResidueTerminiCentral ILE VAL LEU PHE CYS MET ALA GLY THR SER TRP TYR PRO HIS ASP GLU ASN GLN LYS ARG fraction of AA are lipid-facing LF_scale(AA) = fraction of AA are in interior The most hydrophobic residues (ILE, VAL, LEU) prefer the surface of MPs in the central region, while prefer interior position in the terminal regions; Small residues (GLY, ALA, CYS, THR) tend to be buried in the helix bundle; Bulky residues (LYS, ARG, TRP, HIS) are likely to be found on the surface. This propensity scale reflects both hydrophobic interactions and helix packing

Helical Wheel and Moment Analysis Lipid facing vector prediction: state of the art kPROT: avg. error ~41º Samatey Scale: 61º Hydrophobicity scales: 65 ~68º X (Angstrom) Y (Angstrom) * Average Predication Error: 41 degree The magnitude of each thin-vector is proportional to the LF-propensity and overall lipid-facing vector is the sum of all thin vectors,

Reside-Depth Potential - hydrophobic residues tend to be located in the hydrocarbon core; - hydrophilic residues tend to be closer to terminal regions; - aromatic residues prefer the interface region.

TM Helix Tilt Angle Prediction major pVIII coat protein of the filamentous fd bacteriophage (1MZT) 23º

Inter-Helical Pair-wise Potential Å

Statistical energy potentials (summary) 1.Three residue-based statistic potentials were derived from the database: (a) lipid-facing propensity, (b) residue depth potential, (c) inter-helical pair-wise potential 2.The lipid-facing scale predicted the lipid-facing direction for single helix with a uncertainty at ~ ±40º; 3.The residue-depth potential was able to predict the tilt angle for single helix with high accuracy. 4.Need more data to make inter-helical pair-wise potential more reliable

Part II: Structure Prediction at Residue Level

Key Prediction Steps Structure prediction through optimizing our statistical potential (weighted sum) Idealized and rigid helical backbone configurations; Monte Carlo moves: translations, rotations, rotation by helix axis; Wang-Landau sampling technique for MC simulation Principle component analysis.

In Wang-Landau, g(E) is initially set to 1 and modified “on the fly”. Monte Carlo moves are accepted with probability Each time when an energy level E is visited, its density of states is updated by a modification factor f >1, i.e., Observation: if a random walk is performed with probability proportional to reciprocal of density of states then a flat energy histogram could be obtained. Wang-Landau Method for MC The density of states is not known a priori.

Wang-Landau Method for MC Advantages: 1.simple formulation and general applicability; 2.Entropy and free energy information derivable from g(E); 3.Each energy state is visited with equal probability, so energy barriers are overcome with relative ease.

Principal Component Analysis Purpose: - analyze the conformation variations during a simulation, and - identify the most important conformational degrees of freedom. Covariance matrix: * A large part of the system’s fluctuations can be described in terms of only a few PCA eigenvectors.

A Model System: Glycophorin (GpA) Dimer 22 residues, 189 atoms EITLIIFGVMAGVMAGVIGTILLISY GxxxG motif Ridges-into-grooves

Glycophorin (GpA) Dimer (1AFO) RMSD=3.6A E=-114.6kcal/mol A: GEM (global energy minimum) B: LEM RMSD=0.8A E=-93.9kcal/mol RED: experiment GREY: simulation BA

Helices A and B of Bacteriorhodopsin (1QHJ) RMSD=2.7A E=-94kcal/mol A: GEM B: LEM RMSD=0.9A E=-86kcal/mol A B RED: experiment GREY: simulation

Bacteriorhodopsin (1QHJ) Rmsd=5.0A A B C D E F G A Experimental structure Computational prediction

Residue-level structure prediction (Summary) 1.A computational scheme was established for TM helix structure prediction at residue level; 2.For two-helix systems, LEM structures very close to native structures (RMSD < 1.0 Å) were consistently predicted; 3.For a seven-helix bundle, a packing topology within 5.0 Å of the crystal structure was identified as one of the LEMs.

Part III: Structure Prediction at Atomistic Level

Key Prediction Steps  Structure prediction through optimizing atom-level energy potential:  CHARMM19 force field for helix-helix interaction  Knowledge-based energy function for lipid-helix interaction  Idealized and rigid helix structure for backbone and sidechain flexible;  Apply helix orientation constraint (i.e., N-term inside/outside cell);  MC moves: translations, rotations, rotation by helix axis, and side- chain torsional rotation;  Wang-Landau algorithm for MC simulation

CHARMM19 Polar Hydrogen Force Field - nonpolar hydrogen atoms are combined with heavy atoms they are bound to, - polar hydrogen atoms are modeled explicitly.

2D Wang-Landau Sampling in PC1 and E Spaces LEM2 LEM1

Effect of Helix-Lipid Interactions: Helices A&B of Bacteriorhodopsin Helix-helix interactionsHelix-helix & helix-lipid interactions Helix-lipid interactions play a critical role in the correct packing of helices

Effect of Helix-Lipid Interactions: Helix A&B of Bacteriorhodopsin (BR) RMSD=4.4 Å RMSD=0.2 Å RMSD=5.7 Å RMSD=7.1 Å 30 Å Hydrocarbon core region All four LEM structures share essentially the same contact surfaces. In the native structure, the polar N-terminals of both helices are located outside of hydrocarbon core region, resulting in low helix-lipid energy.

Docking of a Seven-helix Bundle: Bacteriorhodopsin (1QHJ) 7 helices, 174 residues, 1619 atoms CHARMM19 + lipid-helix potential; One month CPU time on one PC AB A B Initial Configuration Crystal structure

Potential Energy Landscape Rmsd=3.0A Rmsd=4.7A Rmsd=6.6A Rmsd=8.0A Rmsd=8.4A

Global Energy Minimum Structure (RMSD=3.0 Å) RED: experiment GREY: simulation

Atom-level Structure Prediction (Summary) 1.Wang-Landau algorithm proved to be effective for the energetics study of TM helix packing; 2.Prediction results for two-helix and seven-helix structures are highly promising 3.Practical application of Wang-landau method to large systems requires further work.

Part IV: Linking Predictions at Residue- and Atomistic levels

Correspondence between simulations at two levels A multi-scale hierarchical modeling approach is feasible and practical: LEMs identified at residue-level be used as candidates for atomistic simulation; Using PC vectors from residue-level simulation to improve search speed in atomistic simulation.

Future Works 1.Further improvement of the residue-based folding potentials; 2.Speed-up and parallelization of Wang-Landau sampling; 3.Construct a hierarchical computational framework, and develop corresponding software package.

Acknowledgements 1.Funding from NSF/DBI, NSF/ITR, NIH, and Georgia Cancer Coalition 2.Dr. David Landau (Wang-Landau algorithm) and Dr. Jim Prestegard (NMR data generation) of UGA 3.Thanks DIMACS for invitation to speak here