Secondary Structure & Solvent accessible surface Calculation Lecture 6 Structural Bioinformatics Dr. Avraham Samson 81-871.

Slides:



Advertisements
Similar presentations
Chemistry 2100 Lecture 10.
Advertisements

Review of Basic Principles of Chemistry, Amino Acids and Proteins Brian Kuhlman: The material presented here is available on the.
Protein Structure – Part-2 Pauling Rules The bond lengths and bond angles should be distorted as little as possible. No two atoms should approach one another.
The amino acids in their natural habitat. Topics: Hydrogen bonds Secondary Structure Alpha helix Beta strands & beta sheets Turns Loop Tertiary & Quarternary.
Lactate dehydrogenase + 38 ATP + 2 ATP. How does lactate dehydrogenase perform its catalytic function ?
Proteins Function and Structure.
Review: Amino Acid Side Chains Aliphatic- Ala, Val, Leu, Ile, Gly Polar- Ser, Thr, Cys, Met, [Tyr, Trp] Acidic (and conjugate amide)- Asp, Asn, Glu, Gln.
FUNDAMENTALS OF MOLECULAR BIOLOGY Introduction -Molecular Biology, Cell, Molecule, Chemical Bonding Macromolecule -Class -Chemical structure -Forms Important.
Protein Secondary Structures
Basics of protein structure and stability IV: Anatomy of protein structure continued Biochem 565, Fall /03/08 Cordes.
Protein-a chemical view A chain of amino acids folded in 3D Picture from on-line biology bookon-line biology book Peptide Protein backbone N / C terminal.
1 Levels of Protein Structure Primary to Quaternary Structure.
Amino Acids and Proteins 1.What is an amino acid / protein 2.Where are they found 3.Properties of the amino acids 4.How are proteins synthesized 1.Transcription.
The Anatomy and Taxonomy of Protein Structure
Basic protein structure and stability V: Even more protein anatomy
Protein Structure Elements Primary to Quaternary Structure.
Protein Structure FDSC400. Protein Functions Biological?Food?
You Must Know How the sequence and subcomponents of proteins determine their properties. The cellular functions of proteins. (Brief – we will come back.
Marlou Snelleman 2012 Proteins and amino acids. Overview Proteins Primary structure Secondary structure Tertiary structure Quaternary structure Amino.
Proteins account for more than 50% of the dry mass of most cells
Proteins account for more than 50% of the dry mass of most cells
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
Lecture 10: Protein structure
Proteins: Secondary Structure Alpha Helix
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
BIOL 200 (Section 921) Lecture # 2, June 20, 2006 Reading for lecture 2: Essential Cell Biology (ECB) 2nd edition. Chap 2 pp 55-56, 58-64, 74-75; Chap.
Department of Mechanical Engineering
Amino Acids & Side Groups Polar Charged ◦ ACIDIC negatively charged amino acids  ASP & GLU R group with a 2nd COOH that ionizes* above pH 7.02nd COOH.
Secondary structure prediction
CS790 – BioinformaticsProtein Structure and Function1 Review of fundamental concepts  Know how electron orbitals and subshells are filled Know why atoms.
Protein Secondary Structure Prediction G P S Raghava.
Amino Acids ©CMBI 2001 “ When you understand the amino acids, you understand everything ”
Proteins.
Chapter 3 Proteins.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Protein backbone Biochemical view:
Levels of Protein Structure. Why is the structure of proteins (and the other organic nutrients) important to learn?
Levels of Protein Structure. Why is the structure of proteins (and the other organic nutrients) important to learn?
Fibrous Proteins Examples 1. a-keratins 2. Silk Fibroin 3. Collagen
Proteins account for more than 50% of the dry mass of most cells
Proteins Primary structure: Amino acids link together to form a linear polypeptide. The primary structure of a protein is a linear chain of amino acids.
Protein Structure and Properties
Basic protein structure and stability V: Even more protein anatomy
The heroic times of crystallography
Structure of the Rho Family GTP-Binding Protein Cdc42 in Complex with the Multifunctional Regulator RhoGDI  Gregory R. Hoffman, Nicolas Nassar, Richard.
Proteins.
High-Resolution Model of the Microtubule
Proteins account for more than 50% of the dry mass of most cells
Biology of Amyloid: Structure, Function, and Regulation
Haixu Tang School of Inforamtics
Volume 8, Issue 3, Pages (March 1998)
Packet #9 Supplement.
Volume 11, Issue 10, Pages (October 2004)
Packet #9 Supplement.
Volume 3, Issue 3, Pages (March 1999)
Beyond the “Recognition Code”
Volume 5, Issue 1, Pages (January 1997)
Proteins account for more than 50% of the dry mass of most cells
Levels of Protein Structure
Volume 85, Issue 7, Pages (June 1996)
Volume 3, Issue 2, Pages (February 1995)
Binding Dynamics of Isolated Nucleoporin Repeat Regions to Importin-β
Volume 95, Issue 7, Pages (December 1998)
Volume 15, Issue 6, Pages (December 2001)
Structure of the Rho Family GTP-Binding Protein Cdc42 in Complex with the Multifunctional Regulator RhoGDI  Gregory R. Hoffman, Nicolas Nassar, Richard.
Amedeo Caflisch, Martin Karplus  Structure 
Hideki Kusunoki, Ruby I MacDonald, Alfonso Mondragón  Structure 
Volume 87, Issue 7, Pages (December 1996)
Structure of a HoxB1–Pbx1 Heterodimer Bound to DNA
Structure of an IκBα/NF-κB Complex
Presentation transcript:

Secondary Structure & Solvent accessible surface Calculation Lecture 6 Structural Bioinformatics Dr. Avraham Samson

DSSP 2012Avraham Samson - Faculty of Medicine - Bar Ilan University 2 Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features Wolfgang Kabsch, Christian Sander Biopolymers, Volume 22, Issue 12, pages 2577–2637, December 1983

Amino Acids Secondary Structure Solvent Accessibility

Hydrogen bond donors and acceptors the amide nitrogen: main-chain hydrogen bond donor the carbonyl oxygen: main-chain hydrogen bond acceptor there are also side-chain acceptors and donors

2012Avraham Samson - Faculty of Medicine - Bar Ilan University 5

Hydrogen bonded turns 2012Avraham Samson - Faculty of Medicine - Bar Ilan University 7

Hydrogen bonded bridges 2012Avraham Samson - Faculty of Medicine - Bar Ilan University 8

Bend 2012Avraham Samson - Faculty of Medicine - Bar Ilan University 9

Chirality 2012Avraham Samson - Faculty of Medicine - Bar Ilan University 10

Dihedral angle calculation The book "Crystal Structure Analysis for Chemists and Biologists" by Jenny P. Glusker gives four different ways of calculating the dihedral angle, p Probably the most direct is: Consider the four atom chain The distances between any two atoms is denoted d(ij). For example d13 is the distance between atoms 1 and 3. Since you already have cartesian coordinates, this is easily calculated as SQRT( SQ(x3-x1) + SQ(y3-y1) + SQ(z3-z1) ) The dihedral angle is defined as follows: cos(angle) = P/SQRT(Q) where P = SQ(d12) * ( SQ(d23)+SQ(d34)-SQ(d24)) + SQ(d23) * (- SQ(d23)+SQ(d34)+SQ(d24)) + SQ(d13) * ( SQ(d23)-SQ(d34)+SQ(d24)) - 2 * SQ(d23) * SQ(d14) and Q = (d12 + d23 + d13) * ( d12 + d23 - d13) * (d12 - d23 + d13) * (-d12 + d23 + d13 ) * (d23 + d34 + d24) * ( d23 + d34 - d24 ) * (d23 - d34 + d24) * (-d23 + d34 + d24 ) A test case, d12 = 2.38, d23 = 1.48, d34 = 1.48, d13 = 3.56, d14 = 3.61, d24 = 2.40 P = 20.83, SQRT(Q) = 21.40, angle = 13.3 degrees 2012Avraham Samson - Faculty of Medicine - Bar Ilan University 11

2012Avraham Samson - Faculty of Medicine - Bar Ilan University 12 Helices

Ladders and sheets 2012Avraham Samson - Faculty of Medicine - Bar Ilan University 13

More details SS-bonds Chain breaks Handedness (chirality) Pymol and molmol use DSSP to assign secondary structure 2012Avraham Samson - Faculty of Medicine - Bar Ilan University 14

Nelson et al (Eisenberg lab), Nature 435:773 (2005). for background on “polar zippers”: Perutz et al. PNAS 91:5355 (1991) These types of fibrils important in Huntington’s disease etc amyloid-like fibril(left) of peptide GNNQNNY from the yeast prion protein Sup35, and its atomic structure (right) Because of the repetitive nature of secondary structures, and particularly beta-sheets, proteins can form fibrillar structures and aggregates amide stacks fibril axis in the case of this fibril the side chains also hydrogen bond to each other

Fibrillar helical structures: the leucine zipper GCN4 “leucine zipper” (green) bound as a dimer (two copies of the polypeptide) to target DNA The GCN4 dimer is formed through hydrophobic interactions between leucines (red) in the two polypeptide chains Leu

DSSP Code: H = alpha helix G = 3-helix (3/10 helix) I = 5 helix (pi helix) B = residue in isolated beta-bridge E = extended strand, participates in beta ladder T = hydrogen bonded turn S = bend Blank = loop

18

Question: How would you assign structural neighbors (<5 A) from a PDB file? Answer: Parse PDB file for atoms with distance less than 5 Angstroms! 19

Contact maps of protein structures 1avg--structure of triabin map of C  -C  distances < 6 Å near diagonal: local contacts in the sequence off-diagonal: long-range (nonlocal) contacts rainbow ribbon diagram blue to red: N to C -both axes are the sequence of the protein

Contact maps of protein structures Structure of n15 Cro -both axes are the sequence of the protein rainbow ribbon diagram blue to red: N to C map of C  -C  distances < 6 Å

Contact maps of protein structures Structure of n15 Cro -both axes are the sequence of the protein rainbow ribbon diagram blue to red: N to C map of all heavy atom distances < 6 Å (includes side chains)

Surface and interior of globular proteins solvent accessible surface molecular surface residue fractional accessibility pockets and cavities “hydrophobic core” ordered waters in protein structures

“Accessible Surface” Lee & Richards, 1971 Shrake & Rupley, 1973 represent atoms as spheres w/appropriate radii and eliminate overlapping parts... mathematically roll a sphere all around that surface... the sphere’s center traces out a surface as it rolls...

Now look at a cross-section (slice) of a protein structure: Inner surfaces here are van der Waals. Outer surface is that traced out by the center of the sphere as it rolls around the van der Waals’ surface. If any part of the arc around a given atom is traced out, that atom is accessible to solvent. The solvent accessible surface of the atom is defined as the sum the arcs traced around an atom. solvent accessible surface from Lee & Richards, 1971 van der Waals surface arc traced around atom there’s not much solvent accessible surface in the middle

“Accessible surface”/“Molecular surface” note: these are alternative ways of representing the same reality: the surface which is essentially in contact with solvent

molecular and accessible surfaces are both useful representations, but molecular surface is more closely related to the actual atomic surfaces. This makes it somewhat better for visualizing the texture of the outer surface, as well as for assessing the shape and volume of any internal cavities. you will hear the term Connolly surface used often, after Michael Connolly. A Connolly surface is a particular way of calculating the molecular surface. The accessible surface is also occasionally called the Richards surface, after Fred Richards.

Molecular surface of proteins depiction of heavy atoms (O, N,C, S) in a protein as van der Waals spheres depiction of the corresponding “molecular surface”--volume contained by this surface is vdW volume plus “interstitial volume”--spaces in between

The irregular surface of proteins: pockets and cavities a pocket is an empty concavity on a protein surface which is accessible to solvent from the outside. a cavity or void in a protein is a pocket which has no opening to the outside. It is an interior empty space inside the protein. Pockets and cavities can be critical features of proteins in terms of their binding behavior, and identifying them is usually a first step in structure-based ligand design etc.

Fractional accessibility calculate total solvent accessible surface of protein structure (also can calculate solvent accessible surface for individual residues/sidechains within the protein) can also model the accessible surface area in a disordered or unfolded protein using accessible surface area calculations on model tripeptides such as Ala-X-Ala or Gly-X-Gly. from these we can calculate what fraction of the surface is buried (inaccessible to solvent) by virtue of being within the folded, native structure of the protein. this is done by dividing the accessible surface area in the native protein structure by the accessible surface in the modelled unfolded protein. That’s the fractional accessibility. The residue fractional accessibility and side chain fractional accessibility refer to the same thing calculated for individual residues/sidechains within the structure.

Accessible surface area in globular protein structures Accessible surface area A s in native states of proteins is a non-linear function of molecular weight (Miller, Janin, Lesk & Chothia, 1987): A s = 6.3M r 0.73 `where M r is molecular wt This is an empirical correlation but it comes close to the expected two-thirds power law relating surface area to volume or mass for a set of bodies of similar shape and density.

How much surface area is buried when a protein adopts its native structure in solution? estimate total accessible surface area in extended/disorded polypeptide chain using the accessible surface areas in Gly-X-Gly or Ala-X-Ala models. This is a linear function of molecular weight A t = 1.48M r + 21 the total fractional accessibility is A s /A t,and the fraction of surface area buried is 1- A s /A t What is the total fractional surface area buried for a protein of molecular weight 10,000? 20,000? Is the fraction higher for small proteins or large?

Distribution of residue fractional accessibilities note broad distribution among non-buried residues, and mean fractional accessibility for non-buried residues of around 0.5 note that few residues are completely exposed to solvent, but that fractional accessibility of >1 is possible from Miller et al, 1987 note that a sizeable group are completely buried (hatched) or nearly completely buried

Buried residues in proteins size classmean Mrfraction of buried residues 0% ASA5% ASA small medium large XL all the fraction of buried residues (defined by 0% or 5% ASA cutoffs) increases as a function of molecular weight--for your average protein around 25% of the residues will be buried. These form the core.

Residue fractional accessibility correlates with free energies of transfer for amino acids between water and organic solvents (Miller, Janin, Lesk & Chothia, 1987) (Fauchere & Pliska, 1983) the interior of a protein is akin to a nonpolar solvent in which the nonpolar sidechains are buried. Polar sidechains, on the other hand, are usually on the surface. However, some polar side chains do get buried, and it must also be remembered that the backbone for every residue is polar, including those with nonpolar side chains. So a lot of polar moieties do get buried in proteins.

The hydrophobic core of a small protein: N15 Cro 0% ASA: Pro 3 Leu 6 Ala 16 Val 27 Ile 36 Ile 44 < 5 % ASA: Met 1 Ala 17 Val 20 Gln 41 Ser of 66 ordered residues have less than 5% ASA note that some polar residues are buried

The outer surface: water in protein structures Structures of water-soluble proteins determined at reasonably high resolution will be decorated on their outer surfaces with water molecules (cyan balls) with relatively well-defined positions, and waters may also occur internally Water is not just surrounding the protein--it is interacting with it

Water interacts with protein surfaces second shell water: only contacts other waters first shell waters: in contact with/ hydrogen bound to protein Most waters visible in crystal structures make hydrogen bonds to each other and/or to the protein, as donor/acceptor/both

DSSP Web Service

Amino Acids Secondary Structure Solvent Accessibility

STRIDE web service bin/stride/stridecgi.py 41

REM Detailed secondary structure assignment L4W REM 1L4W REM |---Residue---| |--Structure--| |-Phi-| |-Psi-| |-Area-| 1L4W ASG ILE A 1 1 C Coil L4W ASG VAL A 2 2 E Strand L4W ASG CYS A 3 3 E Strand L4W ASG HIS A 4 4 E Strand L4W ASG THR A 5 5 E Strand L4W ASG THR A 6 6 E Strand L4W ASG ALA A 7 7 C Coil L4W ASG THR A 8 8 T Turn L4W ASG SER A 9 9 T Turn L4W ASG PRO A T Turn L4W ASG ILE A E Strand L4W ASG SER A E Strand L4W ASG ALA A E Strand L4W ASG VAL A E Strand L4W ASG THR A E Strand L4W ASG CYS A C Coil L4W ASG PRO A C Coil L4W ASG PRO A T Turn L4W ASG GLY A T Turn L4W ASG GLU A T Turn L4W ASG ASN A T Turn L4W ASG LEU A E Strand L4W ASG CYS A E Strand L4W ASG TYR A E Strand L4W ASG ARG A E Strand L4W ASG LYS A E Strand L4W ASG MET A E Strand L4W ASG TRP A E Strand L4W ASG CYS A E Strand L4W ASG ASP A E Strand L4W ASG ALA A B Bridge L4W ASG PHE A T Turn L4W ASG CYS A T Turn L4W ASG SER A T Turn L4W ASG SER A T Turn L4W ASG ARG A C Coil L4W ASG GLY A E Strand L4W 2012Avraham Samson - Faculty of Medicine - Bar Ilan University 42

Structure Analysis Assign secondary structure for amino acids from 3D structure Generate solvent accessible area for amino acids from 3D structure Most widely used tool: DSSP (Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features. Kabsch and Sander, 1983)

2D: Contact Map Prediction 1 2 ………..………..…j...…………………..…n i n i n 3D Structure 2D Contact Map Cheng, Randall, Sweredoski, Baldi. Nucleic Acid Research, 2005 Distance Threshold = 8A o

3D Structure Prediction Tools MULTICOM ( ) I-TASSER ( HHpred ( ed) ed Robetta ( 3D-Jury ( FFAS ( Pcons ( Sparks ( sp3.html) FUGUE ( cryst.bioc.cam.ac.uk/%7Efugue/prfsearch.html) cryst.bioc.cam.ac.uk/%7Efugue/prfsearch.html FOLDpro ( SAM ( Phyre ( 3D-PSSM ( mGenThreader (