Structural Biology: What does 3D tell us? Stephen J Everse University of Vermont.

Slides:



Advertisements
Similar presentations
Secondary structure prediction from amino acid sequence.
Advertisements

Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Protein Structure Prediction
Protein Structure – Part-2 Pauling Rules The bond lengths and bond angles should be distorted as little as possible. No two atoms should approach one another.
The amino acids in their natural habitat. Topics: Hydrogen bonds Secondary Structure Alpha helix Beta strands & beta sheets Turns Loop Tertiary & Quarternary.
Review: Amino Acid Side Chains Aliphatic- Ala, Val, Leu, Ile, Gly Polar- Ser, Thr, Cys, Met, [Tyr, Trp] Acidic (and conjugate amide)- Asp, Asn, Glu, Gln.
Secondary structure elements  helices  strands/sheets/barrels  turns The type of 2° structure is determined by the amino acid sequence –Chemical & physical.
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala.
Structural bioinformatics
1 September, 2004 Chapter 5 Macromolecular Structure.
Chapter 9 Structure Prediction. Motivation Given a protein, can you predict molecular structure Want to avoid repeated x-ray crystallography, but want.
An Introduction to Bioinformatics Protein Structure Prediction.
Protein Structure, Databases and Structural Alignment
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
Computational Biology, Part 10 Protein Structure Prediction and Display Robert F. Murphy Copyright  1996, 1999, All rights reserved.
CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.
Protein Modules An Introduction to Bioinformatics.
Computing for Bioinformatics Lecture 8: protein folding.
(Foundation Block) Dr. Ahmed Mujamammi Dr. Sumbul Fatma
Bioinformatics for biomedicine Protein domains and 3D structure Lecture 4, Per Kraulis
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
Protein Tertiary Structure Prediction
Chapter 12 Protein Structure Basics. 20 naturally occurring amino acids Free amino group (-NH2) Free carboxyl group (-COOH) Both groups linked to a central.
Structural Biology: What does 3D tell us? Stephen J Everse University of Vermont.
Structural Biology: What does 3D tell us? Stephen J Everse University of Vermont.
Structural Biology: What does 3D tell us?
Introduction to Macromolecular X-ray Crystallography Biochem 300 Borden Lacy Print and online resources: Introduction to Macromolecular X-ray Crystallography,
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
Lecture 10: Protein structure
Introduction to Protein Structure
Proteins: Secondary Structure Alpha Helix
Proteins. Proteins? What is its How does it How is its How does it How is it Where is it What are its.
STRUCTURAL ORGANIZATION
ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory.
CS 177 Proteins, part 2 (Computational modeling) Review of protein structures Computational Modeling Three-dimensional structural analysis in laboratory.
© Wiley Publishing All Rights Reserved. Protein 3D Structures.
Protein Folding & Biospectroscopy F14PFB David Robinson Mark Searle Jon McMaster
The Strategy of Atomic Resolution Structural Biology Break down complexity so that the system can be understood at a fundamental level Build up a picture.
Single-crystal X-ray Crystallography ● The most common experimental means of obtaining a detailed picture of a large molecule like a protein. ● Allows.
Secondary structure prediction
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Protein Structure 1 Primary and Secondary Structure.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
The α-helix forms within a continuous strech of the polypeptide chain 5.4 Å rise, 3.6 aa/turn  1.5 Å/aa N-term C-term prototypical  = -57  ψ = -47 
Structural Biology: What does 3D tell us? Stephen J Everse University of Vermont.
Structural proteomics
Module 3 Protein Structure Database/Structure Analysis Learning objectives Understand how information is stored in PDB Learn how to read a PDB flat file.
Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Structural proteomics Handouts. Proteomics section from book already assigned.
3-D Structure of Proteins
X-ray crystallography – an overview (based on Bernie Brown’s talk, Dept. of Chemistry, WFU) Protein is crystallized (sometimes low-gravity atmosphere is.
Structural Biology: What does 3D tell us?
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein backbone Biochemical view:
Lecture 10 CS566 Fall Structural Bioinformatics Motivation Concepts Structure Solving Structure Comparison Structure Prediction Modeling Structural.
Protein Structure Prediction. Protein Sequence Analysis Molecular properties (pH, mol. wt. isoelectric point, hydrophobicity) Secondary Structure Super-secondary.
3.3b1 Protein Structure Threading (Fold recognition) Boris Steipe University of Toronto (Slides evolved from original material.
Lecture 53: X-ray crystallography. Electrons deflect x-rays We try to recreate electron density from the x-ray diffraction pattern Each point in space.
Protein Structure BL
The heroic times of crystallography
The Peptide Bond Amino acids are joined together in a condensation reaction that forms an amide known as a peptide bond.
The Peptide Bond Amino acids are joined together in a condensation reaction that forms an amide known as a peptide bond.
Protein Structure Prediction
The Three-Dimensional Structure of Proteins
Presentation transcript:

Structural Biology: What does 3D tell us? Stephen J Everse University of Vermont

Training –PhD & Postdoc with Russell F. Doolittle, UCSD structure of fragment D of fibrinogen structures of double-D of fibrin –Joined the faculty at UVM in 1998 Structural biologist (crystallographer) Current projects –factor Va –thioredoxin reductase –transferrin The life of a bio-chemist!!

Everse Group Maria Cristina Bravo Brian Eckenroth, Ph.D.

Fundamental Questions l How do protein cofactors modulate enzymes? l What determines and mediates protein-protein and protein-membrane interactions? l How is a protein’s function defined by structure? l How does structure prescribe the binding affinity of a metal?

Coagulation Cascade Factor XII Prekallikrein HMW Kininogen “Surface” Factor XIa HMW Kininogen Membrane Ca 2+ Zn 2+ Factor IXa Factor VIIIa Membrane Ca 2+ Intrinsic Pathway Extrinsic Pathway Factor VIIa Tissue Factor Membrane Ca 2+ Extrinsic Tenase Factor Xa Factor Va Membrane Ca 2+ Prothrombinase IX IXa XI XIa X Xa Intrinsic Tenase Contact Activation Pathway II IIa “Thrombin”

Ca FXa “Prothrombinase” FVa HC FVa LC Ca 2+ FVa HC FXa FVa LC Ca ,000 Prothrombin  -Thrombin Prothrombinase Components Relative Rate of Prothrombin Activation FXa 1 Ca 2+ 2 W.

Cu 2+ Ca 2+ A1 C1 C2 A3 Bovine Factor Va i Funded by: NIH American Society of Hematology

Prothrombinase (Va + Xa) A3 A1 A2 C2 C1 Hypothetical model

Thioredoxin reductase DmTR Eckenroth et al. Biochemistry 2007

Outline Determining a 3D structure –X-ray crystallography Structural elements Modeling a 3D structure

PrimarySecondaryTertiaryQuaternary Amino acid sequence. Alpha helices & Beta sheets, Loops. Arrangement of secondary elements in 3D space. Packing of several polypeptide chains. Given an amino acid sequence, we are interested in its secondary structures, and how they are arranged in higher structures. Protein Structures

Secondary Structure  Helix First predicted by Linus Pauling. Modeled on basis x-ray data which provided accurate geometries, bond lengths, and angles. Modeled before Kendrew’s structure; 3.6 residues/ turn, 5.4Å/ turn; The main chain forms a central cylinder with R-groups projecting out; Variable lengths: from 4 to 40+ residues with the average helix length is 10 residues (3 turns).

Secondary Structure The  Sheet Unlike  helix,  sheet composed of secondary structure elements distant in structure; The  strands are located next to each other Hydrogen bonds can form between C=O groups of one strand and NH groups of an adjacent strand. Two different orientations –all strands run same direction: “parallel” –strands in alternating orientation: the “antiparallel”.

 -Turns Type I: Also referred to as a  turn: H- bond between Acyl O of AA1 and NH of AA4; Type II, glycine must occupy the AA3 position due to steric effects; Type III is equivalent to 3 10 helix; Types I & III constitute some 70% of all  turns; Proline is typically found in the second position, and most  turns have Asp, Asn, or Gly at the third position.

Other Secondary Structural Elements Random coil Loop  -turn –defined for 3 residues i, i+1, i+2 if a hydrogen bond exists between residues i and i+2 and the phi and psi angles of residue i+1 fall within 40 degrees of one of the following 2 classes turn type phi(i+1) psi(i+1) classic inverse Disordered structure

Viewing Structures C  or CA Ball-and-stickCPK

Ribbon and Topology Diagrams Representations of Secondary Structures  -helix  -strand N C

Tools for Viewing Structures Jmol – PyMOL – Swiss PDB viewer – Mage/KiNG – – Rasmol –

RCSB

GRASP Graphical Representation and Analysis of Structural Properties Red = negative surface charge Blue = positive surface charge

Consurf The ConSurf server enables the identification of functionally important regions on the surface of a protein or domain, of known three-dimensional (3D) structure, based on the phylogenetic relations between its close sequence homologues; A multiple sequence alignment (MSA) is used to build a phylogenetic tree consistent with the MSA and calculates conservation scores with either an empirical Bayesian or the Maximum Likelihood method.

Movies

Proteopedia

Higher Level Structures: Motifs & Domains Motif is a simple combination of a few secondary structures, that appear in several different proteins in nature. A collection of motifs forms a domain. Domain is a more complex combination of secondary structures. It has a very specific function (contains an active site). A protein may contain more than one domain.

Super-secondary Structures or Motifs

Domains "Within a single subunit [polypeptide chain], contiguous portions of the polypeptide chain frequently fold into compact, local semi- independent units called domains." - Richardson, 1981 Domains are: can be built from structural motifs; independently folding elements; functional units; separable by proteases. Typically, globular proteins are organized into one or more domains. EGF domain from p-selectin

Evolutionarily Conserved Domains Often certain structural themes (domains) repeat themselves, but not always in proteins that have similar biological functions. This phenomenon of repeating structures is consistent with the notion that the proteins are genetically related, and that they arose from one another or from a common ancestor. In looking at the amino acid sequences, sometimes there are obvious homologies, and you could predict that the 3-D structures would be similar. But sometimes virtually identical 3-D structures have no sequence similarities at all!

Rates of Change Not all proteins change at the same rate; Why? Functional pressures –Surface residues are observed to change most frequently; –Interior less frequently;

Sequence  Structure  Function Many sequences can give same structure  Side chain pattern more important than sequence When homology is high (>50%), likely to have same structure and function (Structural Genomics)  Cores conserved  Surfaces and loops more variable *3-D shape more conserved than sequence* *There are a limited number of structural frameworks* W. Chazin © 2003

Degree of Evolutionary Conservation Less conserved Information poor More conserved Information rich DNA seqProtein seqStructureFunction ACAGTTACAC CGGCTATGTA CTATACTTTG HDSFKLPVMS KFDWEMFKPC GKFLDSGKLG S. Lovell © 2002

How is a 3D structure determined ? 1. Experimental methods (Best approach): X-rays crystallography - stable fold, good quality crystals. NMR - stable fold, not suitable for large molecule. 2. In-silico methods (partial solutions - based on similarity): Sequence or profile alignment - uses similar sequences, limited use of 3D information. Threading - needs 3D structure, combinatorial complexity. Ab-initio structure prediction - not always successful.

Experimental Determination of Atomic Resolution Structures X-ray X-rays Diffraction Pattern  Direct detection of atom positions  Crystals NMR RF Resonance H0H0  Indirect detection of H-H distances  In solution

Position Signal Resolving Power: The ability to see two points that are separated by a given distance as distinct Resolution of two points separated by a distance d requires radiation with a wavelength on the order of d or shorter: d wavelength Mark Rould © 2007 Resolving Power

Lenses require a difference in refractive index between the air and lens material in order to 'bend' and redirect light (or any other form of electromagnetic radiation.) The refractive index for x-rays is almost exactly 1.00 for all materials. ∆ There are no lenses for xrays. n air n glass n air Mark Rould © 2007 X-ray Microscopes?

Scattering = Fourier Transform of specimen Lens applies a second Fourier Transform to the scattered rays to give the image Mark Rould © 2007 Light Scattering and Lenses are Described by Fourier Transforms Since X-rays cannot be focused by lenses and refractive index of X-rays in all materials is very close to 1.0 how do we get an atomic image?

X-ray Diffraction with “The Fourier Duck” Images by Kevin Cowtan The molecule The diffraction pattern

Animal Magic Images by Kevin Cowtan The CAT (molecule) The diffraction pattern

X-Ray Detector Computer Mark Rould © 2007 Solution: Measure Scattered Rays, Use Fourier Transform to Mimic Lens Transforms

A single molecule is a very weak scatterer of X-rays. Most of the X-rays will pass through the molecule without being diffracted. Those rays which are diffracted are too weak to be detected. Solution: Analyzing diffraction from crystals instead of single molecules. A crystal is made of a three-dimensional repeat of ordered molecules (10 14 ) whose signals reinforce each other. The resulting diffracted rays are strong enough to be detected. A Problem… Sylvie Doublié © D repeating lattice; Unit cell is the smallest unit of the lattice; Come in all shapes and sizes. Crystals come from slowly precipitating the biological molecule out of solution under conditions that will not damage or denature it (sometimes). A Crystal

X-rays Computer Crystallographer Electron density map Model Scattered rays Detector Object Putting it all together: X-ray diffraction Sylvie Doublié © 2000 Diffraction pattern is a collection of diffraction spots (reflections) Rubisco diffraction pattern

3-D view of macromolecules at near atomic resolution. The result of a successful structural project is a “structure” or model of the macromolecule in the crystal. You can assign: - secondary structure elements - position and conformation of side chains - position of ligands, inhibitors, metals etc. A model allows you: - to understand biochemical and genetic data (i.e., structural basis of functional changes in mutant or modified macromolecule). - generate hypotheses regarding the roles of particular residues or domains What information does structure give you? Sylvie Doublié © 2000

What did I just say????!!! A structure is a “MODEL”!! What does that mean? –It is someone’s interpretation of the primary data!!!

So what happens when we can’t get an NMR or X-ray structure?

2˚ & 3˚ Structure Prediction

Secondary (2 o ) Structure Table

Secondary Structure Prediction One of the first fields to emerge in bioinformatics (~1967) Grew from a simple observation that certain amino acids or combinations of amino acids seemed to prefer to be in certain secondary structures Subject of hundreds of papers and dozens of books, many methods…

Simplified C-F Algorithm Select a window of 7 residues Calculate average P  over this window and assign that value to the central residue Repeat the calculation for P  and P c Slide the window down one residue and repeat until sequence is complete Analyze resulting “plot” and assign secondary structure (H, B, C) for each residue to highest value

Limitations of Chou-Fasman Does not take into account long range information (>3 residues away) Does not take into account sequence content or probable structure class Assumes simple additive probability (not true in nature) Does not include related sequences or alignments in prediction process Only about 55% accurate (on good days)

Protein Principles Proteins reflect millions of years of evolution. Most proteins belong to large evolutionary families. 3D structure is better conserved than sequence during evolution. Similarities between sequences or between structures may reveal information about shared biological functions of a protein family.

The PhD Algorithm Search the SWISS-PROT database and select high scoring homologues Create a sequence “profile” from the resulting multiple alignment Include global sequence info in the profile Input the profile into a trained two-layer neural network to predict the structure and to “clean-up” the prediction

Prediction Performance

Best of the Best PredictProtein-PHD (72%) – Jpred (73-75%) – jpred/index.html SAM-T08 (75%) – query.html PSIpred (77%) –

Structure Prediction Threading A protein fold recognition technique that involves incrementally replacing the sequence of a known protein structure with a query sequence of unknown structure. Why threading? Secondary structure is more conserved than primary structure Tertiary structure is more conserved than secondary structure T H R E A D

An Approach SAS Calculations DSSP - Database of Secondary Structures for Proteins – VADAR - Volume Area Dihedral Angle Reporter – GetArea – Naccess - Atomic Solvent Accessible Area Calculations –

3D Threading Servers Generate 3D models or coordinates of possible models based on input sequence PredictProtein-PHDacc – PredAcc – bin/portal.py?form=PredAcc Loopp (version 2) – Phyre – SwissModel – All require addresses since the process may take hours to complete

Ab Initio Folding Two Central Problems –Sampling conformational space ( ) –The energy minimum problem The Sampling Problem (Solutions) –Lattice models, off-lattice models, simplified chain methods, parallelism The Energy Problem (Solutions) –Threading energies, packing assessment, topology assessment

Lattice Folding

For the gamers out there…

Print & Online Resources Crystallography Made Crystal Clear, by Gale Rhodes Online tutorial with interactive applets and quizzes. Nice pictures demonstrating Fourier transforms Cool movies demonstrating key points about diffraction, resolution, data quality, and refinement. Notes from a macromolecular crystallography course taught in Cambridge