Goals for Today Introduce automated refinement and validation.

Slides:



Advertisements
Similar presentations
Phasing Goal is to calculate phases using isomorphous and anomalous differences from PCMBS and GdCl3 derivatives --MIRAS. How many phasing triangles will.
Advertisements

Introduction to protein x-ray crystallography. Electromagnetic waves E- electromagnetic field strength A- amplitude  - angular velocity - frequency.
Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Refinement procedure Copy your best coordinate file to “prok-native-r1.pdb”: cp yourname-coot-99.pdb prok-native-r1.pdb Start refinement phenix.refine.
Protein Planes Bob Fraser CSCBC Overview Motivation Points to examine Results Further work.
Computing Protein Structures from Electron Density Maps: The Missing Loop Problem I. Lotan, H. van den Bedem, A. Beacon and J.C. Latombe.
A Brief Description of the Crystallographic Experiment
The Structure and Functions of Proteins BIO271/CS399 – Bioinformatics.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Can protein model accuracy be identified? Morten Nielsen, CBS, BioCentrum, DTU.
Thomas Blicher Center for Biological Sequence Analysis
Protein Primer. Outline n Protein representations n Structure of Proteins Structure of Proteins –Primary: amino acid sequence –Secondary:  -helices &
Structures and Structure Descriptions Chapter 8 Protein Bioinformatics.
Proteins are made by linking amino acids Protein Structure Review and Refinement Introduction Brian Bahnson Dept of Chemistry & Biochemistry, University.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Can protein model accuracy be identified? Morten Nielsen, CBS, BioCentrum, DTU.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modelling Thomas Blicher Center for Biological Sequence Analysis.
Protein Basics Protein function Protein structure –Primary Amino acids Linkage Protein conformation framework –Dihedral angles –Ramachandran plots Sequence.
A PEPTIDE BOND PEPTIDE BOND Polypeptides are polymers of amino acid residues linked by peptide group Peptide group is planar in nature which limits.
Proteins: Levels of Protein Structure Conformation of Peptide Group
Two parts to successful model building BUILDING TOOLS –how to use Coot –Initiate trace of protein chain (“Place helix here”) –Test sidechain assignments.
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
Protein Secondary Structure Lecture 2/19/2003. Three Dimensional Protein Structures Confirmation: Spatial arrangement of atoms that depend on bonds and.
Proteins. Proteins? What is its How does it How is its How does it How is it Where is it What are its.
Model-Building with Coot An Introduction Bernhard Lohkamp Karolinska Institute June 2009 Chicago (Paul Emsley) (University of Oxford)
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
Protein Planes Bob Fraser Protein Folding 882 Project November, 2006.
Molecular visualization
CS790 – BioinformaticsProtein Structure and Function1 Review of fundamental concepts  Know how electron orbitals and subshells are filled Know why atoms.
Ligand fitting and Validation with Coot Bernhard Lohkamp Karolinska Institute June 2009 Chicago (Paul Emsley) (University of Oxford)
Protein Structure 1 Primary and Secondary Structure.
Phasing Today’s goal is to calculate phases (  p ) for proteinase K using PCMBS and EuCl 3 (MIRAS method). What experimental data do we need? 1) from.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Refinement is the process of adjusting an atomic model to:
Forward and inverse kinematics in RNA backbone conformations By Xueyi Wang and Jack Snoeyink Department of Computer Science UNC-Chapel Hill.
Today: compute the experimental electron density map of proteinase K Fourier synthesis  (xyz)=  |F hkl | cos2  (hx+ky+lz -  hkl ) hkl.
Automated Refinement (distinct from manual building) Two TERMS: E total = E data ( w data ) + E stereochemistry E data describes the difference between.
Protein Structure BL
Computational Structure Prediction
Common Coot (Fulica atra).
Protein Structure and Properties
Refinement procedure for native structure
Model Building and Refinement for CHEM 645
Protein Structure September 7,
Phasing Today’s goal is to calculate phases (ap) for proteinase K using MIRAS method (PCMBS and GdCl3). What experimental data do we need? 1) from native.
Protein Structure Prediction and Protein Homology modeling
The Peptide Bond Amino acids are joined together in a condensation reaction that forms an amide known as a peptide bond.
Hierarchical Structure of Proteins
Lecture 5 Protein Structure.
The Peptide Bond Amino acids are joined together in a condensation reaction that forms an amide known as a peptide bond.
Protein Planes Bob Fraser CSCBC 2007.
Conformation Dependence of Backbone Geometry in Proteins
Goals for Today Introduce automated refinement and validation.
Protein Structure Prediction
Volume 25, Issue 11, Pages e3 (November 2017)
Volume 19, Issue 7, Pages (July 2011)
Levels of Protein Structure
Protein structure prediction.
Volume 8, Issue 12, Pages (December 2001)
Axel T Brünger, Paul D Adams, Luke M Rice  Structure 
Volume 15, Issue 9, Pages (September 2007)
Volume 9, Issue 7, Pages (July 2001)
Volume 2, Issue 7, Pages (July 1994)
The sequence, crystal structure determination and refinement of two crystal forms of lipase B from Candida antarctica  Jonas Uppenberg, Mogens Trier Hansen,
Volume 85, Issue 5, Pages (May 1996)
Fig 3.13 Reproduced from: Biochemistry by T.A. Brown, ISBN: © Scion Publishing Ltd, 2017.
Note for 2019 If you get positive peaks on the sulfurs after phenix.refine, try setting all the B-factors to a constants, such as 5.00 Å2 or Å2.
Protein structure prediction
Presentation transcript:

Goals for Today Introduce automated refinement and validation. Evaluate Rwork and Rfree for your ProK model. Refine ProK (automatic) Validate ProK (web server) Awards Automatic refinement Refine ProK-PMSF complex Go forth wielding the tools of X-ray crystallography and discover the secrets of other biological macromolecules.

Real Space Refinement with manual intervention positive negative density density A simplistic target Atoms move into closest electron density Manual adjustments improve radius of convergence

Radius of convergence Manual adjustments improve radius of convergence Torsion angle Ca-Cb Rupp

REAL vs RECIPROCAL Real Space Reciprocal Space Manual Local Large radius of convergence Atomic movements are guided by the phases Improvement in the model is limited by the quality of the phases Reciprocal Space Automatic Global Small radius of convergence Phases not used in the refinement. They change. Improved phases will lead to improved maps and improved interpretability and improved model.

Reciprocal Space Target function: Edata (R-factor) Move atoms to minimize the R-factor. Minimize the discrepancy between Fobs and Fcalc. Specifically, minimize E Edata=S w(Fobs-Fcalc)2 Over all hkl. Least squares refinement. Maximum likelihood allows for non-random error model. Given this model, what is the probability that the given set of data would be observed.

Importance of supplementing the Data to Parameter Ratio in crystallographic refinement. PARAMETERS Each atom has 4 parameters (variables) to refine: x coordinate y coordinate z coordinate B factor In proteinase K there are approximately 2000 atoms to refine. This corresponds to 2000*4= 8000 variables. DATA At 2.5 A resolution we have 8400 observations (data points) (Fobs). When # of observations= # of variables A perfect fit can be obtained irrespective of the accuracy of the model. At 1.7 A resolution we have 25,000 observations. About 3 observations per variable. The reliability of the model is still questionable. Adding stereochemical restraints is equivalent to adding observations

Automated Refinement Etotal = Edata(wdata)+ Estereochemistry (distinct from manual building) Two TERMS: Etotal = Edata(wdata)+ Estereochemistry Edata describes the difference between observed and calculated data. wdata is a weight chosen to balance the gradients arising from the two terms. Estereochemistry comprises empirical information about chemical interactions between atoms in the model. It is a function of all atomic positions and includes information about both covalent and non-bonded interactions.

Estereochemistry (Geometry) BOND LENGTHS & ANGLES have ideal values. Engh & Huber dictionary. -CHIRALITY of a-carbons PLANARITY of peptide bonds and aromatic side chains NONBONDED CONTACTS -two atoms cannot occupy the same space at the same time TORSION ANGLE PREFERENCES side chains have preferred rotamers. some values of f and y are forbidden. -Ramachandran. Not restrained- used for validation. loop_ _chem_comp_bond.comp_id _chem_comp_bond.atom_id_1 _chem_comp_bond.atom_id_2 _chem_comp_bond.type _chem_comp_bond.value_dist _chem_comp_bond.value_dist_esd ALA N CA single 1.458 0.019 ALA CA CB single 1.521 0.033 ALA CA C single 1.525 0.021 ALA C O double 1.231 0.020 e

Etotal =Estereochemistry + wdataEdata Jeopardy clue: The appearance of the atomic model when stereochemical restraints are not included in crystallographic refinement. Etotal =Estereochemistry + wdataEdata What is spaghetti, Alex?

restrained not restrained

Etotal =Estereochemistry + wdataEdata 2nd Jeopardy clue: The value of the R-factor resulting when stereochemical restraints are not included in crystallographic refinement. Etotal =Estereochemistry + wdataEdata What is zero, Alex?

An atomic model should be validated by several unbiased indicators The need for Cross-Validation Low RMS deviations in bond lengths and angles does not guarantee a correct structure Rfree is an unbiased indicator of the discrepancy between the model and the data. The data used in this R-factor calculation were not used in determining atomic shifts in the refinement process. Ramachandran plot is unbiased because phi and psi torsion angles are not restrained in the refinement process.

Lowest energy f,y angles correspond to a-helices and b-sheets a-helix Ramachandran plot Lets focus on recognizing helix and strand features in electron density maps.

O N H BACKBONE AMIDE

BAD O N H BACKBONE AMIDE 2.8 Å H O N Asn

GOOD O N H BACKBONE AMIDE 2.8 Å H O N H Asn

ERRAT examines distances between non-bonded atoms ERRAT examines distances between non-bonded atoms. Reports the deviations of C-C, C-N, C-O, N-N, N-O, O-O distances from distributions characteristic of reliable structures.

Verify 3D plot –Indicates if the sequence has been improperly threaded through the density. It measures the compatibility of a model with its sequence. For each residue in the structure, measured values of (1) Surface area buried (2) fraction of side-chain area covered by polar atoms (3) local secondary structure are compared to the values preferred for its amino acid type. Correct trace Backwards trace Report the fraction of residues with score greater than 0.2

Plan for today: Solve structure of ProK-PMSF complex H O F ProK active site Ser225 PMSF: Phenylmethylsufonyl fluoride

The beauty of isomorphism r(x,y,z)=1/V*S|Fobs|e-2pi(hx+ky+lz-fcalc) Initial phases: phases from native proteinase K structure fcalc ProK. Fobs amplitudes: Use |FProk-PMSF| data measured earlier in the course. protein a (Å) b (Å) c (Å) a b g ProK 67.9 101.8 90° ProK+PMSF 102.5 Riso=15.2% What is maximum possible Riso? What is minimum possible Riso? Why don’t we have to use Heavy atoms? Why don’t we have to use Molecular Replacement?

Fo-Fc Difference Fourier map r(x,y,z)=1/V*S|Fobs-Fcalc|e-2pi(hx+ky+lz-fcalc) Here, Fobs will correspond to the Proteinase K-PMSF complex. Fcalc will correspond to the model of Proteinase K by itself after a few cycles of automated refinement. Positive electron density will correspond to features present in the PMSF complex that are not in the native structure. Negative electron density will correspond to features present in the native structure that should be removed in the inhibitor complex. After model building, do more automated refinement and then validate.

Plan for today (continued) Remove waters from autobuilt ProK model. Use this as a starting model to refine against ProK-PMSF data. Then manually build the PMSF inhibitor into an Fo-Fc difference Fourier map. Refinement process typically iterates between automated and manual building. Automated refinement has a limited radius of convergence. For example- automated refinement cannot jump between rotamers or flip between cis and trans peptides. Validate structure. Fill out Refinement Statistics table.

3 Key Concepts When to use isomorphous difference Fourier to solve the phase problem. How to interpret an Fo-Fc Difference Fourier map. RMS deviation from ideal geometry difference between cis and trans peptides methods of cross-validation

Number of least-squares parameters Name _______________________ Proteinase K from Tritirachium album ProK + PMSF Number of least-squares parameters protein atoms Errat overall quality factor

Cis vs. Trans peptide R Ca C O N C O N Ca R R LOTS OF FREEDOM! peptide plane C O N Ca peptide plane R Steric CLASH R LOTS OF FREEDOM!

Cis OK with glycine or proline Ca C O N peptide plane O peptide plane C N Ca Ca R Steric hindrance equivalent for cis or trans.

Steric hindrance equivalent for cis or trans proline Ca C N peptide plane O peptide plane Ca Cb Cd Cg C N Cg Cb Ca Cd R .