Download presentation
Presentation is loading. Please wait.
1
Direct Methods and Many Site Se-Met MAD Problems using BnP Direct Methods and Many Site Se-Met MAD Problems using BnP W. Furey
2
Classical Direct Methods Main method for “small molecule” structure determination Highly automated (almost totally “black box”) Solves structures containing up to a few hundred non-hydrogen atoms in the asymmetric unit.
3
Direct Methods Assumptions and Requirements Non-negativity of electron density Atoms are “resolved”, i.e. “atomic resolution” data are available Unit cell, symmetry and contents are known
4
Important Concepts - 1 Normalized Structure Factors E H given by E H = F H / 1/2 with averaging in resolution shells The phase H of E H is the same as for F H = 1 hence “normalized”
5
Important Concepts - 2 Structure Invariant - structural quantity independent of choice of unit cell origin Probabilistic estimates can be made for the values of structure invariants given the associated E magnitudes and cell contents
7
Fundamental formulas involving individual triplets P( HK ) = [2 I 0 (A HK )] -1 exp(A HK cos HK ) where P( HK ) is the probability of the structure invariant having the value HK A HK = 2 |E H E K E -H-K | / N 1/2 where N is the number of atoms in the cell and the E’s are normalized structure factors
8
Note probability P( HK ) increases as A HK increases, and that A HK is proportional to product of E’s and inversely proportional to N 1/2 Expected value of cos HK is given by = I 1 (A HK ) / I 0 (A HK )
9
Cochran Distribution for various K’s vs K 3 = HK, K=A HK
13
Classical Direct Methods Applications for Proteins Used for phase extension to very high resolution Used with moderate success to locate heavy atom sites in isomorphous derivatives E values used in molecular replacement calculations
14
Current Direct Methods Applications for Proteins Shake n Bake (based on minimum function) used to solve complete protein structures with over 1,000 atoms (rubredoxin, lysozyme, calmodulin etc.), provided data to 1.1Å or better is available Used to locate anomalous scatterer sites from MAD or SAS data
15
General Shake n Bake Concept Use a multi-solution method starting with random phases (or randomly positioned atoms) in each trial. For each trial phase set, use a “dual space” procedure iterating between real and reciprocal space optimization/constraints.
16
Reciprocal space optimization based on shifting phases to reduce the “minimum function” R( ) Real space optimization and constraints based on computing new phases only from the largest peaks in map based on previous cycle phases Each trial phase set ranked by value of R( )
17
Generate random trial structure Select “structure” from largest peaks Compute phases from structure Shift phases to reduce R( ) Compute map from new phases SnB inner loop for trial structure Stop after N iterations
18
Choice of data for Se determination Use | |F H | + - |F H | - | (anomalous) difference at single Use | |F H | i - |F H l j | (dispersive) difference between two ’s Use F A values (derived from data at all ’s) Use F HLE values based on max anomalous and max dispersive differences
21
MAD Phasing For data collected at 1, 2 etc, choose a wavelength n as “native” data, and “reduce” that data set by averaging Bijvoet pairs. For other “derivative” wavelengths d, reduce both by averaging Bijvoet pairs to form “isomorphous” data sets, and without averaging to form “anomalous” data sets.
22
MAD Phasing For “isomorphous” and “derivative anomalous” data sets, scale “derivative” to “native” and use scattering factors of f 0 = 0, f’= f’( d) - f’( n), f”= f”( d) For “native anomalous” data use original native Bijvoet pairs and scattering factors of f 0 = 0, f’ = 0, f”= f”( n)
23
Phase Refinement Minimizing |FPHcalc h 2 |FPobs| h 2 |FHcalc| h 2 2|FPobs| h |FHcalc| h cos P H h ( P )| W h P P P h |FPHobs| h |FPHcalc P | h 2 where
24
Phase Refinement Options “Classical” - P = centroid, W h =1/E 2,1/ or unity, P P =1, use reflections with FOM > 0.4-0.6 “Maximum Likelihood” - P stepped over allowed phases, P P = corresponding probability, W h =1/E 2, 1/ or unity, use reflections with FOM > 0.2 P, P P can also come from external source, i.e solvent flattened or NC-symmetry averaged maps. W h h P P |FPHobs| h |FPHcalc ( P )| h P 2
28
Projection of peaks down NC twofold
29
MAD 1, 2, 3 data (Scalepack files) “iso” and “ano” scaled files “extension” file all “native” ( 3) data CMBISOCMBANO PHASIT MISSNG FSFOUR BNDRY MAPINV EXTRMP MAPAVG BLDCEL “phase” file “submap” file “averaging” mask file final map
31
MAD Phasing/Averaging Statistics
34
Peak anomalous ( 2) difference Patterson
40
With SnB it’s possible to automatically locate the anomalous scatterer substructure with data from any one of the dispersive combinations or anomalous pair sets As expected, sets with the maximum dispersive or anomalous signal typically yield a greater frequency of success
41
Automated Applications of BnP: Methodology W. Furey, 1 L. Pasupulati, 1 S. Potter 2, H. Xu 2, R. Miller 3 & C. Weeks 2 S. Potter 2, H. Xu 2, R. Miller 3 & C. Weeks 2 1 University of Pittsburgh School of Medicine and VA Medical Center and VA Medical Center 2 Hauptman-Woodward Medical Research Institute 3 Center for Computational Research, SUNY at Buffalo
42
SnB Strengths 1. Powerful, state-of-the-art direct methods for automatically locating heavy atom sites 2. Friendly graphical user interface. SnB Weaknesses 1. Stops after finding sites, i.e no protein phasing 2. No software interface PHASES Strengths 1. Proven protein phasing (MAD, MIRAS, etc), solvent flattening, NCS averaging, external program interfacing 2. Interactive graphics PHASES Weaknesses 1. Doesn’t automatically find heavy atom sites 2. Script based, i.e. no GUI Goal: Provide user-friendly software for automatic determination of protein crystal structures
43
Combine the SnB program with the “PHASES” package, putting everything under GUI control Establish default parameters and procedures allowing all aspects of the structure determination to be fully automated Also provide a manual mode allowing experienced users more control, and to facilitate development Provide graphical feedback when possible Facilitate coupling with popular external software Adopted Strategy
44
Automatic substructure solution detection Automatic substructure validation Automatic hand determination (including space group changes, when needed) Main Developments Required for Automated Structure Determination
45
Automatic Substructure Solution Detection Original Method Based on histogram (Manual, time consuming, requires user interaction) Current Method Based on R min and R cryst statistics (Automatic, fast, no user interaction)
46
Automatic Substructure Validation Original Method Left up to user to decide which peaks correspond to true sites (Manual) Current Method (auto mode) Based on occupancy refinement against Bijvoet differences (Automatic, fast, requires no coordinate refinement, hand insensitive) Current Method (manual mode) As in auto but can also compare peaks from different solutions (Manual)
47
Automatic Substructure Validation
48
Automatic Hand Determination Original Method Visual inspection of map projections (Manual, requires user interaction) Current Method (MAD, SIRAS or MIRAS) Based on variance differences in protein and solvent regions (Automatic, fast since requires no refinement, also requires no user interaction)
49
Automatic Hand Determination Current Method (SAS data only) Comparative analysis of R, FOM and CC after solvent flattening/phase combination. (Automatic, fast, requires no refinement) Current Method (SIR, MIR data only) Both hands tried, map examination needed. (Requires user interaction)
50
No man (or program) is an island Importing data files Scalepack files D*Trek files MTZ files $ Free format files Exporting control files O RESOLVE 2.08 Arp/wARP 6.1.1 Exporting data files Free format files CNS files MTZ files $ O files CHAIN files PDB files Job submission from GUI RESOLVE $ 2.08 Arp/wARP $ 6.1.1 $ RESOLVE, Arp/wARP and/or CCP4 must be obtained from their respective authors/distributors for these options to work
51
Results for 1jc4 a=43.6 b=78.6, c=89.4 Å, = 91.95°, P2 1 4 molecules (592 residues) in asu 2.1Å data, 3 MAD data Substructure: Found 24 of 24 Se Phasing: mean PP- 2.95; mean FOM- 0.661 Time to map: ~41 min on G4 (1.5 GHz) Powerbook ~13 min on G5 (2.7 GHz) Desktop Auto Tracability: Resolve- 87% main chain, 68% side chain Arp/wARP- 82% main chain, 73% side chain
52
SeMet ASU Size & Data Resolution PDB Code No. Sites No. ResiduesNCS d(Å) PDB Code No. Sites No. ResiduesNCS d(Å) 1QC2416911.51CLI28138043.0 1BX4734512.251A7A3086422.8 1CB0828312.21L8A40177222.6 1T5H1050412.51E3M45160023.0 2JXH1257623.11HI850132822.8 1GSO1343112.221GKP54274862.5 2TPS1545422.71DQ860186842.33 1DBT1971732.491E2Y601880103.2 1JEN2266822.251M3266219662.55 1JC42459242.11EQ2703100102.9
53
Phasing Flexibility (Manual Mode)
54
Conclusion BnP is a user friendly, efficient, package for the automated determination of protein structures from x-ray diffraction data BnP downloads for Linux, Apple G4, G5, & Intel, and SGI’s available (academic & non-profit institutions) at http://www.hwi.buffalo.edu/BnP/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.