A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps Frank DiMaio, Jude Shavlik Computer Sciences Department George Phillips Biochemistry.

A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps Frank DiMaio, Jude Shavlik Computer Sciences Department George Phillips Biochemistry Department University of Wisconsin – Madison USA Presented at the Fourteenth Conference on Intelligent Systems for Molecular Biology (ISMB 2006), Fortaleza, Brazil, August 7, 2006

X-ray Crystallography Protein Crystal Collection Plate FFT Electron Density Map (“3D picture”) X-ray beam

Given: Sequence + Density Map Sequence + Electron Density Map

Find: Each Atom’s Coordinates

Our Subtask: Backbone Trace CαCα CαCα CαCα CαCα

The Unit Cell  3D density function ρ(x,y,z) provided over unit cell  Unit cell may contain multiple copies of the protein

Density Map Resolution ARP/wARP (Perrakis et al. 1997) TEXTAL (Ioerger et al. 1999) Resolve (Terwilliger 2002) Our focus 2Å 3Å 4Å

Overview of ACMI (our method)  Local Match Algorithm searches for sequence-specific 5-mers centered at each amino acid Many false positives  Global Consistency Use probabilistic model to filter false positives Find most probable backbone trace  Global Consistency Use probabilistic model to filter false positives Find most probable backbone trace

5-mer Lookup and Cluster … VKH V LVSPEKIEELIKGY … PDB Cluster 1 Cluster 2 wt=0.67wt=0.33 NOTE: can be done in precompute step

5-mer Search  6D search (rotation + translation) for representative structures in density map  Compute “similarity”  Computed by Fourier convolution (Cowtan 2001)  Use tuneset to convert similarity score to probability

Convert Scores to Probabilities 5-mer representative scores t i (u i ) search density map Bayes’ rule probability distribution over unit cell P(5-mer at u i | Map) match to tuneset score distributions POS NEG

In This Talk…  Where we are now For each amino acid in the protein, we have a probability distribution over the unit cell  Where we are headed Find the backbone layout maximizing

Pairwise Markov Field Models  A type of undirected graphical model  Represent joint probabilities as product of vertex and edge potentials  Similar to (but more general than) Bayesian networks u1u1 u3u3 u2u2 y

Protein Backbone Model ALAGLYLYSLEU  Each vertex is an amino acid  Each label is location + orientation  Evidence y is the electron density map  Each vertex (or observational) potential comes from the 5-mer matching

Protein Backbone Model  Two types of edge (or structural) potentials Adjacency constraints ensure adjacent amino acids are ~3.8 Å apart and in the proper orientation ALAGLYLYSLEU

Protein Backbone Model  Two types of structural (edge) potentials Adjacency constraints ensure adjacent amino acids are ~3.8 Å apart and in the proper orientation Occupancy constraints ensure nonadjacent amino acids do not occupy same 3D space ALAGLYLYSLEU

Backbone Model Potential Constraints between adjacent amino acids: =x

Constraints between nonadjacent amino acids: Backbone Model Potential

Observational (“amino-acid-finder”) probabilities Backbone Model Potential

Probabilistic Inference  Exact methods are intractable  Use belief propagation (BP) to approximate marginal distributions  Want to find backbone layout that maximizes

Belief Propagation (BP)  Iterative, message-passing method (Pearl 1988)  A message,, from amino acid i to amino acid j indicates where i expects to find j  An approximation to the marginal (or belief), is given as the product of incoming messages

Belief Propagation Example ALAGLY

Technical Challenges  Representation of potentials Store Fourier coefficients in Cartesian space At each location x, store a single orientation r  Speeding up O(N 2 X 2 ) naïve implementation X = the unit cell size (# Fourier coefficients) N = the number of residues in the protein

Speeding Up O(N 2 X 2 ) Implementation  O(X 2 ) computation for each occupancy message Each message must integrate over the unit cell O(X log X) as multiplication in Fourier space  O(N 2 ) messages computed & stored Approx N-3 occupancy messages with a single message O(N) messages using a message product accumulator  Improved implementation O(NX log X)

1XMT at 3Å Resolution 1.12Å RMSd 100% coverage HIGH LOW 0.17 0.82 prob(AA at location)

1VMO at 4Å Resolution 3.63Å RMSd 72% coverage 0.02 0.25 HIGH LOW prob(AA at location)

1YDH at 3.5Å Resolution 1.47Å RMSd 90% coverage 0.02 0.27 HIGH LOW prob(AA at location)

Experiments  Tested ACMI against other map interpretation algorithms: TEXTAL and Resolve  Used ten model-phased maps  Smoothly diminished reflection intensities yielding 2.5, 3.0, 3.5, 4.0 Å resolution maps

RMS Deviation ACMI Textal Resolve Density Map Resolution Cα RMS Deviation ACMI

Model Completeness Density Map Resolution ACMI Textal Resolve % chain traced % residues identified ACMI

Per-protein RMS Deviation ACMI RMS Error TEXTAL RMS Error Resolve RMS Error

Conclusions  ACMI effectively combines weakly-matching templates to construct a full model  Produces an accurate trace even with poor-quality density map data  Reduces computational complexity from O(N 2 X 2 ) to O(N X log X)  Inference possible for even large unit cells

Future Work  Improve “amino-acid-finding” algorithm  Incorporate sidechain placement / refinement  Manage missing data Disordered regions Only exterior visible (e.g., in CryoEM)

Acknowledgements  Ameet Soni  Craig Bingman  NLM grants 1R01 LM008796 and 1T15 LM007359

A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps Frank DiMaio, Jude Shavlik Computer Sciences Department George Phillips Biochemistry.

Similar presentations

Presentation on theme: "A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps Frank DiMaio, Jude Shavlik Computer Sciences Department George Phillips Biochemistry."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps Frank DiMaio, Jude Shavlik Computer Sciences Department George Phillips Biochemistry.

Similar presentations

Presentation on theme: "A Probabilistic Approach to Protein Backbone Tracing in Electron Density Maps Frank DiMaio, Jude Shavlik Computer Sciences Department George Phillips Biochemistry."— Presentation transcript:

Similar presentations

About project

Feedback