Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from.

Slides:



Advertisements
Similar presentations
Review Chapter 4 Sections 1-6.
Advertisements

Road-Sign Detection and Recognition Based on Support Vector Machines Saturnino, Sergio et al. Yunjia Man ECG 782 Dr. Brendan.
3D reconstruction.
Planar Orientations Chapter 4 ( ) in the book Written By: Tomer Heber.
Map Overlay Algorithm. Birch forest Wolves Map 1: Vegetation Map 2: Animals.
Fast Algorithms For Hierarchical Range Histogram Constructions
Searching on Multi-Dimensional Data
Prediction to Protein Structure Fall 2005 CSC 487/687 Computing for Bioinformatics.
1Notes  Assignment 0 marks should be ready by tonight (hand back in class on Monday)
Structural bioinformatics
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Protein Structure, Databases and Structural Alignment
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
“Inverse Kinematics” The Loop Closure Problem in Biology Barak Raveh Dan Halperin Course in Structural Bioinformatics Spring 2006.
Structure from motion. Multiple-view geometry questions Scene geometry (structure): Given 2D point matches in two or more images, where are the corresponding.
FLEX* - REVIEW.
Connected Components, Directed Graphs, Topological Sort Lecture 25 COMP171 Fall 2006.
1 Nearest Neighbor Learning Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) Intro AI.
1 Alignment of Flexible Protein Structures Based on: FlexProt: Alignment of Flexible Protein Structures Without a Pre-definition of Hinge Regions / M.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Protein Structure Prediction Samantha Chui Oct. 26, 2004.
3-D Scene u u’u’ Study the mathematical relations between corresponding image points. “Corresponding” means originated from the same 3D point. Objective.
Bioinformatics for biomedicine Protein domains and 3D structure Lecture 4, Per Kraulis
Architecture of the photosynthetic apparatus by electron microscopy Architecture of the photosynthetic apparatus by electron microscopy Egbert Boekema.
Practical session 2b Introduction to 3D Modelling and threading 9:30am-10:00am 3D modeling and threading 10:00am-10:30am Analysis of mutations in MYH6.
UNC Chapel Hill M. C. Lin Point Location Reading: Chapter 6 of the Textbook Driving Applications –Knowing Where You Are in GIS Related Applications –Triangulation.
Tasks and Training the Intermediate Age Students for Informatics Competitions Emil Kelevedjiev Zornitsa Dzhenkova BULGARIA.
COMPARATIVE or HOMOLOGY MODELING
Planar Graphs: Euler's Formula and Coloring Graphs & Algorithms Lecture 7 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.:
BINF6201/8201 Hidden Markov Models for Sequence Analysis
S EGMENTATION FOR H ANDWRITTEN D OCUMENTS Omar Alaql Fab. 20, 2014.
8 th Grade Math Common Core Standards. The Number System 8.NS Know that there are numbers that are not rational, and approximate them by rational numbers.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Shape Modeling and Matching in Protein Structure Identification Sasakthi Abeysinghe, Tao Ju Washington University, St. Louis, USA Matthew Baker, Wah Chiu.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Motion Planning in Games Mark Overmars Utrecht University.
By Siavoush Dastmalchi Tabriz University of Medical Sciences Tabriz-Iran Modelling the Structures of G Protein-Coupled Receptors Aided by Three-Dimensional.
Multiple Mapping Method with Multiple Templates (M4T): optimizing sequence-to-structure alignments and combining unique information from multiple templates.
Data Structures & Algorithms Graphs
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Graph-based Deformable Matching of 3D Line Segments with Application in Protein Fitting 12 1 HANG DOU 1, MATTHEW L BAKER 2, TAO JU Washington University.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Fixed parameter algorithms for protein similarity search under mRNA structure constrains A joint work by: G. Blin, G. Fertin, D. Hermelin, and S. Vialette.
Example 3.2 Graphical Solution Method | 3.1a | a3.3 Background Information n The Monet Company produces two type of picture frames, which.
Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.
Data Mining and Decision Support
Query sequence MTYKLILNGKTKGETTTEAVDAATAEKVFQYANDN GVDGEWTYTE Structure-Sequence alignment “Structure is better preserved than sequence” Me! Non-redundant.
Object Recognition by Discriminative Combinations of Line Segments and Ellipses Alex Chia ^˚ Susanto Rahardja ^ Deepu Rajan ˚ Maylor Leung ˚ ^ Institute.
“ Using Sequence Motifs for Enhanced Neural Network Prediction of Protein Distance Constraints ” J.Gorodkin, O.Lund, C.A.Anderson, S.Brunak On ISMB 99.
Example 3.2 Graphical Solution Method | 3.1a | a3.3 Background Information n To illustrate the graphical approach, we will use a slightly.
Distance and Midpoint In The Coordinate Plane
The heroic times of crystallography
Convex Hull.
Probabilistic Data Management
Properties of Translations
Three-Dimensional Structure of the Human DNA-PKcs/Ku70/Ku80 Complex Assembled on DNA and Its Implications for DNA DSB Repair  Laura Spagnolo, Angel Rivera-Calzada,
Volume 95, Issue 8, Pages (October 2008)
Objective - To graph ordered pairs on the coordinate plane.
Descriptive Analysis and Presentation of Bivariate Data
Protein structure prediction.
Yang Liu, Perry Palmedo, Qing Ye, Bonnie Berger, Jian Peng 
Volume 86, Issue 4, Pages (April 2004)
Richard C. Page, Sanguk Kim, Timothy A. Cross  Structure 
Jason O. Moore, Wayne A. Hendrickson  Structure 
Jason O. Moore, Wayne A. Hendrickson  Structure 
Topological Signatures For Fast Mobility Analysis
Structural Basis for Cooperative DNA Binding by CAP and Lac Repressor
Topologies of a Substrate Protein Bound to the Chaperonin GroEL
Protein structure prediction
Presentation transcript:

Assigning Transmembrane Segments to Helices in Intermediate-Resolution Structures Angela Enosh Sarel J. Fleishman Nir Ben-Tal & Dan Halperin Adapted from a presentation made by Angela Enosh

Lecture Outline Background The assignment problem The algorithm Validation

TM proteins form helix bundles Figure 1: 3D structure of Bacteriorhodopsin Transmembrane (TM) proteins cross membrane planes Constitute approximately 50% of contemporary drug targets Helices typically cross the membrane Loops are typically located on the external/internal side of the membrane, connecting consecutive helices

Adapted from vertrees.org/ by Jason Vertrees

TM proteins amino-acid sequence TM / EM segment 2D-arrangement can be predicted on basis of the sequence data alone membrane

TM protein 3D structure Technical problems hamper TM protein structure determination Only 30 distinct folds have been solved using high resolution methods such as X-ray crystallography

Cryo-electron microscopy (Cryo-EM) Determines protein structure with low resolution ( >4Å) Individual amino-acids cannot be identified Supplies the locations of the helices Exact structure is left ambiguous

Cryo-electron microscopy (cryo-EM) Bovine rhodopsin; adapted from Krebs et al. (2003) J. Biol. Chem. 278, **

Problem description Input and Target Position, orientation and azimuth of helices with respect to the membrane planes Partitioning of the sequence into TM segments (helices) and extra membrane segments (loops) Target: Find correspondence between the TM helix-segments and the cryo-EM helices Attempt to reduce the number of possible assignments

Find the native assignment of: TM segments (I-VII) to cryo-EM helices (A-G). Given the helices seen in cryo-EM maps (A-G) Given the sequence classified as TM/EM segments (I-VII) Example

The Algorithm Stage I: Pruning by distance constraints Eliminate helices assignments based on the estimated maximal length of the loops. Construction of an assignment graph that contains only the set of feasible assignments.

The Algorithm Stage II: Ranking the feasible assignments Use known protein structures taken from the Protein Data Bank (PDB) Score each assignment based on the capability of loops to connect pairs of helices in 3D.

Formal Statement of the problem Sequence of all segments: TM segments: EM segments:

Formal Statement of the problem (cont.) 3D Helix denoted coordinates of the atoms Membrane defined by inner and outer plane Maximal distance between two points that can be connected by is denoted it is deduced from the distance between consecutive atoms, typically 3.8Å The external and internal are denoted

Formal Goals Find all feasible assignments of ‘s and ‘s An assignment is a permutation where is assigned to Attribute a score to each assignment based on the compatibility with locations of the helices Remark: N-Termini and C-Termini can be deduced experimentally

Stage I: Pruning by Distance Constraints Acyclic Graph: Vertices: Edges:

C B I-II 12 AA II-III 4 AA Valid path in G ~ feasible assignment Short EM segments  less feasible assignments Graph Example B I C C BA A I C I II A B C A III B C

Graph construction Construction is bottom up A valid path in the graph is a path which: Starts at first level Ends at last level Alternating sequence of internal/external edges Does not contain two vertices with same helix

Stage II: Ranking Feasible Assignments A score is assigned to each feasible assignment stored in G For each we define defines the feasibility of connecting two helices in 3D-space by

Based on the length of and a statistical analysis conducted on solved structures of soluble proteins Only helix-loop-helix motifs used, denoted motif (A,L,B) We examine all motifs with the same loop length (2-7) Evaluation

Loop length classification Only proteins which were less than 20% similar were selected

All motifs with length are placed in a common orthogonal reference frame so that all A’s overlap The starting points of the B’s are placed in separate data structures KD-trees are used for efficient axis aligned queries Evaluation: preprocessing

Distribution of end points of short loops Kinematics considerations allow a reachable space limited only by the length of the loop Example: loop length of 4 results in 8 degrees of freedom In reality the end points tend to be highly nonuniform Highly significant with loops of length two to five Still noticeable in loops of lengths up to seven

Distribution of the end points of EM loops of length 4

Distribution of the end points of EM loops of lengths 3 (left) and 4 (right)

The 2 helices are placed in the same reference frame Q is a cube around the start of B with a side size of Å We define a colony function the score depends on: number of neighboring points in the vicinity of q distances between these neighboring points and q Evaluation: scoring

The score of the assignment is the total score of its extra membrane segments Define a weight for each edge For each path we define to be: Evaluation

19 TM proteins with a known high resolution structure were tested Two distinct cases: Accurate data Noisy data regarding the locations and orientations of the helices Validation

Dealing with uncertainty in cryo-EM data Unknown orientation of the helix with respect to its axis Unknown translation of the helix Solution: A cylinder envelope is constructed around the end Termini

Name#hLoop lengths (#AA)Possiblefeasiblerank Bacterio- rhodopsin 73,14,2,3,10,4 7!= Sensory rhodopsin 77,12,2,3,3,4 7!= Lactose permease 123,2,1,3,1,24,3,1,3,1,1 12!> Cytochrome c oxidase E 55,6,1,1 5!=12021 Cytochrome c oxidase H 37,2 3!=661 Acetylcholine receptor 44,4,103 4!=24221 Performance of the Algorithm

Summary Provides more than a single assignment The complexity of the problem scales with the number of amino-acids in the extra-membrane segments – not with the number of TM helices

Questions