Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alignment of Flexible Molecular Structures. Motivation Proteins are flexible. One would like to align proteins modulo the flexibility. Hinge and shear.

Similar presentations


Presentation on theme: "Alignment of Flexible Molecular Structures. Motivation Proteins are flexible. One would like to align proteins modulo the flexibility. Hinge and shear."— Presentation transcript:

1 Alignment of Flexible Molecular Structures

2 Motivation Proteins are flexible. One would like to align proteins modulo the flexibility. Hinge and shear protein domain motions (Gerstein, Lesk, Chotia). Conformational flexibility in drugs.

3

4 Motivation

5 Flexible protein alignment without prior hinge knowledge FlexProt - algorithm –detects automatically flexibility regions –exploits amino acid sequence order

6 Examples

7 Experimental Results

8 largest flexible alignment by decomposing the two molecules into a minimal number of rigid fragment pairs having similar 3-D structure.Task: largest flexible alignment by decomposing the two molecules into a minimal number of rigid fragment pairs having similar 3-D structure.

9 Detection of Congruent Rigid Fragment Pairs Joining Rigid Fragment Pairs Rigid Structural Comparison Clustering (removing ins/dels) FlexProt Main Steps

10 Structural Similarity Matrix Congruent Rigid Fragment Pair

11 j i+1 j+1 i j-1 i-1 v i-1 v i v i+1 w j-1 w j w j+1 Frag kt (l) = v k … v i... v k+l-1 w t … w j … w t+l-1 RMSD (Frag kt (l) ) <  Detection of Congruent Rigid Fragment Pairs k t k+l-1 t+l-1

12 Detection of Congruent Rigid Fragment Pairs Joining Rigid Fragment Pairs Rigid Structural Comparison Clustering (removing ins/dels) FlexProt Main Steps

13 How to Join Rigid Fragment Pairs ?

14 Graph Representation Graph Node Graph Edge

15 Graph Representation The fragments are in ascending order.The fragments are in ascending order. The gaps (ins/dels) are limited.The gaps (ins/dels) are limited. Allow some overlapping.Allow some overlapping. W + Size of the rigid fragment pair (node b) - Gaps (ins/dels) - Overlapping Penalties a b

16 Graph Representation W _i W _k W _t W _m W _n DAG (directed acyclic graph) DAG (directed acyclic graph)

17 W _i W _k W _t W _m W _n “Single-source shortest paths”“Single-source shortest paths” O(|E|+|V|) O(|E|+|V|)

18 Detection of Congruent Rigid Fragment Pairs Joining Rigid Fragment Pairs Rigid Structural Comparison Clustering (removing ins/dels) FlexProt Main Steps

19 Clustering (removing ins/dels) T1T1 T2T2 If joining two fragment pairs gives small RMSD (T 1 ~ T 2 ) then put them into one cluster.

20 Detection of Congruent Rigid Fragment Pairs Joining Rigid Fragment Pairs Rigid Structural Comparison Clustering (removing ins/dels) FlexProt Main Steps

21 Rigid Structural Comparison

22 Multiple Structural Alignment

23 Multiple Structural Alignment Schemes Linear progressive. Starts with one object and successively compares the other objects to the results. Tree progressive. The alignment is created according to a similarity tree. The alignment direction is from the leaves to the tree root. Gerstein and Levitt 1998. Orengo and Taylor 1994. SSAPm method. Sali and Blundell 1990 Russell and Barton 1992 Ding et al. 1994

24 Multiple Structural Alignment Schemes Pivot. Uses one object as the pivot and compares it to all other objects. The results are then analyzed to find the common similarities. Leibowitz, Fligelman, Nussinov, and Wolfson 1999. Geometric Hashing technique. Escalier, Pothier, Soldano, Viari 1998. Exploits all common substructures.

25 Multiple Structural Alignment Schemes Optimization Techniques. Guda, Scheeff, Bourne, Shindyalov. Monte Carlo optimization.Guda, Scheeff, Bourne, Shindyalov. Monte Carlo optimization.

26 Previous Work – Multiple Structural Alignment Disadvantages: Most methods do not detect partial solutions. The methods which detect partial solutions are not efficient for a large number of molecules.

27 Partial Solutions A A A B B B is harder to detect than A Detection of local similarities. Detection of subset of molecules that share some local structural pattern.

28 Largest Common Point Set (LCP) Multiple-LCP is NP-hard even in one dimensional space for the case of exact congruence (Akutsu 2000). 3-D + ε-congruence more complex problem Given two point sets detect the largest common sub-set. [exact congruence or ε-congruence]

29 Solution Space The number of solutions, which answer the minimal criteria, could be exponential. α-1α-2α-3 α-1α-2α-3 α-1α-2 323 k M

30 Partial Multiple-LCP Detect t largest alignments between exactly k molecules. We are interested in above solutions for each k, 2  k  m.

31 MultiProt Non-predefined Pattern detection. Partial Solutions. Time Efficient – 5 protein in 14 seconds 20 proteins (~500 a.a.) in 10 minutes 50 proteins (~200 a.a.) in 19 minutes [PentiumII 500MHz 512Mb memory] /home/silly6/mol/demos/MultiProt/

32 α-1 α-2 α-3 α-1 α-2 α-1 α-2 α-3 α-1 α-2 α-1 α-2 α-3 α-1 α-2 α-1α-2α-3

33 Algorithm Features Assumption: any multiple alignment of proteins should align, at least short, contiguous fragments (minimum 3 points) of input points. Reduction of solution space: The aligned contiguous fragments are of maximal length. All (almost, because of ε-congruence) possible solutions (transformations) are detected (optimal solutions are ‘hard’ to select).

34 Input: Pivot Molecule: M p (participates in all solutions) Set of Molecules: S`=S\{M p } Error Threshold: ε Multiple Alignment with Pivot Detect all possibly aligned fragments of maximal length between the input molecules (chance to detect subtle similarities). Select solutions that give high scoring global structural similarity. Iterate over all possible pivots, M p = M 1 … M m

35 Bio-Core Detection Geom. + Bio. Constraints Classification: hydrophobic (Ala, Val, Ile, Leu, Met, Cys) polar/charged (Ser, Thr, Pro, Asn, Gln, Lys, Arg, His, Asp, Glu) aromatic (Phe, Tyr, Trp) glycine (Gly) Or any other scoring matrix!

36 Experimental Results

37 Superhelix, 5 molecules.

38 Concavalin, 6 molecules.

39 Partial Solution Detection 1adj 1hc7 1qf6 1ati A B A A A x y z B B B Task to detect A and B

40 Domain A ranked first (142 matched atoms) Domain B ranked eight’th (85 matched atoms)

41 4 proteins aligned based on detected domain A

42 Multiple Alignment of domain A

43 Multiple Alignment of domain A (enlarged)

44 4 proteins aligned based on domain B

45 Multiple Alignment of domain B

46 Multiple Alignment of domain B (enlarged)

47 Application to G proteins A A B

48

49 Substrate assisted catalysis – application to G proteins Substrate assisted catalysis – application to G proteins. Mickey Kosloff and Zvi Selinger, TRENDS in Biochemical Sciences Vol.26 No.3 March 2001 161

50 Aspects of Structural Comparison A large number of structures (hundreds) – Molecular Dynamics. Structural flexibility – proteins are not rigid structures. Structure representation – C-alpha atoms are suitable for comparisons of folds. Detection of similar function requires different representation. This brings another problem – side chain flexibility. Sequence order in structural alignment. Detection of active sites might require different approach. Proteins with different folds might provide the same function. Statistical Significance Measure of geometrical similarity (RMSD, bottleneck, …), biological scoring function.

51 Molecular Surface Representation Applications to docking

52 Motivation Prediction of biomolecular recognition. Detection of drug binding ‘cavities’. Molecular Graphics.

53 Rasmol Spacefill display

54 1. Solvent Accessible Surface – SAS 2. Connolly Surface

55 Connolly’s MS algorithm A ‘water’ probe ball (1.4-1.8 A diameter) is rolled over the van der Waals surface. Smoothes the surface and bridges narrow ‘inaccessible’ crevices.

56 Connolly’s MS algorithm - cont. Convex, concave and saddle patches according to the no. of contact points between the surface atoms and the probe ball. Outputs points+normals according to the required sampling density (e.g. 10 pts/A 2 ).

57 Example - the surface of crambin

58 Critical points based on Connolly rep. (Lin, Wolfson, Nussinov) Define a single point+normal for each patch. Convex-caps, concave-pits, saddle - belt.

59 Critical point definition

60 Connolly => Shou Lin

61 Solid Angle local extrema knob hole

62 Chymotrypsin surface colored by solid angle (yellow-convex, blue- concave)

63 Protein-protein and Protein- ligand Docking The geometric filtering

64 Shape Complementarity

65 Geometric Docking Algorithms Based on the assumption of shape complementarity between the participating molecules. Molecular surface complementarity - protein-protein, protein-ligand, (protein - drug). Hydrogen donor/acceptor complementarity - protein-drug. Remark : usually “protein” here can be replaced by “DNA” or “RNA” as well.

66 Issues to be examined when evaluating docking methods Rigid docking vs Flexible docking : –If the method allows flexibility: Is flexibility allowed for ligand only, receptor only or both ? No. of flexible bonds allowed and the cost of adding additional flexibility. Does the method require prior knowledge of the active site ? Performance in “unbound” docking experiments. Speed - ability to explore large libraries.

67 General Algorithm outline Calculate the molecular surface of the receptor and the ligands and their interest points (+ normals). Match the interest points and recover candidate transformations. Check for inter-molecule and intra-molecule penetrations and score the amount of contact. Rank by geom-score/energies.

68 Shape feature and signature (Norel et al.)

69 Unbound docking examples

70 GGH based flexible docking Applies either to flexible ligands or to flexible receptors.

71 Flexible Docking Calmodulin with M13 ligand

72 Flexible Docking HIV Protease Inhibitor


Download ppt "Alignment of Flexible Molecular Structures. Motivation Proteins are flexible. One would like to align proteins modulo the flexibility. Hinge and shear."

Similar presentations


Ads by Google