Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Energy Maintenance for Molecular Simulation kinematics + energy  motion + structure Main computational issue: Proximity computation.

Similar presentations


Presentation on theme: "1 Energy Maintenance for Molecular Simulation kinematics + energy  motion + structure Main computational issue: Proximity computation."— Presentation transcript:

1 1 Energy Maintenance for Molecular Simulation kinematics + energy  motion + structure Main computational issue: Proximity computation

2 2 Energy q1q1 qiqi q2q2 qjqj q N-1 qNqN Function defined over large dimensional conformation space

3 3 Energy Function E = E S + E  + E S  + E T or + E vdW + E dipole bonded terms (in linear number)

4 4 Energy Function E = E S + E  + E S  + E T or + E vdW + E dipole bonded terms (in linear number) E vdW non-bonded terms (in quadratic number)

5 5 Role of vdW Terms  vdW terms  maze of in conformational space  Other terms steer the molecule in this maze

6 6 Heuristic Energy Terms (e.g., Gō Models)

7 7 Interaction with Solvent  Explicit solvent models: 100s or 1000s of discrete solvent molecules  Implicit solvent models: solvent as continuous medium, interface is solvent-accessible surface

8 8 Energy Function  E =  bonded terms +  non-bonded terms +  solvation terms  Bonded terms - Relatively few  Non-bonded terms - Depend on distances between pairs of atoms - Quadratic number  Expensive to compute  Solvation terms - May require computing molecular surface

9 9 Energy Function  E =  bonded terms +  non-bonded terms +  solvation terms  Bonded terms - Relatively few  Non-bonded terms - Depend on distances between pairs of atoms - Quadratic number  Expensive to compute  Solvation terms - May require computing molecular surface

10 10 Uses of Energy Function  Generate energetically plausible conformations: sample (at random), minimize, cluster  Generate meaningful distributions (e.g., Boltzman) of conformations: Monte Carlo simulation  Generate motion pathways to study molecular kinetics: molecular dynamics, MC simulation

11 11  Popular approach to study thermodynamic and kinetic properties of proteins  Random walk through conformation space  At each cycle: –Perturb current conformation at random –Accept step with probability: (Metropolis acceptance criterion)  The conformations generated by an arbitrarily long MCS are Boltzman distributed, i.e., #conformations in V ~ Monte Carlo Simulation (MCS)

12 12 Uses of Energy Function  Generate energetically plausible conformations: sample (at random), minimize, cluster  Generate meaningful distributions (e.g., Boltzman) of conformations: Monte Carlo simulation  Generate motion pathways to study molecular kinetics: molecular dynamics, MC simulation  One issue in common: Energy must be evaluated frequently E.g., MD and MC simulation runs may consist of millions of steps, each

13 13 Uses of Energy Function  Generate energetically plausible conformations: sample (at random), minimize, cluster  Generate meaningful distributions (e.g., Boltzman) of conformations: Monte Carlo simulation  Generate motion pathways to study molecular kinetics: molecular dynamics, MC simulation Problem: How to efficiently compute and update energy during minimization and simulation?

14 14 Non-Bonded Energy Terms  Quadratic number of pairs of atoms  Energy terms go to 0 when distance increases  Cutoff distance (6 - 12Å)  vdW forces prevent atoms from bunching up  Only O(n) interacting pairs [Halperin&Overmars 98] Problems:  How can we find the interacting pairs without enumerating all atom pairs?  How can we detect atomic clashes quickly? Main computational issue: Proximity computation

15 15 Grid Method d cutoff  Subdivide 3-space into cubic cells  Compute cell that contains each atom center  Represent grid as hashtable

16 16 Grid Method d cutoff  O(n) time to build grid  O(1) time to find interactive pairs for each atom  Θ(n) to find all interactive pairs of atoms [Halperin&Overmars, 98]  Asymptotically optimal in worst-case

17 17 Energy Update  Compare the interacting pairs at new step with those at previous step  For every pair that has disappeared, subtract the corresponding energy term from energy value  For every new pair, add the corresponding energy term to energy value  Takes Θ(n) time, even if very few pairs have changed

18 18 Conservation of partial energy sums The grid method is unable to recognize and re-use such partial sums

19 19 Grid Method d cutoff  O(n) time to build grid  O(1) time to find interactive pairs for each atom  Θ(n) to find all interactive pairs of atoms [Halperin&Overmars, 98]  Asymptotically optimal in worst-case  But: - Energy partial sums? - Atomic clashes? [second grid with small cutoff distance]

20 20 Grid Method  Surface [Halperin and Shelton, 97]  Each sphere intersects O(1) spheres  Computing each atom’s contribution to molecular surface takes O(1) time  Computation of molecular surface takes Θ(n) time  implicit solvation term in Θ(n) time

21 21 General Problem Molecules form geometrically complex objects that deform and move relative to each other  (Self-)collision detection  Distance computation Several computational approaches:  Space occupancy: grid, octree  Tracking pairs of closest features  Polynomial equation  Bounding-volume hierarchies (BVH)  Spanners

22 22 Bounding Volume Hierarchies (BVHs) Outline:  Case of rigid objects:  Bounding volume (BV)  BV hierarchy (BVH)  Types of BVs  Collision detection with BVHs  Distance computation  Application to deformable objects  Application to protein simulation http://www.cs.unc.edu/~geom/collide/index.shtml

23 23 Basic Problem Given the geometric models and relative positions of two objects, determine whether they overlap

24 24 Basic Problem Given the geometric models and relative positions of two objects, determine whether they overlap distance = 0  collision

25 25 Applications  Computer graphics & simulation  Robotics  Haptics

26 26

27 27 Basic Idea of Solution  Enclose objects into bounding volumes (spheres or boxes)  Check the bounding volumes first

28 28 Basic Idea of Solution  Enclose objects into bounding volumes (spheres or boxes)  Check the bounding volumes first  Decompose an object into two

29 29 Basic Idea of Solution  Enclose objects into bounding volumes (spheres or boxes)  Check the bounding volumes first  Decompose an object into two  Proceed hierarchically

30 30 Basic Idea of Solution  Enclose objects into bounding volumes (spheres or boxes)  Check the bounding volumes first  Decompose an object into two  Proceed hierarchically

31 31 Bounding Volume Hierarchy (BVH) BVH is pre-computed for each object BVH is typically a balanced binary tree

32 32 BVH in 3D

33 33 Collision Detection Two objects described by their precomputed BVHs A B C D EF G A B C D EF G

34 34 Collision Detection A Search tree A A pruning

35 35 Collision Detection A CCBCBBCBCB Search tree A A A B C D EF G

36 36 Collision Detection CCBCBBCBCB A Search tree pruning A B C D EF G

37 37 If two leaves of the BVH’s overlap (here, G and D) check their content for collision Collision Detection CCBCBBCBCB A Search tree GEGEGDGDFEFEFDFD A B C D EF G G D

38 38 Variant A CCBCBBCBCB Search tree A A A B C D EF G A CACABABA

39 39 Collision Detection  Pruning discards subsets of the two objects that are separated by the BVs  Each path is followed until pruning or until two leaves overlap  When two leaves overlap, their contents are tested for overlap

40 40 Search Strategy and Heuristics  If there is no collision, all paths must eventually be followed down to pruning or a leaf node  But if there is collision, it is desirable to detect it as quickly as possible  Greedy best-first search strategy with f(N) = d/(r X +r Y ) [Expand the node XY with largest relative overlap (most likely to contain a collision)] rXrX rYrY d X Y

41 41 Recursive (Depth-First) Collision Detection Algorithm Test(A,B) 1.If A and B do not overlap, then return 1 2.If A and B are both leaves, then return 0 if their contents overlap and 1 otherwise 3.Switch A and B if A is a leaf, or if B is bigger and not a leaf 4.Set A 1 and A 2 to be A’s children 5.If Test(A 1,B) = 1 then return Test(A 2,B) else return 0

42 42 Performance Several thousand collision checks per second for 2 three-dimensional objects each described by 500,000 triangles, on a 1-GHz PC

43 43 Greedy Distance Computation (same recursion as collision detection) Greedy-Distance(A,B) 1.If dist(A,B) > 0, then return dist(A,B) 2.If A and B are both leaves, then return distance between their contents 3.Switch A and B if A is a leaf, or if B is bigger and not a leaf 4.Set A 1 and A 2 to be A’s children 5.d 1  Greedy-Distance(A 1,B) 6.If d 1 > 0 then a.d 2  Greedy-Distance(A 2,B) b.If d 2 > 0 then return Min(d 1, d 2 ) 7.Return 0

44 44 Exact Distance Computation Distance(A,B) 1.If dist(A,B) > M, then return M 2.If A and B are both leaves, then a.d  distance between their contents b.Return Min(d,M) 3.Switch A and B if A is a leaf, or if B is bigger and not a leaf 4.Set A 1 and A 2 to be A’s children 5.M  Distance(A 1,B) 6.If M > 0 then return Distance(A 2,B) 7.Else return 0 M (upper bound on distance) is initialized to very large number

45 45 Approximate Distance Computation Approx-Distance(A,B) [  d a : d a  d e and d e -d a   d e ] 1.If dist(A,B) > M, then return M 2.If A and B are both leaves, then a.d  distance between their contents b.If d < M then return (1-  )  d else return M 3.Switch A and B if A is a leaf, or if B is bigger and not a leaf 4.Set A 1 and A 2 to be A’s children 5.M  Approx-Distance(A 1,B) 6.If M > 0 then return Approx-Distance(A 2,B) 7.Return 0 M (upper bound on distance) is initialized to very large number

46 46 Approximate Distance Computation Approx-Distance(A,B) [  d a : d a  d e and d e -d a   d e ] 1.If dist(A,B) > M, then return M 2.If A and B are both leaves, then a.d  distance between their contents b.If d < M then return (1-  )  d 3.Switch A and B if A is a leaf, or if B is bigger and not a leaf 4.Set A 1 and A 2 to be A’s children 5.M  Approx-Distance(A 1,B) 6.If M > 0 then return Approx-Distance(A 2,B) 7.Return 0 M (upper bound on distance) is initialized to very large number Garanteed to return an approximate distance between (1-  )d and d

47 47 Collision detection < Greedy distance computation < 0.5 Approximate distance computation << Exact distance computation < : slightly faster << : much faster

48 48 Desirable Properties of BVs and BVHs BVs:  Tightness  Efficient testing  Invariance BVH:  Separation  Balanced tree

49 49 Desirable Properties of BVs and BVHs BVs:  Tightness  Efficient testing  Invariance BVH:  Separation  Balanced tree

50 50 Spheres  Invariant  Efficient to test  But tight?

51 51 Axis-Aligned Bounding Box (AABB)

52 52 Axis-Aligned Bounding Box (AABB)  Not invariant  Efficient to test  Not tight

53 53 Oriented Bounding Box (OBB) [Gottschalk, Lin, and Manocha, 96]

54 54 Oriented Bounding Box (OBB)  Invariant  Less efficient to test  Tight

55 55 Rectangle Swept Spheres (RSS) Similar to OBBs  Efficient distance computation

56 56 Computation of Distance Between Two RSS’s  Compute the distance between the two underlying rectangles  Subtract the growing radius

57 57 Comparison of BVs SphereAABBOBBRSS Tightness---++ Testing++--+ Invarianceyesnoyes No type of BV is optimal for all situations

58 58  Each intermediate sphere encloses the geometry contained in its descendant leaf nodes  Simple solution: Compute each intermediate sphere to minimally enclose its two children  Tighter-fitting solution: each intermediate sphere is computed to minimally enclose the sphere’s leaf descendants [Welzl, 91]  expected O(N) time Computation of a BV Sphere

59 59 Computation of an OBB [Gottschalk, Lin, and Manocha, 96]  N points a i = (x i, y i, z i ) T, i = 1,…, N  SVD of A = (a 1 a 2... a N )  A = UDV T where  D = diag(  1,  2,  3 ) such that  1   2   3  0  U is a 3x3 rotation matrix that defines the principal axes of variance of the a i ’s  OBB’s directions  The OBB is defined by max and min coordinates of the a i ’s along these directions  Possible improvements: use vertices of convex hull of the a i ’s or dense uniform sampling of convex hull x y X Y rotation described by matrix U

60 60 OBB of a Collection of Spheres  Compute the OBB of the centers  Grow the OBB by moving each of its faces outward by the atom radius x y X Y

61 61 Computation of an RSS [Larsen, Gottschalk, Lin, and Manocha, 00]  Similar to OBB. Compute the two principal axes of variance of the a i ’s (atom centers)  Project all a i ’s into the plane P defined by these two directions  Compute minimum enclosing rectangle R contained in P and aligned with these directions  Grow R by half the length of the interval spanned by the a i ’s along the direction perpendicular to P increased by the atom radius

62 62 Desirable Properties of BVs and BVHs BVs:  Tightness  Efficient testing  Invariance BVH:  Separation  Balanced tree

63 63 Desirable Properties of BVs and BVHs BVs:  Tightness  Efficient testing  Invariance BVH:  Separation  Balanced tree Group pieces that are close apart, not pieces that are far apart

64 64 Construction of a BVH  Top-down recursive algorithm from the root to the leaves  At each step, create the two children of a BV

65 65 Subdivision of a Sphere BV  Split longest axis of AABB at mid or median point  Median point guarantees balanced BVH, but takes slightly more time to compute

66 66 Subdivision of an OBB/RSS  Split longest axis at mid or median point

67 67 Application to Deformable Objects  The BVH computed for some initial or nominal geometry may become useless

68 68 Application to Deformable Objects  The BVH computed for some initial or nominal geometry may become useless  Group pieces hierarchically based on topological rather than geometric proximity  Topological proximity is  invariant  implies geometric proximity (converse is not true)

69 69 Particular Case: Long Chain

70 70 Application to Deformable Objects  The BVH computed for some initial or nominal geometry may become useless  Group pieces hierarchically based on topological rather than geometric proximity  Topological proximity is  invariant  implies geometric proximity (converse is not true)  BVH with fixed topology, but BVs must still be adjusted in size and position  Self-collision detection is done by testing a BVH against itself

71 71 Particular Case: Long Chain A chain of spheres is well-behaved iff: 1.The ratio of the radii of the largest and smallest spheres is less than some  2.The distance between any two sphere centers is greater than some  Complexity for updating the BVH and testing self-collision of a well-behaved chain of spheres

72 72 Application to Monte Carlo Simulation of Proteins (ChainTree) [I. Lotan, D. Halperin, F. Schwarzer and J.C. Latombe. Algorithm and Data Structures for Efficient Energy maintenance During Monte Carlo Simulation of Proteins, J. Computational Biology, 2004]

73 73  Random walk through conformation space  At each cycle: - Perturb current conformation at random –Accept step with probability:  Problem: Update energy value Monte Carlo Simulation (MCS)

74 74 Energy Function  E =  bonded terms +  non-bonded terms +  solvation terms  Bonded terms - Relatively few  Non-bonded terms - Depend on distances between pairs of atoms - Quadratic number  Expensive to compute  Solvation terms - May require computing molecular surface

75 75 Non-Bonded Energy Terms  They go to 0 when distance increases  Use cutoff distance (6 - 12Å)  vdW forces prevent atoms from bunching up  Only O(n) interacting pairs [Halperin&Overmars 98] Problem: How to find these interacting pairs without enumerating all atom pairs?

76 76 Can We Do Better on Average than Grid method? Few DOFs are changed at each MC step Number k of DOF changes 0 10 20 305

77 77 Can We Do Better on Average than Grid method? Few DOFs are changed at each MC step Number k of DOF changes 0 10 20 305 simulation of 100,000 attempted steps

78 78  Few DOFs are changed at each MC step  Proteins are long chain kinematics  Long sub-chains stay rigid at each step  Many partial energy sums remain constant Problem: How to retrieve the unchanged partial sums? Can We Do Better on Average?

79 79 ChainTree (Twofold Hierarchy: BVs + Transforms) links

80 80 T NO T JK T AB joints ChainTree (Twofold Hierarchy: BVs + Transforms)

81 81 Updating the ChainTree Update path to root: –Recompute transforms that “shortcut” the DOF change –Recompute BVs that contain the DOF change –O(k (log(n/k)+1)) work for k changes

82 82 Finding Interacting Pairs 

83 83 Finding Interacting Pairs

84 84 Finding Interacting Pairs  Do not search inside rigid sub-chains (unmarked nodes)  Do not test two nodes with no marked node between them

85 85 Finding Interacting Pairs  Do not search inside rigid sub-chains (unmarked nodes)  Do not test two nodes with no marked node between them

86 86 EnergyTree E(N,N) E(J,L) E(K.L) E(L,L) E(M,M)

87 87 EnergyTree E(N,N) E(J,L) E(K.L) E(L,L) E(M,M)

88 88 Computational Complexity  n : total number of DOFs  k : number of DOF changes at each MCS step  k << n  Complexity of:  updating ChainTree: O(k (log(n/k)+1))  finding interacting pairs: O(n 4/3 ) but p erforms much better in practice!!!

89 89 Experimental Setup  Energy function:  Van der Waals  Electrostatic  Attraction between native contacts  Cutoff at 12Å  300,000 steps MCS with Grid and ChainTree  Steps are the same with both methods  Early rejection for large vdW terms

90 90 Results: 1-DOF change (68)(144)(374) (755) # amino acids 3.5 12.5 5.8 7.8 speedup

91 91 Results: 5-DOF change (68)(144)(374)(755) 2.2 3.4 4.5 5.9 speedup

92 92 Two-Pass ChainTree (ChainTree+) 1 st pass: small cutoff distance to detect steric clashes 2 nd pass: Normal cutoff distance >5 Tests around native state

93 93 Interaction with Solvent  Explicit solvent models: 100s or 1000s of discrete solvent molecules  Implicit solvent models: solvent as continuous medium, interface is solvent-accessible surface E. Eyal, D. Halperin. Dynamic Maintenance of Molecular Surfaces under Conformational Changes. http://www.give.nl/movie/publications/telaviv/EH04.pdf http://www.give.nl/movie/publications/telaviv/EH04.pdf

94 94 Conclusion  ChainTree significantly reduces average time of MCS for proteins (vs. grid)  It exploits:  Atomic exclusion  Cutoff distance on potentials  Chain kinematics of protein  Small # of DOF changes at each MC step  Larger speed-up for bigger proteins and smaller # of simultaneous DOF changes  Extension to updating protein surface  http://robotics.stanford.edu/~itayl/mcs http://robotics.stanford.edu/~itayl/mcs Already exploited by grid method


Download ppt "1 Energy Maintenance for Molecular Simulation kinematics + energy  motion + structure Main computational issue: Proximity computation."

Similar presentations


Ads by Google