Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.

Slides:



Advertisements
Similar presentations
Complete Motion Planning
Advertisements

By Guang Song and Nancy M. Amato Journal of Computational Biology, April 1, 2002 Presentation by Athina Ropodi.
DESIGN OF A GENERIC PATH PATH PLANNING SYSTEM AILAB Path Planning Workgroup.
Geometric Algorithms for Conformational Analysis of Long Protein Loops J. Cortess, T. Simeon, M. Remaud- Simeon, V. Tran.
Two Technique Papers on High Dimensionality Allan Rempel December 5, 2005.
Algorithmic Robotics and Motion Planning Dan Halperin Tel Aviv University Fall 2006/7 Dynamic Maintenance and Self-Collision Testing for Large Kinematic.
A COMPLEX NETWORK APPROACH TO FOLLOWING THE PATH OF ENERGY IN PROTEIN CONFORMATIONAL CHANGES Del Jackson CS 790G Complex Networks
Bio-CS Exploration of Molecular Conformational Spaces Jean-Claude Latombe Computer Science Department Robotics Laboratory & Bio-X Clark Center.
Two Examples of Docking Algorithms With thanks to Maria Teresa Gil Lucientes.
CS 326 A: Motion Planning robotics.stanford.edu/~latombe/cs326/2003/index.htm Collision Detection and Distance Computation: Feature Tracking Methods.
1 Single Robot Motion Planning - II Liang-Jun Zhang COMP Sep 24, 2008.
The Probabilistic Roadmap Approach to Study Molecular Motion Jean-Claude Latombe Kwan Im Thong Hood Cho Temple Visiting Professor, NUS Kumagai Professor,
Application of Probabilistic Roadmaps to the Study of Protein Motion.
Stochastic Roadmap Simulation: An efficient representation and algorithm for analyzing molecular motion Mehmet Serkan Apaydιn May 27 th, 2004.
Computational Geometry, Algorithmic Robotics, and Molecular Modeling Dan Halperin School of Computer Science Tel Aviv University June 2007.
Graphical Models for Protein Kinetics Nina Singhal CS374 Presentation Nov. 1, 2005.
Docking of Protein Molecules
CS 326 A: Motion Planning robotics.stanford.edu/~latombe/cs326/2004/index.htm Collision Detection and Distance Computation: Feature Tracking Methods.
Using Motion Planning to Map Protein Folding Landscapes
Dynamic Maintenance and Self Collision Testing for Large Kinematic Chains Lotan, Schwarzer, Halperin, Latombe.
Probabilistic Roadmaps: A Tool for Computing Ensemble Properties of Molecular Motions Serkan Apaydin, Doug Brutlag 1 Carlos Guestrin, David Hsu 2 Jean-Claude.
Protein Structure Space Patrice Koehl Computer Science and Genome Center
Recovering Articulated Object Models from 3D Range Data Dragomir Anguelov Daphne Koller Hoi-Cheung Pang Praveen Srinivasan Sebastian Thrun Computer Science.
Motion Algorithms: Planning, Simulating, Analyzing Motion of Physical Objects Jean-Claude Latombe Computer Science Department Stanford University.
Algorithm for Fast MC Simulation of Proteins Itay Lotan Fabian Schwarzer Dan Halperin Jean-Claude Latombe.
BL5203: Molecular Recognition & Interaction Lecture 5: Drug Design Methods Ligand-Protein Docking (Part I) Prof. Chen Yu Zong Tel:
Efficient Nearest-Neighbor Search in Large Sets of Protein Conformations Fabian Schwarzer Itay Lotan.
IMA, October 29, 2007 Slide 1 T H E B I O I N F O R M A T I C S C E N T R E A continuous probabilistic model of local RNA 3-D structure Jes Frellsen The.
Stochastic roadmap simulation for the study of ligand-protein interactions Mehmet Serkan Apaydin, Carlos E. Guestrin, Chris Varma, Douglas L. Brutlag and.
CS273 Algorithms for Structure and Motion in Biology Instructors: Serafim Batzoglou and Jean-Claude Latombe Teaching Assistant: Sam Gross | serafim | latombe.
Proximity and Deformation Leonidas Guibas Stanford University “Tutto cambia perchè nulla cambi” T. di Lampedusa, Il Gattopardo (1860+)
Clustering protein fragments to extract a shape library data clustered data library [JMB (2002) 323, ]
Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002.
Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion Mehmet Serkan Apaydin, Douglas L. Brutlag, Carlos.
Protein Structure Prediction Samantha Chui Oct. 26, 2004.
CS 326A: Motion Planning Probabilistic Roadmaps: Sampling and Connection Strategies.
Algorithmic Robotics and Molecular Modeling Dan Halperin School of Computer Science Tel Aviv University June 2007.
Bioinformatics Curriculum Guidelines: Toward a Definition of Core Competencies Lonnie Welch School of Electrical Engineering & Computer Science Biomedical.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Efficient Maintenance and Self-Collision Testing for Kinematic Chains Itay Lotan Fabian Schwarzer Dan Halperin Jean-Claude Latombe.
Molecular Motion Pathways: Computation of Ensemble Properties with Probabilistic Roadmaps 1)A.P. Singh, J.C. Latombe, and D.L. Brutlag. A Motion Planning.
Construyendo modelos 3D de proteinas ‘fold recognition / threading’
NUS CS5247 A dimensionality reduction approach to modeling protein flexibility By, By Miguel L. Teodoro, George N. Phillips J* and Lydia E. Kavraki Rice.
Generating Better Conformations for Roadmaps in Protein Folding PARASOL Lab, Department of Computer Science, Texas A&M University,
Room 2032 China Canada Winnipeg Manitoba.
Efficient Maintenance and Self- Collision Testing for Kinematic Chains Itay Lotan Fabian Schwarzer Dan Halperin Jean-Claude Latombe.
Statistical Physics of the Transition State Ensemble in Protein Folding Alfonso Ramon Lam Ng, Jose M. Borreguero, Feng Ding, Sergey V. Buldyrev, Eugene.
The Geometry of Biomolecular Solvation 2. Electrostatics Patrice Koehl Computer Science and Genome Center
Educational & Community Extending Activities. Education Outline Graduate training/mentoring Undergraduate training/mentoring Courses with Biogeometry.
Approximation of Protein Structure for Fast Similarity Measures Fabian Schwarzer Itay Lotan Stanford University.
Conformational Space.  Conformation of a molecule: specification of the relative positions of all atoms in 3D-space,  Typical parameterizations:  List.
Protein Folding and Modeling Carol K. Hall Chemical and Biomolecular Engineering North Carolina State University.
1 Energy Maintenance for Molecular Simulation kinematics + energy  motion + structure Main computational issue: Proximity computation.
Molecular Modelling - Lecture 2 Techniques for Conformational Sampling Uses CHARMM force field Written in C++
Flexible Spanners: A Proximity and Collision Detection Tool for Molecules and Other Deformable Objects Jie Gao, Leonidas Guibas, An Nguyen Computer Science.
Modeling Protein Flexibility with Spatial and Energetic Constraints Yi-Chieh Wu 1, Amarda Shehu 2, Lydia Kavraki 2,3  Provided an approach to generating.
Conformational Space of a Flexible Protein Loop Jean-Claude Latombe Computer Science Department Stanford University (Joint work with Ankur Dhanik 1, Guanfeng.
Efficient Motion Updates for Delaunay Triangulations Daniel Russel Leonidas Guibas.
Monte Carlo Simulation of Folding Processes for 2D Linkages Modeling Proteins with Off-Grid HP-Chains Ileana Streinu Smith College Leo Guibas Rachel Kolodny.
Protein structure prediction Computer-aided pharmaceutical design: Modeling receptor flexibility Applications to molecular simulation Work on this paper.
Interactive Continuous Collision Detection for Polygon Soups Xin Huang 11/20/2007.
Protein Structures from A Statistical Perspective Jinfeng Zhang Department of Statistics Florida State University.
A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Dr. Harish Vashisth Department of Chemical Engineering, University of New Hampshire,
A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Harish Vashisth Department of Chemical Engineering, University of New Hampshire,
K -Nearest-Neighbors Problem. cRMSD  cRMSD(c,c ’ ) is the minimized RMSD between the two sets of atom centers: min T [(1/n)  i=1, …,n ||a i (c) – T(a.
PRM based Protein Folding
Efficient Energy Computation for Monte Carlo Simulation of Proteins
Finding Functionally Significant Structural Motifs in Proteins
Sampling and Connection Strategies for Probabilistic Roadmaps
BIOINFORMATICS Summary
Presentation transcript:

Structure and Motion Jean-Claude Latombe Computer Science Department Stanford University NSF-ITR Meeting on November 14, 2002

Stanford’s Participants  PI’s: L. Guibas, J.C. Latombe, M. Levitt  Research Associate: P. Koehl  Postdocs: F. Schwarzer, A. Zomorodian  Graduate students: S. Apaydin (EE), S. Ieong (CS), R. Kolodny (CS), I. Lotan (CS), A. Nguyen (Sc. Comp.), D. Russel (CS), R. Singh (CS), C. Varma (CS)  Undergraduate students: J. Greenberg (CS), E. Berger (CS)  Collaborating faculty:  A. Brunger (Molecular & Cellular Physiology)  D. Brutlag (Biochemistry)  D. Donoho (Statistics)  J. Milgram (Math)  V. Pande (Chemistry)

Problems Addressed Biological functions derive from the structures (shapes) achieved by molecules through motions  Determination, classification, and prediction of 3D protein structures  Modeling of molecular energy and simulation of folding and binding motion

What’s New for Computer Science? Massive amount of experimental data Importance of similarities Multiple representations of structure Continuous energy functions Many objects forming deformable chains Many degrees of freedom Ensemble properties of pathways

Massive amount of experimental data Massive amount of experimental data  Abstract/simplify data sets into compact data structures E.g.: Electron density map  Medial axis

Importance of similarities Importance of similarities  Segmentation/matching/scoring techniques data set clustered data small library E.g.: Libraries of protein fragments [Kolodny, Koehl, Guibas, Levitt, JMB (2002)]

1tim Approximations Complexity 10 (100 fragments of length 5) A cRMS Complexity 2.26 (50 fragments of length 7) A cRMS real protein

Alignment of Structural Motifs [Singh and Saha; Kolodny and Linial] Problem:  Determine if two structures share common motifs: 2 (labelled) structures in R 3 A={a 1,a 2,…,a n }, B={b 1,b 2,…,b m } Find subsequences s a and s b s.t the substructures {a s a (1),a s a (2),…, a s a (l) } {b s b (1),b s b (2),…, b s b (l) } are similar  Twofold problem: alignment and correspondence  Score  Approximation  Complexity

Iterative Closest Point (Besl-McKay) for alignment: [R. Singh and M. Saha. Identifying Structural Motifs in Proteins. Pacific Symp. on Biocomputing, Jan ]  Score: RMSD distance

[R. Singh and M. Saha. Identifying Structural Motifs in Proteins. Pacific Symp. on Biocomputing, Jan ] Trypsin Trypsin active site

[R. Singh and M. Saha. Identifying Structural Motifs in Proteins. Pacific Symp. on Biocomputing, Jan ] Trypsin active site against 42Trypsin like proteins

Multiple representations of structure Multiple representations of structure ProShape software [Koehl, Levitt (Stanford), Edelsbrunner (Duke)]

 Decoys generated using “physical” potentials   Select best decoys using distance information Statistical potentials for proteins based on alpha complex [Guibas, Koehl, Zomorodian]

 Many pairs of objects, but relatively few are close enough to interact  Data structures that capture proximity, but undergo small or rare changes During motion simulation - detect steric clashes (self-collisions) - find pairs of atoms closer than cutoff Continuous energy functions Continuous energy functions Many objects in deformable chains Many objects in deformable chains

Other application domains:  Modular reconfigurable robots  Reconstructive surgery

 Fixed Bounding-Volume hierarchies don’t work sec17

 Instead, exploit what doesn’t change: chain topology  Adaptive BV hierarchies [Guibas, Nguyen, Russel, Zhang] [Lotan, Schwarzer, Halperin, Latombe] (SOCG’02) sec17

Wrapped bounding sphere hierarchies [Guibas, Nguyen, Russel, Zhang] (SoCG 2002) WBSH undergoes small number of changes Self-collision: O(n logn ) in R2 O(n2-2/d) in R d, d  3

ChainTrees [Lotan, Schwarzer, Halperin, Latombe] (SoCG’02) Assumption: Few degrees of freedom change at each motion step (e.g., Monte Carlo simulation)  Find all pairs of atoms closer than a given cutoff  Find which energy terms can be reused

ChainTrees [Lotan, Schwarzer, Halperin, Latombe] (SoCG’02) Updating: Finding interacting pairs : (in practice, sublinear)

ChainTrees Application to MC simulation (comparison to grid method) (68)(144)(374) (755) (68)(144)(374) (755) m=1m = 5

Future work: ChainTrees Open problem: How to find good moves to make when the conformation is compact and random moves are rejected with high probability?  Run new series of experiments with more complex energy field: EEF1 [Lazaridis & Karplus] (with Pande)  Use library of fragments (with Koehl)

Capture proximity information with a sparse spanner 3HVT Future Work: Spanner for deformable chain [Agarwal, Gao, Duke; Nguyen, Zhang, Stanford]

Many degrees of freedom Many degrees of freedom  Tools to explore large dimensional conformation space: - Sampling strategies - Nearest neighbors

Sampling structures by combining fragments [Kolodny, Levitt] a b c d cabcab bbc Library of protein fragments  Discrete set of candidate structures

Find k nearest neighbors of a given protein conformation in a set of n conformations (cRMS, dRMS) a0a0 a1a1 amam a6a6 a5a5 a4a4 a3a3 a2a2 Idea: Cut backbone into m equal subsequences Nearest neighbors in high-dimensional space [Lotan and Schwarzer]

Full rep., dRMS (brute force)~84h Ave. rep., dRMS (brute force) :~4.8h SVD red. rep., dRMS (brute force)41min SVD red. rep., dRMS (kd-tree)19min 100,000 decoys of 1CTF (Park-Levitt set) Computation of 100 NN of each conformation ~80% of computed NNs are true NNs kd-tree software from ANN library (U. Maryland)

Ensemble properties of pathways Ensemble properties of pathways  Stochastic nature of molecular motion requires characterizing average properties of many pathways

Example #1: Probability of Folding p fold Unfolded set Folded set p fold 1- p fold “We stress that we do not suggest using p fold as a transition coordinate for practical purposes as it is very computationally intensive.” Du, Pande, Grosberg, Tanaka, and Shakhnovich “On the Transition Coordinate for Protein Folding” Journal of Chemical Physics (1998). HIV integrase [Du et al. ‘98]

Example #2: Ligand-Protein Interaction [Sept, Elcock and McCammon `99] 10K to 30K independent simulations

vivi vjvj P ij Probabilistic Roadmap [Apaydin, Brutlag, Hsu, Guestrin, Latombe] (RECOMB’02, ECCB’02) Idea: Capture the stochastic nature of molecular motion by a network of randomly selected conformations and by assigning probabilities to edges

P ii F: Folded setU: Unfolded set P ij i k j l m P ik P il P im Let f i = p fold (i) After one step: f i = P ii f i + P ij f j + P ik f k + P il f l + P im f m =1  One linear equation per node  Solution gives p fold for all nodes  No explicit simulation run  All pathways are taken into account  Sparse linear system Probabilistic Roadmap [Apaydin, Brutlag, Hsu, Guestrin, Latombe] (RECOMB’02, ECCB’02)

Probabilistic Roadmap Correlation with MC Approach 1ROP (repressor of primer) 2  helices 6 DOF

Monte Carlo: 49 conformations Over 11 days of computer time Over 10 6 energy computations Roadmap: 5000 conformations hours of computer time ~15,000 energy computations ~4 orders of magnitude speedup! Probabilistic Roadmap Computation Times (1ROP)

Future work: Probabilistic Roadmap  Non-uniform sampling strategies  Encoding molecular dynamics into probabilistic roadmaps (with V. Pande)  Quantitative experiments with ligand-protein binding (with V. Pande)

Bio-X – Clark Center

The following slides relate to non-research issues. I do not plan to present them. Jack and Leo may want to use the contents of some of them for their own presentations.

Tutorial on Delaunay, Alpha-Shape and Pockets (Koehl) A biocomputing Notebook (Koehl) Biocomputation lectures in pre-existing classes: –CS326 – motion planning: molecular motion, probabilistic roadmaps, self-collision detection (Latombe) –CS468 – intro to computational topology: finding pockets and tunnels in molecules, compute surface areas and volumes and their derivative (Zomorodian) New class on Algorithmic Biology (Batzoglu, Guibas, Latombe) Graduate Curriculum Committee, Bio-Engineering Dept., Stanford (Latombe) Education

PhD students Serkan Apaydin, EE An Nguyen, Scientific Computing Carlos Guestrin, CS (Daphne Koller’s group) Itay Lotan, CS Rachel Kolodny, CS Daniel Russel, CS Samuel Ieong, CS Trained Students (1/2) Most graduate students have a principal advisor in CS and a secondaryone in a bio-related department (Levitt, Brutlag, Pande)

Graduated Master students Rohit Singh, finding motifs in proteins, best Stanford CS master’s thesis, June ’02 [current position: bioinformatics company in San Diego] Chris Varma, study of ligand-protein interaction with probabilistic roadmaps, June ’02 [current position: PhD student, Harvard/MIT Biomedical program] Current Master student Ben Wong, modeling T cell activity Undergraduate Eric Berger, CS, Stanford, summer internship Julie Greeberg, CS, Harvard, summer internship Trained Students (2/2)

Prof. Alberto Munoz Math Dept., University of Yucatan, Mexico 3 months, Summer’02 Haptic interaction and probabilistic roadmaps Prof. Ileana Streinu Smith College 6 months, from Sept.’02 Protein folding Visitors

- Guibas and Levitt, with J. Milgram (Math): topology of configuration spaces of chains - Guibas, with V. Pande (Chemistry) and D. Donoho (Statistics) non-linear multi-resolution analysis of molecular motions - Latombe and Apaydin, with D. Brutlag (Biochemistry) and V. Pande: probabilistic roadmaps - Latombe and Lotan with V. Pande: efficient MC simulation Interactions Within Stanford

- Collision Detection for Deforming Necklaces, P. Agarwal, L. Guibas, A. Nguyen, D. Russel, and L. Zhang. Invited to special issue of Comp. Geom., Theory and Applications, following presentation at SoCG'02. - Kinetic Medians and kd-Trees, P. Agarwal, J. Gao, and L. Guibas. Proc. 10th European Symp. Algorithms, LNCS 2461, Springer-Verlag, 5-16, Stochastic Roadmap Simulation: An Efficient Representation and Algorithm for Analyzing Molecular Motion, M.S. Apaydin, D.L. Brutlag, C. Guestrin, D. Hsu, and J.C. Latombe. Proc. RECOMB'02, Washington D.C., pp , Efficient Maintenance and Self-Collision testing for Kinematic Chains, I. Lotan, F. Schwarzer, D. Halperin, and J.C. Latombe, SoCG’02, pp June Stochastic Conformational Roadmaps for Computing Ensemble Properties of Molecular Motion, M.S. Apaydin, D.L. Brutlag, C. Guestrin, D. Hsu, and J.C. Latombe. Workshop on Algorithmic Foundations of Robotics (WAFR), Nice, Dec Interactions Outside Stanford

- BCATS ‘01 and ‘02 [Bio-Computation At Stanford] - RECOMB ’02 [Int. Conf. on Research in Computational Biology] - ISMB ‘02 [Int. Conf. on Intelligent Syst. for Molecular Biology] - ECCB 2002 [European Conf. on Computational Biology] - Biophysical Society Symp. on Molecular Simulations in Structural Biology, SoCG 2002 [ACM Symp. on Computational Heometry] Attendance to Conferences

- Latombe and Levitt serve as members of the Scientific Leadership Council of Stanford’s Bio-X program - Presentations: Stanford’s Bio-X Symposium (3/02), Stanford’s Computer Forum (3/02), Berkeley’s Broad Area Seminar (4/02) - Conference committees: Guibas, program committee, WAFR’02 and SoCG’03 Latombe, program committee, 1 st IEEE Bioinformatics Conf. ‘03 Apaydin, organization committee of BCATS’02 Outreach

The following slides are extra slides that I removed from my presentation for lack of time

General Goals Larger proteins considered  computational efficiency Diversity of molecules and interactions  computational abstractions Extension of in-silico experiments  computational correctness  Enable biological studies that were not possible before, more systematically

Approach Select hard problems Close interaction between computer scientists (Guibas, Koehl, Latombe) and biologists (Koehl, Levitt, Brutlag, Pande, Brunger) Most graduate students are CS students with secondary advisor in biology Perform extensive tests

Electron density map  Medial axis [Guibas, Brunger, Russel]  Medial axis of iso-surfaces to estimate backbone  Cleaning and simplification of axis to filter noise out  Persistence of features across multiple iso-surfaces sec17

Continuous energy function Continuous energy function  Essential for protein structure prediction and molecular motion simulation: - Statistical potentials based on alpha complex - Maintenance of energy values during simulation

 Instead, exploit what doesn’t change: chain topology  Adaptive BV hierarchies Balanced binary trees of constant topology Efficient repair of position/size of BVs [Guibas, Nguyen, Russel, Zhang] [Lotan, Schwarzer, Halperin, Latombe] (SOCG’02) sec17

Future Work: Spanner for deformable chain [Agarwal, Gao, Duke; Nguyen, Zhang, Stanford]

1ROP (repressor of primer) 2  helices 6 DOF 1HDD (Engrailed homeodomain) 3  helices 12 DOF H-P energy model with steric clash exclusion [Sun et al., 95] Probabilistic Roadmap