Clustering the Temporal Sequences of 3D Protein Structure Mayumi Kamada +*, Sachi Kimura, Mikito Toda ‡, Masami Takata +, Kazuki Joe + + : Graduate School.

Slides:



Advertisements
Similar presentations
November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.
Advertisements

University of Ioannina - Department of Computer Science Wavelets and Multiresolution Processing (Background) Christophoros Nikou Digital.
Clustering by Passing Messages Between Data Points Brendan J. Frey and Delbert Dueck Science, 2007.
Dimensionality Reduction PCA -- SVD
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
PCA + SVD.
Computer Vision Lecture 16: Texture
A 3-D reference frame can be uniquely defined by the ordered vertices of a non- degenerate triangle p1p1 p2p2 p3p3.
Texture Segmentation Based on Voting of Blocks, Bayesian Flooding and Region Merging C. Panagiotakis (1), I. Grinias (2) and G. Tziritas (3)
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.
1 PharmID: A New Algorithm for Pharmacophore Identification Stan Young Jun Feng and Ashish Sanil NISSMPDM 3 June 2005.
Non-metric affinity propagation for unsupervised image categorization Delbert Dueck and Brendan J. Frey ICCV 2007.
Department of Computer Science, University of California, Santa Barbara August 11-14, 2003 CTSS: A Robust and Efficient Method for Protein Structure Alignment.
Distance methods. UPGMA: similar to hierarchical clustering but not additive Neighbor-joining: more sophisticated and additive What is additivity?
‘Gene Shaving’ as a method for identifying distinct sets of genes with similar expression patterns Tim Randolph & Garth Tan Presentation for Stat 593E.
Regular Expression Constrained Sequence Alignment Abdullah N. Arslan Assistant Professor Computer Science Department.
Protein Structure Space Patrice Koehl Computer Science and Genome Center
Ordinary least squares regression (OLS)
Oral Defense by Sunny Tang 15 Aug 2003
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
Inverse Kinematics for Molecular World Sadia Malik April 18, 2002 CS 395T U.T. Austin.
Summarized by Soo-Jin Kim
NUS CS5247 A dimensionality reduction approach to modeling protein flexibility By, By Miguel L. Teodoro, George N. Phillips J* and Lydia E. Kavraki Rice.
Chapter 9 Superposition and Dynamic Programming 1 Chapter 9 Superposition and dynamic programming Most methods for comparing structures use some sorts.
Video Mosaics AllisonW. Klein Tyler Grant Adam Finkelstein Michael F. Cohen.
Homogeneous Coordinates (Projective Space) Let be a point in Euclidean space Change to homogeneous coordinates: Defined up to scale: Can go back to non-homogeneous.
CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.
Texture analysis Team 5 Alexandra Bulgaru Justyna Jastrzebska Ulrich Leischner Vjekoslav Levacic Güray Tonguç.
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Feature selection LING 572 Fei Xia Week 4: 1/29/08 1.
Pseudo-supervised Clustering for Text Documents Marco Maggini, Leonardo Rigutini, Marco Turchi Dipartimento di Ingegneria dell’Informazione Università.
Approximation of Protein Structure for Fast Similarity Measures Fabian Schwarzer Itay Lotan Stanford University.
SINGULAR VALUE DECOMPOSITION (SVD)
Conformational Space.  Conformation of a molecule: specification of the relative positions of all atoms in 3D-space,  Typical parameterizations:  List.
DCT.
A Technical Introduction to the MD-OPEP Simulation Tools
1/20 Study of Highly Accurate and Fast Protein-Ligand Docking Method Based on Molecular Dynamics Reporter: Yu Lun Kuo
CCN COMPLEX COMPUTING NETWORKS1 This research has been supported in part by European Commission FP6 IYTE-Wireless Project (Contract No: )
A DISTRIBUTION BASED VIDEO REPRESENTATION FOR HUMAN ACTION RECOGNITION Yan Song, Sheng Tang, Yan-Tao Zheng, Tat-Seng Chua, Yongdong Zhang, Shouxun Lin.
Introduction to Linear Algebra Mark Goldman Emily Mackevicius.
Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.
MINRMS: an efficient algorithm for determining protein structure similarity using root-mean-squared-distance Andrew I. Jewett, Conrad C. Huang and Thomas.
CS654: Digital Image Analysis Lecture 11: Image Transforms.
Protein structure prediction Computer-aided pharmaceutical design: Modeling receptor flexibility Applications to molecular simulation Work on this paper.
An Efficient Index-based Protein Structure Database Searching Method 陳冠宇.
3/13/2016Data Mining 1 Lecture 1-2 Data and Data Preparation Phayung Meesad, Ph.D. King Mongkut’s University of Technology North Bangkok (KMUTNB) Bangkok.
A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Dr. Harish Vashisth Department of Chemical Engineering, University of New Hampshire,
A Computational Study of RNA Structure and Dynamics Rhiannon Jacobs and Harish Vashisth Department of Chemical Engineering, University of New Hampshire,
CSE 554 Lecture 8: Alignment
Experience Report: System Log Analysis for Anomaly Detection
By Brian Lam and Vic Ciesielski RMIT University
Multi-resolution image processing & Wavelet
School of Computer Science & Engineering
Principal Component Analysis (PCA)
Bayesian Refinement of Protein Functional Site Matching
Extra Tree Classifier-WS3 Bagging Classifier-WS3
Computer Vision Lecture 16: Texture II
BIOINFORMATICS Summary
Scale-Space Representation for Matching of 3D Models
DCT-based Processing of Dynamic Features for Robust Speech Recognition Wen-Chi LIN, Hao-Teng FAN, Jeih-Weih HUNG Wen-Yi Chu Department of Computer Science.
Accommodating Protein Dynamics in the Modeling of Chemical Crosslinks
Volume 15, Issue 9, Pages (September 2007)
Interpretation of Similar Gene Expression Reordering
K-Medoid May 5, 2019.
Alemayehu A. Gorfe, Barry J. Grant, J. Andrew McCammon  Structure 
Measuring the Similarity of Rhythmic Patterns
by Nicola Salvi, Anton Abyzov, and Martin Blackledge
Presentation transcript:

Clustering the Temporal Sequences of 3D Protein Structure Mayumi Kamada +*, Sachi Kimura, Mikito Toda ‡, Masami Takata +, Kazuki Joe + + : Graduate School of Humanities and Science, Information and Computer Sciences, Nara Women’s University ‡ : Departments of physics, Nara Women’s University

Outline Motivation Flexibility Docking Feature Extraction using Motion Analysis Conclusions and Future Work

Motivation Protein in biological molecules “Docking” –Transform oneself and Combine with other materials Prediction of Docking  Prediction of resultant functions

Existing Docking Simulation Predicted structures from docking structure A structure B Docking simulation PDB * Rigid structures * Protein Data Bank Fluctuating in living cells  Low prediction accuracy Docking simulation  Considering fluctuations

Flexibility Docking Predicted structures from docking structure A structure B Docking simulation PDB Flexibility handling Considering fluctuation of proteins in living cells Extraction of fluctuated structures Consideration of structural fluctuation of proteins

Flexibility Handling Flexibility handling MD Filter ・・・ output file Representative structure ・・・ Filtering Selection of representative structures from similar structures Molecular dynamic simulation(MD) Simulation of motion of molecules in a polyatomic system output file output file output file output file Representative structure Create filters by using RMSD

Filters using RMSD RMSD(Root Mean Square Deviation) –Comparison of the similarity of two structures Propose two filtering algorithms Maximum RMSD selection filter Below RMSD 1 Å deletion filter Result – Useful for the heat fluctuation condition –RMSD  Unification of topology information  Lapse of information Feature extraction focusing on Protein Motion not Structure

Capture Protein Motion MD ・・・ Wavelet transform ・・・ Clustering ・・・ Continuous wavelet transform: Morlet wavelet Clustering algorithm: Affinity Propagation Selection of representative motions Feature extraction The frequency may change momentarily!

Target Protein 1TIB –Residue length: 269 MD simulation –Software: AMBER –Simulation run time: 2 nsec –Result data files: 200 Space coordinates of Cα atoms

Singular Value Decomposition SVD(Singular value decomposition) Definition: Unitary matrix U: Left-singular vectors  Spatial motion Unitary matrix V: Right-singular vectors  Frequency fluctuation Matrix A: At time step i (t i ) Components column : Cα row : Frequency ★ matrix-size of A: 807×199

Singular Value Decomposition SVD(Singular value decomposition) Definition: Unitary matrix U: Left-singular vectors  Spatial motion Unitary matrix V: Right-singular vectors  Frequency fluctuation Matrix A: At time step i (t i ) Components column : Cα row : Frequency ★ matrix-size of A: 807×199

Verification of Reproducibility Singular values and principal components N=1 N=4 N=6 N=8 M=1 M=4 M=6 M=8 Left Singular Vectors (Spatial motion) Right Singular Vectors (Frequency fluctuation)

Reproducibility Using the eight principal components, the motion expressed by 199 components can be reproduced ! Almost adjusted !

Examination (1) Each of singular values (2)The first singular value –Accounted for about 30% over Expression of the original motion  Possible by the six singular values The first singular value is useful

Clustering Analysis Focus on the first principal component Definition –Similarities and Preference  Clustering by using the above values

Similarities (1) For left singular vectors –Difference of spatial directs  Inner products –Similarity : Same directionDifferential direction K ij :Value 10 Cα

Similarities (2) For right singular vectors –Difference between distributions of spectrum  Hellinger Distance –Similarity:

Clustering Method Affinity propagation(AP) –Brendan J. Frey and Delbert Dueck –“ Clustering by Passing Messages Between Data Points ”. Science 315, 972 – –Obtain “Exemplars”: cluster centers Preference –Left singular vectors Average of similarities –Right singular vectors minimum of similarities -( maximum of similarities - minimum )

Similarities between Left Singular Vectors

Clustering of Left Singular Vectors

Similarities between Right Singular Vectors

Clustering of Right Singular Vectors

Discussions Each of motions –Spatial motion Repetition of several similar spatial motions in time variation –Frequency fluctuation Repetition of similar frequency patterns in time variation Relationship Characteristic Frequency fluctuation Group transition on spatial motion

Conclusions and Future Work Flexibility docking –Flexibility handling: MD and Filter Feature extraction based motion –Wavelet analysis –Analysis of motions –Clustering Future work –Collective motion –Relationship –Perform the docking simulation

Conclusions and Future Work Flexibility docking –Flexibility handling: MD and Filter Feature extraction based motion –Wavelet analysis –Analysis of motions –Clustering Future work –Collective motion –Relationship –Perform the docking simulation