Basic Computations with 3D Structures

Slides:



Advertisements
Similar presentations
Lec. (4,5) Miller Indices Z X Y (100).
Advertisements

PCA + SVD.
Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.
Protein Planes Bob Fraser CSCBC Overview Motivation Points to examine Results Further work.
Chapter 4.1 Mathematical Concepts. 2 Applied Trigonometry Trigonometric functions Defined using right triangle  x y h.
Protein Structure, Databases and Structural Alignment
Geometric and Kinematic Models of Proteins From a course taught firstly in Stanford by JC Latombe, then in Singapore by Sung Wing Kin, and now in Rome.
1 On Updating Torsion Angles of Molecular Conformations Vicky Choi Department of Computer Science Virginia Tech (with Xiaoyan Yu, Wenjie Zheng)
3-D Geometry.
CISC667, F05, Lec20, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction Protein Secondary Structure.
Structures and Structure Descriptions Chapter 8 Protein Bioinformatics.
Mechanics of Rigid Bodies
Protein Basics Protein function Protein structure –Primary Amino acids Linkage Protein conformation framework –Dihedral angles –Ramachandran plots Sequence.
Math for CSLecture 11 Mathematical Methods for Computer Science Lecture 1.
1 Computational Biology, Part 13 Retrieving and Displaying Macromolecular Structures Robert F. Murphy Copyright  1996, 1999, All rights reserved.
A Kinematic View of Loop Closure EVANGELOS A. COUTSIAS, CHAOK SEOK, MATTHEW P. JACOBSON, KEN A. DILL Presented by Keren Lasker.
1 Computational Biology, Part 11 Retrieving and Displaying Macromolecular Structures Robert F. Murphy Copyright  1996, 1999, All rights reserved.
Screw Rotation and Other Rotational Forms
A PEPTIDE BOND PEPTIDE BOND Polypeptides are polymers of amino acid residues linked by peptide group Peptide group is planar in nature which limits.
Proteins: Levels of Protein Structure Conformation of Peptide Group
Separate multivariate observations
Computational Structure Prediction Kevin Drew BCH364C/391L Systems Biology/Bioinformatics 2/12/15.
Chapter 12 Protein Structure Basics. 20 naturally occurring amino acids Free amino group (-NH2) Free carboxyl group (-COOH) Both groups linked to a central.
UNIVERSITI MALAYSIA PERLIS
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2014.
CSE554Laplacian DeformationSlide 1 CSE 554 Lecture 8: Laplacian Deformation Fall 2012.
Chapter 9 Superposition and Dynamic Programming 1 Chapter 9 Superposition and dynamic programming Most methods for comparing structures use some sorts.
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
Some matrix stuff.
Protein Secondary Structure Lecture 2/19/2003. Three Dimensional Protein Structures Confirmation: Spatial arrangement of atoms that depend on bonds and.
Lecture 10: Protein structure
Proteins. Proteins? What is its How does it How is its How does it How is it Where is it What are its.
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
CSE554AlignmentSlide 1 CSE 554 Lecture 5: Alignment Fall 2011.
Course 12 Calibration. 1.Introduction In theoretic discussions, we have assumed: Camera is located at the origin of coordinate system of scene.
Ken Youssefi Mechanical Engineering dept. 1 Mass Properties Mass property calculation was one of the first features implemented in CAD/CAM systems. Curve.
Protein Planes Bob Fraser Protein Folding 882 Project November, 2006.
Molecular visualization
10.7 Moments of Inertia for an Area about Inclined Axes
T. Bajd, M. Mihelj, J. Lenarčič, A. Stanovnik, M. Munih, Robotics, Springer, 2010 GEOMETRIC DESCRIPTION OF THE ROBOT MECHANISM T. Bajd and M. Mihelj.
Protein Structure Comparison. Sequence versus Structure The protein sequence is a string of letters: there is an optimal solution (DP) to the problem.
Conformational Space.  Conformation of a molecule: specification of the relative positions of all atoms in 3D-space,  Typical parameterizations:  List.
Doug Raiford Lesson 17.  Framework model  Secondary structure first  Assemble secondary structure segments  Hydrophobic collapse  Molten: compact.
CSE554AlignmentSlide 1 CSE 554 Lecture 8: Alignment Fall 2013.
Geometric Transformations
3-D Structure of Proteins
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Structural alignment methods Like in sequence alignment, try to find best correspondence: –Look at atoms –A 3-dimensional problem –No a priori knowledge.
Proteins: 3D-Structure Chapter 6 (9 / 17/ 2009)
Jürgen Sühnel Supplementary Material: 3D Structures of Biological Macromolecules Exercise 1:
Protein backbone Biochemical view:
Topics in bioinformatics CS697 Spring 2011 Class 12 – Mar Molecular distance measurements Molecular transformations.
Lab Meeting 10/08/20041 SuperPose: A Web Server for Automated Protein Structure Superposition Gary Van Domselaar October.
An Efficient Index-based Protein Structure Database Searching Method 陳冠宇.
Find the optimal alignment ? +. Optimal Alignment Find the highest number of atoms aligned with the lowest RMSD (Root Mean Squared Deviation) Find a balance.
NUS CS 5247 David Hsu Protein Motion. NUS CS 5247 David Hsu2 What is a protein?  Primary level - a sequence of alphabets (amino acid molecules)  Amino.
Structural organization of proteins
Presented by Shouyong Peng Shouyong Peng May 27, 2005 Journal Club.
CSE 554 Lecture 8: Alignment
Protein Structure BL
Protein Structure Comparison
Computational Structure Prediction
Protein Structure Prediction Dr. G.P.S. Raghava Protein Sequence + Structure.
The heroic times of crystallography
Secondary Structure.
Hierarchical Structure of Proteins
Protein Planes Bob Fraser CSCBC 2007.
Richard C. Page, Sanguk Kim, Timothy A. Cross  Structure 
Protein structure prediction
Presentation transcript:

Basic Computations with 3D Structures Concepts

General issues How do we represent structure for computation? How do we compare structures? How can we summarize structural families?

Outline General issues in representing structure Bond lengths, bond angles, dihedral (torsion) angles Coordinate Systems Cartesian Coordinates Internal Coordinates Object-based representations Comparing structures, the RMSD (root mean squared distance between two structures). Summarizing structures as mean, variance.

Basic Measurements Bond lengths Bond angles Dihedral (torsion) angles

Bond length the distance between bonded atoms is (almost) constant depends on the “type” of the bond (single: C-C, double: C=C, triple: CΞC) varies from 1.0 Å (C—H) to 1.5 Å (C—C) some others are slightly longer BOND LENGTH IS A FUNCTION OF THE POSITION OF TWO ATOMS.

Computing bond length = Computing distance For two points (x1, y1, z1) and (x2, y2, z2) Distance = sqrt[(x1-x2)2 + (y1-y2)2 + (z1-z2)2] Some non-covalent distances are virtually constant in the protein peptide backbone: The CA—CA distance for consecutive peptides is 3.8 Å (we will see why in a moment).

Bond Angle All bond angles are determined by chemical makeup of the atoms involved, and are (almost) constant. Depends on the type of atom, and the number of electrons available for bonding. Ranges from 100° to 180° BOND ANGLE IS A FUNCTION OF THE POSITION OF THREE ATOMS.

Computing Bond Angle Angle can be computed by computing the arccosine of the dot product between unit vectors BA and BC. theta = acos(X•Y/|X||Y|)

Dihedral Angle These are usually variables. Range from 0—360° in molecules Most important are φ, ψ, ω, and χ (see next figure) DIHEDRAL ANGLES ARE A FUNCTION OF THE POSITION OF FOUR ATOMS.

Dihedral angle

Computing Dihedral Angle Compute cross-product of BA and CB. Compute cross-product of CB and DC This produces two vectors perpendicular to the ABC plane and BCD plane. Angle between those vectors (ala bond angle) is dihedral angle. Need to check if it is positive or negative.

Important dihedral angles

Omega is constant = 180 (C-N doesn’t rotate) Phi, Psi have range of values (Ca-N, N-C rotate) The range is restricted by having things not bumping into each other.

Ramachandran Plot

Typical values for secondary structures you have known Alpha helices phi = –57, psi = –47 Parallel Beta strands phi = –119 , psi = 113 Antiparallel beta strands phi = –139, psi = 135 3-10 Helix phi = –49, psi = –26

Non-Cartesian Coordinates Cartesian coordinates assume orthogonal (xyz) axes, and then provide coordinate values along each axis (as in PDB). But if bond lengths and bond angles are virtually constant, why not reduce the number of variables by specifying only the dihedral angles? This is one of the internal coordinate schemes used.

Think about advantages 3 Peptide Units = 12 atoms = 36 coordinates OR 6 dihedral angles 3 Sidechains = 12 atoms = 36 coordinates OR 5 dihedral angles 72 Cartesian Coordinates vs. 11 internal coordinates

Think about disadvantages Some basic computations become much more difficult. What is distance between two points? What are the N points closest to point X? Difficult to relate objects that are not connected together. Much more nonlinear relationships between coordinates, which may make some kinds of optimization computations difficult.

Object-based coordinate systems What if we know that certain parts of the protein are in standard alpha-helical conformation? We can treat an entire helical backbone as a rigid body. How many parameters to represent location of a helix? Around 6.

Object-based coordinates We can build a helix in an arbitrary coordinate system, and then represent its location in a particular protein as a TRANSLATION and ROTATION

Translation & Rotation Translation = (x y z) vector of some reference point Rotation = many different representations Rotation matrix (3 x 3) Euler Angles rotate around z, then around x, then z and others Direction cosines Quaternions Many others

Comparing structures In order to compare two structures, A and B, we need some basic information: 1. We need to know which atoms in A correspond to which atoms in B. This is why we do alignments of sequence.... 2. We need to know where atoms are. This is what we do with PDB files... 3. We need some criteria for comparing.

Criteria for comparing structures All atoms being equal, we would like to know how closely the two structures can be superimposed If they are exactly the same, then they should superimpose perfectly and have a “distance” of 0 As they diverge, the distance should increase.

The RMSD = root mean squared deviation where N is the number of atoms; di is the distance between two atoms with index i from the two structures

We want minimum RMSD

Computing the RMSD Can be formulated as an inefficient search around superimposed centers of mass (Huang, Blostein, Margerum). Method based on quarternions (Faugeras & Hebert). Method based on singular value analysis of specially formed matrix (Arun, Huang, Blostein).

Algorithm See Arun et al paper...Basically, 1. Compute centroid of points for each object 2. Subtract off centroid, so both objects at origin 3. Create special matrix as sum of vector products 4. Decompose matrix using SVD (singular value decomposition) and use resulting matrices to form optimal ROTATION 5. Compute TRANSLATION necessary to get rotated points in proper location. Algorithm is guaranteed to be optimal, under very broad set of conditions.

Benefits of RMSD Has appropriate behavior (identical structures have RMSD = 0.0, and degrades continuously) Relatively easy to compute Units are natural ones (e.g. Ångstroms) Lots of experience with it.– Similar structures usually 1– 3 Å RMSD

Disadvantages of RMSD Weights all atoms equally Unclear upper bound (what is RMSD of two random sets of points subject to bond constraints of proteins???) Significance of values changes as function of size of protein.

Case Study: Myoglobin family Eight structures taken: sperm whale myoglobin sea hare myoglobin plant leghemoglobin sea lampry hemoglobin human hemoglobin A & B chains chironomous hemoglobin Bloodworm hemoglobin Aligned by hand, because very low amino acid identity (~ 20%). 115 common positions RMS values (for alpha carbons): Average RMSD 2.19 Å Maximum RMSD 3.16 Å Minimum RMSD 1.22 Å

Largest RMSD

Superimpose all on an average

Conclusions We know how to compute basic structural parameters (bond length, bond angle, and torsion angle) Different coordinate systems have different advantages. RMSD is a basic measure of similarity, and can be computed relatively easily. We can superimpose related structures and summarize commonalities/differences.

Major Methods for Structure Determination Xray crystallography grow crystals shoot xrays through the crystals collect pattern of deflected xrays structure = FFT(pattern) NMR spectroscopy concentrate protein in solution expose to high magnetic field