Geometric and Kinematic Models of Proteins From a course taught firstly in Stanford by JC Latombe, then in Singapore by Sung Wing Kin, and now in Rome by AG…web solidarity. With excerpta from a course by D. Wishart. LECT_4 8 th Oct 2007
Kinematic Models of Bio-Molecules Atomistic model: The position of each atom is defined by its coordinates in 3-D space (x 4,y 4,z 4 ) (x 2,y 2,z 2 ) (x 3,y 3,z 3 ) (x 5,y 5,z 5 ) (x 6,y 6,z 6 ) (x 8,y 8,z 8 ) (x 7,y 7,z 7 ) (x 1,y 1,z 1 ) p atoms 3p parameters Drawback: The bond structure is not taken into account
Peptide bonds make proteins into long kinematic chains The atomistic model does not encode this kinematic structure ( algorithms must maintain appropriate bond lengths)
Protein Features ACEDFHIKNMFSDQWWIPANMCASDFDPQWERELIQNMDKQERTQATRPQDS... Sequence ViewStructure View
Where To Go**
Compositional Features Molecular Weight Amino Acid Frequency Isoelectric Point UV Absorptivity Solubility, Size, Shape Radius of Gyration Free Energy of Folding
Kinematic Models of Bio-Molecules Atomistic model: The position of each atom is defined by its coordinates in 3- D space Linkage model: The kinematics is defined by internal coordinates (bond lengths and angles, and torsional angles around bonds)
Linkage Model T?
Issues with Linkage Model Update the position of each atom in world coordinate system Determine which pairs of atoms are within some given distance (topological proximity along chain spatial proximity but the reverse is not true)
Rigid-Body Transform x z y x T T(x)
2-D Case x y
x y x y
x y x y
x y x y
x y x y
x y x y
x y x y txtx tyty cos -sin sin cos Rotation matrix: ij
x y 2-D Case x y txtx tyty i 1 j 1 i 2 j 2 Rotation matrix: ij
x y 2-D Case x y txtx tyty a b abab v a’ b’ = a’ b’ i 1 j 1 i 2 j 2 Rotation matrix: ij Transform of a point?
Homogeneous Coordinate Matrix i 1 j 1 t x i 2 j 2 t y 001 x’ cos -sin t x x t x + x cos – y sin y’ = sin cos t y y = t y + x sin + y cos x y x y txtx tyty x’ y’ y x T = (t,R) T(x) = t + Rx
3-D Case 11 22 ?
Homogeneous Coordinate Matrix in 3-D i 1 j 1 k 1 t x i 2 j 2 k 2 t y i 3 j 3 k 3 t z 0001 with: –i i i 3 2 = 1 –i 1 j 1 + i 2 j 2 + i 3 j 3 = 0 –det(R) = +1 –R -1 = R T x z y x y z j i k R
Example x z y cos 0sin t x 010t y -sin 0cos t z 0001
Rotation Matrix R(k, ) = k x k x v + c k x k y v - k z s k x k z v + k y s k x k y v + k z s k y k y v + c k y k z v - k x s k x k z v - k y s k y k z v + k x s k z k z v + c where: k = ( k x k y k z ) T s = sin c = cos v = 1-cos k
Homogeneous Coordinate Matrix in 3-D x z y x y z j i k x’i 1 j 1 k 1 t x x y’i 2 j 2 k 2 t y y z’i 3 j 3 k 3 t z z = (x,y,z) (x’,y’,z’) Composition of two transforms represented by matrices T 1 and T 2 :T 2 T 1
Building a Serial Linkage Model Rigid bodies are: atoms (spheres), or groups of atoms
Building a Serial Linkage Model 1.Build the assembly of the first 3 atoms: a.Place 1 st atom anywhere in space b.Place 2 nd atom anywhere at bond length
Bond Length
Building a Serial Linkage Model 1.Build the assembly of the first 3 atoms: a.Place 1 st atom anywhere in space b.Place 2 nd atom anywhere at bond length c.Place 3 rd atom anywhere at bond length with bond angle
Bond angle
Coordinate Frame z x y Atom: -2 0
Building a Serial Linkage Model 1.Build the assembly of the first 3 atoms: a.Place 1 st atom anywhere in space b.Place 2 nd atom anywhere at bond length c.Place 3 rd atom anywhere at bond length with bond angle 2.Introduce each additional atom in the sequence one at a time
1000c-s00100d0c-s0sc sc c-s00100d0c-s0sc sc T i+1 = Bond Length z x y
1000c-s00100d0c-s0sc sc c-s00100d0c-s0sc sc T i+1 = Bond angle z x y
Torsional (Dihedral) angle z x y 1000c-s00100d0c-s0sc sc c-s00100d0c-s0sc sc T i+1 =
Transform T i+1 i-2 i-1 i i+1 T i+1 d 1000c-s00100d0c-s0sc sc c-s00100d0c-s0sc sc T i+1 = z x y x y z
Transform T i+1 Transform T i+1 i-2 i-1 i i+1 T i+1 d z x y x y z 1000c-s00100d0c-s0sc sc c-s00100d0c-s0sc sc T i+1 =
Readings: J.J. Craig. Introduction to Robotics. Addison Wesley, reading, MA, Zhang, M. and Kavraki, L. E.. A New Method for Fast and Accurate Derivation of Molecular Conformations. Journal of Chemical Information and Computer Sciences, 42(1):64–70, comp-mole-conform.pdf comp-mole-conform.pdf
Serial Linkage Model T1T1 T2T2
Relative Position of Two Atoms i k T k (i) = T k … T i+2 T i+1 position of atom k in frame of atom i T i+1 TkTk i+1 k-1 T i+2
Update T k (i) = T k … T i+2 T i+1 Atom j between i and k T k (i) = T j (i) T j+1 T k (j+1) A parameter between j and j+1 is changed T j+1 T j+1 T k (i) T k (i) = T j (i) T j+1 T k (j+1)
Tree-Shaped Linkage Root group of 3 atoms p atoms 3p 6 parameters Why?
Tree-Shaped Linkage Root group of 3 atoms p atoms 3p 6 parameters world coordinate system T0T0
Simplified Linkage Model In physiological conditions: Bond lengths are assumed constant [depend on “type” of bond, e.g., single: C-C or double C=C; vary from 1.0 Å (C-H) to 1.5 Å (C-C)] Bond angles are assumed constant [~120dg] Only some torsional (dihedral) angles may vary Fewer parameters: 3p-6 p-3
Bond Lengths and Angles in a Protein : C C : C C : N N = 3.8Å C CC N C
Linkage Model peptide group side-chain group
Convention for f-y Angles f is defined as the dihedral angle composed of atoms C i-1 –N i –Ca i –C i If all atoms are coplanar: Sign of f: Use right-hand rule. With right thumb pointing along central bond (N-Ca), a rotation along curled fingers is positive Same convention for y C CC N C C CC N C
Ramachandran Maps They assign probabilities to φ - ψ pairs based on frequencies in known folded structures φ ψ
The sequence of N-C -C-… atoms is the backbone (or main chain) Rotatable bonds along the backbone define the - torsional degrees of freedom Small side-chains with degree of freedom CC CC - - Linkage Model of Protein
Side Chains with Multiple Torsional Degrees of Freedom ( angles) 0 to 4 angles: 1,..., 4
Kinematic Models of Bio-Molecules Atomistic model: The position of each atom is defined by its coordinates in 3-D space Drawback: Fixed bond lengths/angles are encoded as additional constraints. More parameters Linkage model: The kinematics is defined by internal parameters (bond lengths and angles, and torsional angles around bonds) Drawback: Small local changes may have big global effects. Errors accumulate. Forces are more difficult to express Simplified (f-y-c) linkage model: Fixed bond lengths, bond angles and torsional angles are directly embedded in the representation. Drawback: Fine tuning is difficult
In linkage model a small local change may have big global effect Computational errors may accumulate
Drawback of Homogeneous Coordinate Matrix x’i1j1k1txx y’i2j2k2tyy z’i3j3k3tzz = Too many rotation parameters Accumulation of computing errors along a protein backbone and repeated computation Non-redundant 3-parameter representations of rotations have many problems: singularities, no simple algebra A useful, less redundant representation of rotation is the unit quaternion
Unit Quaternion R(r, ) = ( cos /2, r 1 sin /2, r 2 sin /2, r 3 sin /2 ) = cos /2 + r sin /2 R(r, ) R(r, +2 ) Space of unit quaternions: Unit 3-sphere in 4-D space with antipodal points identified
Operations on Quaternions P = p 0 + p Q = q 0 + q Product R = r 0 + r = PQ r 0 = p 0 q 0 – p.q(“.” denotes inner product) r = p 0 q + q 0 p + p q(“ ” denotes outer product) Conjugate of P: P * = p 0 - p
Transformation of a Point Point x = (x,y,z) quaternion 0 + x Transform of translation t = (t x,t y,t z ) and rotation (n,q) Transform of x is x’ 0 + x’ = R(n,q) (0 + x) R * (n,q) + (0 + t)