Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 3.31 Superposition & Threading † Gary Van Domselaar University of Alberta † Slides adapted from David Wishart.

Similar presentations


Presentation on theme: "Lecture 3.31 Superposition & Threading † Gary Van Domselaar University of Alberta † Slides adapted from David Wishart."— Presentation transcript:

1 Lecture 3.31 Superposition & Threading † Gary Van Domselaar University of Alberta gary.vandomselaar@ualberta.ca † Slides adapted from David Wishart

2 Lecture 3.32 Outline Vectors, matrices and other geometry issues General Superposition concepts Threading and threading methods

3 Lecture 3.33 Vectors Define Bonds and Atomic Positions x y z Origin CO bond

4 Lecture 3.34 Review - Vectors (1,2,1) (0,0,0) u u = 1i + 2j + 1k ^^^ u = 121121 = (1-0) 2 + (2-0) 2 + (1-0) 2 = 6 u Vectors have a length & a direction x y z

5 Lecture 3.35 Review - Vectors Vectors can be added together Vectors can be subtracted Vectors can be multiplied (dot or cross or by a matrix) Vectors can be transformed (resized) Vectors can be translated Vectors can be rotated

6 Lecture 3.36 Matrices A matrix is a table or “array” of characters A matrix is also called a tensor of “rank 2” 2 4 6 8 9 4 1 3 5 7 9 3 1 0 1 0 1 0 9 4 6 4 3 5 3 4 3 4 3 4 row column A 5 x 6 Matrix # columns # rows

7 Lecture 3.37 Different Types of Matrices 2 4 6 8 9 4 1 3 5 7 9 3 1 0 1 0 1 0 9 4 6 4 3 5 3 4 3 4 3 4 3 6 7 9 1 0 2 4 6 8 9 4 4 3 5 7 9 3 6 5 1 0 1 0 8 7 0 4 3 5 9 9 1 3 3 4 4 3 0 5 4 0 135973135973 A square Matrix A symmetric Matrix A column Matrix (A vector)

8 Lecture 3.38 Different Types of Matrices A B C D E F G H I J K L M N O P Q R S T U V W X cos  sin  0 sin  -cos  0 0 0 1 A rectangular Matrix A rotation Matrix A row Matrix (A vector) 2 4 6 8 9

9 Lecture 3.39 Review - Matrix Multiplication 2 4 0 1 3 1 1 0 0 1 0 2 2 1 3 0 1 0 2x1 + 4x2 + 0x0 2x0 + 4x1 + 0x1 2x2 + 4x3 + 0x0 1x1 + 3x2 + 1x0 1x0 + 3x1 + 1x1 1x2 + 3x3 + 1x0 1x1 + 0x2 + 0x0 1x0 + 0x1 + 0x1 1x2 + 0x3 + 0x0 x 10 4 16 7 4 11 1 0 0

10 Lecture 3.310 Rotation 1 0 0 0cos  sin  0 -sin  cos  cos  sin  0 -sin  cos  0 0 0 1 Rotate about x Rotate about z   x z y

11 Lecture 3.311 Rotation 1 0 0 0cos  sin  0 -sin  cos  cos  sin  0 -sin  cos  0 0 0 1 Clockwise about xClockwise about z 1 0 0 0cos  -sin  0 sin  cos  cos  -sin  0 sin  cos  0 0 0 1 Counterclockwise about xCounterclockwise about z

12 Lecture 3.312 Rotation X = X = x y z x y z 1 0 0 0cos  sin  0 -sin  cos  1 0 0 0cos  sin  0 -sin  cos 

13 Lecture 3.313 Rotation (Detail) X = x y z x y z =  cos  sin  -sin  + cos  111111 1 0 0 0cos  sin  0 -sin  cos  1 0 0 0cos  sin  0 -sin  cos 

14 Lecture 3.314 Superposition Objective is to match or overlay 2 or more similar objects Requires use of translation and rotation operators (matrices/vectors) Recall that very three dimensional object can be represented by a plane defined by 3 points

15 Lecture 3.315 Superposition x y z a b c a’ b’ c’ x y z a b c a’ b’ c’ Identify 3 “equivalence” points in objects to be aligned

16 Lecture 3.316 b’ c’ Superposition x y z x y z a b c a’ b’ c’ a b c Translate points a,b,c and a’,b’,c’ to origin

17 Lecture 3.317 b’ c’ Superposition x y z a b c b’ c’ x y z  a b c Rotate the a,b,c plane clockwise by  about x axis

18 Lecture 3.318 Superposition b’ c’ x y z a b c b’ c’ x y z a bc   Rotate the a,b,c plane clockwise by  about z axis

19 Lecture 3.319 Superposition b’ c’ x y z a bc b’ c’ x y z a bc  Rotate the a,b,c plane clockwise by  about x axis

20 Lecture 3.320 Superposition b’ c’ x y z a bc b’ c’ x y z a bc  ’ Rotate the a’,b’,c’ plane anticlockwise by  ’ about x axis

21 Lecture 3.321 Superposition b’ c’ x y z a bc b’ c’ x y z a bc  ‘ Rotate the a’,b’,c’ plane anticlockwise by  ’ about z axis

22 Lecture 3.322 Superposition b’ c’ x y z a bc Rotate the a’,b’,c’ plane clockwise by  ’ about x axis b’ c’ x y z a bc ’’

23 Lecture 3.323 Superposition Apply all rotations and translations to remaining points b’ c’ x y z a bc b’ c’ x y z a bc

24 Lecture 3.324 Superposition BeforeAfter b’ c’ x y z a bc x y z a b c a’ b’ c’

25 Lecture 3.325 Returning to the “red” frame BeforeAfter y z x b’ c’ x y z a bc a b c

26 Lecture 3.326 Returning to the “red” frame Begin with the superimposed structures on the x-y plane Apply counterclockwise rot. By  Apply counterclockwise rot. By  Apply counterclockwise rot. By  Apply red translation to red origin Just do things in reverse order!

27 Lecture 3.327 Superposition - Applications Ideal for comparing or overlaying two or more protein structures Allows identification of structural homologues (CATH and SCOP) Allows loops to be inserted or replaced from loop libraries (comparative modelling) Allows side chains to be replaced or inserted with relative ease

28 Lecture 3.328 Side Chain Placement http://www.fccc.edu/research/labs/dunbrack/scwrl/ SCWRL

29 Lecture 3.329 C COOHH2NH2N H NH 3 + Amino Acid Side Chains

30 Lecture 3.330 Adding a Side Chain x y z x y z x y z

31 Lecture 3.331 Adding a Side Chain x y z x y z y

32 Lecture 3.332 Adding a Side Chain x y z x y z y

33 Lecture 3.333 Adding a Side Chain x y z x y z y

34 Lecture 3.334 Adding a Side Chain x y z x y z y

35 Lecture 3.335 Superposition The concept of superposition is key to many aspects of protein structure generation and comparison Superposition may be used to insert side chains and loops (for homology models) Side chains require more consideration as side chain packing ultimately determines the 3D structure of proteins

36 Lecture 3.336 Superposition - RMSD The degree of similarity between two or more structures is described by its average root mean square deviation (RMSD): x1x1 x1x1 x5x5 x4x4 x3x3 x2x2 y1y1 y2y2 y3y3 y4y4 y5y5

37 Lecture 3.337 Superposition Software Swiss PDB Viewer –Aligns 2 homologous structures

38 Lecture 3.338 Superposition Software CE: Structure Comparison by Combinatorial Extension http://cl.sdsc.edu/ce.html Superposition for 2 chains and for multiple chains (new)

39 Lecture 3.339 Superposition Software SuperPose http://wishart.biology.ualberta.ca/SuperPose/ Superposition for 2 chains and for multiple chains Subdomain superposition Superposition of structures with low sequence identity

40 Lecture 3.340 Definition Threading - A protein fold recognition technique that involves incrementally replacing the sequence of a known protein structure with a query sequence of unknown structure. The new “model” structure is evaluated using a simple heuristic measure of protein fold quality. The process is repeated against all known 3D structures until an optimal fit is found.

41 Lecture 3.341 Why Threading? Secondary structure is more conserved than primary structure Tertiary structure is more conserved than secondary structure Therefore very remote relationships can be better detected through 2 o or 3 o structural homology instead of sequence homology

42 Lecture 3.342 Visualizing Threading T H R E A D THREADINGSEQNCEECNQESGNI ERHTHREADINGSEQNCETHREAD GSEQNCEQCQESGIDAERTHR...

43 Lecture 3.343 Visualizing Threading T H R E THREADINGSEQNCEECNQESGNI ERHTHREADINGSEQNCETHREAD GSEQNCEQCQESGIDAERTHR...

44 Lecture 3.344 Visualizing Threading T H THREADINGSEQNCEECNQESGNI ERHTHREADINGSEQNCETHREAD GSEQNCEQCQESGIDAERTHR...

45 Lecture 3.345 Visualizing Threading THREADINGSEQNCEECNQESGNI ERHTHREADINGSEQNCETHREAD GSEQNCEQCQESGIDAERTHR...

46 Lecture 3.346 Visualizing Threading THREAD..SEQNCEECN..THREAD..SEQNCEECN..

47 Lecture 3.347 Threading Database of 3D structures and sequences –Protein Data Bank (or non-redundant subset) Query sequence –Sequence < 25% identity to known structures Alignment protocol –Dynamic programming Evaluation protocol –Distance-based potential or secondary structure Ranking protocol

48 Lecture 3.348 2 Kinds of Threading 2D Threading or Prediction Based Methods (PBM) –Predict secondary structure (SS) or ASA of query –Evaluate on basis of SS and/or ASA matches 3D Threading or Distance Based Methods (DBM) –Create a 3D model of the structure –Evaluate using a distance-based “hydrophobicity” or pseudo-thermodynamic potential

49 Lecture 3.349 2D Threading Algorithm Convert PDB to a database containing sequence, SS and ASA information Predict the SS and ASA for the query sequence using a “high-end” algorithm Perform a dynamic programming alignment using the query against the database (include sequence, SS & ASA) Rank the alignments and select the most probable fold

50 Lecture 3.350 Database Conversion >Protein1 THREADINGSEQNCEECNQESGNI HHHHHHCCCCEEEEECCCHHHHHH ERHTHREADINGSEQNCETHREAD HHCCEEEEECCCCCHHHHHHHHHH >Protein2 QWETRYEWQEDFSHAECNQESGNI EEEEECCCCHHHHHHHHHHHHHHH YTREWQHGFDSASQWETRA CCCCEEEEECCCEEEEECC >Protein3 LKHGMNSNWEDFSHAECNQESG EEECCEEEECCCEEECCCCCCC

51 Lecture 3.351 Secondary Structure Table 10 --

52 Lecture 3.352 2 o Structure Identification DSSP - Database of Secondary Structures for Proteins (swift.embl-heidelberg.de/dssp) VADAR - Volume Area Dihedral Angle Reporter (redpoll.pharmacy.ualberta.ca) PDB - Protein Data Bank (www.rcsb.org) QHTAWCLTSEQHTAAVIWDCETPGKQNGAYQEDCA HHHHHHCCEEEEEEEEEEECCHHHHHHHCCCCCCC

53 Lecture 3.353 Accessible Surface Area Solvent Probe Accessible Surface Van der Waals Surface Reentrant Surface

54 Lecture 3.354 ASA Calculation DSSP - Database of Secondary Structures for Proteins (swift.embl-heidelberg.de/dssp) VADAR - Volume Area Dihedral Angle Reporter (www.redpoll.pharmacy.ualberta.ca/vadar/) GetArea - www.scsb.utmb.edu/getarea/area_form.html QHTAWCLTSEQHTAAVIWDCETPGKQNGAYQEDCAMD BBPPBEEEEEPBPBPBPBBPEEEPBPEPEEEEEEEEE 1056298799415251510478941496989999999

55 Lecture 3.355 Other ASA sites Connolly Molecular Surface Home Page –http://www.biohedron.com/ Naccess Home Page –http://sjh.bi.umist.ac.uk/naccess.html ASA Parallelization –http://cmag.cit.nih.gov/Asa.htm Protein Structure Database –http://www.psc.edu/biomed/pages/research/PSdb/

56 Lecture 3.356 2D Threading Algorithm Convert PDB to a database containing sequence, SS and ASA information Predict the SS and ASA for the query sequence using a “high-end” algorithm Perform a dynamic programming alignment using the query against the database (include sequence, SS & ASA) Rank the alignments and select the most probable fold

57 Lecture 3.357 ASA Prediction PredictProtein-PHDacc (58%) –http://cubic.bioc.columbia.edu/predictprotein PredAcc (70%?) –condor.urbb.jussieu.fr/PredAccCfg.html QHTAW... QHTAWCLTSEQHTAAVIW BBPPBEEEEEPBPBPBPB

58 Lecture 3.358 2D Threading Algorithm Convert PDB to a database containing sequence, SS and ASA information Predict the SS and ASA for the query sequence using a “high-end” algorithm Perform a dynamic programming alignment using the query against the database (include sequence, SS & ASA) Rank the alignments and select the most probable fold

59 Lecture 3.359 G E N ETICS G100 0 00000 E 0 0 0000 N 0 0 00000 E 0 0 0 0 00 S 0 0 0 0000 I 0 0 0 00 00 S 0 0 0 0000 GENETICS G60403020 0100 E405030 200100 N30 4020 0100 E20 302010 0 S20 010 I 20100 S0000000

60 Lecture 3.360 S ij (Identity Matrix) A C D E F G H I K L M N P Q R S T V W Y A 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 C 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 D 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 E 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 F 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 G 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 H 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 I 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 K 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 L 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 M 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 N 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 P 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 Q 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 R 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 S 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 V 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 W 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 Y 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

61 Lecture 3.361 A A T V D A 1 V D A A T V D A 1 1 V D A A T V D A 1 1 0 0 0 V D A A T V D A 1 1 0 0 0 V 0 V D A A T V D A 1 1 0 0 0 V 0 1 1 V D A A T V D A 1 1 0 0 0 V 0 1 1 2 V D

62 Lecture 3.362 A Simple Example... A A T V D A 1 1 0 0 0 V 0 1 1 2 1 V D A A T V D A 1 1 0 0 0 V 0 1 1 2 1 V 0 1 1 2 2 D 0 1 1 1 3 A A T V D A 1 1 0 0 0 V 0 1 1 2 1 V 0 1 1 2 2 D 0 1 1 1 3 A A T V D | | | | A - V V D A A T V D | | | | A V V D A A T V D | | | | A V - V D

63 Lecture 3.363 Let’s Include 2 o info & ASA H E C H 1 0 0 E 0 1 0 C 0 0 1 E P B E 1 0 0 P 0 1 0 B 0 0 1 S ij = k 1 S ij + k 2 S ij + k 3 S ij seq strcasa total S ij strc S ij asa

64 Lecture 3.364 A A T V D A 2 V D A A T V D A 2 2 V D A A T V D A 2 2 1 0 0 V D A A T V D A 2 2 1 0 0 V 1 V D A A T V D A 2 2 1 0 0 V 1 3 3 V D A A T V D A 2 2 1 0 0 V 1 3 3 3 V D E E E C C EECCEECC EECCEECC EECCEECC EECCEECC EECCEECC EECCEECC

65 Lecture 3.365 A Simple Example... A A T V D A 2 2 1 0 0 V 1 3 3 3 2 V D A A T V D A 2 2 1 0 0 V 1 3 3 3 2 V 0 2 3 5 4 D 0 2 3 4 7 A A T V D A 2 2 1 0 0 V 1 3 3 3 2 V 0 2 3 5 4 D 0 2 3 4 7 E E E C C EECCEECC EECCEECC EECCEECC A A T V D | | | | A - V V D A A T V D | | | | A V V D A A T V D | | | | A V - V D

66 Lecture 3.366 2D Threading Performance In test sets 2D threading methods can identify 30-40% of proteins having very remote homologues (i.e. not detected by BLAST) using “minimal” non-redundant databases (<700 proteins) If the database is expanded ~4x the performance jumps to 70-75% Performs best on true homologues as opposed to postulated analogues

67 Lecture 3.367 2D Threading Advantages Algorithm is easy to implement Algorithm is very fast (10x faster than 3D threading approaches) The 2D database is small ( 1.5 Gbytes) Appears to be just as accurate as DBM or other 3D threading approaches Very amenable to web servers

68 Lecture 3.368 Servers - PredictProtein

69 Lecture 3.369 Servers - 123D

70 Lecture 3.370 Servers - GenThreader

71 Lecture 3.371 More Servers - www.bronco.ualberta.ca

72 Lecture 3.372 2D Threading Disadvantages Reliability is not 100% making most threading predictions suspect unless experimental evidence can be used to support the conclusion Does not produce a 3D model at the end of the process Doesn’t include all aspects of 2 o and 3 o structure features in prediction process PSI-BLAST may be just as good (faster too!)

73 Lecture 3.373 Making it Better Include 3D threading analysis as part of the 2D threading process -- offers another layer of information Include more information about the “coil” state (3-state prediction isn’t good enough) Include other biochemical (ligands, function, binding partners, motifs) or phylogenetic (origin, species) information

74 Lecture 3.374 3D Threading Servers Generate 3D models or coordinates of possible models based on input sequence Loopp (version 2) –http://ser-loopp.tc.cornell.edu/loopp.html 3D-PSSM –http://www.sbg.bio.ic.ac.uk/~3dpssm/ All require email addresses since the process may take hours to complete

75 Lecture 3.375

76 Lecture 3.376


Download ppt "Lecture 3.31 Superposition & Threading † Gary Van Domselaar University of Alberta † Slides adapted from David Wishart."

Similar presentations


Ads by Google