Protein Folding and Protein Threading Some slides from Tolga Can, CENG 465: Introduction to Bioinformatics, Middle East Technical University, Turkey Kristen Huber, EC 697S: Topics in Computational Biology, University of Massachusetts
Protein threading Structure is better conserved than sequence Structure can adopt a wide range of mutations. Physical forces favor certain structures. Number of folds is limited. Currently ~700 Total: 1,000 ~10,000 TIM barrel Tolga Can, METU, CENG 465
Protein Threading Basic premise Statistics from Protein Data Bank (~35,000 structures) The number of unique structural (domain) folds in nature is fairly small (possibly a few thousand) 90% of new structures submitted to PDB in the past three years have similar structural folds in PDB Tolga Can, METU, CENG 465
Concept of Threading Thread (align or place) a query protein sequence onto a template structure in “optimal” way Good alignment gives approximate backbone structure Query sequence MTYKLILNGKTKGETTTEAVDAATAEKVFQYANDNGVDGEWTYTE Template set Tolga Can, METU, CENG 465
Protein Threading – energy function MTYKLILNGKTKGETTTEAVDAATAEKVFQYANDNGVDGEWTYTE how preferable to put two particular residues nearby: E_p how well a residue fits a structural environment: E_s alignment gap penalty: E_g total energy: E_p + E_s + E_g find a sequence-structure alignment to minimize the energy function Tolga Can, METU, CENG 465
Prediction of Protein Structures Examples – a few good examples actual predicted actual predicted actual predicted actual predicted Tolga Can, METU, CENG 465
Prediction of Protein Structures Not so good example Tolga Can, METU, CENG 465
CASP/CAFASP CASP: Critical Assessment of Structure Prediction CAFASP: Critical Assessment of Fully Automated Structure Prediction CASP Predictor CAFASP Predictor Won’t get tired High-throughput Tolga Can, METU, CENG 465
Protein Threading Kristen Huber, UMass, EC 697S
Protein Threading Kristen Huber, UMass, EC 697S
Protein Threading (RAPTOR) Jinbo Xu, Ying Xu, Dongsup Kim, Ming Li. RAPTOR: Optimal Protein Threading by Linear Programming Journal of Bioinformatics and Computational Biology, April 2003 Given a query sequence S = (s1, s2, s3, …sn) and a template (library) sequence T = (t1, t2, t3, …tm), pair up elements from S and T, by possibly inserting gaps, while minimizing an energy function Assumptions the template is a sequence of cores (conserved segments – α-helix or β-sheet) connected by loops gaps are allowed only within the loops only interactions between residues in the cores are considered; interaction between residues is assumed to exist if they are within 7 Ǻ and at least 4 positions away
Protein Threading (RAPTOR) Steps of the RAPTOR algorithm: build a contact map for the template structure find all possible alignments for each core within the query sequence build a contact map for the query sequence and template structure define energy function and carry out minimization Em – mutation score Es – environment fitness score Ep – pairwise interaction score Eg – gap penalty score Ess – secondary structure compatibility Wx – weights (determined experimentally)
Protein Threading (RAPTOR) Step 1: Build a contact map for the template structure contact map indicates interactions between cores, i.e.if any two residues within the cores interact Xu et al., JBCB, 2003
Protein Threading (RAPTOR) Step 3: Build a contact map for the query and template Xu et al., JBCB, 2003
Protein Threading (RAPTOR) Step 4: define energy function and carry out minimization Xu et al., JBCB, 2003