Download presentation
Presentation is loading. Please wait.
Published byNorman McCarthy Modified over 8 years ago
1
We propose an accurate potential which combines useful features HP, HH and PP interactions among the amino acids Sequence based accessibility obtained for each amino acids 3D Structure based property i.e. uPhi and uPsi The improved potential can be used for Protein-Ligand binding site prediction Ab Initio protein structure prediction Fold recognition Drug design and Enzyme design The proposed potential outperforms all the stat-of-arts approaches. 3D structure prediction is useful in drug and novel enzymes design. Energy functions can aid in Protein structure prediction and Fold recognition We propose, 3DIGARS3.0 potential for improved accuracy. We introduce two 3D structural features uPhi based energy uPsi based energy Motivation comes from the fact that the 3D structural features assists the advancement of the accuracy. uPhi and uPsi are linearly combined with prior energy components 3DIGARS energy which is based on HP, HH and PP interactions and their respective ideal gas reference state ASA energy computed by modeling real and predicted accessibility obtained from protein sequences The linearly combined energies are optimized using GA Three decoy sets were used in optimization Moulder Rosetta and I-Tasser Five independent test decoy sets were used to evaluate the accuracy 4state_reduced fisa_casp3 hg_structal ig_structal and ig_structural hires 3DIGARS3.0 outperformed the state-of-the-arts approaches DFIRE by 440.91% RWplus by 440.91% dDFIRE by 72.46% GOAP by 20.20% 3DIGARS by 417.39% 3DIGARS2.0 by 440.91% based on independent test datasets. The percentage weighted average improvement is calculated as where, y i represents new value and x i represents old value Figure 1: (a) Native like protein conformation, presented in a 3D hexagonal-close-packing (HCP) configuration using hydrophobic (H) and hydrophilic or polar (P) residues. The H-H interactions space is relatively smaller than P-P interactions space, since hydrophobic residues (black ball) being afraid of water tends to remain inside of the central space. (b) 3D metaphoric HP folding kernels, depicted based on HCP configuration based HP model, showing the 3 layers of distributions of amino-acids. Figure 5: Process flow of the design and development of 3DIGARS3.0 energy function. 3DIGARS potential Core statistical function based on HP, HH and PP interactions (see Fig. 1) Segregated ideal gas reference state and libraries for HP, HH and PP groups Better training dataset (100% sequence identity cutoff can capture natural frequency distribution) Three shape parameters (α hp, α hh and α pp ) controls shape of assumed spherical protein surface Three contribution parameters (β hp, β hh and β pp ) controls the contribution of each group 3DIGARS2.0 potential Integration of the core energy and sequence specific features Sequence specific feature is computed by modeling error between the real and predicted ASA (see Fig. 2) Real and predicted ASA are obtained from DSSP and REGAd 3 p respectively 3DIGARS2.0 is a linearly weighted accumulation of 3DIGARS and mined ASA 3DIGARS3.0 potential Integration of core energy, sequence specific energy and 3D structural features (see Fig. 5) 3D structural features added are attained based on uPhi and uPsi angles uPhi and uPsi are computed using Cartesian coordinates of set of 4 atoms (see Fig. 3 and 4) uPhi and uPsi based energies are computed based on following steps (see Fig. 4) Cosine value range (-1 to 1) of angles uPhi and uPsi are divided into 20 bins, each of width 0.1 Individual frequency tables for uPhi and uPsi are computed Frequency tables are further used to compute individual energy score libraries Energy score are then used to compute uPhi and uPsi energies for a given protein Protein folding and structure prediction problems relies on an accurate energy function. Accuracy of the potential function depends on Interaction distance between atom pairs Hydrophobic (H) and hydrophilic (P) properties Sequence-specific information Orientation-dependent interactions and Optimization techniques We develop a potential function, which is an optimized linearly weighted accumulation of 3-Dimensional Ideal Gas Reference State based Energy Function (3DIGARS) It is formulated using an idea of HP, HH and PP properties of amino acids Mined accessible surface area (ASA) and Ubiquitously computed Phi (uPhi) and Psi (uPsi) energies Optimization is performed using a Genetic Algorithm (GA). Based on independent test dataset, the proposed energy function outperformed state-of-the- art approaches significantly. An Eclectic Energy Function to Discriminate Native From Decoys Avdesh Mishra, Sumaiya Iqbal, Md Tamjidul Hoque email: {amishra2, siqbal1, thoque}@uno.edu Department of Computer Science, University of New Orleans, New Orleans, LA, USA Methods Introduction Results Discussions Conclusions Acknowledgements Figure 4: (a) Shows atoms arrangement as well as vectors created using the Cartesian coordinates of the atoms. (b) Shows the dihedral angle involving the four atoms. Figure 3: Definition of the angle formed by four atoms (At 1, At 2, At 3 and At 4 ). uPhi is computed using At 1 belonging to one residue and a set of atoms, At 2, At 3, At 4 belonging to some other residues. Similarly, uPsi is computed using a set of atoms, At 1, At 2, At 3 belonging to some residues and an atom At 4 belonging to some other residue. Figure 2: The dark central area, composed of atoms, can be thought of a 3D proteins and the outline around the area in green and red can be thought of real and predicted accessible surface area respectively. The error between real and predicted ASA is modelled as an energy feature. Table 1: Performance comparison of different energy functions on optimization datasets based on correct native count. Decoy Sets (No. of targets) Methods DFIRERWplusdDFIREGOAP3DIGARS3DIGARS2.03DIGARS3.0 Moulder (20) 19 (-2.97) 19 (-2.84) 18 (-2.74) 19 (-3.58) 19 (-2.99) 19 (-2.68) 20 (-3.851) Rosetta (58) 20 (-1.82) 20 (-1.47) 12 (-0.83) 45 (-3.70) 31 (-2.023) 49 (-2.987) 46 (-2.683) I-Tasser (56) 49 (-4.02) 56 (-5.77) 48 (-5.03) 45 (-5.36) 53 (-4.036) 56 (-4.296) 56 (-5.573) Weighted Average in % 38.6428.4256.4111.9318.45-1.61 Legend: Entry format is native-count (z-score). Bold indicates best scores. Underscore indicates close to best scores. Table 2: Performance comparison of different energy functions on independent test datasets based on correct native count. Decoy Sets (No. of targets) Methods DFIRERWplusdDFIREGOAP3DIGARS3DIGARS2.03DIGARS3.0 4state_reduced (7) 6 (-3.48) 6 (-3.51) 7 (-4.15) 7 (-4.38) 6 (-3.371) 4 (-2.642) 7 (-3.456) fisa_casp3 (5) 4 (-4.80) 4 (-5.17) 4 (-4.83) 5 (-5.27) 5 (-4.319) 5 (-4.682) 4 (-4.076) hg_structal (29) 12 (-1.97) 12 (-1.74) 16 (-1.33) 22 (-2.73) 12 (-1.914) 12 (-1.589) 28 (-3.678) ig_structal (61) 0 (0.92) 0 (1.11) 26 (-1.02) 47 (-1.62) 0 (0.645) 0 (0.268) 60 (-2.526) ig_structal_hires (20) 0 (0.17) 0 (0.32) 16 (-2.05) 18 (-2.35) 0 (-0.002) 1 (0.030) 20 (-2.378) Weighted Average in % 440.91 72.4620.20417.39440.91 Legend: Entry format is native-count (z-score). Bold indicates best scores. Underscore indicates close to best scores. We gratefully acknowledge the Louisiana Board of Regents through the Board of Regents Support Fund, LEQSF (2013- 16)-RD-A-19.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.