Avdesh Mishra, Md Tamjidul Hoque {amishra2,

Slides:



Advertisements
Similar presentations
Introduction to Structural Bioinformatics Dong Xu Computer Science Department 271C Life Sciences Center 1201 East Rollins Road University of Missouri-Columbia.
Advertisements

Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction.
Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
1 Protein Structure Prediction Reporter: Chia-Chang Wang Date: April 1, 2005.
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Protein Structure Prediction and Analysis
Template-based Prediction of Protein 8-state Secondary Structures June 12 th 2013 Ashraf Yaseen and Yaohang Li DEPARTMENT OF COMPUTER SCIENCE OLD DOMINION.
Protein Tertiary Structure Prediction
Protein Secondary Structure Prediction with inclusion of Hydrophobicity information Tzu-Cheng Chuang, Okan K. Ersoy and Saul B. Gelfand School of Electrical.
Molecular visualization
Protein Structure Prediction
Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.
Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Protein backbone Biochemical view:
Application of the GA-PSO with the Fuzzy controller to the robot soccer Department of Electrical Engineering, Southern Taiwan University, Tainan, R.O.C.
Mean Field Theory and Mutually Orthogonal Latin Squares in Peptide Structure Prediction N. Gautham Department of Crystallography and Biophysics University.
►Search and optimization method that mimics the natural selection ►Terms to define ٭ Chromosome – a set of numbers representing one possible solution ٭
We propose an accurate potential which combines useful features HP, HH and PP interactions among the amino acids Sequence based accessibility obtained.
Breeding Swarms: A GA/PSO Hybrid 簡明昌 Author and Source Author: Matthew Settles and Terence Soule Source: GECCO 2005, p How to get: (\\nclab.csie.nctu.edu.tw\Repository\Journals-
Structural Bioinformatics Yasaman Karami Master BIM-BMC Semestre 3, Laboratoire de Biologie Computationnelle et Quantitative (LCQB) e-documents:
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Intelligent Exploration for Genetic Algorithms Using Self-Organizing.
Paper Review for ENGG6140 Memetic Algorithms
-A introduction with an example
Using GA’s to Solve Problems
High Resolution Weather Radar Through Pulse Compression
Generating, Maintaining, and Exploiting Diversity in a Memetic Algorithm for Protein Structure Prediction Mario Garza-Fabre, Shaun M. Kandathil, Julia.
Avdesh Mishra, Md Tamjidul Hoque {amishra2,
Amanda L. Do, MPH1,2, Ruby Y. Wan, MS1,2, Robert W
Department of Computer Science
Feature Extraction Introduction Features Algorithms Methods
Avdesh Mishra, Manisha Panta, Md Tamjidul Hoque, Joel Atallah
Introduction Feature Extraction Discussions Conclusions Results
Brain Hemorrhage Detection and Classification Steps
Prediction of RNA Binding Protein Using Machine Learning Technique
Extra Tree Classifier-WS3 Bagging Classifier-WS3
Modified Crossover Operator Approach for Evolutionary Optimization
Support Vector Machine (SVM)
An Integrated Approach to Protein-Protein Docking
Alfonso Jaramillo, Shoshana J. Wodak  Biophysical Journal 
Volume 25, Issue 11, Pages e3 (November 2017)
Srayanta Mukherjee, Yang Zhang  Structure 
○ Hisashi Shimosaka (Doshisha University)
Volume 19, Issue 7, Pages (July 2011)
Aiman H. El-Maleh Sadiq M. Sait Syed Z. Shazli
Protein structure prediction.
Γ-TEMPy: Simultaneous Fitting of Components in 3D-EM Maps of Their Assembly Using a Genetic Algorithm  Arun Prasad Pandurangan, Daven Vasishtan, Frank.
Yang Liu, Perry Palmedo, Qing Ye, Bonnie Berger, Jian Peng 
Unsupervised Pretraining for Semantic Parsing
Richard C. Page, Sanguk Kim, Timothy A. Cross  Structure 
Volume 20, Issue 2, Pages (February 2012)
Low-Resolution Structures of Proteins in Solution Retrieved from X-Ray Scattering with a Genetic Algorithm  P. Chacón, F. Morán, J.F. Díaz, E. Pantos,
Richard C. Page, Sanguk Kim, Timothy A. Cross  Structure 
Srayanta Mukherjee, Yang Zhang  Structure 
Daniel Hoersch, Tanja Kortemme  Structure 
Volume 20, Issue 3, Pages (March 2012)
Volume 21, Issue 6, Pages (June 2013)
Conformational Search
Volume 85, Issue 4, Pages (October 2003)
Alfonso Jaramillo, Shoshana J. Wodak  Biophysical Journal 
Subdomain Interactions Foster the Design of Two Protein Pairs with ∼80% Sequence Identity but Different Folds  Lauren L. Porter, Yanan He, Yihong Chen,
Alice Qinhua Zhou, Diego Caballero, Corey S. O’Hern, Lynne Regan 
Energy Minimization of Protein Tertiary Structure by Parallel Simulated Annealing using Genetic Crossover Doshisha University, Kyoto, Japan Takeshi Yoshida.
Volume 24, Issue 1, Pages (January 2016)
Γ-TEMPy: Simultaneous Fitting of Components in 3D-EM Maps of Their Assembly Using a Genetic Algorithm  Arun Prasad Pandurangan, Daven Vasishtan, Frank.
Network-Based Coverage of Mutational Profiles Reveals Cancer Genes
Pooja Pun, Avdesh Mishra, Simon Lailvaux, Md Tamjidul Hoque
Manisha Panta, Avdesh Mishra, Md Tamjidul Hoque, Joel Atallah
Results Motivation Introduction Methods Conclusions Acknowledgements
Presentation transcript:

Avdesh Mishra, Md Tamjidul Hoque email: {amishra2, thoque}@uno.edu Next Generation Evolutionary Sampling and Energy Function Guided Ab Initio Protein Structure Prediction Example of 3DIGARS-PSP modeling results on known Hard E. Coli and Protease Inhibitor proteins Avdesh Mishra, Md Tamjidul Hoque email: {amishra2, thoque}@uno.edu Department of Computer Science University of New Orleans, LA, USA The confirmation of a protein is vital to understand the function it performs within the cell. Towards this goal, we developed a computer program that applies a memory assisted evolutionary algorithm to sample the energy hyper-surface of the protein folding process, searching for the global minimum or the native fold of the protein. Sampling of the energy hyper-surface of the protein is achieved by novel mutation and crossover operations based on angular rotation and translation capabilities. Furthermore, the crossover operations in current generation are enhanced by the use of the best parents selected from previous generations. In addition, we employ a knowledge-based novel energy function, 3DIGARS3.0, which can differentiate the native structure that corresponds to the most thermodynamically stable state, compare to the possible decoy structures most effectively. The 3DIGARS3.0 energy function is an optimized combination of crucial properties such as hydrophobic versus hydrophilic, sequence-specific predicted accessibility and ubiquitous phi-psi characterization. Ongoing Research Effective use of Ramachandran Plot Effective initialization and use of associated memory Development of new operator to implement move sets Introduction Figure 1 | Cysteine Protease Inhibitor (PDB ID: 1nyc); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from Rosetta); towards right – superposition of top Rosetta model (based on TMScore) on native. Figure 2 | E. Coli protein (PDB ID: 1pohA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from Rosetta); towards right – superposition of top Rosetta model (based on TMScore) on native. Figure 3 | E. Coli protein (PDB ID: 1pohA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from I-Tasser); towards right – superposition of top I-Tasser model (based on TMScore) on native. Figure 4 | E. Coli protein (PDB ID: 2z9hA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from Rosetta); towards right – superposition of top Rosetta model (based on TMScore) on native. Figure 5 | E. Coli protein (PDB ID: 2z9hA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from I-Tasser); towards right – superposition of top I-Tasser model (based on TMScore) on native. Figure 6 | E. Coli protein (PDB ID: 2p7vA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from Rosetta); towards right – superposition of top Rosetta model (based on TMScore) on native. Methods Backbone Models Dataset of 4332 Protein Structures Initialize Population for GA using Single Point Angular Mutation Obtain Secondary Structure (SS) and Φ, Ψ Angles using DSSP Save Best Model in Memory Figure 7 | E. Coli protein (PDB ID: 2p7vA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from I-Tasser); towards right – superposition of top I-Tasser model (based on TMScore) on native. Figure 8 | E. Coli protein (PDB ID: 1k4nA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from Rosetta); towards right – superposition of top Rosetta model (based on TMScore) on native. Figure 9 | E. Coli protein (PDB ID: 1k4nA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from I-Tasser); towards right – superposition of top I-Tasser model (based on TMScore) on native. Generate Frequency Distribution of Φ, Ψ Angles and SS Types Select 5% Elite Models Note: Natives are shown in cyan and pink and Models are shown in red and yellow Perform Memory Assisted Crossover @ 70 % In past we have shown that our energy function, 3DIGARS3.0 outperforms the state-of-arts method significantly. Also, in our prior work we have shown that our associate memory based sampling algorithm provides superior performance. In this work, we are working on to find the right combination of our energy function and the sampling algorithm to have better prediction of 3D structure of protein in comparison to the state-of-art approaches. To this end, we have been able to successfully apply dihedral angles mutation by rotation and crossover by protein segment translation rules to enhance the mutation and crossover operations of the sampling algorithms. We are working on case by case basis to obtain an accurate prediction of the useful secondary structures in a protein. Towards this, we have utilized the Ramachandran Plot information within our sampling algorithm. We have found that the use of Ramachandran Plot yields in significant improvement. We are exploring on the topics such as effective use of Ramachandran Plot, move sets and associated memory to find more efficient and effective rules to apply within the sampling algorithm. We plan to further improve the PSP problem by combining 3DIGARS and sDFIRE energy function in near future to make it further robust. Results Discussions and Conclusions Fill Rest Randomly Perform Angular Mutation @ 60% Calculate Fitness using 3DIGARS3.0 Save Models Generation < 2000 End Best Models Acknowledgements Authors gratefully acknowledge the Louisiana Board of Regents through the Board of Regents Support Fund, LEQSF (2013-16)-RD-A-19.