Avdesh Mishra, Md Tamjidul Hoque {amishra2,

Slides:

Advertisements

Similar presentations

Introduction to Structural Bioinformatics Dong Xu Computer Science Department 271C Life Sciences Center 1201 East Rollins Road University of Missouri-Columbia.

Advertisements

Protein Tertiary Structure Prediction. Protein Structure Prediction & Alignment Protein structure Secondary structure Tertiary structure Structure prediction.

Summary Protein design seeks to find amino acid sequences which stably fold into specific 3-D structures. Modeling the inherent flexibility of the protein.

. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]

1 Protein Structure Prediction Reporter: Chia-Chang Wang Date: April 1, 2005.

Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.

Protein Structure Prediction and Analysis

Template-based Prediction of Protein 8-state Secondary Structures June 12 th 2013 Ashraf Yaseen and Yaohang Li DEPARTMENT OF COMPUTER SCIENCE OLD DOMINION.

Protein Tertiary Structure Prediction

Protein Secondary Structure Prediction with inclusion of Hydrophobicity information Tzu-Cheng Chuang, Okan K. Ersoy and Saul B. Gelfand School of Electrical.

Molecular visualization

Protein Structure Prediction

Protein Structure Prediction: Homology Modeling & Threading/Fold Recognition D. Mohanty NII, New Delhi.

Protein Structure and Bioinformatics. Chapter 2 What is protein structure? What are proteins made of? What forces determines protein structure? What is.

Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.

Protein backbone Biochemical view:

Application of the GA-PSO with the Fuzzy controller to the robot soccer Department of Electrical Engineering, Southern Taiwan University, Tainan, R.O.C.

Mean Field Theory and Mutually Orthogonal Latin Squares in Peptide Structure Prediction N. Gautham Department of Crystallography and Biophysics University.

►Search and optimization method that mimics the natural selection ►Terms to define ٭ Chromosome – a set of numbers representing one possible solution ٭

We propose an accurate potential which combines useful features HP, HH and PP interactions among the amino acids Sequence based accessibility obtained.

Breeding Swarms: A GA/PSO Hybrid 簡明昌 Author and Source Author: Matthew Settles and Terence Soule Source: GECCO 2005, p How to get: (\\nclab.csie.nctu.edu.tw\Repository\Journals-

Structural Bioinformatics Yasaman Karami Master BIM-BMC Semestre 3, Laboratoire de Biologie Computationnelle et Quantitative (LCQB) e-documents:

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Intelligent Exploration for Genetic Algorithms Using Self-Organizing.

Paper Review for ENGG6140 Memetic Algorithms

-A introduction with an example

Using GA’s to Solve Problems

High Resolution Weather Radar Through Pulse Compression

Generating, Maintaining, and Exploiting Diversity in a Memetic Algorithm for Protein Structure Prediction Mario Garza-Fabre, Shaun M. Kandathil, Julia.

Avdesh Mishra, Md Tamjidul Hoque {amishra2,

Amanda L. Do, MPH1,2, Ruby Y. Wan, MS1,2, Robert W

Department of Computer Science

Feature Extraction Introduction Features Algorithms Methods

Avdesh Mishra, Manisha Panta, Md Tamjidul Hoque, Joel Atallah

Introduction Feature Extraction Discussions Conclusions Results

Brain Hemorrhage Detection and Classification Steps

Prediction of RNA Binding Protein Using Machine Learning Technique

Extra Tree Classifier-WS3 Bagging Classifier-WS3

Modified Crossover Operator Approach for Evolutionary Optimization

Support Vector Machine (SVM)

An Integrated Approach to Protein-Protein Docking

Alfonso Jaramillo, Shoshana J. Wodak Biophysical Journal

Volume 25, Issue 11, Pages e3 (November 2017)

Srayanta Mukherjee, Yang Zhang Structure

○　Hisashi Shimosaka (Doshisha University)

Volume 19, Issue 7, Pages (July 2011)

Aiman H. El-Maleh Sadiq M. Sait Syed Z. Shazli

Protein structure prediction.

Γ-TEMPy: Simultaneous Fitting of Components in 3D-EM Maps of Their Assembly Using a Genetic Algorithm Arun Prasad Pandurangan, Daven Vasishtan, Frank.

Yang Liu, Perry Palmedo, Qing Ye, Bonnie Berger, Jian Peng

Unsupervised Pretraining for Semantic Parsing

Richard C. Page, Sanguk Kim, Timothy A. Cross Structure

Volume 20, Issue 2, Pages (February 2012)

Low-Resolution Structures of Proteins in Solution Retrieved from X-Ray Scattering with a Genetic Algorithm P. Chacón, F. Morán, J.F. Díaz, E. Pantos,

Richard C. Page, Sanguk Kim, Timothy A. Cross Structure

Srayanta Mukherjee, Yang Zhang Structure

Daniel Hoersch, Tanja Kortemme Structure

Volume 20, Issue 3, Pages (March 2012)

Volume 21, Issue 6, Pages (June 2013)

Conformational Search

Volume 85, Issue 4, Pages (October 2003)

Alfonso Jaramillo, Shoshana J. Wodak Biophysical Journal

Subdomain Interactions Foster the Design of Two Protein Pairs with ∼80% Sequence Identity but Different Folds Lauren L. Porter, Yanan He, Yihong Chen,

Alice Qinhua Zhou, Diego Caballero, Corey S. O’Hern, Lynne Regan

Energy Minimization of Protein Tertiary Structure by Parallel Simulated Annealing using Genetic Crossover Doshisha University, Kyoto, Japan Takeshi Yoshida.

Volume 24, Issue 1, Pages (January 2016)

Γ-TEMPy: Simultaneous Fitting of Components in 3D-EM Maps of Their Assembly Using a Genetic Algorithm Arun Prasad Pandurangan, Daven Vasishtan, Frank.

Network-Based Coverage of Mutational Profiles Reveals Cancer Genes

Pooja Pun, Avdesh Mishra, Simon Lailvaux, Md Tamjidul Hoque

Manisha Panta, Avdesh Mishra, Md Tamjidul Hoque, Joel Atallah

Results Motivation Introduction Methods Conclusions Acknowledgements

Presentation transcript:

Avdesh Mishra, Md Tamjidul Hoque email: {amishra2, thoque}@uno.edu Next Generation Evolutionary Sampling and Energy Function Guided Ab Initio Protein Structure Prediction Example of 3DIGARS-PSP modeling results on known Hard E. Coli and Protease Inhibitor proteins Avdesh Mishra, Md Tamjidul Hoque email: {amishra2, thoque}@uno.edu Department of Computer Science University of New Orleans, LA, USA The confirmation of a protein is vital to understand the function it performs within the cell. Towards this goal, we developed a computer program that applies a memory assisted evolutionary algorithm to sample the energy hyper-surface of the protein folding process, searching for the global minimum or the native fold of the protein. Sampling of the energy hyper-surface of the protein is achieved by novel mutation and crossover operations based on angular rotation and translation capabilities. Furthermore, the crossover operations in current generation are enhanced by the use of the best parents selected from previous generations. In addition, we employ a knowledge-based novel energy function, 3DIGARS3.0, which can differentiate the native structure that corresponds to the most thermodynamically stable state, compare to the possible decoy structures most effectively. The 3DIGARS3.0 energy function is an optimized combination of crucial properties such as hydrophobic versus hydrophilic, sequence-specific predicted accessibility and ubiquitous phi-psi characterization. Ongoing Research Effective use of Ramachandran Plot Effective initialization and use of associated memory Development of new operator to implement move sets Introduction Figure 1 | Cysteine Protease Inhibitor (PDB ID: 1nyc); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from Rosetta); towards right – superposition of top Rosetta model (based on TMScore) on native. Figure 2 | E. Coli protein (PDB ID: 1pohA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from Rosetta); towards right – superposition of top Rosetta model (based on TMScore) on native. Figure 3 | E. Coli protein (PDB ID: 1pohA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from I-Tasser); towards right – superposition of top I-Tasser model (based on TMScore) on native. Figure 4 | E. Coli protein (PDB ID: 2z9hA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from Rosetta); towards right – superposition of top Rosetta model (based on TMScore) on native. Figure 5 | E. Coli protein (PDB ID: 2z9hA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from I-Tasser); towards right – superposition of top I-Tasser model (based on TMScore) on native. Figure 6 | E. Coli protein (PDB ID: 2p7vA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from Rosetta); towards right – superposition of top Rosetta model (based on TMScore) on native. Methods Backbone Models Dataset of 4332 Protein Structures Initialize Population for GA using Single Point Angular Mutation Obtain Secondary Structure (SS) and Φ, Ψ Angles using DSSP Save Best Model in Memory Figure 7 | E. Coli protein (PDB ID: 2p7vA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from I-Tasser); towards right – superposition of top I-Tasser model (based on TMScore) on native. Figure 8 | E. Coli protein (PDB ID: 1k4nA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from Rosetta); towards right – superposition of top Rosetta model (based on TMScore) on native. Figure 9 | E. Coli protein (PDB ID: 1k4nA); towards left – superposition of 3DIGARS-PSP model on native (initial seeds from I-Tasser); towards right – superposition of top I-Tasser model (based on TMScore) on native. Generate Frequency Distribution of Φ, Ψ Angles and SS Types Select 5% Elite Models Note: Natives are shown in cyan and pink and Models are shown in red and yellow Perform Memory Assisted Crossover @ 70 % In past we have shown that our energy function, 3DIGARS3.0 outperforms the state-of-arts method significantly. Also, in our prior work we have shown that our associate memory based sampling algorithm provides superior performance. In this work, we are working on to find the right combination of our energy function and the sampling algorithm to have better prediction of 3D structure of protein in comparison to the state-of-art approaches. To this end, we have been able to successfully apply dihedral angles mutation by rotation and crossover by protein segment translation rules to enhance the mutation and crossover operations of the sampling algorithms. We are working on case by case basis to obtain an accurate prediction of the useful secondary structures in a protein. Towards this, we have utilized the Ramachandran Plot information within our sampling algorithm. We have found that the use of Ramachandran Plot yields in significant improvement. We are exploring on the topics such as effective use of Ramachandran Plot, move sets and associated memory to find more efficient and effective rules to apply within the sampling algorithm. We plan to further improve the PSP problem by combining 3DIGARS and sDFIRE energy function in near future to make it further robust. Results Discussions and Conclusions Fill Rest Randomly Perform Angular Mutation @ 60% Calculate Fitness using 3DIGARS3.0 Save Models Generation < 2000 End Best Models Acknowledgements Authors gratefully acknowledge the Louisiana Board of Regents through the Board of Regents Support Fund, LEQSF (2013-16)-RD-A-19.