Genetic Algorithms Select Protein Features Most Predictive of Enzyme Function Andrew Kernytsky, Burkhard Rost Columbia University.

Slides:



Advertisements
Similar presentations
The story beyond Artificial Immune Systems Zhou Ji, Ph.D. Center for Computational Biology and Bioinformatics Columbia University Wuhan, China 2009.
Advertisements

Other biological databases. Biological systems Taxonomic data Literature Protein folding and 3D structure Small molecules Pathways and networks Biological.
The Use of Linkage Learning in Genetic Algorithms By David Newman.
EBI is an Outstation of the European Molecular Biology Laboratory. IntEnz Integrated relational Enzyme database 23 May 2015.
50%, guessing 100%, all correct Accuracy = Figure 2 Predictive Accuracy of SMO algorithm using each attribute separately Prediction of catalytic residues.
Enzymes. What is an enzyme? globular protein which functions as a biological catalyst, speeding up reaction rate by lowering activation energy without.
Genetic algorithms applied to multi-class prediction for the analysis of gene expressions data C.H. Ooi & Patrick Tan Presentation by Tim Hamilton.
1. Elements of the Genetic Algorithm  Genome: A finite dynamical system model as a set of d polynomials over  2 (finite field of 2 elements)  Fitness.
Readings for this week Gogarten et al Horizontal gene transfer….. Francke et al. Reconstructing metabolic networks….. Sign up for meeting next week for.
Data Mining CS 341, Spring 2007 Genetic Algorithm.
Introduction to Genetic Algorithms Yonatan Shichel.
Data classification based on tolerant rough set reporter: yanan yean.
Genetic Algorithm for Variable Selection
Basic concepts of Data Mining, Clustering and Genetic Algorithms Tsai-Yang Jea Department of Computer Science and Engineering SUNY at Buffalo.
Paper Presentation April 10, 2006 Rui Min Topic in Bioinformatics, Dr. Charles Yan - Training HMM structure with genetic algorithm for biological sequence.
What is Neutral? Neutral Changes and Resiliency Terence Soule Department of Computer Science University of Idaho.
國立陽明大學生資學程 陳虹瑋. Genetic Algorithm Background Fitness function ……. population selection Cross over mutation Fitness values Random cross over.
Enzymes Definition Classification Chemistry Properties
Section 2.5: Enzymes Biology.
Burkhard Rost (Columbia New York) Some gory details of protein secondary structure prediction Burkhard Rost CUBIC Columbia University
Anusorn Cherdthong, PhD Applied Biochemistry in Nutritional Science E-learning:
Using reaction mechanism to measure enzyme similarity Noel M. O'Boyle, Gemma L. Holliday, Daniel E. Almonacid and John B.O. Mitchell Unilever Centre for.
An Approach of Artificial Intelligence Application for Laboratory Tests Evaluation Ş.l.univ.dr.ing. Corina SĂVULESCU University of Piteşti.
Overview Enzymes are specialized proteins that function as catalysts to increase the rate of biochemical reactions. By interacting with substrates (reactant.
Introduction to Genetic Algorithms and Evolutionary Computation
Rising accuracy of protein secondary structure prediction Burkhard Rost
Soft Computing Lecture 18 Foundations of genetic algorithms (GA). Using of GA.
Friday 17 rd December 2004Stuart Young Capstone Project Presentation Predicting Deleterious Mutations Young SP, Radivojac P, Mooney SD.
1 Integration of Neural Network and Fuzzy system for Stock Price Prediction Student : Dah-Sheng Lee Professor: Hahn-Ming Lee Date:5 December 2003.
The Chemistry of Protein Catalysis
1 Chapter 20 Enzymes and Vitamins 20.1 Enzymes Copyright © 2007 by Pearson Education, Inc. Publishing as Benjamin Cummings.
Chp Chemistry 121 Winter 2009 LA Tech Introduction to Organic Chemistry and Biochemistry Instructor Dr. Upali Siriwardane (Ph.D. Ohio State)
1 SURVEY OF BIOCHEMISTRY Enzyme Catalysis. 2 General Properties of Enzymes High reaction rates –10 6 to times faster than uncatalyzed reaction Mild.
PREDICTION OF CATALYTIC RESIDUES IN PROTEINS USING MACHINE-LEARNING TECHNIQUES Natalia V. Petrova (Ph.D. Student, Georgetown University, Biochemistry Department),
Genetic Algorithms Przemyslaw Pawluk CSE 6111 Advanced Algorithm Design and Analysis
Genetic Algorithms Czech Technical University in Prague, Faculty of Electrical Engineering Ondřej Vaněk, Agent Technology Center ZUI 2011.
 Based on observed functioning of human brain.  (Artificial Neural Networks (ANN)  Our view of neural networks is very simplistic.  We view a neural.
Genetic Algorithms Abhishek Sharma Piyush Gupta Department of Instrumentation & Control.
Robot Intelligence Technology Lab. Generalized game of life YongDuk Kim.
.1Sources of DNA and Sequencing Methods.1Sources of DNA and Sequencing Methods 2 Genome Assembly Strategy and Characterization 2 Genome Assembly.
CZ5225 Methods in Computational Biology Lecture 2-3: Protein Families and Family Prediction Methods Prof. Chen Yu Zong Tel:
Classification of enzymes. Units of enzyme activity.
Classification and Nomenclature of Enzymes
Contribution of second order evolution to evolutionary algorithms Virginie LEFORT July 11 th.
Daniel E. Almonacid and Patricia C. Babbitt
Neural Networks And Its Applications By Dr. Surya Chitra.
Enzymes: Basic concepts
Genetic Algorithms. Overview “A genetic algorithm (or GA) is a variant of stochastic beam search in which successor states are generated by combining.
An Evolutionary Algorithm for Neural Network Learning using Direct Encoding Paul Batchis Department of Computer Science Rutgers University.
1 Enzymology INTRODUCTION 2006/09/18 Downloaded from
Genetic Algorithm(GA)
Evolutionary Design of the Closed Loop Control on the Basis of NN-ANARX Model Using Genetic Algoritm.
Advanced AI – Session 7 Genetic Algorithm By: H.Nematzadeh.
The Chemistry of Protein Catalysis John Mitchell University of St Andrews.
Enzymes General properties Enzyme definition Factors affecting enzyme activity. Enzyme Inhibition. Application of enzyme inhibition. Isoenzymes.
How Enzymes Work Pratt & Cornely Ch 6.
20.2 Classification of Enzymes
An evolutionary approach to solving complex problems
מיחזור במערכת החינוך.
Chapter Three: Enzymes
Chapter 20 Enzymes and Vitamins
Biomedical Importance of Enzymes Basic concepts about Enzymes Classification of Enzymes as per IUB.
4 th SEMESTER – BOTANY KARNATAKA UNIVERSITY, DHARWAD Modified from various internet resources by Dr. Jayakara Bhandary Associate Professor of Botany Government.
Add Heuristic a b C D E Switch Cost
4n + 2 1st term = 4 × = 6 2nd term = 4 × = 10 3rd term
Chapter 24 Genomics and DNA Sequencing
Enzymes: Introduction
Genetic algorithms: case study
.1Sources of DNA and Sequencing Methods 2 Genome Assembly Strategy and Characterization 3 Gene Prediction and Annotation 4 Genome Structure 5 Genome.
Training Feedforward Neural Networks Using Genetic Algorithms
Presentation transcript:

Genetic Algorithms Select Protein Features Most Predictive of Enzyme Function Andrew Kernytsky, Burkhard Rost Columbia University

Enzyme function prediction Given protein sequence predict Enzyme Commission (EC) number NC-IUBMB (1992) Recommendations of the International Union of Biochemistry on the Nomenclature and Classification of Enzymes. In, Enzyme Nomenclature. Academic Press, New York. EC Wheel Figure: Porter CT, Bartlett GJ, Thornton JM. Nucleic Acids Res January 1; 32: D129–D133. Oxidoreductases Transferases Hydrolases Lyases Isomerases Ligases

TAGHCVNYDYGAGCQSGSPV bbbbbieeeiibbieeeeee..|....|......||.... AA Acc Cons Intersection properties capture local information 20% 10% 5% HHHEEEEELLEEEEELLLLL iiibbbbbbboooobbbbbb Feat 4 Feat 5 Feat 6 1% 0.1% 0.01% All Global All Interse ction Limited local information Significant risk of overfitting during training features > 10 2 positive samples

Algorithm overview Protein sequence MSNLLKDFEVAQCMSNLLKDFEVAQC AA AA×sec sec AA×sec Inner Learning Algorithm SVM Neural Network OR Fitness Assesed Selection Crossover Mutation AA×sec AA AA×sec sec AA×sec 2 nd Generation Genome Pop. 3 rd Generation Genome Pop. GA Evolution Genetic Algorithm 1 st 2 nd 3 rd 4 th Generation Populations AA sec AA × sec AA sec AA AA×sec sec AA × sec AA sec AA × sec All possible combinations of feature classes [genomes] AA sec AA × sec All intersection and global feature classes

GA improves performance EC Level

Balance between intersection and global features gives best performance AA, acc, sec, htm, cons-95 AA, acc, sec, cons-95 AA, acc, acc×sec, htm, cons-95 AA, sec, cons- 97 AA, acc×sec, sec, cons- 95 AA, acc, acc×sec×cons-94, sec AA, AA×acc×sec×cons-95, sec, cons-95 AA, sec AA, acc, sec×cons-94, cons-83×cons-94 AA, acc×sec×cons-89, cons-95 AA, acc×sec×cons- 84×cons-94, sec AA×acc×htm×cons- 84×cons-95, acc, cons-94 AA AA, acc×cons-96, sec×cons-91 AA, acc×sec×cons-94, acc×cons-94 AA, acc×sec×cons- 84×cons-94 AA, acc×sec×cons- 88×cons-91×cons-95 AA×cons-94, acc×cons- 94 AA×cons-82, acc×sec×cons-94 AA×cons-82, acc×sec×cons-94×cons-96 AA×sec×htm×cons- 95×cons-96