Project list 1.Peptide MHC binding predictions using position specific scoring matrices including pseudo counts and sequences weighting clustering (Hobohm)

Slides:



Advertisements
Similar presentations
Computational Biology, Part 2 Sequence Motifs Robert F. Murphy Copyright  1996, All rights reserved.
Advertisements

CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Sequence information, logos and Hidden Markov Models Morten Nielsen, CBS, BioCentrum,
Gibbs sampling Morten Nielsen, CBS, BioSys, DTU. Class II MHC binding MHC class II binds peptides in the class II antigen presentation pathway Binds peptides.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU T cell Epitope predictions using bioinformatics (Neural Networks and hidden.
Slide 1 of 38 T-cell EPITOPES PREDICTION OF HEMAGGLUTININ, NEURAMINIDASE AND MATRIX PROTEIN OF INFLUENZA A VIRUS USING SUPPORT VECTOR MACHINE AND HIDDEN.
Bioinformatics Motif Detection Revised 27/10/06. Overview Introduction Multiple Alignments Multiple alignment based on HMM Motif Finding –Motif representation.
Protein Backbone Angle Prediction with Machine Learning Approaches by R Kang, C Leslie, & A Yang in Bioinformatics, 1 July 2004, vol 20 nbr 10 pp
Computer Aided Vaccine Design Dr G P S Raghava. Concept of Drug and Vaccine Concept of Drug Concept of Drug –Kill invaders of foreign pathogens –Inhibit.
MHC Polymorphism Ole Lund. Objectives What is HLA polymorphism? What is it good for? How does it make life difficult for vaccine design? Definition of.
Artificial Neural Networks 2 Morten Nielsen BioSys, DTU.
Optimization methods Morten Nielsen Department of Systems Biology, DTU.
A Hidden Markov Model for Protein Secondary Structure Prediction
Beam Sampling for the Infinite Hidden Markov Model Van Gael, et al. ICML 2008 Presented by Daniel Johnson.
Biological sequence analysis and information processing by artificial neural networks Søren Brunak Center for Biological Sequence Analysis Technical University.
Bioinformatics Finding signals and motifs in DNA and proteins Expectation Maximization Algorithm MEME The Gibbs sampler Lecture 10.
Algorithms in Bioinformatics Morten Nielsen BioSys, DTU.
Biological sequence analysis and information processing by artificial neural networks.
Artificial Neural Networks 2 Morten Nielsen Depertment of Systems Biology, DTU.
Lecture 6, Thursday April 17, 2003
Biological sequence analysis and information processing by artificial neural networks Morten Nielsen CBS.
Hidden Markov Model Special case of Dynamic Bayesian network Single (hidden) state variable Single (observed) observation variable Transition probability.
MHC Polymorphism. MHC Class I pathway Figure by Eric A.J. Reits.
Introduction to BioInformatics GCB/CIS535
Course Summary June 2, 2005 Programming Workshop Overview of course (presentation) Protein modeling, part 2 Instructor evaluations.
Artificial Neural Networks Thomas Nordahl Petersen & Morten Nielsen.
M.W. Mak and S.Y. Kung, ICASSP’09 1 Conditional Random Fields for the Prediction of Signal Peptide Cleavage Sites M.W. Mak The Hong Kong Polytechnic University.
Biological sequence analysis and information processing by artificial neural networks.
1 Automated Feature Abstraction of the fMRI Signal using Neural Network Clustering Techniques Stefan Niculescu and Tom Mitchell Siemens Medical Solutions,
Project list 1.Peptide MHC binding predictions using position specific scoring matrices including pseudo counts and sequences weighting clustering (Hobohm)
A452 – Programming project – Mark Scheme
Signaling Pathways and Summary June 30, 2005 Signaling lecture Course summary Tomorrow Next Week Friday, 7/8/05 Morning presentation of writing assignments.
Biological sequence analysis and information processing by artificial neural networks Søren Brunak Center for Biological Sequence Analysis Technical University.
Prediction of CTL responses Mette Voldby Larsen cand. scient. in biology ph.d. student.
Platforms, installation, configuration; accessing example collections Course material prepared by Greenstone Digital Library Project University of Waikato,
Presented by Liu Qi An introduction to Bioinformatics Algorithms Qi Liu
CISC673 – Optimizing Compilers1/34 Presented by: Sameer Kulkarni Dept of Computer & Information Sciences University of Delaware Phase Ordering.
SUPERVISED NEURAL NETWORKS FOR PROTEIN SEQUENCE ANALYSIS Lecture 11 Dr Lee Nung Kion Faculty of Cognitive Sciences and Human Development UNIMAS,
Algorithms in Bioinformatics Morten Nielsen Department of Systems Biology, DTU.
Protein Sequence Alignment and Database Searching.
Scoring Matrices Scoring matrices, PSSMs, and HMMs BIO520 BioinformaticsJim Lund Reading: Ch 6.1.
1 Computer-aided Subunit Vaccine Design G.P.S. Raghava, Institute of Microbial Technology, Chandigarh  Understanding immune system  Breaking complex.
Protein Secondary Structure Prediction Based on Position-specific Scoring Matrices Yan Liu Sep 29, 2003.
Use of Machine Learning in Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.
ORDered ALignment Information Explorer. Alignment editor Conservation computtion “barcode” = schematic alignment Phylogenic tree 3D viewer => sequence.
Sequence encoding, Cross Validation Morten Nielsen BioSys, DTU
What is a Project Purpose –Use a method introduced in the course to describe some biological problem How –Construct a data set describing the problem –Define.
AdvancedBioinformatics Biostatistics & Medical Informatics 776 Computer Sciences 776 Spring 2002 Mark Craven Dept. of Biostatistics & Medical Informatics.
1 Web Site: Dr. G P S Raghava, Head Bioinformatics Centre Institute of Microbial Technology, Chandigarh, India Prediction.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
LOGO iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance- Pairs and Reduced Alphabet Profile into the General Pseudo Amino.
California Pacific Medical Center
Sequence Based Analysis Tutorial March 26, 2004 NIH Proteomics Workshop Lai-Su L. Yeh, Ph.D. Protein Science Team Lead Protein Information Resource at.
Artificiel Neural Networks 2 Morten Nielsen Department of Systems Biology, DTU.
Protein Family Classification using Sparse Markov Transducers Proceedings of Eighth International Conference on Intelligent Systems for Molecular Biology.
Combining Evolutionary Information Extracted From Frequency Profiles With Sequence-based Kernels For Protein Remote Homology Detection Name: ZhuFangzhi.
Final Report (30% final score) Bin Liu, PhD, Associate Professor.
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida March 31,
Prediction of T cell epitopes using artificial neural networks Morten Nielsen, CBS, BioCentrum, DTU.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
“ Using Sequence Motifs for Enhanced Neural Network Prediction of Protein Distance Constraints ” J.Gorodkin, O.Lund, C.A.Anderson, S.Brunak On ISMB 99.
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture
SMA5422: Special Topics in Biotechnology
Evaluation of machine learning methods to predict peptide binding to MHC Class I proteins  Bhattacharya, Sivakumar, Tokheim, Belva Guthrie, Anagnostou,
Machine Learning with Weka
Sequence Based Analysis Tutorial
Morten Nielsen, CBS, BioSys, DTU
Artificial Neural Networks Thomas Nordahl Petersen & Morten Nielsen
Artificial Neural Networks Thomas Nordahl Petersen & Morten Nielsen
Assignment 5 Example of multivariate regression
Gene Structure Prediction Using Neural Networks and Hidden Markov Models June 18, 권동섭 신수용 조동연.
Presentation transcript:

Project list 1.Peptide MHC binding predictions using position specific scoring matrices including pseudo counts and sequences weighting clustering (Hobohm) techniques 2.Peptide MHC binding predictions using artificial neural networks with different sequence encoding schemes 3.Gibbs sampler approach to the prediction of MHC class II binding motifs 4.Implementation of HMM Baum-Welsh algorithm 5.Comparative study of PSSM, ANN for peptide MHC binding 6.Comparison of “fake” versus “true” cross-validation 7....

What is a Project Purpose –Use a method introduced in the course to describe some biological problem How –Construct a data set describing the problem –Define which method to use –(Develop method) –Train and evaluate method –(Compare performance to other methods) Documentation –Write report in form of a research article (10-15 pages) Abstract Introduction Materials and method Results Discussion References

PSSM Peptide MHC binding predictions using position specific scoring matrices including pseudo counts and sequences weighting techniques –Compare methods for sequence weighting Clustering vs heuristics –Benchmark (Peters et al 2006) covering some 20 MHC molecules, compare to best other methods Data are at – –Use IC50 = 500 nM as threshold for binders when making PSSM

NN Peptide MHC binding predictions using artificial neural networks with different sequence encoding schemes –Benchmark (Peters et al 2006) covering some 20 MHC molecules, compare to best other methods –Compare sequence encoding schemes Sparse, Blosum, composition, charge, amino acids size,.. AA index database of amino acid physicochemical properties –

Gibbs sampler Gibbs sampler approach to the prediction of MHC class II binding motifs –Develop Gibbs sampler to prediction of MHC class II binding motifs –Benchmark Nielsen et al 2007 covering 14 HLA-DR alleles II-2.0.phphttp:// II-2.0.php Use IC50 = 500 nM (log50k = 0.426) as threshold for binders

Comparative study Compare methods for MHC peptide binding –PSSM –ANN Data: Benchmark by Peters et al 2006 covering some 20 MHC molecules

HMM Implement Baum-Welsh HMM training –Based on code from Tapas Kanungo HMM toolkit –Hidden Markov Model (HMM) Software: Implementation of Forward-Backward, Viterbi, and Baum-Welch algorithms. The software has been compiled and tested on UNIX platforms (sun solaris, dec osf and linux) and PC NT running the GNU package from Cygnus (has gcc, sh, etc.). A tar file can be found at: (tar file). If you need a zip file: zip file. The README file. Postscript slides for tutorial talks that I gave on HMM. The PDF version of the tutorial.(tar file)zip file README Postscript slides PDF Test code on un-fair casino example

Method evaluation using cross- validation Compare performance of data-driven prediction methods when evaluated using cross-validation What is the difference between the “fake” and “true” cross-validated performance as a function of –Data set size –.. Data: Benchmark by Peters et al covering some 20 MHC molecules