An Analysis of “Coronavirus 3CL pro proteinase cleavage sites: Possible relevance to SARS virus pathology” Connie Wu.

Slides:



Advertisements
Similar presentations
NCBI data, sliding window programs and dot plots Sept. 25, 2012 Learning objectives-Become familiar with OMIM and PubMed. Understand the difference between.
Advertisements

50%, guessing 100%, all correct Accuracy = Figure 2 Predictive Accuracy of SMO algorithm using each attribute separately Prediction of catalytic residues.
A Hidden Markov Model for Protein Secondary Structure Prediction
LSM3241: Bioinformatics and Biocomputing Lecture 2: Bioinformatics of viral genome Prof. Chen Yu Zong Tel:
PROTEIN SECONDARY STRUCTURE PREDICTION WITH NEURAL NETWORKS.
Mike Arnoult 9/30/2010 The role of Artificial Neural Networks in Phage Research.
CISC667, F05, Lec18, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Gene Prediction and Regulation.
Archives and Information Retrieval
An analysis of “Bioinformatics analysis of SARS coronavirus genome polymorphism” by Pavlović-Lažetić, et. al Angela Brooks July 9, 2004 SoCalBSI Article.
An Introduction to Bioinformatics Protein Structure Prediction.
Co-Transporters Na + /Glucose Symport Vibrio cholerae Prokaryote Water-bourne pathogen Produces Cholera Toxin.
Profile-profile alignment using hidden Markov models Wing Wong.
Biological inspiration Animals are able to react adaptively to changes in their external and internal environment, and they use their nervous system to.
Discriminative Motifs Saurabh Sinha, RECOMB ’02, April Introduction The term “motif” means the common pattern in different binding sites of a transcription.
Protein Fold recognition Morten Nielsen, Thomas Nordahl CBS, BioCentrum, DTU.
Introduction to BioInformatics GCB/CIS535
Artificial Neural Networks Thomas Nordahl Petersen & Morten Nielsen.
Bioinformatics Student host Chris Johnston Speaker Dr Kate McCain.
Predicting protein functions from redundancies in large-scale protein interaction networks Speaker: Chun-hui CAI
Protein Modules An Introduction to Bioinformatics.
Future Trends: Translational Informatics James J. Cimino Chief, Laboratory for Informatics Development Mark O. Hatfield Clinical Research Center National.
Influenza A Virus Pandemic Prediction and Simulation Through the Modeling of Reassortment Matthew Ingham Integrated Sciences Program University of British.
M.W. Mak and S.Y. Kung, ICASSP’09 1 Conditional Random Fields for the Prediction of Signal Peptide Cleavage Sites M.W. Mak The Hong Kong Polytechnic University.
Training a Neural Network to Recognize Phage Major Capsid Proteins Author: Michael Arnoult, San Diego State University Mentors: Victor Seguritan, Anca.
Introduction to Bioinformatics - Tutorial no. 8 Protein Prediction: - PROSITE - Pfam - SCOP - TOPITS - genThreader.
Mining the Medical Literature Chirag Bhatt October 14 th, 2004.
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Chapter 1 Introduction to the Scientific Method Can Science Cure the Common Cold?
A Study of Cystic Fibrosis Using Web-Based Tools Anuradha Datta Murphy Graduate Student, Dept. of Molecular and Integrative Physiology, University of Illinois.
Artificial Neural Networks for Secondary Structure Prediction CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (slides by J. Burg)
Overview of Bioinformatics A/P Shoba Ranganathan Justin Choo National University of Singapore A Tutorial on Bioinformatics.
Protein Tertiary Structure Prediction
Truncation of Protein Sequences for Fast Profile Alignment with Application to Subcellular Localization Man-Wai MAK and Wei WANG The Hong Kong Polytechnic.
Scoring Matrices Scoring matrices, PSSMs, and HMMs BIO520 BioinformaticsJim Lund Reading: Ch 6.1.
PART II. Prediction of functional regions within disordered proteins Zsuzsanna Dosztányi MTA-ELTE Momentum Bioinformatics Group Department of Biochemistry.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.
THE SARS VIRUS GENOME The Quick and the Dead?. SARS Severe Acute Respiratory Syndrome First identified in Guangdong Province, China Mortality 3-6% (45-63%
Discovering the Correlation Between Evolutionary Genomics and Protein-Protein Interaction Rezaul Kabir and Brett Thompson
Secondary structure prediction
TMpro: Transmembrane Helix Prediction using Amino Acid Properties and Latent Semantic Analysis Madhavi Ganapathiraju, N. Balakrishnan, Raj Reddy and Judith.
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Multiple alignment: Feng- Doolittle algorithm. Why multiple alignments? Alignment of more than two sequences Usually gives better information about conserved.
Web Servers for Predicting Protein Secondary Structure (Regular and Irregular) Dr. G.P.S. Raghava, F.N.A. Sc. Bioinformatics Centre Institute of Microbial.
Samudrala group - overall research areas CASP6 prediction for T Å C α RMSD for all 70 residues CASP6 prediction for T Å C α RMSD for all.
Protein-Protein Interaction Hotspots Carved into Sequences Yanay Ofran 1,2, Burkhard Rost 1,2,3 1.Department of Biochemistry and Molecular Biophysics,
Evidence for Positive Epistasis in HIV-1 Sebastian Bonhoeffer, Colombe Chappe, Neil T. Parkin, Jeanette M. Whitcomb, Christos J. Petropoulos.
Genes and Genomic Datasets. DNA compositional biases Base composition of genomes: E. coli: 25% A, 25% C, 25% G, 25% T P. falciparum (Malaria parasite):
Combining SELEX with quantitative assays to rapidly obtain accurate models of protein–DNA interactions Jiajian Liu and Gary D. Stormo Presented by Aliya.
Bioinformatics and Computational Biology
Overview of Bioinformatics Module Denis Manley.. Contact Details Lecturer Name: Denis Manley Room number: KE-1-013a
CYSTIC FIBROSIS AND CELL COMMUNICATION. CFTR Cystic Fibrosis Transmembrane Conductance Regulator ( Or CFTR)  Is a transport protein for Chloride across.
Feature Extraction Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and.
. Finding Motifs in Promoter Regions Libi Hertzberg Or Zuk.
Matching Protein  -Sheet Partners by Feedforward and Recurrent Neural Network Proceedings of Eighth International Conference on Intelligent Systems for.
Combining Evolutionary Information Extracted From Frequency Profiles With Sequence-based Kernels For Protein Remote Homology Detection Name: ZhuFangzhi.
Bioinformatics in Vaccine Design
Protein Prediction with Neural Networks! Chris Alvino CS152 Fall ’06 Prof. Keller.
Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Program Computational Intelligence, Learning, and Discovery Program.
Mismatch String Kernals for SVM Protein Classification Christina Leslie, Eleazar Eskin, Jason Weston, William Stafford Noble Presented by Pradeep Anand.
Patterns of HIV-1 evolution in individuals with differing rates of CD4 T cell decline Markham RB, Wang WC, Weisstein AE, Wang Z, Munoz A, Templeton A,
Predicting Structural Features Chapter 12. Structural Features Phosphorylation sites Transmembrane helices Protein flexibility.
BNFO 615 Fall 2016 Usman Roshan NJIT. Outline Machine learning for bioinformatics – Basic machine learning algorithms – Applications to bioinformatics.
Computer Applications and Bioinformatics
Bioinformatics Overview
School of Pharmacy, University of Nizwa
Relationship between Genotype and Phenotype
9 Future Challenges for Bioinformatics
Artificial Neural Networks Thomas Nordahl Petersen & Morten Nielsen
Maturation of recoded cystic fibrosis transmembrane conductance regulator (CFTR) channels. a) Representative Western blot obtained from HEK293 cells expressing.
Artificial Neural Networks Thomas Nordahl Petersen & Morten Nielsen
Presentation transcript:

An Analysis of “Coronavirus 3CL pro proteinase cleavage sites: Possible relevance to SARS virus pathology” Connie Wu

Article Resources BMC Bioinformatics 2004, 5:72 Published on Jun 6, 2004 Article URL Article URL /5/ /5/72 NetCorona URL

Outline SARS outbreak in 2003 Introduction to SARS virus Experimental database used Pattern Recognition Method Neural Network Method Biological Significance on NetCorona

SARS Outbreak in 2003 A Chinese man was found to have caught the infectious respiratory disease in Hong Kong, first case emerge from the general population since July Infected more than 8,000 people in close to 30 nations and killed more than 750.

SARS Virus Belongs to the family of human coronavirus, normally causes mild cold symptoms in human. The proteolytic cleavage of host proteins by viral proteinases is found in the pathology of other virus families such as picornaviruses. Virus proliferation can be arrested using specific proteinase inhibitors.

SARS Virus

Experimental database Seven full-length coronavius genomes retrieved from the GenBank database. Each sequence contained eleven 3CL pro proteinase cleavage sites, given a total 77 identifiable sites. Identify the main 3CL sites (P1) in polyproteins using alignment without gaps. P1 = N-terminal to cleavage site P1’= C-terminal to cleavage site

Consensus Pattern Recognition Glutamine (Q) in position P1, and a trend of strong preference for leucine (L) at position P2 in found in coronavirus proteinase. ‘LQ’ consensus pattern prediction 60/77 true positives (78%) 196 additional false positives by random occurrence of this pair of amino acid ‘LQ[S/A]’ consensus pattern prediction 48/77 true positive (62%) 36 additional false positives

Limitations of Pattern Recognition Simple consensus pattern recognition (i.e. ‘LQ’) low specificity high sensitivity Sophisticated consensus pattern recognition (i.e. ‘LQ[S/A]’) high specificity low sensitivity

Neural Network A sequence window of 9 amino acid centered on the glutamine in the P1 position A score between 0 and 1 to every glutamine that is present Score > 0.8 = most likely to cleaved 0.5 ~ 0.8 = possibly cleaved < 0.5 = likely not cleaved 67/77 true positives (87.0%) 1358/1372 true negatives (99.0%)

Neural Network Three-layered neural network Two hidden neurons

Neural Network Training Training was done with three-fold cross-validation and Matthews correlation coefficients were calculated by sum up values in all combinations of training and test sets. An averaged sum of the score of all three networks arising from the three- fold cross-validation was used for predition.

Neural Network on Host Cell protein Cystic fibrosis transmembrane conductance regulator (CFTR), an ATP- dependent chloride channel is predicted as a cleavage site with a high score at Gln762. Transcription factor OCT-1 is predicted to be cleaved at Gln62 by the 3CLpro proteinase with a high confidence score of

Limitation of NetCorona High specificity Low sensitivity Not accurate in predicting sites with relative low cleavage efficiency in vivo. Need to disregard high scored cleavage sites that are inaccessible to the proteinase.

Significance of NetCorona Employed by researchers suspecting a possible viral proteinase cleavage. Useful if working with coronavirus function. May facilitate proteinase inhibitor drug discovery. Possible future strategy for drug development