Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Slides:



Advertisements
Similar presentations
Multiple Alignment Anders Gorm Pedersen Molecular Evolution Group
Advertisements

Bioinformatics Motif Detection Revised 27/10/06. Overview Introduction Multiple Alignments Multiple alignment based on HMM Motif Finding –Motif representation.
March 03 Identification of Transcription Factor Binding Sites Presenting: Mira & Tali.
Structural bioinformatics
Intro to Bioinformatics Summary. What did we learn Pairwise alignment – Local and Global Alignments When? How ? Tools : for local blast2seq, for global.
CIS786, Lecture 7 Usman Roshan Some of the slides are based upon material by Dennis Livesay and David.
Protein Modules An Introduction to Bioinformatics.
Today’s menu: -SwissProt/TrEMBL -PROSITE -Pfam -Gene Onltology Protein and Function Databases Tutorial 7.
Evaluating alignments using motif detection Let’s evaluate alignments by searching for motifs If alignment X reveals more functional motifs than Y using.
The Poor Beginners’ Guide to Bioinformatics. What we have – and don’t have... a computer connected to the Internet (incl. Web browser) a text editor (Notepad.
Promoter Analysis using Bioinformatics, Putting the Predictions to the Test Amy Creekmore Ansci 490M November 19, 2002.
Protein and Function Databases
Single Motif Charles Yan Spring Single Motif.
Proteomics: Analyzing proteins space. Protein families Why proteins? Shift of interest from “Genomics” to “Proteomics” Classification of proteins to groups/families.
CIS786, Lecture 8 Usman Roshan Some of the slides are based upon material by Dennis Livesay and David.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
MicroRNA Targets Prediction and Analysis. Small RNAs play important roles The Nobel Prize in Physiology or Medicine for 2006 Andrew Z. Fire and Craig.
Structural Bioinformatics R. Sowdhamini National Centre for Biological Sciences Tata Institute of Fundamental Research Bangalore, INDIA.
Tools: Amino acid sequences (PDB, EBI) from many diverse organisms to be provided for students to select about 5-6 organisms representing the three domains.
Guiding Motif Discovery by Iterative Pattern Refinement Zhiping Wang, Mehmet Dalkilic, Sun Kim School of Informatics, Indiana University.
CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.
A computational study of protein folding pathways Reducing the computational complexity of the folding process using the building block folding model.
Modelling binding site with 3DLigandSite Mark Wass
Identification of Protein Domains. Orthologs and Paralogs Describing evolutionary relationships among genes (proteins): Two major ways of creating homologous.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Eric C. Rouchka, University of Louisville SATCHMO: sequence alignment and tree construction using hidden Markov models Edgar, R.C. and Sjolander, K. Bioinformatics.
Identification of Regulatory Binding Sites Using Minimum Spanning Trees Pacific Symposium on Biocomputing, pp , 2003 Reporter: Chu-Ting Tseng Advisor:
Genome alignment Usman Roshan. Applications Genome sequencing on the rise Whole genome comparison provides a deeper understanding of biology – Evolutionary.
Multiple Alignments Motifs/Profiles What is multiple alignment? HOW does one do this? WHY does one do this? What do we mean by a motif or profile? BIO520.
Eidhammer et al. Protein Bioinformatics Chapter 4 1 Multiple Global Sequence Alignment and Phylogenetic trees Inge Jonassen and Ingvar Eidhammer.
My Research Work and Clustering Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
Modelling Genome Structure and Function Ram Samudrala University of Washington.
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Bioinformatics Multiple Alignment. Overview Introduction Multiple Alignments Global multiple alignment –Introduction –Scoring –Algorithms.
Comparative genomics analysis of NtcA regulons in cyanobacteria: Regulation of nitrogen assimilation and its coupling to photosynthesis Wen-Ting Huang.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Localising regulatory elements using statistical analysis and shortest unique substrings of DNA Nora Pierstorff 1, Rodrigo Nunes de Fonseca 2, Thomas Wiehe.
Manually Adjusting Multiple Alignments Chris Wilton.
Combining Sequence and Structure Information Topic 17.
PROTEIN PATTERN DATABASES. PROTEIN SEQUENCES SUPERFAMILY FAMILY DOMAIN MOTIF SITE RESIDUE.
Russell Group, Protein Evolution _________ ____ Rob Russell Cell Networks University of Heidelberg Interactions and Modules: the how and why of molecular.
Exercises Pairwise alignment Homology search (BLAST) Multiple alignment (CLUSTAL W) Iterative Profile Search: Profile Search –Pfam –Prosite –PSI-BLAST.
I.U. School of Informatics Motif Discovery from Large Number of Sequences: A Case Study with Disease Resistance Genes in Arabidopsis thaliana by Irfan.
David Wishart February 18th, 2004 Lecture 3 BLAST (c) 2004 CGDN.
Motif Search and RNA Structure Prediction Lesson 9.
Hidden Markov Model and Its Application in Bioinformatics Liqing Department of Computer Science.
Gene Family Size Distributions Brought to You By Your Neighorhood Durand Lab Narayanan Raghupathy Nan Song Rose Hoberman.
Intro to Probabilistic Models PSSMs Computational Genomics, Lecture 6b Partially based on slides by Metsada Pasmanik-Chor.
Computational Biology, Part C Family Pairwise Search and Cobbling Robert F. Murphy Copyright  2000, All rights reserved.
Detecting Protein Function and Protein-Protein Interactions from Genome Sequences TuyetLinh Nguyen.
Cytochrome P450 Monooxygenases ubiquitous in nature: > 40 in humans > 250 in plants >400 in rice signature motif: F—G-R-C-G requires redox partner scission.
Substitution Matrices and Alignment Statistics BMI/CS 776 Mark Craven February 2002.
Lab 4.11 Lab 4.1: Multiple Sequence Alignment Jennifer Gardy Molecular Biology & Biochemistry Simon Fraser University.
Multiple Sequence Alignment Dr. Urmila Kulkarni-Kale Bioinformatics Centre University of Pune
WRKY transcription factors in potato genome factors in potato genome
پروتكل آموزش سلامت به مددجو
Multiple Sequence Alignment
Predicting Active Site Residue Annotations in the Pfam Database
binding sites 58 of the 473 unambiguously assigned phosphorylation sites are predicted by Scansite to be sites for binding. 50 of these correspond.
TGFβ Signaling: Receptors, Transducers, and Mad Proteins
WRKY transcription factors in potato genome factors in potato genome
Comparing read recruitment, de novo, and insertion tree strategies for phylogenetic diversity computation. Comparing read recruitment, de novo, and insertion.
Nora Pierstorff Dept. of Genetics University of Cologne
Deep Learning in Bioinformatics
The family of bone morphogenetic proteins
Gene regulatory regions of the insect/crustacean egr-B homologs.
Conserved motifs in the ABC
Presentation transcript:

Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic problem With the explosion of genomic data from recent sequencing efforts, protein functional site prediction from only sequence is an increasingly important bioinformatic endeavor.

What is a “Functional Site”? Defining what constitutes a “functional site” is not trivial Residues that include and cluster around known functionality are clear candidates for functional sites We define a functional site as catalytic residues, binding sites, and regions that clustering around them.

Protein

Protein + Ligand

Functional Sites (FS)

Regions that Cluster Around FS

Phylogenetic motifs PMs are short sequence fragments that conserve the overall familial phylogeny Are they functional? How do we detect them?

Phylogenetic motifs PMs are short sequence fragments that conserve the overall familial phylogeny Are they functional? How do we detect them? First we design a simple heuristic to find them Then we see if the detected sites are functional

Phylogenetic Motif Identification Compare all windowed trees with whole tree and keep track of the partition metric scores Normalize all partition metric scores by calculating z-scores Call these normalized scores Phylogenetic Similarity Z-scores (PSZ) Set a PSZ threshold for identifying windows that represent phylogenetic motifs

Set PSZ Threshold

Regions of PMs

TIM Phylogenetic Similarity False Positive Expectation

TIM Phylogenetic Similarity False Positive Expectation

Cytochrome P450 Phylogenetic Similarity False Positive Expectation

Cytochrome P450 Phylogenetic Similarity False Positive Expectation

Enolase Phylogenetic Similarity False Positive Expectation

Glycerol Kinase Phylogenetic Similarity False Positive Expectation

Glycerol Kinase Phylogenetic Similarity False Positive Expectation

Myoglobin Phylogenetic Similarity False Positive Expectation

Myoglobin Phylogenetic Similarity False Positive Expectation