Sourav Roy School of Informatics Indiana University

Slides:



Advertisements
Similar presentations
Blast outputoutput. How to measure the similarity between two sequences Q: which one is a better match to the query ? Query: M A T W L Seq_A: M A T P.
Advertisements

Blast to Psi-Blast Blast makes use of Scoring Matrix derived from large number of proteins. What if you want to find homologs based upon a specific gene.
1 Introduction to Sequence Analysis Utah State University – Spring 2012 STAT 5570: Statistical Bioinformatics Notes 6.1.
Using phylogenetic profiles to predict protein function and localization As discussed by Catherine Grasso.
Computational detection of cis-regulatory modules Stein Aerts, Peter Van Loo, Ger Thijs, Yves Moreau and Bart De Moor Katholieke Universiteit Leuven, Belgium.
Measuring the degree of similarity: PAM and blosum Matrix
Regulatory Motifs. Contents Biology of regulatory motifs Experimental discovery Computational discovery PSSM MEME.
Intro to Bioinformatics Summary. What did we learn Pairwise alignment – Local and Global Alignments When? How ? Tools : for local blast2seq, for global.
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Detecting Orthologs Using Molecular Phenotypes a case study: human and mouse Alice S Weston.
BACKGROUND E. coli is a free living, gram negative bacterium which colonizes the lower gut of animals. Since it is a model organism, a lot of experimental.
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – The Transcription.
Similar Sequence Similar Function Charles Yan Spring 2006.
BNFO 235 Lecture 5 Usman Roshan. What we have done to date Basic Perl –Data types: numbers, strings, arrays, and hashes –Control structures: If-else,
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Alignment IV BLOSUM Matrices. 2 BLOSUM matrices Blocks Substitution Matrix. Scores for each position are obtained frequencies of substitutions in blocks.
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
An Introduction to Bioinformatics
Protein Bioinformatics Course
Guiding Motif Discovery by Iterative Pattern Refinement Zhiping Wang, Mehmet Dalkilic, Sun Kim School of Informatics, Indiana University.
Good solutions are advantageous Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
PART II. Prediction of functional regions within disordered proteins Zsuzsanna Dosztányi MTA-ELTE Momentum Bioinformatics Group Department of Biochemistry.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Discovering the Correlation Between Evolutionary Genomics and Protein-Protein Interaction Rezaul Kabir and Brett Thompson
Comp. Genomics Recitation 3 The statistics of database searching.
Construction of Substitution Matrices
Function preserves sequences Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Basic terms:  Similarity - measurable quantity. Similarity- applied to proteins using concept of conservative substitutions Similarity- applied to proteins.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.
Protein Disordered Regions and the Evolution of Eukaryotes Allan Wu Phar 201 Phil Bourne.
PREETI MISRA Advisor: Dr. HAIXU TANG SCHOOL OF INFORMATICS - INDIANA UNIVERSITY Computational method to analyze tandem repeats in eukaryote genomes.
1 Improve Protein Disorder Prediction Using Homology Instructor: Dr. Slobodan Vucetic Student: Kang Peng.
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
Russell Group, Protein Evolution _________ ____ Rob Russell Cell Networks University of Heidelberg Interactions and Modules: the how and why of molecular.
Construction of Substitution matrices
Ubiquitination Sites Prediction Dah Mee Ko Advisor: Dr.Predrag Radivojac School of Informatics Indiana University May 22, 2009.
PROTEIN INTERACTION NETWORK – INFERENCE TOOL DIVYA RAO CANDIDATE FOR MASTER OF SCIENCE IN BIOINFORMATICS ADVISOR: Dr. FILIPPO MENCZER CAPSTONE PROJECT.
Research proposal 2009 信息技术会议 Bioinformatics Analysis & Identification of non-Synonymous SNPs in Candidate Genes for Ascites College of Animal Husbandry.
Bioinformatics Overview
Sequence similarity, BLAST alignments & multiple sequence alignments
CSCI2950-C Lecture 12 Networks
Uncovering the Protein Tyrosine Phosphatome in Cattle
Control of Gene Expression
A Very Basic Gibbs Sampler for Motif Detection
Basics of Comparative Genomics
Protein Structure Prediction and Protein Homology modeling
Molecular Docking Profacgen. The interactions between proteins and other molecules play important roles in various biological processes, including gene.
Predicting Active Site Residue Annotations in the Pfam Database
Cancer and Cell Communication
Genomes and Their Evolution
Protein Bioinformatics Course
Genome organization and Bioinformatics
Introduction to Bioinformatics II
Homework #2 is due 10/17 Bonus #1 is due 10/24 FrakenFlowers.
Geneomics and Database Mining and Genetic Mapping
Ligand Docking to MHC Class I Molecules
Notch target genes in presomitic mesoderm cells have
Explore Evolution: Instrument for Analysis
SEG5010 Presentation Zhou Lanjun.
BSC1010: Intro to Biology I K. Maltz Chapter 21.
Basics of Comparative Genomics
Problems from last section
Alignment IV BLOSUM Matrices
Basic Local Alignment Search Tool
Deep Learning in Bioinformatics
Presentation transcript:

Unraveling the segmentation clock with bioinformatics tools and techniques. Sourav Roy School of Informatics Indiana University Advisors: Dr. Predrag Radivojac & Dr. Santiago Schnell Capstone Instructor: Dr. Mehmet Dalkilic. 11/27/2018 Capstone Presentation 04/21/2006

Background and Motivation Notch Signaling Pathway (NSP) Introduction Background and Motivation Notch Signaling Pathway (NSP) Structure of NSP proteins (NSPPs) Potential Binding Sites in NSPPs Probable Protein-Protein Interactions Interaction Map & Boolean Model for the Pathway Overview Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 1

Beginning of Life Sourav Roy School of Informatics Indiana University Introduction Background Beginning of Life zygote Cleavage Sourav Roy School of Informatics Indiana University Blastula Start of Gastrulation 11/27/2018 Capstone Presentation 04/21/2006 2

Vertebrate segmentation Introduction Background Vertebrate segmentation Segmentation was nature’s answer to the development of complex organisms Presomitic mesoderm (PSM) or paraxial mesoderm gives rise to vertebrate equal- sized segments – somites Repeated formation of intersomitic boundaries is driven by a molecular oscillator (segmentation clock) Sourav Roy School of Informatics Indiana University 11/27/2018 Source: Cancer Research UK Website Capstone Presentation 04/21/2006 3

Periodicity Sourav Roy School of Informatics Indiana University Introduction Background Motivation Periodicity Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 4

Why are we interested? Sourav Roy School of Informatics Introduction Background Motivation Why are we interested? Sourav Roy School of Informatics Indiana University To date models of the pathway have been built on the basis of knockout experiments, but nobody knows whether there is a single interaction or a cascade for most of the cases 11/27/2018 Capstone Presentation 04/21/2006 5

Notch Signaling Pathway Introduction Background Motivation Notch Pathway Notch Signaling Pathway Sourav Roy School of Informatics Indiana University 11/27/2018 Source - Biocarta Capstone Presentation 04/21/2006 6

Proteins in Mouse Notch Signaling Pathway Introduction Background Motivation Notch Pathway Proteins in Mouse Notch Signaling Pathway Notch1 - Transmembrane receptor. Dll1 - Ligand. Dll3 - Ligand. Psen1 - Membrane bound protein with γ- secretase activity. Lfng - Beta-1,3-N- acetylglucosaminyltransferase. RbpSuh - DNA binding transcription. Mesp2 - bHLH protein. Hes1 - bHLH transcription repressor. Hes7 - bHLH transcription repressor. Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 7

Flexibility, Rigidity and binding Introduction Background Motivation Notch Pathway Structure of NSPPs Flexibility, Rigidity and binding Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 8

Order and Disorder Sourav Roy School of Informatics Indiana University Introduction Background Motivation Notch Pathway Structure of NSPPs Order and Disorder Ordered Regions Disordered Regions Sourav Roy School of Informatics Indiana University 11/27/2018 Source : Bachinsky and V.V. Solovyev Capstone Presentation 04/21/2006 9

Disorder Prediction Sourav Roy School of Informatics Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder Prediction Disorder has a role to play in signal transduction, cell cycle regulation and transcriptional activity (Dunker et al. 2002; Iakoucheva et al. 2002; Ward et al. 2004) Swiss-Prot (June 2005, version) database was downloaded and all mouse proteins were extracted with the help of a Matlab code VL3 model (Obradovic et al., 2003) was used to calculate the disorder for each protein Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 10

Notch Signaling Pathway Proteins Intrinsically Disordered? Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the Pathway Notch Signaling Pathway Proteins Intrinsically Disordered? Proteins %Disorder Average of 9 pathway proteins 56.4 Average of all 9448 Mouse Proteins in the Swiss Prot database on 05/10/2005 32.2 Sourav Roy School of Informatics Indiana University A Matlab code was used with VL3 within it to calculate the average disorder 11/27/2018 Capstone Presentation 04/21/2006 11

Disorder Percentage in general Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the Pathway Disorder Percentage in general Sampling Method I: a) 10 / 30 / 60 / 90 / 100 / 300 / 600 / 900 /1000 random samples were taken from the set of 9448 proteins. b) Average Disorder and standard deviation calculated with the help of a Matlab code. c) t-test done – Fischer’s t-test. Sourav Roy School of Informatics Indiana University Steps a) and b) were done with the help of a Matlab code JAVA code was used for the t-test 11/27/2018 Capstone Presentation 04/21/2006 12

t-test results 9 10 60 100 600 1000 Sourav Roy School of Informatics Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the Pathway t-test results # of Proteins Mean Sd P value Significance 9 56.4 25.2 0.0062 Yes 10 26.6 23.0 0.5029 No 60 31.6 26.5 0.8572 100 29.8 23.1 0.3628 600 30.6 25.7 0.1556 1000 32.1 26.1 0.9283 Sourav Roy School of Informatics Indiana University % Disorder # of Proteins 9 10 60 100 600 1000 # of Proteins 11/27/2018 Capstone Presentation 04/21/2006 13

Sampling Method II Different sample sets 1,2,3,…,11 were taken. Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the Pathway Sampling Method II Different sample sets 1,2,3,…,11 were taken. Each of the sets had 9 random proteins from the entire set of 9448. The average of the 9 proteins in each set was calculated. The average of the averages and standard of the sample sets was calculated. Sourav Roy School of Informatics Indiana University A Matlab code was used for this method 11/27/2018 Capstone Presentation 04/21/2006 14

Sampling Method II results Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the Pathway Introduction Background & Motivation Notch Pathway Ordered & Dis- ordered Proteins Disorder in the Pathway Sampling Method II results N 1 2 3 4 5 6 7 8 9 10 11 # of Sample sets % Disorder Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 15

Potential binding sites Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the Pathway MoREs Potential binding sites MoREs - short, interaction-prone loosely-structured or semi-structured regions in intrinsically disordered proteins α-MoRE a subclass of MoREs Source: Oldfield et al Biochemistry 2005 Ordered Regions Disordered Regions α-MoRE Sourav Roy School of Informatics Indiana University 11/27/2018 Source : Bachinsky and V.V. Solovyev Capstone Presentation 04/21/2006 16

α-MoRE Prediction MoREs Sourav Roy School of Informatics Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the Pathway MoREs α-MoRE Prediction α-MoRE score Amino acid residue Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 17

Disordered Regions and α-MoREs Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the Pathway MoREs Disordered Regions and α-MoREs Sourav Roy School of Informatics Indiana University Ordered regions, Disordered regions α-MoREs at p >0.5 α-MoREs at p >0.7 11/27/2018 Capstone Presentation 04/21/2006 18

To check if the disordered regions and the α-MoREs are conserved Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the Pathway MoREs Evolution To check if the disordered regions and the α-MoREs are conserved PSI- BLAST of the pathway proteins Extraction of the aligned regions with the help of a PERL script Clustering of the aligned regions with the help of PERL script Multiple sequence alignment of the clustered sequences by ClustalW Parsing the Clustal report and calculating the entropy of each column with the help of a Matlab code Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 19

Results for Hes7 Sourav Roy School of Informatics Indiana University Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the Pathway MoREs Evolution Results for Hes7 Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 20

Prediction of possible protein-protein interactions Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the Pathway MoREs Evolution Probable PPI Prediction of possible protein-protein interactions ADVICE webtool was used for the prediction ADVICE is a web server providing Automated Detection and Validation of Interaction based on the Co-Evolutions between interacting proteins. It automated the steps needed to compute the similarities between proteins' evolutionary histories to detect co-evolved interacting proteins. Step 1. Homolog Search. ADVICE detects orthologous sequences for pair(s) of protein sequences and retrieve the orthologous sequences if both appear in the same species. Step 2. Distance Matrix Construction. ADVICE then constructs the distance matrix for both orthologous group of sequence. The distance matrices are derived from multiple sequences alignments using ClustalW. Step 3. Linear correlation coefficient computation. The Pearson's Correlation Coefficient formula is used to calculate the similarities between the two distance matrices: The result r will fall into -1 to 1. Previous studies have indicated that interacting proteins share similarity in their evolutionary histories and have high r-value (>=0.8) Sourav Roy School of Informatics Indiana University 11/27/2018 Source: ADVICE website Capstone Presentation 04/21/2006 21

Chances of interaction (r) Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the Pathway MoREs Evolution Probable PPI Predicted PPI Protein Pairs Chances of interaction (r) Notch1 & Dll1 89.50% Notch1 & Psen1 97.20% Notch1 & Lfng 98.90% Notch1 & RbpSuh 98.20% RbpSuh & Hes7 97.70% Hes1 & Lfng 97.80% Notch1 & Hes7 Lfng & Dll1 88.70% Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 22

Interaction Map Sourav Roy School of Informatics Indiana University Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the pathway MoREs Evolution Probable PPI Interaction map Interaction Map Notch Dll1 Dll Notch NICD Psen1 Rbp/Suh NICD Rbp/Suh Nucleus Sourav Roy School of Informatics Indiana University dll mesp2 Mesp2 lfng Lfng hes1/7 Hes1/7 Cell1 Cell2 11/27/2018 Capstone Presentation 04/21/2006 23

Mesp2t  ( Notch1t  Psen1t  RbpSuht )  Dll1/3t+1 Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the pathway MoREs Evolution Probable PPI Interaction map Boolean Model Boolean Logic Mesp2t  ( Notch1t  Psen1t  RbpSuht )  Dll1/3t+1 Dll1/3t   Lfngt  Notch1t+1 {( Notch1t  Psen1t  RbpSuht )  Mesp2t }   Hes1/7t  Lfngt+1 Notch1t  Psen1t  RbpSuht  Mesp2t+1 (Notch1t  Psen1t  RbpSuht)   Hes1/7t  Hes1/7t+1 Notch1t  Psen1t  RbpSuht+1 Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 24

Expression of hes/her regulates the Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the pathway MoREs Evolution Probable PPI Interaction map Boolean Model Expression of hes/her regulates the segmentation clock Sourav Roy School of Informatics Indiana University The molecular network constituting the segmentation clock in two adjacent cells. Notch pathway HER1/HER7- based oscillator, excluding the Notch pathway components Source: Pourquie & Goldbeter, Current Biology 2003 11/27/2018 Capstone Presentation 04/21/2006 25

Oscillations within the pathway Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the pathway MoREs Evolution Probable PPI Interaction map Boolean Model Oscillations within the pathway time Dll1 Notch1 Psen1 Lfng Mesp2 RbpSuh Hes7 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 26

Somitogenesis - controlled by segmentation clock. Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the pathway MoREs Evolution Probable PPI Interaction map Boolean Model What was known? Somitogenesis - controlled by segmentation clock. Segmentation clock in turn is controlled by the expression of the hairy and enhancer of split family of genes. Expression of Hes family determined by the Notch signaling pathway. Review Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 27

Information from this study: Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the pathway MoREs Evolution Probable PPI Interaction map Boolean Model Review Information from this study: The NSPPs have more disordered regions than usual The MoREs seem to be either in the disordered regions or near to those – probable binding sites for binding partners These regions have low entropy and therefore seem to be conserved Interaction map on the basis of predicted ppi and genetic interactions Got the required oscillations from the Boolean model Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 28

Review Sourav Roy School of Informatics Indiana University Introduction Background Motivation Notch Pathway Structure of NSPPs Disorder in the pathway MoREs Evolution Probable PPI Interaction map Boolean Model Review Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006 29

Thanks !!! Acknowledgements Dr. S. Schnell. Dr. P. Radivojac. Dr. M. Dalkilic. Dr. H. Tang. Dr. S. Kim. Dr. C. Raphael. Junguk Hur. Systems Biology Group. School of Informatics. Everyone related to Bioinformatics at IUB. My family and friends. Thanks !!! Sourav Roy School of Informatics Indiana University 11/27/2018 Capstone Presentation 04/21/2006