Evaluation of Signaling Cascades Based on the Weights from Microarray and ChIP-seq Data by Zerrin Işık Volkan Atalay Rengül Çetin-Atalay Middle East Technical.

Slides:



Advertisements
Similar presentations
Linear Models for Microarray Data
Advertisements

Methods to read out regulatory functions
Statistical methods and tools for integrative analysis of perturbation signatures Mario Medvedovic Laboratory for Statistical Genomics and Systems Biology.
Computational discovery of gene modules and regulatory networks Ziv Bar-Joseph et al (2003) Presented By: Dan Baluta.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Data integration across omics landscapes Bing Zhang, Ph.D. Department of Biomedical Informatics Vanderbilt University School of Medicine
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
Detecting DNA-protein Interactions Xinghua Lu Dept Biomedical Informatics BIOST 2055.
Next-generation sequencing and PBRC. Next Generation Sequencer Applications DeNovo Sequencing Resequencing, Comparative Genomics Global SNP Analysis Gene.
Genome-wide prediction and characterization of interactions between transcription factors in S. cerevisiae Speaker: Chunhui Cai.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
ONCOMINE: A Bioinformatics Infrastructure for Cancer Genomics
Indiana University Bloomington, IN Junguk Hur Computational Omics Lab School of Informatics Differential location analysis A novel approach to detecting.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
CISC667, F05, Lec24, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) DNA Microarray, 2d gel, MSMS, yeast 2-hybrid.
ChIP-seq QC Xiaole Shirley Liu STAT115, STAT215. Initial QC FASTQC Mappability Uniquely mapped reads Uniquely mapped locations Uniquely mapped locations.
Bryan Heck Tong Ihn Lee et al Transcriptional Regulatory Networks in Saccharomyces cerevisiae.
Epistasis Analysis Using Microarrays Chris Workman.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
341: Introduction to Bioinformatics Dr. Natasa Przulj Deaprtment of Computing Imperial College London
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
Quality assessment AATGCGTACATGCACCANTTCAG GTC TGTCANNTGCATTACATGCATTGA CC AATGCGTACATGCACCANTTCAG GTC TGTCTTTTGCATNACATGCAAAAA CC TGTCTTTTGCATNACATGCAGGG.
Bioinformatics Core Facility Ernesto Lowy February 2012.
Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine.
Analysis of Microarray Data 1.Scan the images 2.Quantify intensity of spots 3.Normalization 4.Analysis of data 5.Identification of genes of interest 6.Validation.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Detecting enriched regions (Chip- seq, RIP-seq) Statistical evaluation of enriched regions Data displayed in Genome Browser Detection of enriched motifs.
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
Analyzing ChIP-seq data
RNAseq analyses -- methods
Kristen Horstmann, Tessa Morris, and Lucia Ramirez Loyola Marymount University March 24, 2015 BIOL398-04: Biomathematical Modeling Lee, T. I., Rinaldi,
Massive Parallel Sequencing
* only 17% of SNPs implicated in freshwater adaptation map to coding sequences Many, many mapping studies find prevalent noncoding QTLs.
Finish up array applications Move on to proteomics Protein microarrays.
Chromatin Immunoprecipitation DNA Sequencing (ChIP-seq)
Vidyadhar Karmarkar Genomics and Bioinformatics 414 Life Sciences Building, Huck Institute of Life Sciences.
I519 Introduction to Bioinformatics, Fall, 2012
CS5263 Bioinformatics Lecture 20 Practical issues in motif finding Final project.
Integrating the Bioinformatic Technology Group into your research programme Introduction People and Skills Examples Integrating the BTG Contacts BHRC Away.
Modeling of complex systems: what is relevant? Arno Knobbe, Marvin Meeng, Joost Kok Leiden Institute of Advanced Computer Science (LIACS)
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
Other genomic arrays: Methylation, chIP on chip… UBio Training Courses.
COMPUTATIONAL ANALYSIS OF MULTILEVEL OMICS DATA FOR THE ELUCIDATION OF MOLECULAR MECHANISMS OF CANCER Presented by Azeez Ayomide Fatai Supervisor: Junaid.
Alistair Chalk, Elisabet Andersson Stem Cell Biology and Bioinformatic Tools, DBRM, Karolinska Institutet, September Day 5-2 What bioinformatics.
Starting Monday M Oct 29 –Back to BLAST and Orthology (readings posted) will focus on the BLAST algorithm, different types and applications of BLAST; in.
Algorithms in Bioinformatics: A Practical Introduction
GeWorkbench John Watkinson Columbia University. geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
DAVID Bioinformatics Web Site 2012 – 2015 David Huang, MD LMS/CCR/NCI
Copyright OpenHelix. No use or reproduction without express written consent1.
Lecture-5 ChIP-chip and ChIP-seq
Pathway Ranking Tool Dimitri Kosturos Linda Tsai SoCalBSI, 8/21/2003.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 5.
Introduction of the ChIP-seq pipeline Shigeki Nakagome November 16 th, 2015 Di Rienzo lab meeting.
Affymetrix User’s Group Meeting Boston, MA May 2005 Keynote Topics: 1. Human genome annotations: emergence of non-coding transcripts -tiling arrays: study.
Nature as blueprint to design antibody factories Life Science Technologies Project course 2016 Aalto CHEM.
Expression Data Integration Microarray Gene Expression Database Meeting Sunday 14th November 1999.
Special Topics in Genomics ChIP-chip and Tiling Arrays.
BIOBASE Training TRANSFAC ® Containing data on eukaryotic transcription factors, their experimentally-proven binding sites, and regulated genes ExPlain™
Yiming Kang, Hien-haw Liow, Ezekiel Maier, & Michael Brent
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Altered microRNA expression in stenoses of native arteriovenous fistulas in hemodialysis patients  Lei Lv, MD, Weibin Huang, MD, Jiwei Zhang, MD, Yaxue.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Widespread modulation of epigenetic landscape for differentiated organoids is driven by Hnf4g Widespread modulation of epigenetic landscape for differentiated.
ChIP-seq Robert J. Trumbly
Genome-wide analysis of p53 occupancy.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
BIOBASE Training TRANSFAC® ExPlain™
Genomewide profiling of chromatin accessibility in prostate cancer specimens Genomewide profiling of chromatin accessibility in prostate cancer specimens.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Presentation transcript:

Evaluation of Signaling Cascades Based on the Weights from Microarray and ChIP-seq Data by Zerrin Işık Volkan Atalay Rengül Çetin-Atalay Middle East Technical University and Bilkent University Ankara - TURKEY

Content Analysis of Microarray Data ChIP-Seq Data Data Processing & Integration Scoring of Signaling Cascades Results

Traditional Analysis of Microarray Data Array2BIO BMC Bioinf. 2006

Traditional Analysis of Microarray Data Microarray Proteomics Tissuearray Protein Databases Scientific Literature Expression, Function, Interaction data Data Acquisition Integration Analysis ChIP-Seq

Traditional Analysis of Microarray Data

These tools depend on the primary significant gene lists!

Our Framework

Content Analysis of Microarray Data ChIP-Seq Data Data Processing & Integration Scoring of Signaling Cascades Results

Chromatin ImmunoPrecipitation

ChIP-Sequencing Chromatin Immunoprecipitation (ChIP) combined with genome re-sequencing (ChIP-seq) technology provides protein DNA interactome data. Generally, ChIP-seq experiments are designed for target transcription factors to provide their genome-wide binding information.

Analysis of ChIP-seq Data Several analysis tools avaliable: – QuEST: peak region detection – SISSRs : peak region detection – CisGenome: system to analyse ChIP data visualization data normalization peak detection FDR computation gene-peak association sequence and motif analysis

Analysis Steps of ChIP-seq Data Align reads to the reference genome. 1:17:900:850AGAACTTGGTGGTCATGGTGGAAGGGAG U chr2.fa F.. 19A

Analysis Steps of ChIP-seq Data Identification of peak (binding) regions. – Peak: Region has high sequencing read density FDR computation of peak regions. Sequence and motif analysis.

Further Analysis of ChIP-Seq Data Although there are a few number of early stage analysis tools for ChIP-seq data, gene annotation methods should also be integrated like in the case of microarray data analysis. ChIP-seq experiments provide detailed knowledge about target genes to predict pathway activities.

Content Analysis of Microarray Data ChIP-Seq Data Data Processing & Integration Scoring of Signaling Cascades Results

Our Framework

Data Set ChIP-Seq Data: OCT1 (TF) – Kang et.al. Genes Dev (GSE14283) – Performed on human HeLa S3 cells. – Identify the genes targeted by OCT1 TF under conditions of oxidative stress. Microarray Data: – Murray et.al. Mol Biol Cel (GSE4301) – human genes. – oxidative stress applied two channel data.

3.8 million reads Analysis of Raw ChIP-Seq Data CisGenome software identified peak regions of OCT1 data peak regions

Analysis of Raw ChIP-Seq Data Identify neighboring genes of peak regions bp ←.→ bp +

Analysis of Raw ChIP-Seq Data Total # of genes 2843 # selected genes 260 TSS 5'UTR

ChIP-Seq Data Ranking Percentile rank of each peak region is computed: cf l : cumulative frequency for all scores lower than score of the peak region r f r : frequency of score of peak region r T : the total number of peak regions

Microarray Data Analysis Two channel data Use limma package of R-Bioconductor – Apply background correction – Normalize data between arrays – Compute fold-change of gene x :

Microarray Data Ranking Set a percentile rank value for each gene : cf l : cumulative frequency for all fold-change values lower than the fold - change of the gene x f x : frequency of the fold-change of the gene x T : the total number of genes in chip

Integration of ChIP-Seq and Microarray Data Scores were associated by taking their weighted linear combinations.

Integration of ChIP-Seq and Microarray Data Scores were associated by taking their weighted linear combinations. Gene nameScore(x)ReadRankExpRank SPRY CNTFR OSMR PRLR PIK3CA

Content Analysis of Microarray Data ChIP-Seq Data Data Processing & Integration Scoring of Signaling Cascades Results

Scoring of Signaling Cascades KEGG pathways were used as the model to identify signaling cascades under the control of specific biological processes. Each signaling cascade was converted into a graph structure by extracting KGML files.

KGML example

KGML example JAK1CISHSTAT1 +p

Score Computation on Graph

Scoring Measures of Outcome Process    

Content Analysis of Microarray Data ChIP-Seq Data Data Processing & Integration Scoring of Signaling Cascades Results

Evaluated Signaling Cascades  Jak-STAT  TGF-β  Apoptosis  MAPK

Evaluated Signaling Cascades  Jak-STAT  TGF-β  Apoptosis  MAPK Apoptosis Cell cycle MAPK Ubiquitin mediated proteolysis Apoptosis Cell cycle MAPK Survival Apoptosis Degradation Apoptosis Cell cycle p53 signaling Wnt signaling Proliferation and differentiation

Control data Oxidative stress

Result of KegArray Tool

Enrichment Scores of Outcome Processes

Discussion The scores obtained with control experiment are lower compared to oxidative stress scores. The most effected biological process under oxidative stress condition and transcription of OCT1 protein was Apoptosis process having the highest score between signaling cascades. Biologist should perform lab experiment to validate this cause and effect relation.

Conclusion Our hybrid approach integrates large scale transcriptome data to quantitatively assess the weight of a signaling cascade under the control of a biological process. Signaling cascades in KEGG database were used as the models of the approach. The framework can be applicable to directed acyclic graphs.

Future Work Different ranking methods on the transcriptome data will be analyzed. In order to provide comparable scores on signaling cascades, score computation method will be changed. Permutation tests will be included to provide significance levels for enrichment scores of signaling cascades.

Acknowledgement My colleagues: – Prof.Dr. Volkan Atalay – Assoc. Prof. MD. Rengül Çetin-Atalay Sharing their raw ChIP-seq data: – Assist. Prof. Dr. Dean Tantin Travel support: – The Scientific and Technological Research Council of Turkey (TÜBİTAK)

Evaluation of Signaling Cascades Based on the Weights from Microarray and ChIP-seq Data Zerrin Işık, Volkan Atalay, and Rengül Çetin-Atalay Middle East Technical University and Bilkent University Ankara - TURKEY