Sudhakar Jonnalagadda and Rajagopalan Srinivasan

Slides:



Advertisements
Similar presentations
Microarray statistical validation and functional annotation
Advertisements

Outline Questions from last lecture? P. 40 questions on Pax6 gene Mechanism of Transcription Activation –Transcription Regulatory elements Comparison between.
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
. Inferring Subnetworks from Perturbed Expression Profiles D. Pe’er A. Regev G. Elidan N. Friedman.
Work Process Using Enrich Load biological data Check enrichment of crossed data sets Extract statistically significant results Multiple hypothesis correction.
Learning rule-based models from gene expression time profiles annotated with Gene Ontology terms Jan Komorowski and Astrid Lägreid.
Principal Component Analysis
SocalBSI 2008: Clustering Microarray Datasets Sagar Damle, Ph.D. Candidate, Caltech  Distance Metrics: Measuring similarity using the Euclidean and Correlation.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Identification of a Novel cis-Regulatory Element Involved in the Heat Shock Response in Caenorhabditis elegans Using Microarray Gene Expression and Computational.
Unsupervised Learning - PCA The neural approach->PCA; SVD; kernel PCA Hertz chapter 8 Presentation based on Touretzky + various additions.
L16: Micro-array analysis Dimension reduction Unsupervised clustering.
Yeast Dataset Analysis Hongli Li Final Project Computer Science Department UMASS Lowell.
Metabolomics Bob Ward German Lab Food Science and Technology.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Functional annotation and network reconstruction through cross-platform integration of microarray data X. J. Zhou et al
Automatic Analysis of BrdU Incorporation for Replication Timing Assays Intern: Albert F. Cervantes Advisors: York Marahrens Moira Regelson Silvia Diaz-Perez.
Finding Transcription Modules from large gene-expression data sets Ned Wingreen – Molecular Biology Morten Kloster, Chao Tang – NEC Laboratories America.
Dimension reduction : PCA and Clustering Christopher Workman Center for Biological Sequence Analysis DTU.
Microarray analysis Algorithms in Computational Biology Spring 2006 Written by Itai Sharon.
Classical tree view of cell cycle data (Spellman, et al MolBiolCell 9, 3273)
Fuzzy K means.
1 Segregation of sphingolipids and sterols during formation of secretory vesicles at the trans-Golgi network Kai Simons group, 2009, J. Cell Biology Deniz.
Promoter Analysis using Bioinformatics, Putting the Predictions to the Test Amy Creekmore Ansci 490M November 19, 2002.
ICA-based Clustering of Genes from Microarray Expression Data Su-In Lee 1, Serafim Batzoglou 2 1 Department.
TimeSearcher: Interactive Querying for Identification of Patterns in Genetic Microarray Time Series Data Harry Hochheiser Ben Shneiderman Eric Baehrecke,
Clustering and MDS Exploratory Data Analysis. Outline What may be hoped for by clustering What may be hoped for by clustering Representing differences.
Gene Expression Clustering. The Main Goal Gain insight into the gene’s function. Using: Sequence Transcription levels.
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
Principal Component Analysis (PCA) for Clustering Gene Expression Data K. Y. Yeung and W. L. Ruzzo.
CSE182 L14 Mass Spec Quantitation MS applications Microarray analysis.
Clustering of DNA Microarray Data Michael Slifker CIS 526.
Gene Regulation. Regulation in Prokaryotes Gene Expression = gene to protein processing that functions within cells. Regulation = We are talking about.
Epigenetic Analysis BIOS Statistics for Systems Biology Spring 2008.
A Short Overview of Microarrays Tex Thompson Spring 2005.
Metabolomics Metabolome Reflects the State of the Cell, Organ or Organism Change in the metabolome is a direct consequence of protein activity changes.
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
G 1 and S Phases of the Cell Cycle SIGMA-ALDRICH.
Differential analysis of Eigengene Networks: Finding And Analyzing Shared Modules Across Multiple Microarray Datasets Peter Langfelder and Steve Horvath.
Blind Information Processing: Microarray Data Hyejin Kim, Dukhee KimSeungjin Choi Department of Computer Science and Engineering, Department of Chemical.
Nuria Lopez-Bigas Methods and tools in functional genomics (microarrays) BCO17.
High-throughput omic datasets and clustering
A comparative study of survival models for breast cancer prognostication based on microarray data: a single gene beat them all? B. Haibe-Kains, C. Desmedt,
Analyzing Expression Data: Clustering and Stats Chapter 16.
E14.5E16.5E18.5 Normalized mRNA level Get1 Nfix Smarcd3 A Supplementary Figure 1 (A) The microarray expression levels of bladder terminal differentiation.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Inferring gene regulatory networks from multiple microarray datasets (Wang 2006) Tiffany Ko ELE571 Spring 2009.
Principal Component Analysis and Linear Discriminant Analysis for Feature Reduction Jieping Ye Department of Computer Science and Engineering Arizona State.
Outline Molecular Cell Biology Assessment Review from last lecture Role of nucleoporins in transcription Activators and Repressors Epigenetic mechanisms.
Yeast Cell-Cycle Regulation Network inference Wang Lin.
An unsupervised conditional random fields approach for clustering gene expression time series Chang-Tsun Li, Yinyin Yuan and Roland Wilson Bioinformatics,
The transcriptional program in the response of human fibroblasts to serum Iyer et. al. (1999) Presented by: Paya Sarraf Brendan Finicle.
Gene Expression Ilana Granovsky Jonathan Laserson.
Identifying submodules of cellular regulatory networks Guido Sanguinetti Joint work with N.D. Lawrence and M. Rattray.
Supplementary figure 5.
Role of Arabidopsis ABF genes during det1 germination
Hierarchical clustering of DEGs in Shp−/− mice, chronic hepatitis C cirrhosis, and NASH. A: unsupervised hierarchical clustering of genes common to the.
Building and Analyzing Genome-Wide Gene Disruption Networks
Bacillus subtilis responses to stress
Cold Adaptation in Budding Yeast
HIS-24 regulates expression of infection-inducible genes.
Volume 22, Issue 3, Pages (January 2018)
Metabolic Homeostasis: HDACs Take Center Stage
Gene expression separates male and female single gametocytes into distinct populations. Gene expression separates male and female single gametocytes into.
RNA-Seq analysis of CYR1 cells in the adaptation to acid pH.
Loyola Marymount University
Stage-specific expression modules of preimplantation development.
Distinct subtypes of CAFs are detected in human PDAC
Panobinostat induces reversible perturbations in PBMC gene expression patterns. Panobinostat induces reversible perturbations in PBMC gene expression patterns.
DNSP16 mutant MERS-CoV infection produces minimal changes in early host responses. dNSP16 mutant MERS-CoV infection produces minimal changes in early host.
Presentation transcript:

Principal Component Analysis based Methodologies for Analyzing Time-Course Microarray Data Sudhakar Jonnalagadda and Rajagopalan Srinivasan Dept. of Chemical and Biomolecular Engineering National University of Singapore PCA-based technique for Clustering genes Finding distinct clusters Identifying differentially expressed genes

Motivation Time-course microarray experiments provide large amount of data related to dynamic changes in the cells Large number of genes are measured Multivariate data To answer different biological problems, different data mining techniques are needed Challenge: Can we develop a generalized tools that are applicable to several data-mining problems? PCA modeling Assays PC Gene Score Vectors Genes X Z PT = = Few PCs are sufficient to model the data adequately Removes noise from the data

PCA Modeling PCA Model Comparing Clusters Identifying DEG Gene Clustering Group genes into different cluster that minimizes the sum of normalized distance between each gene to the cluster centroid within the PCA model the orthogonal distance to the PCA model Comparing Clusters Model each cluster using PCA and measure the similarity of models using PCA similarity factor Identifying DEG Model the control data and project the treatment on to the model. Compare the scores to find differentially expression *** Add couple of overview figures θ21 Histone family protein H1f0 θ22 θ12 θ11 Heat-shock protein

Results: clustering genes DATA PCA clustering k-means clustering GK clustering PCA and GK clustering correctly identifies the clusters All clusters need two PCs to model Artificial Data 1 Only PCA clustering correctly identifies the clusters Clusters A, B and C needs 3,2, and 2 PCs to model Artificial Data 2 PCA and k-means identify homogenous clusters All clusters need two PCs to model GK method finds only four clusters which are not homogenous 384 cell-cycle regulated genes 5 Clusters: Early G1 Late G1 S, G2 & M Yeast cell-cycle data PCA clustering k-means GK clustering

Results: Finding Distinct Clusters Case Study: Yeast cell-cycle Data Expression data for ~6000 genes at 17 time points 384 genes found to be cell-cycle regulated Clusters reported: 5 Early G1, Late G1, S, G2, M Result: NEPSI correctly predicts 5 clusters Clusters enriched with similarly expressed genes Clusters are distinct from other clusters Early G1 Late G1 S G2 M Genes Early G1 Late G1 S G2 M 1 0.183 0.435 0.441 0.233 0.262 0.308 0.521 0.467 0.362 0.329 Time Gene activation Gene repression Source: Cho, et al. (1998) A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell., 2, 65-73.

Results: Finding DEG Case Study: Mouse data Result: Characterization of the role of HSF1 in mammalian cells Time-course expression data is collected for 9468 genes at 8 time points in WT (control) and HSF1 KO mouse (Treatment) Several mouse genes (homologue of human genes) bound by HSF1 are differentially expressed in KO mouse. However, several genes that are not bound by HSF1 are induced in both WT and KO mouse. Conclusion: HSF1 doesn’t regulate all the heat-induced genes in mammalian cells. Result: Novel genes shows differential expression in wild-type and mutant mice PCA identifies 288 differentially expressed genes 78 of them are previously reported as differentially expressed PCA identified 4 (out of 9) mouse genes homologues of human genes that are both bound by HSF1 and induced in WT mouse but not activated in HSF1 KO mouse 13 (out of 15) mouse genes homologue of human genes that are not bound by HSF1 are found to be similarly expressed in both WT and KO mouse Conclusions: PCA correctly identifies differentially expressed genes Results support that HSF1 doesn’t regulate all the heat-induced genes in mammalian cells Trinklen,N.D. et al. (2004) The role of heat shock transcription factor 1 in the genome-wide regulation of the mammalian heat shock response. Mol.Biol.cell. 15, 1254-1262.