ICBP, Stanford University 1 Implication Networks from Large Gene-expression Datasets Debashis Sahoo PhD Candidate, Electrical Engineering, Stanford University.

Slides:



Advertisements
Similar presentations
Threshold selection in gene co- expression networks using spectral graph theory techniques Andy D Perkins*,Michael A Langston BMC Bioinformatics 1.
Advertisements

Clustering short time series gene expression data Jason Ernst, Gerard J. Nau and Ziv Bar-Joseph BIOINFORMATICS, vol
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
ONCOMINE: A Bioinformatics Infrastructure for Cancer Genomics
ANALYSIS OF GENETIC NETWORKS USING ATTRIBUTED GRAPH MATCHING.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Cristina Manfredotti D.I.S.Co. Università di Milano - Bicocca An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data Cristina Manfredotti.
Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.
Structure Learning for Inferring a Biological Pathway Charles Vaske Stuart Lab.
Identification of network motifs in lung disease Cecily Swinburne Mentor: Carol J. Bult Ph.D. Summer 2007.
Stanford University Boolean Analysis of Large Gene-expression Datasets Debashis Sahoo PhD Candidate, Electrical Engineering Joint work with David Dill,
Radiogenomics in glioblastoma multiforme
“An Extension of Weighted Gene Co-Expression Network Analysis to Include Signed Interactions” Michael Mason Department of Statistics, UCLA.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.
Using Neural Networks to Predict Claim Duration in the Presence of Right Censoring and Covariates David Speights Senior Research Statistician HNC Insurance.
HUMAN-MOUSE CONSERVED COEXPRESSION NETWORKS PREDICT CANDIDATE DISEASE GENES Ala U., Piro R., Grassi E., Damasco C., Silengo L., Brunner H., Provero P.
Using Bayesian Networks to Analyze Whole-Genome Expression Data Nir Friedman Iftach Nachman Dana Pe’er Institute of Computer Science, The Hebrew University.
Bioinformatics for Stem Cell Lecture 2 Debashis Sahoo, PhD.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Differential analysis of Eigengene Networks: Finding And Analyzing Shared Modules Across Multiple Microarray Datasets Peter Langfelder and Steve Horvath.
“software” of life. Genomes to function Lessons from genome projects Most genes have no known function Most genes w/ known function assigned from sequence-similarity.
Computing Co-Expression Relationships Wen-Dar Lin.
Clustering Algorithms to make sense of Microarray data: Systems Analyses in Biology Doug Welsh and Brian Davis BioQuest Workshop Beloit Wisconsin, June.
While gene expression data is widely available describing mRNA levels in different cancer cells lines, the molecular regulatory mechanisms responsible.
Extracting binary signals from microarray time-course data Debashis Sahoo 1, David L. Dill 2, Rob Tibshirani 3 and Sylvia K. Plevritis 4 1 Department of.
Analyzing Expression Data: Clustering and Stats Chapter 16.
Paper Review on Cross- species Microarray Comparison Hong Lu
Evaluation of gene-expression clustering via mutual information distance measure Ido Priness, Oded Maimon and Irad Ben-Gal BMC Bioinformatics, 2007.
Equivalent Opposite PTPRC low  CD19 low FAM60A low  NUAK1 high XIST high  RPS4Y1 low COL3A1 high  SPARC high Boolean analysis of large gene-expression.
Pan-cancer analysis of prognostic genes Jordan Anaya Omnes Res, In this study I have used publicly available clinical and.
Nonlinear differential equation model for quantification of transcriptional regulation applied to microarray data of Saccharomyces cerevisiae Vu, T. T.,
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 5.
Boolean Analysis of High-Throughput Biological Datasets
PROTEIN INTERACTION NETWORK – INFERENCE TOOL DIVYA RAO CANDIDATE FOR MASTER OF SCIENCE IN BIOINFORMATICS ADVISOR: Dr. FILIPPO MENCZER CAPSTONE PROJECT.
Simultaneous identification of causal genes and dys-regulated pathways in complex diseases Yoo-Ah Kim, Stefan Wuchty and Teresa M Przytycka Paper to be.
Eigengenes as biological signatures Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University 3.
A New Statistical Method for Analyzing Longitudinal Multifactor Expression Data and It ’ s Application to Time Course Burn Data Baiyu Zhou Department of.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Volume 19, Issue 5, Pages (May 2017)
Figure S2 A B Log2 Fold Change (+/- cAMP) Transcriptome (9hr)
Impact of Formal Methods in Biology and Medicine Final Review
Impact of Formal Methods in Biology and Medicine
Bioinformatics for Stem Cell Lecture 2
Impact of Formal Methods in Biology and Medicine
1 Department of Engineering, 2 Department of Mathematics,
MiDReG: Mining Developmentally Regulated Genes
A Short Tutorial on Causal Network Modeling and Discovery
1 Department of Engineering, 2 Department of Mathematics,
Volume 2, Issue 2, Pages (February 2014)
Department of Computer Science
1 Department of Engineering, 2 Department of Mathematics,
Presented by Meeyoung Park
Topological overlap matrix (TOM) plots of weighted, gene coexpression networks constructed from one mouse studies (A–F) and four human studies including.
Department of Computer Science
SEG5010 Presentation Zhou Lanjun.
CHK1 downregulation upon ERG overexpression.
Volume 3, Issue 1, Pages (July 2016)
Volume 10, Issue 5, Pages (May 2018)
Volume 39, Issue 2, Pages (October 2016)
Single Sample Expression-Anchored Mechanisms Predict Survival in Head and Neck Cancer Yang et al Presented by Yves A. Lussier MD PhD The University.
Volume 19, Issue 5, Pages (May 2017)
Lab-Specific Gene Expression Signatures in Pluripotent Stem Cells
ADCY5 is expressed in isolated human islets and affected by T2D risk alleles. ADCY5 is expressed in isolated human islets and affected by T2D risk alleles.
Derivation of ImSig. Derivation of ImSig. A, An example of a correlation network generated from a tissue data set where nodes represent unique genes and.
Department of Computer Science
HIF-1α is not required for the classic transcriptional response to hypoxia. HIF-1α is not required for the classic transcriptional response to hypoxia.
Interactome Networks and Human Disease
FGFR1 is dominantly expressed in mesenchymal-like KRAS-mutant lung cancer cell lines. FGFR1 is dominantly expressed in mesenchymal-like KRAS-mutant lung.
Presentation transcript:

ICBP, Stanford University 1 Implication Networks from Large Gene-expression Datasets Debashis Sahoo PhD Candidate, Electrical Engineering, Stanford University Joint work with David Dill, Andrew Gentles, Rob Tibshirani, Sylvia Plevritis Integrative Cancer Biology Program, Stanford University

ICBP, Stanford University 2 Motivation Current approaches Clustering Co-expression Linear regression Mutual information BUB1B CCNB2

ICBP, Stanford University 3 Hidden Relationships Pearson’s correlation = -0.1 GABRB1 and ACPP are not linearly related. There is a Boolean relationship ACPP high  GABRB1 low GABRB1 high  ACPP low ACPP GABRB1

ICBP, Stanford University 4 Outline Motivation Boolean analysis Boolean implication network Biological insights Conserved Boolean network Conclusion

ICBP, Stanford University 5 Outline Motivation Boolean analysis Boolean implication network Biological insights Conserved Boolean network Conclusion

ICBP, Stanford University 6 Boolean Analysis Workflow Get data Normalize Determine thresholds Discover Boolean relationships Biological interpretation GEO RMA [Edgar et al. 02] [Irizarry et al. 03]

ICBP, Stanford University 7 Determine threshold A threshold is determined for each gene. The arrays are sorted by gene expression StepMiner is used to determine the threshold Sorted arrays CDH expression [Sahoo et al. 07] Threshold High Low Intermediate

ICBP, Stanford University 8 Discovering Boolean Relationships Analyze pairs of genes. Analyze the four different quadrants. Identify sparse quadrants. Record the Boolean relationships. ACPP high  GABRB1 low GABRB1 high  ACPP low ACPP GABRB1

ICBP, Stanford University 9 Boolean Relationships There are six possible Boolean relationships A low  B low A low  B high A high  B low A high  B high Equivalent Opposite

ICBP, Stanford University 10 Four Asymmetric Boolean Relationships A low  B low A low  B high A high  B low A high  B high PTPRC low  CD19 lowXIST high  RPS4Y1 low COL3A1 high  SPARC highFAM60A low  NUAK1 high PTPRCXIST FAM60ACOL3A1 NUAK1 SPARC CD19 RPS4Y1

ICBP, Stanford University 11 Two Symmetric Boolean Relationships BUB1B CCNB2 XTP7 EED Equivalent Opposite

ICBP, Stanford University 12 Outline Motivation Boolean analysis Boolean implication network Biological insights Conserved Boolean network Conclusion

ICBP, Stanford University 13 Boolean Implication Network Boolean implications form a directed graph Nodes: For each gene A A high A low Edges: A high to B low A high  B low A high B low C high

ICBP, Stanford University 14 Size of The Boolean Networks high  low low  low low  high Equivalent high  high Opposite

ICBP, Stanford University 15 Boolean Networks Are Not Scale Free Human #relationships #probesets Total SymmetricAsymmetric

ICBP, Stanford University 16 Outline Motivation Boolean analysis Boolean implication network Biological insights Conserved Boolean network Conclusion

ICBP, Stanford University 17 Gender Specific XIST X inactivation specific transcript Expressed in female RPS4Y1 Y-linked gene Expressed in male only Boolean relationship XIST high  RPS4Y1 low XIST RPS4Y1 [Day et al. 07]

ICBP, Stanford University 18 Tissue Specific ACPP Acid phosphatase, prostate Prostate specific gene GABRB1 GABA A receptor, beta 1 Brain specific Boolean relationship ACPP high  GABRB1 low ACPP GABRB1

ICBP, Stanford University 19 Development HOXD3 Homeobox D3 Fruit fly antennapedia homolog HOXA13 Homeobox A13 Fruit fly ultrabithorax homolog Boolean relationship HOXD3 high  HOXA13 low HOXD3 HOXA13 [Rinn et al. 07]

ICBP, Stanford University 20 Differentiation PTPRC protein tyrosine phosphatase, receptor type, C B220 Expressed in B cell precursors and mature B cell CD19 Expressed in mature B cell Boolean relationship PTPRC low  CD19 low PTPRC CD19

ICBP, Stanford University 21 Biological Insights GenderTissue DevelopmentDifferentiation XIST ACPP HOXD3PTPRC HOXA13CD19 RPS4Y1 GABRB1

ICBP, Stanford University 22 Outline Motivation Boolean analysis Boolean implication network Biological insights Conserved Boolean network Conclusion

ICBP, Stanford University 23 Conserved Boolean Networks Find orthologs between human, mouse and fly using EUGene database. Search for orthologous gene pairs that have the same Boolean relationship. [Gilbert, 02] Human 208M Mouse 336M Fly 17M 4M 41K

ICBP, Stanford University 24 Conserved Boolean Relationships Two largest connected components in the network of equivalent genes 178 genes: highly enriched for cell-cycle and DNA replication 32 genes: highly enriched for synaptic functions Bub1 CycB Ccnb2CCNB2 Bub1bBUB1B Fly MouseHuman

ICBP, Stanford University 25 Conserved Asymmetric Boolean Relationships GABRB1 expressing cells have low cell cycle (BUB1B) activity. Bub1 Lcch3 Gabrb1GABRB1 Bub1bBUB1B Fly MouseHuman

ICBP, Stanford University 26 Outline Motivation Boolean analysis Boolean implication network Biological insights Conserved Boolean network Conclusion

ICBP, Stanford University 27 Conclusion Boolean analysis Boolean relationships are directly visible on the scatter plot. Enables discovery of asymmetric relationship. Can reveal known biological processes. Has potential for new biological discovery. Boolean network Is large Is not scale free

ICBP, Stanford University 28 Acknowledgements The Felsher Lab:  Natalie Wu  Cathy Shachaf  Dean Felsher Funding: ICBP Program (NIH grant: 5U56CA )  Leonore A Herzenberg  James Brooks  Joe Lipsick  Gavin Sherlock  Howard Chang  Stuart Kim

ICBP, Stanford University 29 The END

ICBP, Stanford University 30 Example

ICBP, Stanford University 31 Determine threshold Its hard to determine a threshold for this gene. StepMiner usually puts a threshold in the middle for this case.

ICBP, Stanford University 32 Statistical Tests Compute the expected number of points under the independence model Compute maximum likelihood estimate of the error rate statistic = (expected – observed) expected √ a 00 (a 00 + a 01 ) a 00 (a 00 + a 10 ) + () 1 2 error rate = a 00 a 01 a 11 a 10