Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICBP, Stanford University 1 Implication Networks from Large Gene-expression Datasets Debashis Sahoo PhD Candidate, Electrical Engineering, Stanford University.

Similar presentations


Presentation on theme: "ICBP, Stanford University 1 Implication Networks from Large Gene-expression Datasets Debashis Sahoo PhD Candidate, Electrical Engineering, Stanford University."— Presentation transcript:

1 ICBP, Stanford University 1 Implication Networks from Large Gene-expression Datasets Debashis Sahoo PhD Candidate, Electrical Engineering, Stanford University Joint work with David Dill, Andrew Gentles, Rob Tibshirani, Sylvia Plevritis Integrative Cancer Biology Program, Stanford University

2 ICBP, Stanford University 2 Motivation Current approaches Clustering Co-expression Linear regression Mutual information BUB1B CCNB2

3 ICBP, Stanford University 3 Hidden Relationships Pearson’s correlation = -0.1 GABRB1 and ACPP are not linearly related. There is a Boolean relationship ACPP high  GABRB1 low GABRB1 high  ACPP low ACPP GABRB1

4 ICBP, Stanford University 4 Outline Motivation Boolean analysis Boolean implication network Biological insights Conserved Boolean network Conclusion

5 ICBP, Stanford University 5 Outline Motivation Boolean analysis Boolean implication network Biological insights Conserved Boolean network Conclusion

6 ICBP, Stanford University 6 Boolean Analysis Workflow Get data Normalize Determine thresholds Discover Boolean relationships Biological interpretation GEO RMA [Edgar et al. 02] [Irizarry et al. 03]

7 ICBP, Stanford University 7 Determine threshold A threshold is determined for each gene. The arrays are sorted by gene expression StepMiner is used to determine the threshold Sorted arrays CDH expression [Sahoo et al. 07] Threshold High Low Intermediate

8 ICBP, Stanford University 8 Discovering Boolean Relationships Analyze pairs of genes. Analyze the four different quadrants. Identify sparse quadrants. Record the Boolean relationships. ACPP high  GABRB1 low GABRB1 high  ACPP low 1 2 4 3 ACPP GABRB1

9 ICBP, Stanford University 9 Boolean Relationships There are six possible Boolean relationships A low  B low A low  B high A high  B low A high  B high Equivalent Opposite

10 ICBP, Stanford University 10 Four Asymmetric Boolean Relationships A low  B low A low  B high A high  B low A high  B high PTPRC low  CD19 lowXIST high  RPS4Y1 low COL3A1 high  SPARC highFAM60A low  NUAK1 high PTPRCXIST FAM60ACOL3A1 NUAK1 SPARC CD19 RPS4Y1

11 ICBP, Stanford University 11 Two Symmetric Boolean Relationships BUB1B CCNB2 XTP7 EED Equivalent Opposite

12 ICBP, Stanford University 12 Outline Motivation Boolean analysis Boolean implication network Biological insights Conserved Boolean network Conclusion

13 ICBP, Stanford University 13 Boolean Implication Network Boolean implications form a directed graph Nodes: For each gene A A high A low Edges: A high to B low A high  B low A high B low C high

14 ICBP, Stanford University 14 Size of The Boolean Networks high  low low  low low  high Equivalent high  high Opposite

15 ICBP, Stanford University 15 Boolean Networks Are Not Scale Free Human #relationships #probesets Total SymmetricAsymmetric

16 ICBP, Stanford University 16 Outline Motivation Boolean analysis Boolean implication network Biological insights Conserved Boolean network Conclusion

17 ICBP, Stanford University 17 Gender Specific XIST X inactivation specific transcript Expressed in female RPS4Y1 Y-linked gene Expressed in male only Boolean relationship XIST high  RPS4Y1 low XIST RPS4Y1 [Day et al. 07]

18 ICBP, Stanford University 18 Tissue Specific ACPP Acid phosphatase, prostate Prostate specific gene GABRB1 GABA A receptor, beta 1 Brain specific Boolean relationship ACPP high  GABRB1 low ACPP GABRB1

19 ICBP, Stanford University 19 Development HOXD3 Homeobox D3 Fruit fly antennapedia homolog HOXA13 Homeobox A13 Fruit fly ultrabithorax homolog Boolean relationship HOXD3 high  HOXA13 low HOXD3 HOXA13 [Rinn et al. 07]

20 ICBP, Stanford University 20 Differentiation PTPRC protein tyrosine phosphatase, receptor type, C B220 Expressed in B cell precursors and mature B cell CD19 Expressed in mature B cell Boolean relationship PTPRC low  CD19 low PTPRC CD19

21 ICBP, Stanford University 21 Biological Insights GenderTissue DevelopmentDifferentiation XIST ACPP HOXD3PTPRC HOXA13CD19 RPS4Y1 GABRB1

22 ICBP, Stanford University 22 Outline Motivation Boolean analysis Boolean implication network Biological insights Conserved Boolean network Conclusion

23 ICBP, Stanford University 23 Conserved Boolean Networks Find orthologs between human, mouse and fly using EUGene database. Search for orthologous gene pairs that have the same Boolean relationship. [Gilbert, 02] Human 208M Mouse 336M Fly 17M 4M 41K

24 ICBP, Stanford University 24 Conserved Boolean Relationships Two largest connected components in the network of equivalent genes 178 genes: highly enriched for cell-cycle and DNA replication 32 genes: highly enriched for synaptic functions Bub1 CycB Ccnb2CCNB2 Bub1bBUB1B Fly MouseHuman

25 ICBP, Stanford University 25 Conserved Asymmetric Boolean Relationships GABRB1 expressing cells have low cell cycle (BUB1B) activity. Bub1 Lcch3 Gabrb1GABRB1 Bub1bBUB1B Fly MouseHuman

26 ICBP, Stanford University 26 Outline Motivation Boolean analysis Boolean implication network Biological insights Conserved Boolean network Conclusion

27 ICBP, Stanford University 27 Conclusion Boolean analysis Boolean relationships are directly visible on the scatter plot. Enables discovery of asymmetric relationship. Can reveal known biological processes. Has potential for new biological discovery. Boolean network Is large Is not scale free

28 ICBP, Stanford University 28 Acknowledgements The Felsher Lab:  Natalie Wu  Cathy Shachaf  Dean Felsher Funding: ICBP Program (NIH grant: 5U56CA112973-02)  Leonore A Herzenberg  James Brooks  Joe Lipsick  Gavin Sherlock  Howard Chang  Stuart Kim

29 ICBP, Stanford University 29 The END

30 ICBP, Stanford University 30 Example

31 ICBP, Stanford University 31 Determine threshold Its hard to determine a threshold for this gene. StepMiner usually puts a threshold in the middle for this case.

32 ICBP, Stanford University 32 Statistical Tests Compute the expected number of points under the independence model Compute maximum likelihood estimate of the error rate statistic = (expected – observed) expected √ a 00 (a 00 + a 01 ) a 00 (a 00 + a 10 ) + () 1 2 error rate = a 00 a 01 a 11 a 10


Download ppt "ICBP, Stanford University 1 Implication Networks from Large Gene-expression Datasets Debashis Sahoo PhD Candidate, Electrical Engineering, Stanford University."

Similar presentations


Ads by Google