Download presentation
Presentation is loading. Please wait.
Published byJodie Tate Modified over 9 years ago
1
Stanford University Boolean Analysis of Large Gene-expression Datasets Debashis Sahoo PhD Candidate, Electrical Engineering Joint work with David Dill, Andrew Gentles, Rob Tibshirani, Sylvia Plevritis
2
Stanford University Outline Standard microarray work flow Data collection and preprocessing Boolean analysis Biological insights Conclusion and future work
3
Stanford University Microarray Work Flow mRNAHybridizationScanning Image processingNormalizationData analysis
4
Stanford University Data Collection There are thousands of microarray freely available GEO ArrayExpress SMD Celsius
5
Stanford University Preprocessing Get original RAW CEL files for one platform together. Typical number of CEL files : 2,000-11,000 Use RMA to normalize the CEL files Need a memory efficient algorithm Generates expression values for each probeset
6
Stanford University Existing Methods Correlation analysis Conditional probability Mutual information
7
Stanford University Boolean Analysis Get RAW DataNormalize Determine thresholds Discover Boolean relationshipsNew Biology
8
Stanford University Example
9
Stanford University Determine threshold Sort the gene expressions Use StepMiner to determine the threshold
10
Stanford University Determine threshold Its hard to determine a threshold for this gene. StepMiner usually puts a threshold in the middle for this case.
11
Stanford University Discover Boolean Relationships Analyze scatter plots between two genes. Divide the space into four different regions using the thresholds (quadrants). Determine sparse quadrants. Determine the Boolean relationships. WNT5A high PAX5 low 0 1 3 2
12
Stanford University Statistical Tests Compute the expected number of points under the independence model Compute maximum likelihood estimate of the error rate statistic = (expected – observed) expected √ a 00 (a 00 + a 01 ) a 00 (a 00 + a 10 ) + () 1 2 error rate = a 00 a 01 a 11 a 10
13
Stanford University Boolean Relationships Tightly co-regulated genes forms two sparse quadrants. There are six possible Boolean relationships Equivalent Opposite A lowB low A lowB high A highB low A high B high
14
Stanford University Boolean Relationships Equivalent Opposite PTPRC low CD19 low XIST high RPS4Y1 low COL3A1 high COL1A1 highFAM60A low NUAK1 high SymmetricAsymmetric
15
Stanford University Boolean Implication Network Directed graph Nodes: For each gene A A high A low Edges: A high to B low A high B low A high B low A low B high C high C low
16
Stanford University New Biology This slide is under construction!!
17
Stanford University Biological Insights Gender Organ Tissue DevelopmentDifferentiationCo-expression
18
Stanford University Example Application Immunology B Cell differentiation Goal: Discover genes that mark unique B Cell precursors
19
Stanford University Differentiation Tree Hematopoietic stem cell differentiation is a tree Root: HSC Leaf Lymphocytes B Cell, T Cell, NK cell, Dendritic cell Erythrocytes Granulocytes: Basophil, Neutrophil, Eosinophil Monocytes: Dendritic cell Thrombocytes
20
Stanford University KIT high A high B low B220 low CD19 low KIT A B B220 CD19 A high B low
21
Stanford University Conclusion Boolean analysis Directly visible on the scatter plot. Enables discovery of asymmetric relationship. Follow biology. Potential application to Immunology Future work Cancer progression New biology
22
Stanford University Acknowledgements The Felsher Lab: Natalie Wu Cathy Shachaf Dean Felsher Funding: ICBP Program (NIH grant: 5U56CA112973-02) Leonore A Herzenberg James Brooks Joe Lipsick Gavin Sherlock Howard Chang Stuart Kim
23
Stanford University The END
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.