Download presentation
Presentation is loading. Please wait.
Published byMelvyn Sharp Modified over 9 years ago
1
複数種類のゲノムデータか らのベイズアプローチに基 づく 遺伝子ネットワークの推定 井元 清哉 東京大学医科学研究所 ヒトゲノム解析センター DNA 情報解析分野 imoto@ims.u-tokyo.ac.jp 2004 年 8 月 5 日 統計サマーセミナー チュートリアル
2
Topics Gene network estimation using Bayesian networks from microarray data Combining gene expression and biological knowledge Using promoter elements detection together with gene expression Drug targets identification from network information Summary (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
3
Topics Gene network estimation using Bayesian networks from microarray data Combining gene expression and biological knowledge Using promoter elements detection together with gene expression Drug targets identification from network information Summary (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
4
Transfer of Information from DNA to Protein DNA gene mRNA Splicing; A process that results in removal of introns and joining of exons in RNAs. exon: cording region intron: noncording region Protein Translation Transcription AGGTTCAGCGC (C) Copyright 2003 Seiya Imoto, Human Genome Center, University of Tokyo
5
Gene Regulatory Network Gene A regulates gene B. Gene A regulates gene C. Gene C regulates gene D. ……… TAAAC Gene B Gene A A B C D Directed graph (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
6
cDNA Microarray Data GreenRedRatioLog-ratio Gene 159358.7549173.210.8284-0.2716 Gene 227366.5771966.982.62971.3949 ::::: Ratio = Red Green Log-ratio = log (Ratio) 2 (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo We have p genes’ expression values measured by n microarrays. => X ( n x p ) x ij is j th gene’s expression value measured by i th microarray.
7
Bayesian Networks 1 X4X4 X2X2 X1X1 X3X3 ・ Directed acyclic graph ・ Markov relation between nodes Decomposition of joint probability: A graphical model for capturing causal relationships among r.v.s (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo P(X 1,…, X p |G) = P(X 1 |P 1 ) x … x P(X p |P p ) P1 = (X2, X3)P1 = (X2, X3)
8
Bayesian Networks 2 g1 (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo g2 g4 g3 i th microarray: Density functions Parameters The essential problem for constructing graph by Bayesian network is how to construct each conditional density function.
9
Pacific Symposium on Biocomputing, 7, 175-186, (2002) Journal of Bioinformatics and Computational Biology, 1(2), 231-252, (2003) References (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
10
Parent-child Relationship G2 G3 G4 G1 G1 = m 2 ( G2 ) + m 3 ( G3 ) + m 4 ( G4 ) + G5 G6 G7 G8 G9 (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
11
Bayesian Network with B-spline Nonparametric Regression DAG, Markov assumption Nonparametric regression B-splines Conditional density (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
12
Graph Selection Problem 12 45 3 1 2 45 3 12 4 5 3 Which is better? (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
13
Criterion for Selecting Network Structure Prior probability of the network G Marginal likelihood of the data Posterior probability of the network G (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
14
Laplace Approximation (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
15
BNRC (Bayesian Network and Nonparametric Regression Criterion) We choose the optimal graph structure that minimizes the value of the BNRC score. (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
16
Network Learning In the Bayesian networks, determining the optimal graph is known as an NP-hard problem. We carefully use greedy hill-climbing algorithm for learning a gene network. When the number of genes is small such as 20 or 30, we have develop an algorithm for finding the optimal network. (Sun Fire 15k, 96CPU, 1200MHz) (Robinson, 1973) 1 兆 = 10 12 無量大数 = 10 68 (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
17
Reference (Learning graph structure) Pacific Symposium on Biocomputing, 9, 557-567 (2004) Note: Structural Learning of Bayesian network is known as an NP-hard problem. (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
18
Proc. 2nd Computational Methods in Systems Biology, Lecture Note in Bioinformatics, Springer-Verlag, (2004). in press. Reference (robust method) (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
19
Topics Gene network estimation using Bayesian networks from microarray data Combining gene expression and biological knowledge Using promoter elements detection together with gene expression Drug targets identification from network information Summary (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
20
Models Biological knowledge Microarray data A B CD Network G BNRC Goodness of G (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo i,j f j (x ij | p ij, j ) Bayesian Net. (G)(G) Prior of G
21
We need to optimize. Prior probability of G Gibbs distribution Z is a normalizing constant (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
22
Journal of Bioinformatics and Computational Biology, 2(1), 77-98. (2004) Reference (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo Preliminary version has appeared at the 2 nd IEEE Computational Systems Bioinformatics Conference (CSB2003)
23
Topics Gene network estimation using Bayesian networks from microarray data Combining gene expression and biological knowledge Using promoter elements detection together with gene expression Drug targets identification from network information Summary (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
24
Binding Site Information There may exist a consensus motif in the upstream DNA sequences of co-regulated genes. Transcription Factor SFF ACE2 CLB2 SWI5 …AAAAGGTAAACAATAAC… DNA Sequence Transcription Factor SFF ACE2 CLB2 SWI5 Consensus motifs mRNA (C) Copyright 2004 Y. Tamada & S. Imoto, Human Genome Center, University of Tokyo
25
Bioinformatics, 19 Suppl.2, ii227-ii236, (2003). Reference (C) Copyright 2004 Y. Tamada & S. Imoto, Human Genome Center, University of Tokyo This paper has been appeared at the 2nd European Conference on Computational Biology (ECCB2003)
26
Algorithm 1. Estimate a gene network from microarray data alone. 2. Detect a consensus motif based on the network. 3. Re-estimate the network together with microarray and the result of motif detection. (C) Copyright 2004 Y. Tamada & S. Imoto, Human Genome Center, University of Tokyo
27
A Result toward Discovery We found the MCM1 binding motif in the REB1 and ARG2 upstream region. This suggests that these genes are actually regulated by the MCM1-SFF complex. Known genes & their sequences Newly found genes & their sequences (C) Copyright 2004 Y. Tamada & S. Imoto, Human Genome Center, University of Tokyo
28
Topics Gene network estimation using Bayesian networks from microarray data Combining gene expression and biological knowledge Using promoter elements detection together with gene expression Drug targets identification from network information Summary (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
29
Drug Targets Identification TF molecule × × TF molecule TF × × × × × × × × × × × × Side effect!! Drug affected gene Nuclear receptor molecule Network information To avoid side effects (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
30
Journal of Bioinformatics and Computational Biology, 1(3), 459-474, (2003) DNA Research, 10, 19-25, (2003) References (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
31
Estimation of Gene Networks gene1 gene2 gene3 Expression data Binding site P-P interaction Gene network Statistical Model Boolean Net. Bayesian Net. Diff. Equations Dynamic Bayes Net. Imoto et al. PSB 2002 De Hoon et al. PSB 2003 Kim et al. Biosystems 2004 Tamada et al. Bioinformatics 2003 Nariai et al. PSB 2004 (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
32
Gene and Protein Networks gene1 gene2 gene3 Expression data Binding site P-P interaction Gene network Protein network Localization Evolution ….. (C) Copyright 2004 Seiya Imoto, Human Genome Center, University of Tokyo
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.