Download presentation
Presentation is loading. Please wait.
Published byBruce Patterson Modified over 9 years ago
1
Artificial Intelligence Term Project #3 Kyu-Baek Hwang Biointelligence Lab School of Computer Science and Engineering Seoul National University kbhwang@bi.snu.ac.kr
2
Copyright (c) 2004 by SNU CSE Biointelligence Lab2 Outline Bayesian network – revisit I Properties of Bayesian network Structural learning of Bayesian network Project 3-1 Analysis of structural learning algorithms ALARM dataset Bayesian network – revisit II Bayesian network classifiers (probabilistic inference) Project 3-2 Classification of microarray gene expression data using Bayesian networks
3
Copyright (c) 2004 by SNU CSE Biointelligence Lab3 Bayesian Network The joint probability distribution over all the variables in the Bayesian network. BA CD E Local probability distribution for X i
4
Copyright (c) 2004 by SNU CSE Biointelligence Lab4 Generative Model From the underlying distribution, a set of data examples can be generated. Conditional probability of interest can be calculated from jpd. Gene B Class Gene FGene G Gene A Gene CGene D Gene EGene H This Bayesian network can classify the examples by calculating appropriate conditional probability. P(Class| other variables)
5
Copyright (c) 2004 by SNU CSE Biointelligence Lab5 Classification by Bayesian Networks I Calculate the conditional probability of ‘Class’ variable given the value of the other variables. Infer conditional probability from joint probability distribution. For example, where summation is taken over all the possible class values.
6
Copyright (c) 2004 by SNU CSE Biointelligence Lab6 Knowing the Causal Structure Gene B Class Gene FGene G Gene A Gene CGene D Gene EGene H Gene C regulates Gene E and F. Gene D regulates Gene G and H. Class has an effect on Gene F and G. A set of comprehensible rules (or knowledge)
7
Copyright (c) 2004 by SNU CSE Biointelligence Lab7 Learning Bayesian Networks Metric approach Use a scoring metric to measure how well a particular structure fits an observed set of cases. A search algorithm is used. Find a canonical form of an equivalence class. Independence approach An independence oracle (approximated by some statistical test) is queried to identify the equivalence class that captures the independencies in the distribution from which the data was generated. Search for a PDAG
8
Copyright (c) 2004 by SNU CSE Biointelligence Lab8 Scoring Metrics for Bayesian Networks Likelihood L(G, G, C) = P(C|G h, G ) G h : hypothesis that the given data (C) was generated by a distribution that can be factored according to G. The maximum likelihood metric of G (entropy metric with opposite sign) prefer complete graph structure N: data size x j : jth example
9
Copyright (c) 2004 by SNU CSE Biointelligence Lab9 Information Criterion Scoring Metrics The Akaike information criterion (AIC) metric Bayesian information criterion (BIC) metric
10
Copyright (c) 2004 by SNU CSE Biointelligence Lab10 MDL Scoring Metrics The minimum description length (MDL) metric 1 The minimum description length (MDL) metric 2
11
Copyright (c) 2004 by SNU CSE Biointelligence Lab11 Bayesian Scoring Metrics A Bayesian metric The BDe (Bayesian Dirichlet & likelihood equivalence) metric Prior on the network structure
12
Copyright (c) 2004 by SNU CSE Biointelligence Lab12 Greedy Search Algorithm Generate initial Bayesian network structure G 0. For m = 1, 2, 3, …, until convergence. Among the possible local changes (insertion of an edge, reversal of an edge, and deletion of an edge) in G m–1, the one leads to the largest improvement in the score is performed. The resulting graph is G m. Stopping criterion Score(G m–1 ) == Score(G m ). At each iteration (learning Bayesian network consisting of n variables) O(n 2 ) local changes should be evaluated to select the best one. Random restarts is usually adopted to escape local maxima.
13
Copyright (c) 2004 by SNU CSE Biointelligence Lab13 Project 3-1 Analysis of structural learning algorithms Data generation from ALARM network Various data set size, e.g., 1000, 3000, 5000, 10000. Structural learning of Bayesian network by greedy search (hill- climbing) with several kinds of scoring metrics Compare the results w.r.t. edge errors according to various sample sizes and learning methods
14
Copyright (c) 2004 by SNU CSE Biointelligence Lab14 ALARM Network # of nodes: 37 # of edges: 46 # of possible values of variable: 2 ~ 4 values
15
Copyright (c) 2004 by SNU CSE Biointelligence Lab15 Data Generation Using Netica (http://www.norsys.com)http://www.norsys.com
16
Copyright (c) 2004 by SNU CSE Biointelligence Lab16 Structural Learning WEKA (http://www.cs.waikato.ac.nz/ml/weka/)http://www.cs.waikato.ac.nz/ml/weka/ http://www.cs.waikato.ac.nz/~remco/weka_bn/ http://www.cs.waikato.ac.nz/~remco/weka_bn/
17
Copyright (c) 2004 by SNU CSE Biointelligence Lab17 Probabilistic Inference Calculate the conditional probability given values of observed variables. Junction tree algorithm Sampling methods General probabilistic inference is intractable. (It is known to be NP-hard.) However, calculation of the conditional probability for classification is rather straightforward because of the property of Bayesian network structure (d-separation).
18
Copyright (c) 2004 by SNU CSE Biointelligence Lab18 Markov Blanket Variables of interest X = {X 1, X 2, …, X n } For a variable X i, its Markov blanket MB(X i ) is the subset of X – X i which satisfies the following: Markov boundary Minimal Markov blanket
19
Copyright (c) 2004 by SNU CSE Biointelligence Lab19 Markov Blanket in Bayesian Network Given Bayesian network structure, determination of the Markov blanket of a variable is straightforward. By the conditional independence assertions. Gene B Class Gene FGene G Gene A Gene CGene D Gene EGene H The Markov blanket of a node in the Bayesian network consists of all of its parents, spouses, and children.
20
Copyright (c) 2004 by SNU CSE Biointelligence Lab20 Classification by Bayesian Networks II
21
Copyright (c) 2004 by SNU CSE Biointelligence Lab21 Project 3-2 Classification using Bayesian network Evaluate performance of Bayesian network classifier (classification accuracy) Various parameter settings, e.g., scoring metrics and learning methods If possible, compare with other learning methods such as neural networks and decision trees. Leave-one-out cross validation Using WEKA
22
Copyright (c) 2004 by SNU CSE Biointelligence Lab22 Molecular Biology: Central Dogma DNA microarray
23
Copyright (c) 2004 by SNU CSE Biointelligence Lab23 DNA Microarrays Monitor thousands of gene expression levels simultaneously traditional one gene experiments. Fabricated by high-speed robotics. Known probes
24
Copyright (c) 2004 by SNU CSE Biointelligence Lab24 Types of DNA Microarrays Oligonucleotide chips An array of oligonucleotide (20 ~ 80-mer oligos) probes is synthesized. cDNA microarrays Probe cDNA (500 ~ 5,000 bases long) is immobilized to a solid surface.
25
Copyright (c) 2004 by SNU CSE Biointelligence Lab25 Study Treatment-specific changes in gene expression discriminate in vivo drug response in human leukemia cells, MH Cheok et al., Nature Genetics 35, 2003. 60 leukemia patients Bone marrow samples Affymetrix GeneChip arrays Gene expression data
26
Copyright (c) 2004 by SNU CSE Biointelligence Lab26 Gene Expression Data # of data examples 120 (60: before treatment, 60: after treatment) # of genes measured 12600 (Affymetrix HG-U95A array) Task Classification between “before treatment” and “after treatment” based on gene expression pattern
27
Copyright (c) 2004 by SNU CSE Biointelligence Lab27 Affymetrix GeneChip Arrays Use short oligos to detect gene expression level. Each gene is probed by a set of short oligos. Each gene expression level is summarized by Signal: numerical value describing the abundance of mRNA A/P call: denotes the statistical significance of signal
28
Copyright (c) 2004 by SNU CSE Biointelligence Lab28 Preprocessing Remove the genes having more than 60 ‘A’ calls # of genes: 12600 3190 Discretization of gene expression level Criterion: median gene expression value of each sample 0 (low) and 1 (high)
29
Copyright (c) 2004 by SNU CSE Biointelligence Lab29 Gene Filtering Using mutual information Estimated probabilities were used. # of genes: 3190 50 Final dataset # of attributes: 51 (one for the class) Class: 0 (after treatment), 1 (before treatment) # of data examples: 120
30
Copyright (c) 2004 by SNU CSE Biointelligence Lab30 Final Dataset 120 51
31
Copyright (c) 2004 by SNU CSE Biointelligence Lab31 Submission Deadline: 2004. 12. 2 Location: 301-419
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.