Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Teaching an Agent by Playing a Multimodal Memory Game: Challenges for Machine Learners and Human Teachers AAAI 2009 Spring Symposium: Agents that Learn.
CSCE555 Bioinformatics Lecture 15 classification for microarray data Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Laboratory for Social & Neural Systems Research (SNS) PATTERN RECOGNITION AND MACHINE LEARNING Institute of Empirical Research in Economics (IEW)
Machine Learning Neural Networks
Principal Component Analysis
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
Classification: Support Vector Machine 10/10/07. What hyperplane (line) can separate the two classes of data?
Reduced Support Vector Machine
Sample Midterm question. Sue want to build a model to predict movie ratings. She has a matrix of data, where for M movies and U users she has collected.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Bayesian Networks Alan Ritter.
Artificial Intelligence Term Project #3 Kyu-Baek Hwang Biointelligence Lab School of Computer Science and Engineering Seoul National University
Applications of Data Mining in Microarray Data Analysis Yen-Jen Oyang Dept. of Computer Science and Information Engineering.
Bayesian Learning for Conditional Models Alan Qi MIT CSAIL September, 2005 Joint work with T. Minka, Z. Ghahramani, M. Szummer, and R. W. Picard.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Crash Course on Machine Learning
1 Harvard Medical School Transcriptional Diagnosis by Bayesian Network Hsun-Hsien Chang and Marco F. Ramoni Children’s Hospital Informatics Program Harvard-MIT.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
Hypernetwork Models of Memory Hypernetwork Models of Memory Byoung-Tak Zhang Biointelligence Laboratory School of Computer Science and Engineering Brain.
Efficient Model Selection for Support Vector Machines
Topics on Final Perceptrons SVMs Precision/Recall/ROC Decision Trees Naive Bayes Bayesian networks Adaboost Genetic algorithms Q learning Not on the final:
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.
Benk Erika Kelemen Zsolt
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Project 1: Machine Learning Using Neural Networks Ver 1.1.
Self-Assemblying Hypernetworks for Cognitive Learning of Linguistic Memory Int. Conf. on Cognitive Science, CESSE-2008, Feb. 6-8, 2008, Sheraton Hotel,
1 Generative and Discriminative Models Jie Tang Department of Computer Science & Technology Tsinghua University 2012.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
Ch 4. Linear Models for Classification (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized and revised by Hee-Woong Lim.
Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.
10 December, 2008 CIMCA2008 (Vienna) 1 Statistical Inferences by Gaussian Markov Random Fields on Complex Networks Kazuyuki Tanaka, Takafumi Usui, Muneki.
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Guest lecture: Feature Selection Alan Qi Dec 2, 2004.
Lecture 2: Statistical learning primer for biologists
Applications of Supervised Learning in Bioinformatics Yen-Jen Oyang Dept. of Computer Science and Information Engineering.
Evolving Hypernetworks for Language Modeling AI Course Material Oct. 12, 2009 Byoung-Tak Zhang Biointelligence Laboratory School of Computer Science and.
Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.
1 Machine Learning: Lecture 6 Bayesian Learning (Based on Chapter 6 of Mitchell T.., Machine Learning, 1997)
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
Bayesian Networks in Bioinformatics Kyu-Baek Hwang Biointelligence Lab School of Computer Science and Engineering Seoul National University
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Predictive Automatic Relevance Determination by Expectation Propagation Y. Qi T.P. Minka R.W. Picard Z. Ghahramani.
Artificial Intelligence DNA Hypernetworks Biointelligence Lab School of Computer Sci. & Eng. Seoul National University.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Boosted Augmented Naive Bayes. Efficient discriminative learning of
Molecular Computational Engines of Intelligence The Second Joint Symposium on Computational Intelligence (JSCI) Jan. 19, 2006, KAIST, Korea Byoung-Tak.
Alan Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani
COMP61011 : Machine Learning Ensemble Models
DNA Computing and Molecular Programming
Artificial Intelligence Chapter 3 Neural Networks
Regulation Analysis using Restricted Boltzmann Machines
Pattern Recognition and Machine Learning
Artificial Intelligence Chapter 3 Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
Modeling IDS using hybrid intelligent systems
Presentation transcript:

Learning with Hypergraphs: Discovery of Higher-Order Interaction Patterns from High-Dimensional Data Moscow State University, Faculty of Computational Mathematics and Cybernetics, Feb. 22, 2007, Moscow, Russia Byoung-Tak Zhang Biointelligence Laboratory School of Computer Science and Engineering Brain Science, Cognitive Science, Bioinformatics Programs Seoul National University Seoul 151-742, Korea btzhang@cse.snu.ac.kr http://bi.snu.ac.kr/ I will talk about evolving DNA-encoded genetic programs in a test tube. We evaluate the potentials of this approach by solving a medical diagnosis problem on a simulated DNA computer. The individual genetic program represents a decision list of variable length and the whole population takes part in making probabilistic decisions.

Probabilistic Graphical Models (PGMs) Represent the joint probability distribution on some random variables in graphical form. Undirected PGMs Directed PGMs Generative: The probability distribution for some variables given values of other variables can be obtained. Probabilistic inference C and D are independent given B. C asserts dependency between A and B. B and E are independent given C. A B C D E © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Kinds of Graphical Models Graphical Models Undirected Directed - Boltzmann Machines - Markov Random Fields - Bayesian Networks Latent Variable Models - Hidden Markov Models - Generative Topographic Mapping Non-negative Matrix Factorization © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Bayesian Networks BN = (S, P) consists of a network structure S and a set of local probability distributions P <BN for detecting credit card fraud> Structure can be found by relying on the prior knowledge of causal relationships © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

From Bayes Nets to High-Order PGMs (1) Naïve Bayes J A F G S J (2) Bayesian Net F A G S (3) High-Order PGM J A F G S © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

The Hypernetworks

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Hypergraphs A hypergraph is a (undirected) graph G whose edges connect a non-null number of vertices, i.e. G = (V, E), where V = {v1, v2, …, vn}, E = {E1, E2, …, En}, and Ei = {vi1, vi2, …, vim} An m-hypergraph consists of a set V of vertices and a subset E of V[m], i.e. G = (V, V[m]) where V[m] is a set of subsets of V whose elements have precisely m members. A hypergraph G is said to be k-uniform if every edge Ei in E has cardinality k. A hypergraph G is k-regular if every vertex has degree k. Rem.: An ordinary graph is a 2-uniform hypergraph. © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ An Example Hypergraph E1 E3 G = (V, E) V = {v1, v2, v3, …, v7} E = {E1, E2, E3, E4, E5} E1 = {v1, v3, v4} E2 = {v1, v4} E3 = {v2, v3, v6} E4 = {v3, v4, v6, v7} E5 = {v4, v5, v7} v1 v2 E2 E4 v3 v4 v6 v5 v7 E5 © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Hypernetworks [Zhang, DNA-2006] A hypernetwork is a hypergraph of weighted edges. It is defined as a triple H = (V, E, W), where V = {v1, v2, …, vn}, E = {E1, E2, …, En}, and W = {w1, w2, …, wn}. An m-hypernetwork consists of a set V of vertices and a subset E of V[m], i.e. H = (V, V[m], W) where V[m] is a set of subsets of V whose elements have precisely m members and W is the set of weights associated with the hyperedges. A hypernetwork H is said to be k-uniform if every edge Ei in E has cardinality k. A hypernetwork H is k-regular if every vertex has degree k. Rem.: An ordinary graph is a 2-uniform hypergraph with wi=1. © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ A Hypernetwork x1 x2 x15 x3 x14 x4 x13 x5 x12 x6 x11 x7 x10 x8 x9 © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Learning with Hypernetworks

The Hypernetwork Model of Learning [Zhang, 2006] © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Deriving the Learning Rule © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Derivation of the Learning Rule © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ x1 =1 x2 =0 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 y = 1 1 x1 =0 x2 =1 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 y = 0 2 4 examples x1 =0 x2 x3 =1 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 y 3 x1 =0 x2 x3 x4 x5 x6 x7 x8 =1 x9 x10 x11 x12 x13 x14 x15 y 4 x8 x9 x12 x1 x2 x3 x4 x5 x6 x7 x10 x11 x13 x14 x15 Round 3 Round 1 Round 2 x4 x10 y=1 x1 x4 x12 y=1 x1 1 x10 x12 y=1 x4 x3 x9 y=0 x2 x3 x14 y=0 x2 2 x9 x14 y=0 x3 x6 x8 y=1 x3 x6 x13 y=1 x3 3 x8 x13 y=1 x6 x11 x15 y=0 x8 4 © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Molecular Self-Assembly of Hypernetworks xi xj y Molecular Encoding Hypernetwork Representation X1 X2 X8 X3 X7 X4 X6 X5 © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Encoding a Hypernetwork with DNA z1 : z2 : z3 : z4 : b) x1 x2 x3 x4 x5 y 1 where z1 : (x1=0, x2=1, x3=0, y=1) z2 : (x1=0, x2=0, x3=1, x4=0, x5=0, y=0) z3 : (x2=1, x4=1, y=1) z4 : (x2=1, x3=0, x4=1, y=0) a) AAAACCAATTGGAAGGCCATGCGG AAAACCAATTCCAAGGGGCCTTCCCCAACCATGCCC AATTGGCCTTGGATGCGG AATTGGAAGGCCCCTTGGATGCCC GG AAAA AATT AAGG CCTT CCAA ATGC CC Collection of (labeled) hyperedges Library of DNA molecules corresponding to (a) For example, a program x sub one equals one and x sub three equals one and x sub five equals one and y equals one in the form of decision lists or its DNA encoding denotes a decision rule saying diagnose the DNA sample as positive for disease y if contains all the three markers x sub one, x sub three and x sub five. © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

DNA Molecular Computing Nanostructure Molecular recognition Self-assembly Self-replication Heat Cool Polymer Repeat © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Learning the Hypernetwork (by Molecular Evolution) Next generation Library of combinatorial molecules Library Example + The aim is to build a decision making system f that outputs label Select the library elements matching the example Amplify the matched library elements by PCR Hybridize [Zhang, DNA11] © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Molecular Information Processing MP4.avi © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ The Theory of Bayesian Evolution Evolution as a Bayesian inference process Evolutionary computation (EC) is viewed as an iterative process of generating the individuals of ever higher posterior probabilities from the priors and the observed data. generation 0 generation g P(A |D) P(A |D) ... P0(Ai) Pg(Ai |D) Pg(Ai) [Zhang, CEC-99] © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Evolutionary Learning Algorithm for Hypernetwork Classifiers 1. Let the hypernetwork H represent the current distribution P(X,Y). 2. Get a training example (x,y). 3. Classify x using H as follows 3.1 Extract all molecules matching x into M. 3.2 From M separate the molecules into classes: Extract the molecules with label Y=0 into M0 Extract the molecules with label Y=1 into M1 3.3 Compute y*=argmaxY{0,1}| MY |/|M| 4. Update H If y*=y, then Hn ← Hn-1+{c(u, v)} for u=x and v=y for (u, v) Hn-1, If y*≠y, then Hn ← Hn-1{c(u, v)} for u=x and v ≠ y for (u, v) Hn-1 5.Goto step 2 if not terminated. © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Learning with Hypergraphs: Application Results

Biological Applications DNA-Based Molecular Diagnosis MicroRNA-Based Diagnosis Aptamer-Based Diagnosis

DNA-Based Diagnosis 120 samples from 60 leukemia patients & 120 samples from 60 leukemia patients Gene expression data Class: ALL/AML Training Hypernets with 6-fold validation Diagnosis [Cheok et al., Nature Genetics, 2003] © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Learning Curve Fitness evolution of the population of hyperedges © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Order Effects on Learning Fitness curves for runs with fixed-cardinality hyperedges (card = 1, 4, 7, 10) © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Aptamer-Based Cardiovascular Disease Diagnosis

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Training Data ▷ Disease : Cardiovascular Disease (CVD) ▷ Classes : 4 Classes [Normal / 1st / 2nd / 3rd Stages] ▷ The number of Samples : 135 Samples [N : 40 / 1st : 38 / 2nd : 19 / 3rd : 18] ▷ Preprocessing Convert to Real-value Feature Selection Using Gain Ratio Binarization Using MDL 3K Aptamer Array 3K Real-value Data 150 Real-value Data 150 Boolean Data ▷ Simulation Parameter Value 1) Order : 2 ~ 70 2) Sampling Rate : 50 3) In each case, 10 times repeated and averaged ▷ Classification : Majority voting with The Sum of Library Element Weight ▷ Training / Test Size : Traing 108 (80%) / Test 27 (20%) © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Learning & Classification by Hypernetworks Training Data X0=1 X1=1 X2=0 X3=0 X4=1 X5=1 X6=1 X7=0 … X149=1 C=1 Sampling X0=0 X1=0 X2=0 X3=1 X4=1 X5=1 X6=0 X7=0 … X149=1 C=0 Binarization X0=0 X1=0 X2=1 X3=1 X4=0 X5=1 X6=0 X7=1 … X149=1 C=1 Test Data X0=0 X1=1 X2=1 X3=1 X4=0 X5=0 X6=0 X7=1 … X149=1 C=1 X0=1 X1=0 X2=1 X3=1 X4=0 X5=0 X6=0 X7=1 … X149=1 C=0 X0=1 X1=1 X2=0 X3=0 C=1 W=1000 Data Set Source Data X0=1 X4=1 X6=1 X7=0 C=1 W=1000 X18=1 X35=0 X68=1 X82=0 C=1 W=1000 Learining Loop [Evolution Stage] X6=0 X7=0 X8=0 X9=1 C=0 W=1000 X14=0 X4=1 X5=1 X7=0 C=0 W=1000 Adjust Learning Rate X22=0 X4=1 X6=0 X149=1 C=0 W=1000 X0=1 X1=1 X2=0 X3=0 C=1 W’=1 X1=0 X33=1 X4=0 X9=1 C=1 W=1000 X0=1 X4=1 X6=1 X7=0 C=1 W’=45 X3=1 X6=0 X52=1 X8=0 C=1 W=1000 X18=1 X35=0 X68=1 X82=0 C=1 W’=4000 X0=0 X2=1 X4=0 X5=1 C=1 W=1000 Library Weight Update Test X6=0 X7=0 X8=0 X9=1 C=0 W’=12 X14=0 X4=1 X5=1 X7=0 C=0 W’=8530 Weight Update Rule (Learning) : Error Correction In case that all index-value matched, If Class is correct, w = w*1.0001 Else w = w*0.95. X22=0 X4=1 X6=0 X149=1 C=0 W’=500 X1=0 X33=1 X4=0 X9=1 C=1 W’=1300 Training Data X3=1 X6=0 X52=1 X8=0 C=1 W’=4 Test Data X0=0 X2=1 X4=0 X5=1 C=1 W’=14 Library © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Simulation Result (1/3) ▷ Training & test errors as learning goes on (order k=12) © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Simulation Result (2/3) ▷ Accuracy on test data as learning goes on (order k=12) © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Simulation Result (3/3) ▷ The effect of learning © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Mining Cancer-Related MicroRNA Modules from miRNA Expression Profiles

Gene Regulation by microRNAs MicroRNAs (miRNAs) are endogenous about 22 nt RNAs that can play important regulatory roles in animals, plants and viruses. Post-transcriptional gene regulation Binding target genes for degradation or translational repression Recently, miRNAs are reported that related to the cancer development and progression. © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Dataset Tissue type Cancer Normal Bladder 1 6 Breast 3 Colon 4 7 Kidney Lung 2 5 Pancreas 8 Prostate Uterus 10 Melanoma Mesothelioma Ovary All tissues 21 68 The miRNA expression microarray data The expression profiles of miRNA in human among 11 tumors, which were bladder, breast, colon, kidney, lung, pancreas, prostate, uterus, melanoma, mesothelioma, ovary tissue (Lu et al., 2005). This dataset consists of an expression matrix of 151 miRNAs (rows) and 89 samples (columns). © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ 36

Representing a Hypernetwork from miRNA Expression Data Class cancer X=1 0 X=3 0 X=6 0 ……. X=151 1 Class normal … 1 2 89 Data item : 151 miRNAs 89 samples Library (normal or cancer classification rules) A hypernetwork H = (X, E, W) of DNA Molecules 1 2 X=1 X=2 cancer X=10 X=20 normal X=1 X=45 cancer X=10 X=31 cancer X=1 X=80 normal X=31 X=20 normal X=1 X=2 cancer … 89 X=1 X=2 cancer X=1 X=45 cancer X=1 X=45 cancer X=1 X=2 cancer © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Performance Leave-one-out cross-validation 79.77 % 83.15 % 88.76 % Algorithms Correct classification rate Bayesian Network 79.77 % Naïve Bayes 83.15 % ID3 88.76 % Hypernetworks 90.00% Sequential Minimal Optimization (SMO) 91.01 % Multi-layer perceptron (MLP) 92.13 % © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Accuracy vs. Order for Test Data (sampling only) © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Learning Curves for Training Data © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ miRNA Data Mining miRNA modules related to cancer miRNAs related to cancer Weight miRNA modules a b 7919.249184 hsa-miR-215 1 hsa-miR-7 6787.927872 hsa-miR-194 hsa-miR-30d hsa-miR-214 hsa-miR-30e 6084.600896 hsa-miR-21 hsa-miR-321 5656.60656 hsa-miR-142-3p hsa-miR-34b hsa-miR-96 hsa-miR-126 hsa-miR-30c 5324.025784 hsa-miR-26b hsa-miR-29b hsa-let-7f hsa-miR-9* hsa-miR-224 hsa-miR-301 miRNAs weight hsa-miR-155 295972.7 hsa-miR-105 283034.8 hsa-miR-223 280371.4 hsa-miR-21 277609.9 hsa-let-7c 270764.7 hsa-miR-142-3p 266700.1 hsa-miR-29b 263159 hsa-miR-224 260877.3 hsa-miR-183 hsa-miR-184 260116.7 hsa-let-7a 256313.8 © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Non-Biological Applications Digit Recognition Face Classification Text Classification Movie Title Prediction

Digit Recognition: Dataset Original Data Handwritten digits (0 ~ 9) Training data: 2,630 (263 examples for each class) Test data: 1,130 (113 examples for each class) Preprocessing Each example is 8x8 binary matrix. Each pixel is 0 or 1. © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Pattern Classification “Layered” Hypernetwork Probabilistic Library (DNA Representation) © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Simulation Results – without Error Correction |Train set| = 3760, |Test set| = 1797. © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Performance Comparison Methods Accuracy MLP with 37 hidden nodes 0.941 MLP with no hidden nodes 0.901 SVM with polynomial kernel 0.926 SVM with RBF kernel 0.934 Decision Tree 0.859 Naïve Bayes 0.885 kNN (k=1) 0.936 kNN (k=3) 0.951 Hypernet with learning (k = 10) 0.923 Hypernet with sampling (k = 33) 0.949 © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Error Correction Algorithm Initialize the library as before. maxChangeCnt := librarySize. For i := 0 to iteration_limit trainCorrectCnt := 0. Run classification for all training patterns. For each correctly classifed patterns, increase trainCorrectCnt. For each library elements Initialize fitness value to 0. For each misclassified training patterns if a library element is matched to that example if classified correctly, then fitness of the library element gains 2 points. Else it loses 1 points. changeCnt := max{ librarySize * (1.5 * (trainSetSize - trainCorrectCnt) / trainSetSize + 0.01), maxChangeCnt * 0.9 }. maxChangeCnt := changeCnt. Delete changeCnt library elements of lowest fitness and resample library elements whose classes are that of deleted ones. © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Simulation Results – with Error Correction iterationLimit = 37, librarySize = 382,300, © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Performance Comparison Algorithms Correct classification rate Random Forest (f=10, t=50) 94.10 % KNN (k=4) Hypernetwork (Order=26) 93.49 % 92.99 % AdaBoost (Weak Learner: J48) 91.93 % SVM (Gaussian Kernel, SMO) 91.37 % MLP 90.53 % Naïve Bayes J48 87.26 % 84.86 % © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Face Classification Experiments

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Face Data Set Yale dataset 15 people 11 images per person Total 165 images © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Training Images of a Person 10 for training The remaining 1 for test © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Bitmaps for Training Data (Dimensionality = 480) © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Classification Rate by Leave-One-Out © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Classification Rate (Dimensionality = 64 by PCA) © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Text Classification Experiments

Text Classification . . . 1. Documents 2. Bag-of-words representation 3. Term vectors 1 2 3 1 1 2 d1 d2 d3 dn baseball specs graphics hockey unix space 1 x1=0 x2=1 y=1 x3=1 x2=0 y=0 x3=0 4. Binary term-document matrix 5. DNA encoded kernel functions © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Text Classification Data from Reuters-21578 (‘ACQ’ and ‘EARN’) Learning curves: average for 10 runs © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Performance Comparison ‘ACQ’ data (4,724 documents) ‘EARN’ data (7,888 documents) Higher-dimensional kernel functions can improve the performance further. © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Learning from Movie Captions Experiments

Learning Hypernets from Movie Captions Order Sequential Range: 2~3 Corpus Friends Prison Break 24 © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Learning Hypernets from Movie Captions © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Learning Hypernets from Movie Captions © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Learning Hypernets from Movie Captions © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

Learning Hypernets from Movie Captions Classification Query generation - I intend to marry her : I ? to marry her I intend ? marry her I intend to ? her I intend to marry ? Matching - I ? to marry her order 2: I intend, I am, intend to, …. order 3: I intend to, intend to marry, … Count the number of max-perfect-matching hyperedges © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Learning Hypernets from Movie Captions Completion & Classification Examples Query Completion Classification who are you Corpus: Friends, 24, Prison Break ? are you who ? you who are ? what are you Friends you need to wear it Corpus: 24, Prison Break, House ? need to wear it you ? to wear it you need ? wear it you need to ? it you need to wear ? i need to wear it you want to wear it you need to do it you need to wear a 24 House © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Conclusion Hypernetworks are a graphical model employing higher-order nodes explicitly and allowing for a more natural representation for learning higher-order graphical models. We introduce an evolutionary learning algorithm that makes use of the high information density and massive parallelism of molecular computing to solve the combinatorial explosion problems. Applied to pattern recognition (and completion) problems in IT and BT. Obtained a performance competitive to conventional ML classifiers. Why does this work? Exploits the huge population size available in DNA computing to build an ensemble machine, i.e. a hypernetwork, of simple random hyperedges. A new kind of evolutionary algorithm where a very simple “molecular” operators are applied to a “huge” population of individuals in a “massively parallel” way. Another potential of hypernetworks is for application to solving biological problems where data are given as “wet” DNA or RNA molecules. © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/

© 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/ Acknowledgements Simulation Experiments Joo-Kyoung Kim, Sun Kim, Soo-Jin Kim, Jung-Woo Ha, Chan-Hoon Park, Ha-Young Jang Collaborating Labs - Biointelligence Laboratory, Seoul National University - RNomics Lab, Seoul National University - DigitalGenomics, Inc. - GenoProt, Inc. Supported by - National Research Lab Program of Min. of Sci. & Tech. (2002-2007) - Next Generation Tech. Program of Min. of Ind. & Comm. (2000-2010) More Information at - http://bi.snu.ac.kr/MEC/ - http://cbit.snu.ac.kr/ © 2007, SNU Biointelligence Lab, http://bi.snu.ac.kr/