Multiple DAGs Learning with Non-negative Matrix Factorization

Slides:

Advertisements

Similar presentations

Applications of one-class classification

Advertisements

Bayesian network for gene regulatory network construction

Greedy Layer-Wise Training of Deep Networks

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

ICONIP 2005 Improve Naïve Bayesian Classifier by Discriminative Training Kaizhu Huang, Zhangbing Zhou, Irwin King, Michael R. Lyu Oct

A gene expression analysis system for medical diagnosis D. Maroulis, D. Iakovidis, S. Karkanis, I. Flaounas D. Maroulis, D. Iakovidis, S. Karkanis, I.

Multiple Instance Learning

Support Vector Machines

1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.

Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.

Non Negative Matrix Factorization

Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:

计算机学院计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知计算机学院 Perceptron Revisited: Linear Separators Binary classification.

Evaluation Function in Game Playing Programs M1 Yasubumi Nozawa Chikayama & Taura Lab.

@delbrians Transfer Learning: Using the Data You Have, not the Data You Want. October, 2013 Brian d’Alessandro.

Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.

Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.

why smart data is better than big data Queen Mary University of London

Learning from Positive and Unlabeled Examples Investigator: Bing Liu, Computer Science Prime Grant Support: National Science Foundation Problem Statement.

Conditional Random Fields for ASR Jeremy Morris July 25, 2006.

Neural Text Categorizer for Exclusive Text Categorization Journal of Information Processing Systems, Vol.4, No.2, June 2008 Taeho Jo* 報告者 : 林昱志.

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.

Guest lecture: Feature Selection Alan Qi Dec 2, 2004.

School of Computer Science 1 Information Extraction with HMM Structures Learned by Stochastic Optimization Dayne Freitag and Andrew McCallum Presented.

Dimensionality Reduction in Unsupervised Learning of Conditional Gaussian Networks Authors: Pegna, J.M., Lozano, J.A., Larragnaga, P., and Inza, I. In.

Transfer Learning in Sequential Decision Problems: A Hierarchical Bayesian Approach Aaron Wilson, Alan Fern, Prasad Tadepalli School of EECS Oregon State.

Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.

Private Release of Graph Statistics using Ladder Functions J.ZHANG, G.CORMODE, M.PROCOPIUC, D.SRIVASTAVA, X.XIAO.

A Cooperative Coevolutionary Genetic Algorithm for Learning Bayesian Network Structures Arthur Carvalho

Sparse Coding: A Deep Learning using Unlabeled Data for High - Level Representation Dr.G.M.Nasira R. Vidya R. P. Jaia Priyankka.

Ke (Kevin) Wu1,2, Philip Watters1, Malik Magdon-Ismail1

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

How to forecast solar flares?

Chapter 7. Classification and Prediction

Sentiment analysis algorithms and applications: A survey

Deep Learning Amin Sobhani.

Zeyu You, Raviv Raich, Yonghong Huang (presenter)

KDD CUP 2001 Task 1: Thrombin Jie Cheng (

Chilimbi, et al. (2014) Microsoft Research

An Artificial Intelligence Approach to Precision Oncology

School of Computer Science & Engineering

Rule Induction for Classification Using

Adversarial Learning for Neural Dialogue Generation

Akbar Akbari Esfahani1, Theodor Asch2

Conditional Random Fields for ASR

COMP61011 : Machine Learning Ensemble Models

Intelligent Information System Lab

Deep learning and applications to Natural language processing

Generating Natural Answers by Incorporating Copying and Retrieving Mechanisms in Sequence-to-Sequence Learning Shizhu He, Cao liu, Kang Liu and Jun Zhao.

Hyper-parameter tuning for graph kernels via Multiple Kernel Learning

Adversarially Tuned Scene Generation

Lecture 24: Model Hub.

Hidden Markov Models Part 2: Algorithms

On-going research on Object Detection *Some modification after seminar

Similarity based on Shape and Appearance

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

Learning Probabilistic Graphical Models Overview Learning Problems.

Example: Academic Search

MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING

CSSE463: Image Recognition Day 18

Word embeddings (continued)

Logistics Project proposals are due by midnight tonight (two pages)

Attention for translation

Learning Incoherent Sparse and Low-Rank Patterns from Multiple Tasks

Kostas Kolomvatsos, Christos Anagnostopoulos

Keshav Balasubramanian

Label propagation algorithm

Modeling IDS using hybrid intelligent systems

Report 7 Brandon Silva.

Continuous Curriculum Learning for RL

Presentation transcript:

Multiple DAGs Learning with Non-negative Matrix Factorization Yun Zhou National University of Defense Technology AMBN-2017, Kyoto, Japan 20/09/2017

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 01/20 Overview Classic Machine Learning (ML) paradigm: isolated single-task learning Given a dataset, run an ML algo. to build a model e.g., SVM, CRF, Neural Nets, Bayesian networks, …. Without considering previously learned knowledge Weaknesses of “isolated learning” Knowledge learned is not retained or accumulated Needs a large number of training examples Suitable for well-defined & narrow tasks. Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 01/20

Inductive Transfer Learning Overview Human retain knowledge learned in one task and use it in another task to learn more knowledge Learn simultaneously from different similar tasks Shared knowledge among different tasks enables us to learn these tasks with little data or effort. Multi-Task Learning: Related tasks can be learned jointly Some kinds of commonality can be used across tasks Inductive Transfer Learning Multi-Task Learning Tasks are learned simultaneously Focus on optimizing a target task Humans never learn in isolation Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 02/20

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 03/20 Overview Bayesian network have been well studied in the past two decades. Structure learning still is challenging: Score-based algorithm Constraint-based algorithm There are shared parts between different BNs. Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 03/20

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 04/20 Overview Asia and Cancer networks adopted from the bnlearn repository. Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 04/20

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 05/20 Overview We focus on the multi-task setting of score-based algorithms for BN structure learning. Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 05/20

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 06/20 Related Works Task-Relatedness Aware Multi-task (TRAM) learning [Oyen and Lane, 2012]: Data fitness Regularization on model complexity Structure differences among tasks Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 06/20

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 07/20 Limitations of TRAM Different task learning orders will produce different learning results: Task order 1, 2, 3, 4: Task order 4, 3, 2, 1: Task relatedness needs to be tuned with specific domain knowledge. Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 07/20

Learning a set of DAGs with a single hidden factor (MSL-SHF) M step: E step: The is the shared hidden structure over the entire tasks. Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 08/20

Learning a set of DAGs with a single hidden factor (MSL-SHF) The black DAG is the shared hidden structure over the entire tasks. Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 09/20

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 10/20 Learning a set of DAGs with multiple hidden factor (MSL-MHF) [Oates et al., 2016] M step: E step: The is the closest hidden structure to selected in task. Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 10/20

Learning a set of DAGs with parts-based factors In real world BNs construction, people usually combine some expert knowledge to handcraft the DAG: BN idioms [Neil et al., 2000] BN fragments [Laskey et al., 2008] Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 11/20

Learning a set of DAGs with parts-based factors A set of estimated A matrix NMF aims to ; . Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 12/20

Learning a set of DAGs with parts-based factors Thus, the entire multi-task estimation problem (MSL-NMF) is defined as: The is the transpose of the matrix . Reconstructed hidden strucure decoder encoder Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 13/20

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 14/20 Experiments Synthetic data from Asia network: Randomly insert or delete one edge of the ground truth to make T new BNs; Forward sampling 200, 350 and 500 samples for 20 and 50 tasks. Real-world landmine problem: Classifying the existence of landmines with synthetic-aperture radar data; 29 landmine fields correspond to 29 tasks, each has 400-600 data samples. Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 14/20

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 15/20 Asia network The number of hidden factor is set as 2 (K=2); The relatedness parameter in MSL-TRAM is set as 0.5. Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 15/20

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 16/20 Landmine problem 9 features are discretized into two values by a standard K-means algorithm. Learned DAG contains 10 nodes, where the node 10 is the landmine class node. Binary node that 1 for landmine and 0 for clutter. Half of each dataset for training and half for testing (calculating AUC values). Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 16/20

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 17/20 Landmine problem Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 17/20

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 18/20 Landmine problem Small improvements observed. Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 18/20

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 19/20 Conclusions Findings There exists commodities between different Bayesian networks. Multi-task learning achieves the better result than learning individually when multiple similar tasks are provided. This is the first try to learn multiple DAGs with parts-based factors. Limitations Improvements are not huge. More experiments are needed to verify the proposed method. Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 19/20

Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 20/20 Future works Consider the shared parameters in the learning; Extend this method to solve BN transfer learning problem. BN repository (well learnt BNs) transfer update New learning task Zhou et al. Multiple DAGs Learning with NMF 20/09/2017 20/20

Thank you! zhouyun@nudt.edu.cn