+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.

Slides:

Advertisements

Similar presentations

Document Summarization using Conditional Random Fields Dou Shen, Jian-Tao Sun, Hua Li, Qiang Yang, Zheng Chen IJCAI 2007 Hao-Chin Chang Department of Computer.

Advertisements

Sparsification and Sampling of Networks for Collective Classification

Multi-label Classification without Multi-label Cost - Multi-label Random Decision Tree Classifier 1.IBM Research – China 2.IBM T.J.Watson Research Center.

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Multi-label Relational Neighbor Classification using Social Context Features Xi Wang and Gita Sukthankar Department of EECS University of Central Florida.

Data Mining Classification: Alternative Techniques

Patch to the Future: Unsupervised Visual Prediction

NetSci07 May 24, 2007 Entity Resolution in Network Data Lise Getoor University of Maryland, College Park.

Structural Inference of Hierarchies in Networks BY Yu Shuzhi 27, Mar 2014.

An Approach to Evaluate Data Trustworthiness Based on Data Provenance Department of Computer Science Purdue University.

Lei Tang May.04,  Typical Classification task: IID assumption  Relational Learning: instances are interrelated.  Some Examples: ◦ Hypertext Classification.

IJCAI Wei Zhang, 1 Xiangyang Xue, 2 Jianping Fan, 1 Xiaojing Huang, 1 Bin Wu, 1 Mingjie Liu 1 Fudan University, China; 2 UNCC, USA {weizh,

Relational Learning with Gaussian Processes By Wei Chu, Vikas Sindhwani, Zoubin Ghahramani, S.Sathiya Keerthi (Columbia, Chicago, Cambridge, Yahoo!) Presented.

2. Introduction Multiple Multiplicative Factor Model For Collaborative Filtering Benjamin Marlin University of Toronto. Department of Computer Science.

Daozheng Chen 1, Mustafa Bilgic 2, Lise Getoor 1, David Jacobs 1, Lilyana Mihalkova 1, Tom Yeh 1 1 Department of Computer Science, University of Maryland,

Assuming normally distributed data! Naïve Bayes Classifier.

Lesson 8: Machine Learning (and the Legionella as a case study) Biological Sequences Analysis, MTA.

Ensemble Learning: An Introduction

1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.

Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University

Graph-Based Semi-Supervised Learning with a Generative Model Speaker: Jingrui He Advisor: Jaime Carbonell Machine Learning Department

Ordinal Decision Trees Qinghua Hu Harbin Institute of Technology

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Large-Scale Cost-sensitive Online Social Network Profile Linkage.

Semantic History Embedding in Online Generative Topic Models Pu Wang (presenter) Authors: Loulwah AlSumait Daniel Barbará

Data mining and machine learning A brief introduction.

Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.

C LUSTERING NETWORKED DATA BASED ON LINK AND SIMILARITY IN A CTIVE LEARNING Advisor : Sing Ling Lee Student : Yi Ming Chang Speaker : Yi Ming Chang 1.

Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.

Collective Classification A brief overview and possible connections to -acts classification Vitor R. Carvalho Text Learning Group Meetings, Carnegie.

Transfer Learning Motivation and Types Functional Transfer Learning Representational Transfer Learning References.

Probabilistic Graphical Models for Semi-Supervised Traffic Classification Rotsos Charalampos, Jurgen Van Gael, Andrew W. Moore, Zoubin Ghahramani Computer.

Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.

Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova ， Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.

Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.

Supervised Clustering of Label Ranking Data Mihajlo Grbovic, Nemanja Djuric, Slobodan Vucetic {mihajlo.grbovic, nemanja.djuric,

LINK PREDICTION IN CO-AUTHORSHIP NETWORK Le Nhat Minh ( A N) Supervisor: Dongyuan Lu 1.

Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date :

Saliency Aggregation: A Data- driven Approach Long Mai Yuzhen Niu Feng Liu Department of Computer Science, Portland State University Portland, OR,

Bing LiuCS Department, UIC1 Chapter 8: Semi-supervised learning.

Link Prediction Topics in Data Mining Fall 2015 Bruno Ribeiro

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

Threshold Setting and Performance Monitoring for Novel Text Mining Wenyin Tang and Flora S. Tsai School of Electrical and Electronic Engineering Nanyang.

Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:

Text Based Similarity Metrics and Delta for Semantic Web Graphs Krishnamurthy Koduvayur Viswanathan Monday, June 28,

Progress Report ekker. Problem Definition In cases such as object recognition, we can not include all possible objects for training. So transfer learning.

Improving Collaborative Filtering by Incorporating Customer Reviews Hui Hui Supervisor Prof Min-Yen Kan Dr. Kazunari Sugiyama 1.

Hierarchical Mixture of Experts Presented by Qi An Machine learning reading group Duke University 07/15/2005.

Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.

Introduction to Data Mining Clustering & Classification Reference: Tan et al: Introduction to data mining. Some slides are adopted from Tan et al.

Applying Link-based Classification to Label Blogs Smriti Bhagat, Irina Rozenbaum Graham Cormode.

Who am I? Work in Probabilistic Machine Learning Like to teach 

Semi-Supervised Clustering

Sofus A. Macskassy Fetch Technologies

Constrained Clustering -Semi Supervised Clustering-

Machine Learning Week 1.

Revision (Part II) Ke Chen

MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.

Jiawei Han Department of Computer Science

Binghui Wang, Le Zhang, Neil Zhenqiang Gong

Somi Jacob and Christian Bach

Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.

GANG: Detecting Fraudulent Users in OSNs

Mingzhen Mo and Irwin King

Discriminative Probabilistic Models for Relational Data

Sofia Pediaditaki and Mahesh Marina University of Edinburgh

Using Bayesian Network in the Construction of a Bi-level Multi-classifier. A Case Study Using Intensive Care Unit Patients Data B. Sierra, N. Serrano,

Graph Attention Networks

Human-object interaction

NON-NEGATIVE COMPONENT PARTS OF SOUND FOR CLASSIFICATION Yong-Choon Cho, Seungjin Choi, Sung-Yang Bang Wen-Yi Chu Department of Computer Science &

Presentation transcript:

+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George Mason University Fairfax, VA, USA

+ Outline Introduction Motivation Related Work Problem definition Proposed Method Results Conclusion and Future Work

+ Importance of Multi-label Network Classification Datasets with link structure can be represented as graphs or networks Classifying nodes in networks is useful for: 1. Social Network Analysis to identify user interests – product recommendations 2. Scientific collaboration analysis to identify research interests of individuals – recommendation of relevant scholarly articles 3. Protein-Protein Interaction (PPI) network analysis – protein function prediction Entities can belong to multiple classes – multi-label classification Existence of inter-label, intra-label and inter-instance dependency make multi-label classification hard

+ Multi-label Networks (Palla et al. Nature 2005) Multi-label PPI networkExample: Zoomed in PPI network

+ Related Work Single label classification in Network data (a.k.a. Collective Classification) [Lu & Getoor ICML 2003, Neville & Jensen AAAI 2000, Sen et al. AI Magazine 2008] Multi-label Classification [Zhang et al. IJCAI 2011, Zhang et al. SIGKDD 2010] Multi-label Classification in Network data [Kong et al. SIAM SDM 2011]

+ Classification in Networks Input: A graph G = (V,E) with given percentage of labeled nodes for training, node features for all the nodes Output: Predicted labels of the test nodes Model: 1. Relational features and node features are used for training local classifier using labeled nodes 2. Test nodes labels are initialized with labels predicted by local classifier using node attributes 3. Inference through iterative classification of test nodes until convergence criterion reached Network of researchers ML DM SW AI Bio ?

+ Collective Classification (Sen et al. ‘08) Classification of linked data in networks using iterative inference Collectively predict the labels of all test nodes Relational features = [ ] Total features = node features + relational features What is the label of this orange guy? ML DM AI SW Bio

+ X1X1 X1X1 Y1Y1 Y1Y1 X3X3 X3X3 Y3Y3 Y3Y3 X2X2 X2X2 Y2Y2 Y2Y2 X4X4 X4X4 Y4Y4 Y4Y4 Single Label Collective Classification Partially labeled network with node attributes for all nodes

+ Multi-label Classification in non- network data Y11Y11 Y11Y11 X1X1 X1X1 Y12Y12 Y12Y12 Y13Y13 Y13Y13 X2X2 X2X2 Y21Y21 Y21Y21 Y22Y22 Y22Y22 Y23Y23 Y23Y23 X3X3 X3X3 Y31Y31 Y31Y31 Y32Y32 Y32Y32 Y33Y33 Y33Y33 X4X4 X4X4 Y41Y41 Y41Y41 Y42Y42 Y42Y42 Y43Y43 Y43Y43 No link structure between instances

+ Y11Y11 Y11Y11 X1X1 X1X1 Y12Y12 Y12Y12 Y13Y13 Y13Y13 X2X2 X2X2 Y21Y21 Y21Y21 Y22Y22 Y22Y22 Y23Y23 Y23Y23 X3X3 X3X3 Y31Y31 Y31Y31 Y32Y32 Y32Y32 Y33Y33 Y33Y33 X4X4 X4X4 Y41Y41 Y41Y41 Y42Y42 Y42Y42 Y43Y43 Y43Y43 Link structure between instances, Partially labeled network Multi-label Collective Classification

+ Multi-label Collective Classification [Kong et al. SDM 2011] (Kong et al. SIAM SDM 2011)

+ Our Contributions Incorporate rank based neighborhood selection for influential nodes in Multi-label Collective Classification Proposing a simple neighborhood ranking based naïve Bayes network classifier Proposing an unbiased validation method

+ Multi-label Collective Classification with Ranked Neighbors (ICML_Rank) Assign a rank score to all training nodes Use only top ranked nodes (ranks above given threshold) as influential neighbors for relational feature computation Train K different SVM models (K = no of classes) on the training nodes Use K trained models from previous step to predict K labels of each test node using only node features Compute relational features of all test nodes considering only influential training nodes Use iterative inference to collectively predict K labels of all test nodes until convergence

+ Multi-label Classification in Networks (RankNN) Local neighborhood based ranking method Computes the rank of all training nodes Compute prior probability of each label for a test node Computes likelihood of each label for a test node based on the influential neighbors Computes posterior probability of each label for a test node from the prior and likelihood Faster than multi-label collective classification methods when training sample size is small

+ Influence – concept & measure How to distinguish between different neighbors?

+ Influence of a Training Node Rank of training node is: Where C is cosine similarity matrix between pairwise node features, A is weighted adjacency matrix, M is node label association matrix and “r” is the rank of label Use a threshold to prune out nodes with low rank score Default threshold is set as median value of the rank scores

+ Influential Neighbors P1={DM, ML} P2={DM,AI} P3={DM,AI, ML} Rank(P1) = 0.5, Rank(P2) = 0.3, Rank(P3) = 0.2 and Rank of others is <0.1 P8={DM} P7={ML} P6={SW} P5={DM} P4={AI} If rank threshold is 0.2 then influential neighbors are P1, P2 and P3 ONLY!

+ Proposed Validation Protocol : Filtered VS Unfiltered data Unbiased evaluation Remove shared information between training node and test node, from the training node Re-compute labels of training node Unfiltered data – without performing any label pre- processing before training

+ Unbiased label set: an example Training Node Test Node Coauthored 3 papers in ML Label Before: {DM, AI, ML} Label After: {DM, AI} With the test node

+ Datasets DBLP computer science bibliography data is pre-processed to generate to multi-labeled networks Publication records from 2000 to 2010 were extracted Conference locations are categorized into labels Each node is an author Weight on edge is no of co-authored papers Node attributes are tf-idf weights of titles of papers co-authored

+ Baseline Methods & Proposed Methods Baseline Methods (multi-label collective classification): 1. ICML (Kong et al. SIAM SDM 2011) 2. ML_ICA (Kong et al. SIAM SDM 2011) Proposed Methods (neighborhood ranking based multi-label collective classification with unbiased evaluation) 1. ICML_Rank 2. ML_ICA_Rank 3. RankNN

+ Evaluation Metrics Hamming Loss – considers a label not predicted (i.e. a missing error) and a label incorrectly predicted (i.e. a prediction error) Subset Loss – strict loss i.e. error if predicted labels do not exactly match the true labels Macro-F1 score – computes the average of F1-score on the prediction of all labels Micro-F1 score – harmonic mean of precision & recall across individual label, averaged across set of test nodes

+ Results (Loss for different methods w.r.t training percentage)

+ Results (Loss of RankNN w.r.t. different probability thresholds)

+

+

+ Time consumption of different methods with 30% labeled data

+ Proposed a node ranking method to prune neighborhood for selecting influential nodes adaptively Proposed an improved rank-based method for multi-label collective classification Proposed a simple classification method for multi-label network classification Reported results on real world dataset Need to come up with methods that can scale to huge number of nodes maintaining similar accuracy Need to develop methods that can handle dynamic networks

+ Thank You!