A Fast Kernel for Attributed Graphs Yu Su University of California at Santa Barbara with Fangqiu Han, Richard E. Harang, and Xifeng Yan.

Slides:

Advertisements

Similar presentations

The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke

Advertisements

Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Fast Algorithms For Hierarchical Range Histogram Constructions

Fast Jensen-Shannon Graph Kernel Bai Lu and Edwin Hancock Department of Computer Science University of York Supported by a Royal Society Wolfson Research.

MIT CSAIL Vision interfaces Towards efficient matching with random hashing methods… Kristen Grauman Gregory Shakhnarovich Trevor Darrell.

BIRCH: Is It Good for Databases? A review of BIRCH: An And Efficient Data Clustering Method for Very Large Databases by Tian Zhang, Raghu Ramakrishnan.

Yue Han and Lei Yu Binghamton University.

MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…

The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features Kristen Grauman Trevor Darrell MIT.

Ziming Zhang*, Ze-Nian Li, Mark Drew School of Computing Science Simon Fraser University Vancouver, Canada {zza27, li, AdaMKL: A Novel.

Similarity-based Classifiers: Problems and Solutions.

Fast intersection kernel SVMs for Realtime Object Detection

Discriminative and generative methods for bags of features

One-Shot Multi-Set Non-rigid Feature-Spatial Matching

Lecture 21: Spectral Clustering

Support Vector Machines and Kernel Methods

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Graph Based Semi- Supervised Learning Fei Wang Department of Statistical Science Cornell University.

Unsupervised Learning of Categories from Sets of Partially Matching Image Features Dominic Rizzo and Giota Stratou.

© 2008 IBM Corporation Mining Significant Graph Patterns by Leap Search Xifeng Yan (IBM T. J. Watson) Hong Cheng, Jiawei Han (UIUC) Philip S. Yu (UIC)

A Sparsification Approach for Temporal Graphical Model Decomposition Ning Ruan Kent State University Joint work with Ruoming Jin (KSU), Victor Lee (KSU)

Scalable Text Mining with Sparse Generative Models

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Discriminative and generative methods for bags of features

Selective Transfer Machine for Personalized Facial Action Unit Detection Wen-Sheng Chu, Fernando De la Torre and Jeffery F. Cohn Robotics Institute, Carnegie.

Real-time Action Recognition by Spatiotemporal Semantic and Structural Forest Tsz-Ho Yu, Tae-Kyun Kim and Roberto Cipolla Machine Intelligence Laboratory,

Step 3: Classification Learn a decision rule (classifier) assigning bag-of-features representations of images to different classes Decision boundary Zebra.

CSE 185 Introduction to Computer Vision Pattern Recognition.

Overcoming Dataset Bias: An Unsupervised Domain Adaptation Approach Boqing Gong University of Southern California Joint work with Fei Sha and Kristen Grauman.

Data mining and machine learning A brief introduction.

DATA MINING LECTURE 10 Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines.

Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.

Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.

Fast Similarity Search for Learned Metrics Prateek Jain, Brian Kulis, and Kristen Grauman Department of Computer Sciences University of Texas at Austin.

Object Detection with Discriminatively Trained Part Based Models

Xiangnan Kong,Philip S. Yu Department of Computer Science University of Illinois at Chicago KDD 2010.

A Comparative Study of Kernel Methods for Classification Applications Yan Liu Oct 21, 2003.

Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.

1 A fast algorithm for learning large scale preference relations Vikas C. Raykar and Ramani Duraiswami University of Maryland College Park Balaji Krishnapuram.

Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.

Chao-Yeh Chen and Kristen Grauman University of Texas at Austin Efficient Activity Detection with Max- Subgraph Search.

Andreas Papadopoulos - [DEXA 2015] Clustering Attributed Multi-graphs with Information Ranking 26th International.

Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman.

Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.

Fast Kernel-Density-Based Classification and Clustering Using P-Trees Anne Denton Major Advisor: William Perrizo.

Interactive Learning of the Acoustic Properties of Objects by a Robot

A feature-based kernel for object classification P. Moreels - J-Y Bouguet Intel.

Optimal Dimensionality of Metric Space for kNN Classification Wei Zhang, Xiangyang Xue, Zichen Sun Yuefei Guo, and Hong Lu Dept. of Computer Science &

Consensus Group Stable Feature Selection

Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.

DATA MINING LECTURE 10b Classification k-nearest neighbor classifier

Competition II: Springleaf Sha Li (Team leader) Xiaoyan Chong, Minglu Ma, Yue Wang CAMCOS Fall 2015 San Jose State University.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Algebraic Techniques for Analysis of Large Discrete-Valued Datasets 

Nawanol Theera-Ampornpunt, Seong Gon Kim, Asish Ghoshal, Saurabh Bagchi, Ananth Grama, and Somali Chaterji Fast Training on Large Genomics Data using Distributed.

Ganesh J, Soumyajit Ganguly, Manish Gupta, Vasudeva Varma, Vikram Pudi

A Binary Linear Programming Formulation of the Graph Edit Distance Presented by Shihao Ji Duke University Machine Learning Group July 17, 2006 Authors:

Comparison with Counterparts

Fast Kernel-Density-Based Classification and Clustering Using P-Trees

Paper Presentation: Shape and Matching

A Consensus-Based Clustering Method

Hyper-parameter tuning for graph kernels via Multiple Kernel Learning

Dieudo Mulamba November 2017

Discrete Kernels.

Approximate Correspondences in High Dimensions

CS 2750: Machine Learning Support Vector Machines

Topological Signatures For Fast Mobility Analysis

Presentation transcript:

A Fast Kernel for Attributed Graphs Yu Su University of California at Santa Barbara with Fangqiu Han, Richard E. Harang, and Xifeng Yan

INTRODUCTION A Fast Kernel for Attributed Graphs

Graph Kernel  A graph kernel defines a similarity measure over graphs — a core problem in graph mining  Inner product in some (latent) feature space  Decouple data representation from learning machine Once a graph kernel is supplied, a whole toolbox of kernel machines become readily applicable SVM, Kernel PCA, Support Vector Regression, Clustering, etc. A good graph kernel is thus the key A Fast Kernel for Attributed Graphs

Chemo- & Bioinformatics Semantic webSoftware Engineering Natural Language Processing Broad Applications A Fast Kernel for Attributed Graphs

Trends and Challenges in the Big Data Era A Fast Kernel for Attributed Graphs Increasing graph sizeMore efficient methods More versatile methodsRicher graph attributes This work: A linear-time kernel that can handle both categorical and numerical attributes.

Graph Kernel as a Measure of Graph Similarity ① Decompose each graph into a (multi-)set of features Subgraphs (Gartner et al. 2003, NP-hard!) Random walks (Gartner et al. 2003, Kashima et al. 2003) Subtrees (Shervashidze and Borgwardt 2009) Vectors (Neumann et al. 2016) A Fast Kernel for Attributed Graphs

Graph Kernel as a Measure of Graph Similarity ① Decompose each graph into a (multi-)set of features Subgraphs (Gartner et al. 2003, NP-hard!) Random walks (Gartner et al. 2003, Kashima et al. 2003) Subtrees (Shervashidze and Borgwardt 2009) Vectors (Neumann et al. 2016) A Fast Kernel for Attributed Graphs

Graph Kernel as a Measure of Graph Similarity ① Decompose each graph into a (multi-)set of features Subgraphs (Gartner et al. 2003, NP-hard!) Random walks (Gartner et al. 2003, Kashima et al. 2003) Subtrees (Shervashidze and Borgwardt 2009) Vectors (Neumann et al. 2016) ② Compare feature sets Pair-wise comparison (quadratic) A Fast Kernel for Attributed Graphs

Graph Kernel as a Measure of Graph Similarity ① Decompose each graph into a (multi-)set of features Subgraphs (Gartner et al. 2003, NP-hard!) Random walks (Gartner et al. 2003, Kashima et al. 2003) Subtrees (Shervashidze and Borgwardt 2009) Vectors (Neumann et al. 2016) ② Compare feature sets Pair-wise comparison (quadratic) Inner product (linear; only when features are discrete) A Fast Kernel for Attributed Graphs

Graph Kernel as a Measure of Graph Similarity ① Decompose each graph into a (multi-)set of features Subgraphs (Gartner et al. 2003, NP-hard!) Random walks (Gartner et al. 2003, Kashima et al. 2003) Subtrees (Shervashidze and Borgwardt 2009) Vectors (Neumann et al. 2016) ② Compare feature sets Pair-wise comparison (quadratic) Inner product (linear; only when features are discrete) Discretization (linear; can handle numerical attributes) A Fast Kernel for Attributed Graphs

Graph Kernel as a Measure of Graph Similarity ① Decompose each graph into a (multi-)set of features Subgraphs (Gartner et al. 2003, NP-hard!) Random walks (Gartner et al. 2003, Kashima et al. 2003) Subtrees (Shervashidze and Borgwardt 2009) Vectors (Neumann et al. 2016) ② Compare feature sets Pair-wise comparison (quadratic) Inner product (linear; only when features are discrete) Discretization (linear; can handle numerical attributes) A Fast Kernel for Attributed Graphs vector features + discretization

METHOD A Fast Kernel for Attributed Graphs

Descriptor Matching (DM) Kernel: An Overview A Fast Kernel for Attributed Graphs

Descriptor Matching (DM) Kernel: An Overview A Fast Kernel for Attributed Graphs

Descriptor Matching (DM) Kernel: An Overview A Fast Kernel for Attributed Graphs

Desired Descriptor Property: Preserve Similarity  Similar nodes should have similar descriptors So it becomes meaningful to compare graph similarity by matching their descriptors  Nodes are more similar if their attributes and neighbors are more similar Recursive definition of similarity makes it natural to generate descriptors in a recursive manner A Fast Kernel for Attributed Graphs

Desired Descriptor Property: Highly Discriminative A Fast Kernel for Attributed Graphs

Descriptor Generation via Propagation A Fast Kernel for Attributed Graphs

Descriptor Matching  Optimal matching: Maximum weighted bipartite matching Cubic time. Not a valid kernel (Vert 2008) A Fast Kernel for Attributed Graphs

Descriptor Matching  Optimal matching: Maximum weighted bipartite matching Cubic time. Not a valid kernel (Vert 2008)  Discretization: Uniform binning Linear time. Valid kernel. Unweighted, independent bins. A Fast Kernel for Attributed Graphs

Descriptor Matching  Optimal matching: Maximum weighted bipartite matching Cubic time. Not a valid kernel (Vert 2008)  Discretization: Uniform binning Linear time. Valid kernel. Unweighted, independent bins.  Discretization: Data-dependent hierarchical binning Linear time. Valid kernel. Weighted, multi-resolution bins. Vocabulary-Guided pyramid matching (VG) kernel (Grauman and Darrell 2006) A Fast Kernel for Attributed Graphs

Descriptor Matching  Optimal matching: Maximum weighted bipartite matching Cubic time. Not a valid kernel (Vert 2008)  Discretization: Uniform binning Linear time. Valid kernel. Unweighted, independent bins.  Discretization: Data-dependent hierarchical binning Linear time. Valid kernel. Weighted, multi-resolution bins. Vocabulary-Guided pyramid matching (VG) kernel (Grauman and Darrell 2006) A Fast Kernel for Attributed Graphs

Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs

Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs

Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs

Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs

Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs

Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs

Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs

Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs

EVALUATION A Fast Kernel for Attributed Graphs

Efficiency on Synthetic Graphs A Fast Kernel for Attributed Graphs Number of nodes DM: this work PK: ML’16 GH: NIPS’13 WLSP: JMLR’11 SP: ICDM’05 CSM: ICML’12

Accuracy on Real-world Graphs A Fast Kernel for Attributed Graphs  DM is among the best in 9 out of the 10 datasets, and is significantly better than PK on 8 dataset (Student’s t test at p=0.05).

Summaries  A graph kernel Can be computed in linear time w.r.t. graph size Can handle both categorical and numerical attributes  Key ideas Descriptor generation via categorical attribute propagation Descriptor matching via hierarchical data-dependent discretization  Competitive performance Efficient: scale to graphs with 100,000 nodes Accurate: best on 9 out of 10 datasets A Fast Kernel for Attributed Graphs

Thank You!