A Fast Kernel for Attributed Graphs Yu Su University of California at Santa Barbara with Fangqiu Han, Richard E. Harang, and Xifeng Yan
INTRODUCTION A Fast Kernel for Attributed Graphs
Graph Kernel A graph kernel defines a similarity measure over graphs — a core problem in graph mining Inner product in some (latent) feature space Decouple data representation from learning machine Once a graph kernel is supplied, a whole toolbox of kernel machines become readily applicable SVM, Kernel PCA, Support Vector Regression, Clustering, etc. A good graph kernel is thus the key A Fast Kernel for Attributed Graphs
Chemo- & Bioinformatics Semantic webSoftware Engineering Natural Language Processing Broad Applications A Fast Kernel for Attributed Graphs
Trends and Challenges in the Big Data Era A Fast Kernel for Attributed Graphs Increasing graph sizeMore efficient methods More versatile methodsRicher graph attributes This work: A linear-time kernel that can handle both categorical and numerical attributes.
Graph Kernel as a Measure of Graph Similarity ① Decompose each graph into a (multi-)set of features Subgraphs (Gartner et al. 2003, NP-hard!) Random walks (Gartner et al. 2003, Kashima et al. 2003) Subtrees (Shervashidze and Borgwardt 2009) Vectors (Neumann et al. 2016) A Fast Kernel for Attributed Graphs
Graph Kernel as a Measure of Graph Similarity ① Decompose each graph into a (multi-)set of features Subgraphs (Gartner et al. 2003, NP-hard!) Random walks (Gartner et al. 2003, Kashima et al. 2003) Subtrees (Shervashidze and Borgwardt 2009) Vectors (Neumann et al. 2016) A Fast Kernel for Attributed Graphs
Graph Kernel as a Measure of Graph Similarity ① Decompose each graph into a (multi-)set of features Subgraphs (Gartner et al. 2003, NP-hard!) Random walks (Gartner et al. 2003, Kashima et al. 2003) Subtrees (Shervashidze and Borgwardt 2009) Vectors (Neumann et al. 2016) ② Compare feature sets Pair-wise comparison (quadratic) A Fast Kernel for Attributed Graphs
Graph Kernel as a Measure of Graph Similarity ① Decompose each graph into a (multi-)set of features Subgraphs (Gartner et al. 2003, NP-hard!) Random walks (Gartner et al. 2003, Kashima et al. 2003) Subtrees (Shervashidze and Borgwardt 2009) Vectors (Neumann et al. 2016) ② Compare feature sets Pair-wise comparison (quadratic) Inner product (linear; only when features are discrete) A Fast Kernel for Attributed Graphs
Graph Kernel as a Measure of Graph Similarity ① Decompose each graph into a (multi-)set of features Subgraphs (Gartner et al. 2003, NP-hard!) Random walks (Gartner et al. 2003, Kashima et al. 2003) Subtrees (Shervashidze and Borgwardt 2009) Vectors (Neumann et al. 2016) ② Compare feature sets Pair-wise comparison (quadratic) Inner product (linear; only when features are discrete) Discretization (linear; can handle numerical attributes) A Fast Kernel for Attributed Graphs
Graph Kernel as a Measure of Graph Similarity ① Decompose each graph into a (multi-)set of features Subgraphs (Gartner et al. 2003, NP-hard!) Random walks (Gartner et al. 2003, Kashima et al. 2003) Subtrees (Shervashidze and Borgwardt 2009) Vectors (Neumann et al. 2016) ② Compare feature sets Pair-wise comparison (quadratic) Inner product (linear; only when features are discrete) Discretization (linear; can handle numerical attributes) A Fast Kernel for Attributed Graphs vector features + discretization
METHOD A Fast Kernel for Attributed Graphs
Descriptor Matching (DM) Kernel: An Overview A Fast Kernel for Attributed Graphs
Descriptor Matching (DM) Kernel: An Overview A Fast Kernel for Attributed Graphs
Descriptor Matching (DM) Kernel: An Overview A Fast Kernel for Attributed Graphs
Desired Descriptor Property: Preserve Similarity Similar nodes should have similar descriptors So it becomes meaningful to compare graph similarity by matching their descriptors Nodes are more similar if their attributes and neighbors are more similar Recursive definition of similarity makes it natural to generate descriptors in a recursive manner A Fast Kernel for Attributed Graphs
Desired Descriptor Property: Highly Discriminative A Fast Kernel for Attributed Graphs
Descriptor Generation via Propagation A Fast Kernel for Attributed Graphs
Descriptor Matching Optimal matching: Maximum weighted bipartite matching Cubic time. Not a valid kernel (Vert 2008) A Fast Kernel for Attributed Graphs
Descriptor Matching Optimal matching: Maximum weighted bipartite matching Cubic time. Not a valid kernel (Vert 2008) Discretization: Uniform binning Linear time. Valid kernel. Unweighted, independent bins. A Fast Kernel for Attributed Graphs
Descriptor Matching Optimal matching: Maximum weighted bipartite matching Cubic time. Not a valid kernel (Vert 2008) Discretization: Uniform binning Linear time. Valid kernel. Unweighted, independent bins. Discretization: Data-dependent hierarchical binning Linear time. Valid kernel. Weighted, multi-resolution bins. Vocabulary-Guided pyramid matching (VG) kernel (Grauman and Darrell 2006) A Fast Kernel for Attributed Graphs
Descriptor Matching Optimal matching: Maximum weighted bipartite matching Cubic time. Not a valid kernel (Vert 2008) Discretization: Uniform binning Linear time. Valid kernel. Unweighted, independent bins. Discretization: Data-dependent hierarchical binning Linear time. Valid kernel. Weighted, multi-resolution bins. Vocabulary-Guided pyramid matching (VG) kernel (Grauman and Darrell 2006) A Fast Kernel for Attributed Graphs
Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs
Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs
Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs
Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs
Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs
Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs
Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs
Descriptor Matching via Pyramid Matching Kernel A Fast Kernel for Attributed Graphs
EVALUATION A Fast Kernel for Attributed Graphs
Efficiency on Synthetic Graphs A Fast Kernel for Attributed Graphs Number of nodes DM: this work PK: ML’16 GH: NIPS’13 WLSP: JMLR’11 SP: ICDM’05 CSM: ICML’12
Accuracy on Real-world Graphs A Fast Kernel for Attributed Graphs DM is among the best in 9 out of the 10 datasets, and is significantly better than PK on 8 dataset (Student’s t test at p=0.05).
Summaries A graph kernel Can be computed in linear time w.r.t. graph size Can handle both categorical and numerical attributes Key ideas Descriptor generation via categorical attribute propagation Descriptor matching via hierarchical data-dependent discretization Competitive performance Efficient: scale to graphs with 100,000 nodes Accurate: best on 9 out of 10 datasets A Fast Kernel for Attributed Graphs
Thank You!