Graph Attention Networks

Slides:



Advertisements
Similar presentations
A brief review of non-neural-network approaches to deep learning
Advertisements

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.
Data Mining Classification: Alternative Techniques
Artificial Neural Networks
Michigan State University1 Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways Zhengping Ji Embodied Intelligence.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Speaker Adaptation for Vowel Classification
Artificial Neural Networks Artificial Neural Networks are (among other things) another technique for supervised learning k-Nearest Neighbor Decision Tree.
1 Automated Feature Abstraction of the fMRI Signal using Neural Network Clustering Techniques Stefan Niculescu and Tom Mitchell Siemens Medical Solutions,
Artificial Neural Networks
Distributed Representations of Sentences and Documents
Data mining and statistical learning - lecture 11 Neural networks - a model class providing a joint framework for prediction and classification  Relationship.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Link Recommendation In P2P Social Networks Yusuf Aytaş, Hakan Ferhatosmanoğlu, Özgür Ulusoy Bilkent University, Ankara, Turkey.
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
Multi-Layer Perceptrons Michael J. Watts
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
A Graph-based Friend Recommendation System Using Genetic Algorithm
Multiple parallel hidden layers and other improvements to recurrent neural network language modeling ICASSP 2013 Diamantino Caseiro, Andrej Ljolje AT&T.
Applying Neural Networks Michael J. Watts
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
Mining Social Network for Personalized Prioritization Language Techonology Institute School of Computer Science Carnegie Mellon University Shinjae.
© Devi Parikh 2008 Devi Parikh and Tsuhan Chen Carnegie Mellon University April 3, ICASSP 2008 Bringing Diverse Classifiers to Common Grounds: dtransform.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Spectral Methods for Dimensionality
Big data classification using neural network
Convolutional Sequence to Sequence Learning
Object Detection based on Segment Masks
Consistent and Efficient Reconstruction of Latent Tree Models
Boosted Augmented Naive Bayes. Efficient discriminative learning of
Attention Is All You Need
Intelligent Information System Lab
Machine Learning Basics
CS 188: Artificial Intelligence
Understanding the Difficulty of Training Deep Feedforward Neural Networks Qiyue Wang Oct 27, 2017.
Bird-species Recognition Using Convolutional Neural Network
Neural Networks Advantages Criticism
Topology Control and Its Effects in Wireless Networks
Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.
Deep Belief Nets and Ising Model-Based Network Construction
Word Embedding Word2Vec.
MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.
Classification and Prediction
Neural Networks Geoff Hulten.
المشرف د.يــــاســـــــــر فـــــــؤاد By: ahmed badrealldeen
Yi Zhao1, Yanyan Shen*1, Yanmin Zhu1, Junjie Yao2
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Asymmetric Transitivity Preserving Graph Embedding
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Introduction to Radial Basis Function Networks
GANG: Detecting Fraudulent Users in OSNs
Model generalization Brief summary of methods
Graph Neural Networks Amog Kamsetty January 30, 2019.
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Attention for translation
Human-object interaction
Keshav Balasubramanian
Word representations David Kauchak CS158 – Fall 2016.
A task of induction to find patterns
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Sequence-to-Sequence Models
Week 7 Presentation Ngoc Ta Aidean Sharghi
Graph Convolutional Neural Networks
Overall Introduction for the Lecture
Do Better ImageNet Models Transfer Better?
Presentation transcript:

Graph Attention Networks Authors: Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, Yoshua Bengio

Outlines Goal Challenges Method Discussion (advantages) Experiment Future work (limitations) Summary

Goal Build a model for node classification of graph-structured data Compute the hidden representations of each node by attending over its neighbors

Challenges Limitations of the existing work Depends on the graph structure, cannot directly apply to unseen graphs Sample a fixed-sized neighborhood of each node (e.g., GraphSAGE) For the proposed model (contributions) Generalized to completely unseen graphs Apply to graph nodes having different degrees

Method Graph attention layer The input: a set of node features The output: a set of new node features including neighboring information Attention coefficients: eij indicates the importance of node j’s features to node i

Attention coefficients W is a weight matrix, a linear transformation a is a shared attentional mechanism For node i, neighboring nodes are first-order neighbors of i After normalization and activation, the coefficients is ∥ is the concatenation operation

Multi-head attention For a single attention for a node i, For k independent attention for a node i, For the final layer, replace concatenation with average

Advantages Computational efficient (parallel computation) The time complexity: Assign different weights to nodes of a same neighborhood No requirement to be undirected Work on the entire neighboring nodes No ordering on the neighboring nodes

Datasets for the experiment Tested on four datasets

Experiment setup and Evaluation metrics Transductive learning: a two-layer GAT model For Cora and Citeseer datasets, The first layer: K = 8, F’ = 8, ELU (exponential linear unit) The second layer (classifier), K = 1, F’ = #classes, softmax For Pubmed, the only change is K = 8 in the classification layer. Mean classification accuracy Inductive learning: a three-layer GAT model The first two layers: K = 4, F’ = 256, ELU (exponential linear unit) The third layer (classifier): K = 6, F’ = 121, logistic sigmoid Micro-average F1

Experiment results (Mean classification accuracy)

Experiment results (Micro-average F1)

Future work In practical, handle larger batch size Because the implementation only supports sparse matric multiplication for rank-2 tensors Neighboring nodes attention for model interpretability Incorporate edge features

Summary Graph attention networks Graph attention layer Deal with different sized neighborhoods Does not need to know the entire graph structure upfront