Graph Attention Networks

Slides:

Advertisements

Similar presentations

A brief review of non-neural-network approaches to deep learning

Advertisements

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.

Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.

+ Multi-label Classification using Adaptive Neighborhoods Tanwistha Saha, Huzefa Rangwala and Carlotta Domeniconi Department of Computer Science George.

Data Mining Classification: Alternative Techniques

Artificial Neural Networks

Michigan State University1 Visual Attention and Recognition Through Neuromorphic Modeling of “Where” and “What” Pathways Zhengping Ji Embodied Intelligence.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Speaker Adaptation for Vowel Classification

Artificial Neural Networks Artificial Neural Networks are (among other things) another technique for supervised learning k-Nearest Neighbor Decision Tree.

1 Automated Feature Abstraction of the fMRI Signal using Neural Network Clustering Techniques Stefan Niculescu and Tom Mitchell Siemens Medical Solutions,

Artificial Neural Networks

Distributed Representations of Sentences and Documents

Data mining and statistical learning - lecture 11 Neural networks - a model class providing a joint framework for prediction and classification  Relationship.

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

Link Recommendation In P2P Social Networks Yusuf Aytaş, Hakan Ferhatosmanoğlu, Özgür Ulusoy Bilkent University, Ankara, Turkey.

Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab

Multi-Layer Perceptrons Michael J. Watts

Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.

Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.

A Graph-based Friend Recommendation System Using Genetic Algorithm

Multiple parallel hidden layers and other improvements to recurrent neural network language modeling ICASSP 2013 Diamantino Caseiro, Andrej Ljolje AT&T.

Applying Neural Networks Michael J. Watts

CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.

Mining Social Network for Personalized Prioritization Language Techonology Institute School of Computer Science Carnegie Mellon University Shinjae.

© Devi Parikh 2008 Devi Parikh and Tsuhan Chen Carnegie Mellon University April 3, ICASSP 2008 Bringing Diverse Classifiers to Common Grounds: dtransform.

Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Spectral Methods for Dimensionality

Big data classification using neural network

Convolutional Sequence to Sequence Learning

Object Detection based on Segment Masks

Consistent and Efficient Reconstruction of Latent Tree Models

Boosted Augmented Naive Bayes. Efficient discriminative learning of

Attention Is All You Need

Intelligent Information System Lab

Machine Learning Basics

CS 188: Artificial Intelligence

Understanding the Difficulty of Training Deep Feedforward Neural Networks Qiyue Wang Oct 27, 2017.

Bird-species Recognition Using Convolutional Neural Network

Neural Networks Advantages Criticism

Topology Control and Its Effects in Wireless Networks

Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.

Deep Belief Nets and Ising Model-Based Network Construction

Word Embedding Word2Vec.

MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.

Classification and Prediction

Neural Networks Geoff Hulten.

المشرف د.يــــاســـــــــر فـــــــؤاد By: ahmed badrealldeen

Yi Zhao1, Yanyan Shen*1, Yanmin Zhu1, Junjie Yao2

Binghui Wang, Le Zhang, Neil Zhenqiang Gong

Asymmetric Transitivity Preserving Graph Embedding

Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation Binghui Wang, Jinyuan Jia, and Neil.

Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824

Introduction to Radial Basis Function Networks

GANG: Detecting Fraudulent Users in OSNs

Model generalization Brief summary of methods

Graph Neural Networks Amog Kamsetty January 30, 2019.

Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824

Attention for translation

Human-object interaction

Keshav Balasubramanian

Word representations David Kauchak CS158 – Fall 2016.

A task of induction to find patterns

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Sequence-to-Sequence Models

Week 7 Presentation Ngoc Ta Aidean Sharghi

Graph Convolutional Neural Networks

Overall Introduction for the Lecture

Do Better ImageNet Models Transfer Better?

Presentation transcript:

Graph Attention Networks Authors: Petar Veličković, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, Yoshua Bengio

Outlines Goal Challenges Method Discussion (advantages) Experiment Future work (limitations) Summary

Goal Build a model for node classification of graph-structured data Compute the hidden representations of each node by attending over its neighbors

Challenges Limitations of the existing work Depends on the graph structure, cannot directly apply to unseen graphs Sample a fixed-sized neighborhood of each node (e.g., GraphSAGE) For the proposed model (contributions) Generalized to completely unseen graphs Apply to graph nodes having different degrees

Method Graph attention layer The input: a set of node features The output: a set of new node features including neighboring information Attention coefficients: eij indicates the importance of node j’s features to node i

Attention coefficients W is a weight matrix, a linear transformation a is a shared attentional mechanism For node i, neighboring nodes are first-order neighbors of i After normalization and activation, the coefficients is ∥ is the concatenation operation

Multi-head attention For a single attention for a node i, For k independent attention for a node i, For the final layer, replace concatenation with average

Advantages Computational efficient (parallel computation) The time complexity: Assign different weights to nodes of a same neighborhood No requirement to be undirected Work on the entire neighboring nodes No ordering on the neighboring nodes

Datasets for the experiment Tested on four datasets

Experiment setup and Evaluation metrics Transductive learning: a two-layer GAT model For Cora and Citeseer datasets, The first layer: K = 8, F’ = 8, ELU (exponential linear unit) The second layer (classifier), K = 1, F’ = #classes, softmax For Pubmed, the only change is K = 8 in the classification layer. Mean classification accuracy Inductive learning: a three-layer GAT model The first two layers: K = 4, F’ = 256, ELU (exponential linear unit) The third layer (classifier): K = 6, F’ = 121, logistic sigmoid Micro-average F1

Experiment results (Mean classification accuracy)

Experiment results (Micro-average F1)

Future work In practical, handle larger batch size Because the implementation only supports sparse matric multiplication for rank-2 tensors Neighboring nodes attention for model interpretability Incorporate edge features

Summary Graph attention networks Graph attention layer Deal with different sized neighborhoods Does not need to know the entire graph structure upfront