5 months ago

Global Self-Attention as a Replacement for Graph Convolution

Md Shamim Hussain; Mohammed J. Zaki; Dharmashankar Subramanian

Abstract

We propose an extension to the transformer neural network architecture for general-purpose graph learning by adding a dedicated pathway for pairwise structural information, called edge channels. The resultant framework - which we call Edge-augmented Graph Transformer (EGT) - can directly accept, process and output structural information of arbitrary form, which is important for effective learning on graph-structured data. Our model exclusively uses global self-attention as an aggregation mechanism rather than static localized convolutional aggregation. This allows for unconstrained long-range dynamic interactions between nodes. Moreover, the edge channels allow the structural information to evolve from layer to layer, and prediction tasks on edges/links can be performed directly from the output embeddings of these channels. We verify the performance of EGT in a wide range of graph-learning experiments on benchmark datasets, in which it outperforms Convolutional/Message-Passing Graph Neural Networks. EGT sets a new state-of-the-art for the quantum-chemical regression task on the OGB-LSC PCQM4Mv2 dataset containing 3.8 million molecular graphs. Our findings indicate that global self-attention based aggregation can serve as a flexible, adaptive and effective replacement of graph convolution for general-purpose graph learning. Therefore, convolutional local neighborhood aggregation is not an essential inductive bias.

Code Repositories

shamim-hussain/egt_triangular

Official

pytorch

Mentioned in GitHub

shamim-hussain/egt_pytorch

Official

pytorch

Mentioned in GitHub

shamim-hussain/egt

Official

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
graph-classification-on-cifar10-100k	EGT	Accuracy (%): 68.702
graph-classification-on-mnist	EGT	Accuracy: 98.173
graph-property-prediction-on-ogbg-molhiv	EGT	Test ROC-AUC: 0.806 ± 0.0065
graph-property-prediction-on-ogbg-molpcba	EGT	Test AP: 0.2961 ± 0.0024
graph-regression-on-pcqm4m-lsc	EGT	Validation MAE: 0.1224
graph-regression-on-pcqm4mv2-lsc	EGT	Test MAE: 0.0862 Validation MAE: 0.0857
graph-regression-on-pcqm4mv2-lsc	EGT + Triangular Attention	Test MAE: 0.0683 Validation MAE: 0.0671
graph-regression-on-zinc-100k	EGT	MAE: 0.143
graph-regression-on-zinc-500k	EGT	MAE: 0.108
link-prediction-on-tsp-hcp-benchmark-set	EGT	F1: 0.853
node-classification-on-cluster	EGT	Accuracy: 79.232
node-classification-on-pattern	EGT	Accuracy: 86.821
node-classification-on-pattern-100k	EGT	Accuracy (%): 86.816

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Global Self-Attention as a Replacement for Graph Convolution

Md Shamim Hussain; Mohammed J. Zaki; Dharmashankar Subramanian

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters