HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

An end-to-end attention-based approach for learning on graphs

David Buterez; Jon Paul Janet; Dino Oglic; Pietro Lio

An end-to-end attention-based approach for learning on graphs

Abstract

There has been a recent surge in transformer-based architectures for learning on graphs, mainly motivated by attention as an effective learning mechanism and the desire to supersede handcrafted operators characteristic of message passing schemes. However, concerns over their empirical effectiveness, scalability, and complexity of the pre-processing steps have been raised, especially in relation to much simpler graph neural networks that typically perform on par with them across a wide range of benchmarks. To tackle these shortcomings, we consider graphs as sets of edges and propose a purely attention-based approach consisting of an encoder and an attention pooling mechanism. The encoder vertically interleaves masked and vanilla self-attention modules to learn an effective representations of edges, while allowing for tackling possible misspecifications in input graphs. Despite its simplicity, the approach outperforms fine-tuned message passing baselines and recently proposed transformer-based methods on more than 70 node and graph-level tasks, including challenging long-range benchmarks. Moreover, we demonstrate state-of-the-art performance across different tasks, ranging from molecular to vision graphs, and heterophilous node classification. The approach also outperforms graph neural networks and transformers in transfer learning settings, and scales much better than alternatives with a similar performance level or expressive power.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
graph-classification-on-cifar10-100kESA (Edge set attention, no positional encodings)
Accuracy (%): 75.413±0.248
graph-classification-on-ddESA (Edge set attention, no positional encodings)
Accuracy: 83.529±1.743
graph-classification-on-enzymesESA (Edge set attention, no positional encodings)
Accuracy: 79.423±1.658
graph-classification-on-imdb-bESA (Edge set attention, no positional encodings)
Accuracy: 86.250±0.957
graph-classification-on-malnet-tinyESA (Edge set attention, no positional encodings)
Accuracy: 94.800±0.424
MCC: 0.935±0.005
graph-classification-on-mnistESA (Edge set attention, no positional encodings)
Accuracy: 98.753±0.041
graph-classification-on-mnistESA (Edge set attention, no positional encodings, tuned)
Accuracy: 98.917±0.020
graph-classification-on-nci1ESA (Edge set attention, no positional encodings)
Accuracy: 87.835±0.644
graph-classification-on-nci109ESA (Edge set attention, no positional encodings)
Accuracy: 84.976±0.551
graph-classification-on-peptides-funcESA (Edge set attention, no positional encodings, not tuned)
AP: 0.6863±0.0044
graph-classification-on-peptides-funcESA (Edge set attention, no positional encodings, tuned)
AP: 0.7071±0.0015
graph-classification-on-peptides-funcESA + RWSE (Edge set attention, Random Walk Structural Encoding, tuned)
AP: 0.7357±0.0036
graph-classification-on-peptides-funcESA + RWSE (Edge set attention, Random Walk Structural Encoding, + validation set)
AP: 0.7479
graph-classification-on-proteinsESA (Edge set attention, no positional encodings)
Accuracy: 82.679±0.799
graph-regression-on-esr2ESA (Edge set attention, no positional encodings)
R2: 0.697±0.000
RMSE: 0.486±0.697
graph-regression-on-f2ESA (Edge set attention, no positional encodings)
R2: 0.891±0.000
RMSE: 0.335±0.891
graph-regression-on-kitESA (Edge set attention, no positional encodings)
R2: 0.841±0.000
RMSE: 0.433±0.841
graph-regression-on-lipophilicityESA (Edge set attention, no positional encodings)
R2: 0.809±0.008
RMSE: 0.552±0.012
graph-regression-on-parp1ESA (Edge set attention, no positional encodings)
R2: 0.925±0.000
RMSE: 0.343±0.925
graph-regression-on-pcqm4mv2-lscESA (Edge set attention, no positional encodings)
Test MAE: N/A
Validation MAE: 0.0235
graph-regression-on-peptides-structESA + RWSE (Edge set attention, Random Walk Structural Encoding, tuned)
MAE: 0.2393±0.0004
graph-regression-on-peptides-structESA (Edge set attention, no positional encodings, not tuned)
MAE: 0.2453±0.0003
graph-regression-on-pgrESA (Edge set attention, no positional encodings)
R2: 0.725±0.000
RMSE: 0.507±0.725
graph-regression-on-zincESA + rings + NodeRWSE + EdgeRWSE
MAE: 0.051
graph-regression-on-zinc-500kESA + rings + NodeRWSE + EdgeRWSE
MAE: 0.051
graph-regression-on-zinc-fullESA + rings + NodeRWSE + EdgeRWSE
Test MAE: 0.0109±0.0002
graph-regression-on-zinc-fullESA + RWSE (Edge set attention, Random Walk Structural Encoding, tuned)
Test MAE: 0.0154±0.0001
graph-regression-on-zinc-fullESA + RWSE (Edge set attention, Random Walk Structural Encoding)
Test MAE: 0.017±0.001
graph-regression-on-zinc-fullESA + RWSE + CY2C (Edge set attention, Random Walk Structural Encoding, clique adjacency, tuned)
Test MAE: 0.0122±0.0004
graph-regression-on-zinc-fullESA (Edge set attention, no positional encodings)
Test MAE: 0.027±0.001
molecular-property-prediction-on-esolESA (Edge set attention, no positional encodings)
R2: 0.944±0.002
RMSE: 0.485±0.009
molecular-property-prediction-on-freesolvESA (Edge set attention, no positional encodings)
R2: 0.977±0.001
RMSE: 0.595±0.013

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
An end-to-end attention-based approach for learning on graphs | Papers | HyperAI