HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Graph Inductive Biases in Transformers without Message Passing

Liheng Ma Chen Lin Derek Lim Adriana Romero-Soriano Puneet K. Dokania Mark Coates Philip Torr Ser-Nam Lim

Graph Inductive Biases in Transformers without Message Passing

Abstract

Transformers for graph data are increasingly widely studied and successful in numerous learning tasks. Graph inductive biases are crucial for Graph Transformers, and previous works incorporate them using message-passing modules and/or positional encodings. However, Graph Transformers that use message-passing inherit known issues of message-passing, and differ significantly from Transformers used in other domains, thus making transfer of research advances more difficult. On the other hand, Graph Transformers without message-passing often perform poorly on smaller datasets, where inductive biases are more crucial. To bridge this gap, we propose the Graph Inductive bias Transformer (GRIT) -- a new Graph Transformer that incorporates graph inductive biases without using message passing. GRIT is based on several architectural changes that are each theoretically and empirically justified, including: learned relative positional encodings initialized with random walk probabilities, a flexible attention mechanism that updates node and node-pair representations, and injection of degree information in each layer. We prove that GRIT is expressive -- it can express shortest path distances and various graph propagation matrices. GRIT achieves state-of-the-art empirical performance across a variety of graph datasets, thus showing the power that Graph Transformers without message-passing can deliver.

Code Repositories

linusbao/MoSE
pytorch
Mentioned in GitHub
liamma/grit
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
graph-classification-on-cifar10-100kGRIT
Accuracy (%): 76.468
graph-classification-on-mnistGRIT
Accuracy: 98.108
graph-classification-on-peptides-funcGRIT
AP: 0.6988±0.0082
graph-regression-on-pcqm4mv2-lscGRIT
Validation MAE: 0.0859
graph-regression-on-peptides-structGRIT
MAE: 0.2460±0.0012
graph-regression-on-zincGRIT
MAE: 0.059
graph-regression-on-zinc-500kGRIT
MAE: 0.059
graph-regression-on-zinc-fullGRIT
Test MAE: 0.023
node-classification-on-clusterGRIT
Accuracy: 80.026
node-classification-on-patternGRIT
Accuracy: 87.196

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Graph Inductive Biases in Transformers without Message Passing | Papers | HyperAI