HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

The Split Matters: Flat Minima Methods for Improving the Performance of GNNs

Nicolas Lell Ansgar Scherp

The Split Matters: Flat Minima Methods for Improving the Performance of GNNs

Abstract

When training a Neural Network, it is optimized using the available training data with the hope that it generalizes well to new or unseen testing data. At the same absolute value, a flat minimum in the loss landscape is presumed to generalize better than a sharp minimum. Methods for determining flat minima have been mostly researched for independent and identically distributed (i. i. d.) data such as images. Graphs are inherently non-i. i. d. since the vertices are edge-connected. We investigate flat minima methods and combinations of those methods for training graph neural networks (GNNs). We use GCN and GAT as well as extend Graph-MLP to work with more layers and larger graphs. We conduct experiments on small and large citation, co-purchase, and protein datasets with different train-test splits in both the transductive and inductive training procedure. Results show that flat minima methods can improve the performance of GNN models by over 2 points, if the train-test split is randomized. Following Shchur et al., randomized splits are essential for a fair evaluation of GNNs, as other (fixed) splits like 'Planetoid' are biased. Overall, we provide important insights for improving and fairly evaluating flat minima methods on GNNs. We recommend practitioners to always use weight averaging techniques, in particular EWA when using early stopping. While weight averaging techniques are only sometimes the best performing method, they are less sensitive to hyperparameters, need no additional training, and keep the original model unchanged. All source code is available in https://github.com/Foisunt/FMMs-in-GNNs.

Code Repositories

foisunt/fmms-in-gnns
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
node-classification-on-citeseerGraph-MLP + SWA
Accuracy: 77.99 ± 1.57%
node-classification-on-citeseer-with-publicGraph-MLP + PGN
Accuracy: 74.73 ± 0.6%
node-classification-on-coraGAT + SWA
Accuracy: 88.66 ± 1.38%
node-classification-on-cora-with-public-splitGAT+PGN
Accuracy: 83.26 ± 0.69%
node-classification-on-ppiGCN + SAF
F1: 99.38 ± 0.01%
node-classification-on-ppiGAT + PGN
F1: 99.34 ± 0.02%
node-classification-on-pubmedGraph-MLP + SAF
Accuracy: 90.64 ± 0.46%
node-classification-on-pubmed-60-20-20-randomGraph-MLP + SAF
1:1 Accuracy: 90.64 ± 0.46%
node-classification-on-pubmed-with-publicGraph-MLP + ASAM
Accuracy: 82.60 ± 0.80%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
The Split Matters: Flat Minima Methods for Improving the Performance of GNNs | Papers | HyperAI