
摘要
近期,基于变换器的架构在图学习领域得到了迅速发展,这主要是由于注意力机制作为一种有效的学习方法以及希望取代消息传递方案中手工设计的操作者。然而,人们对其经验有效性、可扩展性和预处理步骤的复杂性提出了担忧,尤其是在与通常在广泛基准测试中表现相当的更简单的图神经网络相比时。为了解决这些不足,我们将图视为边的集合,并提出了一种纯基于注意力的方法,该方法由编码器和注意力池化机制组成。编码器纵向交替使用掩码自注意力模块和普通自注意力模块来学习边的有效表示,同时允许处理输入图中的可能错误指定问题。尽管该方法简单,但在超过70个节点级和图级任务上超越了精心调校的消息传递基线模型和最近提出的基于变换器的方法,包括具有挑战性的长程基准测试。此外,我们在不同任务中展示了最先进的性能,从分子图到视觉图再到异质节点分类。该方法在迁移学习设置中也优于图神经网络和变换器,并且其可扩展性远胜于性能水平或表达能力相似的其他替代方案。
代码仓库
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| graph-classification-on-cifar10-100k | ESA (Edge set attention, no positional encodings) | Accuracy (%): 75.413±0.248 |
| graph-classification-on-dd | ESA (Edge set attention, no positional encodings) | Accuracy: 83.529±1.743 |
| graph-classification-on-enzymes | ESA (Edge set attention, no positional encodings) | Accuracy: 79.423±1.658 |
| graph-classification-on-imdb-b | ESA (Edge set attention, no positional encodings) | Accuracy: 86.250±0.957 |
| graph-classification-on-malnet-tiny | ESA (Edge set attention, no positional encodings) | Accuracy: 94.800±0.424 MCC: 0.935±0.005 |
| graph-classification-on-mnist | ESA (Edge set attention, no positional encodings) | Accuracy: 98.753±0.041 |
| graph-classification-on-mnist | ESA (Edge set attention, no positional encodings, tuned) | Accuracy: 98.917±0.020 |
| graph-classification-on-nci1 | ESA (Edge set attention, no positional encodings) | Accuracy: 87.835±0.644 |
| graph-classification-on-nci109 | ESA (Edge set attention, no positional encodings) | Accuracy: 84.976±0.551 |
| graph-classification-on-peptides-func | ESA (Edge set attention, no positional encodings, not tuned) | AP: 0.6863±0.0044 |
| graph-classification-on-peptides-func | ESA (Edge set attention, no positional encodings, tuned) | AP: 0.7071±0.0015 |
| graph-classification-on-peptides-func | ESA + RWSE (Edge set attention, Random Walk Structural Encoding, tuned) | AP: 0.7357±0.0036 |
| graph-classification-on-peptides-func | ESA + RWSE (Edge set attention, Random Walk Structural Encoding, + validation set) | AP: 0.7479 |
| graph-classification-on-proteins | ESA (Edge set attention, no positional encodings) | Accuracy: 82.679±0.799 |
| graph-regression-on-esr2 | ESA (Edge set attention, no positional encodings) | R2: 0.697±0.000 RMSE: 0.486±0.697 |
| graph-regression-on-f2 | ESA (Edge set attention, no positional encodings) | R2: 0.891±0.000 RMSE: 0.335±0.891 |
| graph-regression-on-kit | ESA (Edge set attention, no positional encodings) | R2: 0.841±0.000 RMSE: 0.433±0.841 |
| graph-regression-on-lipophilicity | ESA (Edge set attention, no positional encodings) | R2: 0.809±0.008 RMSE: 0.552±0.012 |
| graph-regression-on-parp1 | ESA (Edge set attention, no positional encodings) | R2: 0.925±0.000 RMSE: 0.343±0.925 |
| graph-regression-on-pcqm4mv2-lsc | ESA (Edge set attention, no positional encodings) | Test MAE: N/A Validation MAE: 0.0235 |
| graph-regression-on-peptides-struct | ESA + RWSE (Edge set attention, Random Walk Structural Encoding, tuned) | MAE: 0.2393±0.0004 |
| graph-regression-on-peptides-struct | ESA (Edge set attention, no positional encodings, not tuned) | MAE: 0.2453±0.0003 |
| graph-regression-on-pgr | ESA (Edge set attention, no positional encodings) | R2: 0.725±0.000 RMSE: 0.507±0.725 |
| graph-regression-on-zinc | ESA + rings + NodeRWSE + EdgeRWSE | MAE: 0.051 |
| graph-regression-on-zinc-500k | ESA + rings + NodeRWSE + EdgeRWSE | MAE: 0.051 |
| graph-regression-on-zinc-full | ESA + rings + NodeRWSE + EdgeRWSE | Test MAE: 0.0109±0.0002 |
| graph-regression-on-zinc-full | ESA + RWSE (Edge set attention, Random Walk Structural Encoding, tuned) | Test MAE: 0.0154±0.0001 |
| graph-regression-on-zinc-full | ESA + RWSE (Edge set attention, Random Walk Structural Encoding) | Test MAE: 0.017±0.001 |
| graph-regression-on-zinc-full | ESA + RWSE + CY2C (Edge set attention, Random Walk Structural Encoding, clique adjacency, tuned) | Test MAE: 0.0122±0.0004 |
| graph-regression-on-zinc-full | ESA (Edge set attention, no positional encodings) | Test MAE: 0.027±0.001 |
| molecular-property-prediction-on-esol | ESA (Edge set attention, no positional encodings) | R2: 0.944±0.002 RMSE: 0.485±0.009 |
| molecular-property-prediction-on-freesolv | ESA (Edge set attention, no positional encodings) | R2: 0.977±0.001 RMSE: 0.595±0.013 |