
摘要
注意力机制在细粒度视觉识别任务中展现出了巨大的潜力。本文提出了一种基于因果推理的反事实注意力学习方法,以学习更为有效的注意力机制。与大多数现有方法通过传统似然性来学习视觉注意力不同,我们建议利用反事实因果关系来学习注意力,这不仅提供了一种衡量注意力质量的工具,还为指导学习过程提供了强大的监督信号。具体而言,我们通过反事实干预分析所学视觉注意力对网络预测的影响,并最大化该影响,以鼓励网络在细粒度图像识别中学习更有用的注意力。实验结果表明,我们在一系列细粒度识别任务上评估了该方法,这些任务中注意力起着关键作用,包括细粒度图像分类、行人重识别和车辆重识别。所有基准测试的一致改进证明了我们方法的有效性。代码可在 https://github.com/raoyongming/CAL 获取。
代码仓库
raoyongming/CAL
官方
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| few-shot-learning-on-dtd | CAL | 12-shot Accuracy: 54.6 16-shot Accuracy: 57.4 4-shot Accuracy: 40.9 8-shot Accuracy: 50.4 |
| few-shot-learning-on-fgvc-aircraft-1 | CAL | 12-shot Accuracy: 67.6 16-shot Accuracy: 74.3 4-shot Accuracy: 35.2 8-shot Accuracy: 55.4 Harmonic mean: 35.2 |
| few-shot-learning-on-stanford-cars | CAL | 12-shot Accuracy: 82.9 16-shot Accuracy: 88.9 4-shot Accuracy: 42.2 8-shot Accuracy: 71.8 |
| fine-grained-image-classification-on-cub-200-1 | CAL | Accuracy: 90.6 |
| fine-grained-image-classification-on-fgvc | CAL | Accuracy: 94.2 |
| fine-grained-image-classification-on-stanford | CAL | Accuracy: 95.5% |
| mitigating-contextual-bias-on-fgvc-aircraft | CAL | OOD Accuracy (%): 10.2 Top-1 Accuracy (%): 71.0 |
| mitigating-contextual-bias-on-fgvc-aircraft | CAL + ALIA | OOD Accuracy (%): 25.1 Top-1 Accuracy (%): 71.8 |
| person-re-identification-on-dukemtmc-reid | CAL | Rank-1: 90 mAP: 80.5 |
| person-re-identification-on-market-1501 | CAL | Rank-1: 95.5 mAP: 89.5 |
| person-re-identification-on-msmt17 | CAL(ResNet50) | Rank-1: 84.2 mAP: 64 |
| vehicle-re-identification-on-vehicleid-large | CAL | Rank-1: 75.1 mAP: 80.9 |
| vehicle-re-identification-on-vehicleid-medium | CAL | Rank-1: 78.2 mAP: 83.8 |
| vehicle-re-identification-on-vehicleid-small | CAL | Rank-1: 82.5 mAP: 87.8 |
| vehicle-re-identification-on-veri-776 | CAL | Rank-1: 95.4 Rank5: 97.9 mAP: 74.3 |