
摘要
我们提出了一种新的计算机视觉神经架构——WaveMix——该架构资源高效且具有广泛的泛化能力和可扩展性。尽管使用了较少的可训练参数、GPU内存和计算量,WaveMix网络在多个任务上仍能达到与当前最先进的卷积神经网络、视觉变换器和标记混合器相当或更好的精度。这种效率可以转化为时间、成本和能源的节省。为了实现这些优势,我们在WaveMix块中采用了多级二维离散小波变换(2D-DWT),其优点如下:(1) 它根据三个强大的图像先验——尺度不变性、平移不变性和边缘稀疏性——重新组织空间信息;(2) 以无损的方式进行,无需增加参数;(3) 同时减少特征图的空间尺寸,从而降低前向和反向传播所需的内存和时间;(4) 比卷积更快地扩大感受野。整个架构由一系列自相似且分辨率保持的WaveMix块堆叠而成,这为各种任务和不同级别的资源可用性提供了架构灵活性。WaveMix在Cityscapes数据集的分割任务上建立了新的基准;在Galaxy 10 DECals、Places-365、五个EMNIST数据集以及iNAT-mini上的分类任务中表现出色,并在其他基准测试中具有竞争力。我们的代码和预训练模型已公开发布。
代码仓库
pranavphoenix/WaveMix
官方
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| image-classification-on-caltech-256 | WaveMixLite-256/7 | Accuracy: 54.62 |
| image-classification-on-cifar-10 | WaveMixLite-144/7 | Percentage correct: 97.29 |
| image-classification-on-cifar-100 | WaveMix-Lite-256/7 | Percentage correct: 70.20 |
| image-classification-on-cifar-100 | WaveMixLite-256/7 | Percentage correct: 85.09 |
| image-classification-on-emnist-balanced | WaveMixLite-128/7 | Accuracy: 91.06 |
| image-classification-on-emnist-byclass | WaveMixLite-128/7 | Accuracy: 88.43 |
| image-classification-on-emnist-bymerge | WaveMixLite-128/16 | Accuracy: 91.80 |
| image-classification-on-emnist-digits | WaveMixLite-112/16 | Accuracy (%): 99.82 |
| image-classification-on-emnist-letters | WaveMixLite-112/16 | Accuracy: 95.96 |
| image-classification-on-fashion-mnist | WaveMixLite | Percentage error: 5.68 |
| image-classification-on-galaxy10-decals | WaveMix | PARAMS (M): 28 Top-1 Accuracy (%): 95.42 |
| image-classification-on-imagenet | WaveMix-192/16 (level 3) | Top 1 Accuracy: 74.93% |
| image-classification-on-inat2021-mini | WaveMix-256/16 (level 2) | Top 1 Accuracy: 61.75 |
| image-classification-on-mnist-1 | WaveMixLite | Percentage error: 0.25 |
| image-classification-on-places365-standard | WaveMix-240/12 (level 4) | Top 1 Accuracy: 56.45 |
| image-classification-on-stl-10 | WaveMixLite-256/7 | Percentage correct: 70.88 |
| image-classification-on-svhn | WaveMixLite-144/15 | Percentage error: 1.27 |
| image-classification-on-tiny-imagenet-1 | WaveMixLite-144/7 | Validation Acc: 77.47% |
| scene-classification-on-places365-standard | WaveMix | Top 1 Error: 43.55 |
| semantic-segmentation-on-cityscapes-val | WaveMix-256/16 (Level-4) | mIoU: 82.60 |
| semantic-segmentation-on-cityscapes-val | WaveMix | mIoU: 82.7 |