摘要
非线性激活函数是深度神经网络架构中的核心组成部分之一。激活函数的选择会显著影响模型的运行速度、性能表现以及收敛能力。目前最常用的激活函数通常不包含可训练参数,且在训练过程中保持不变。本文提出了一系列具有和不具有可训练参数的新型激活函数。这些激活函数各具优缺点。我们将对这些激活函数的性能进行测试,并将其结果与广泛应用的ReLU激活函数进行对比。我们假设,带有可训练参数的激活函数在性能上可能优于无参数版本,因为可训练参数使模型能够“自主选择”每层所采用的激活函数类型。然而,这一优势在很大程度上取决于深度神经网络的具体架构以及激活函数本身的特性。
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| image-classification-on-cifar-10 | ResNet-44 (Trainable Activations) | Percentage correct: 90.5 |
| image-classification-on-cifar-10 | ResNet-56 (Trainable Activations) | Percentage correct: 88.8 |
| image-classification-on-cifar-10 | ResNet-8 (Trainable Activations) | Percentage correct: 86.5 |
| image-classification-on-cifar-10 | ResNet-32 (Trainable Activations) | Percentage correct: 90.9 |
| image-classification-on-cifar-10 | ResNet-14 (Trainable Activations) | Percentage correct: 89.0 |
| image-classification-on-cifar-10 | ResNet-26 (Trainable Activations) | Percentage correct: 91.1 |
| image-classification-on-cifar-10 | ResNet-20 (Trainable Activations) | Percentage correct: 90.4 |
| image-classification-on-mnist | DNN-3 (Trainable Activations) | Accuracy: 97.0 Percentage error: 3.0 Trainable Parameters: 386719 |
| image-classification-on-mnist | DNN-2 (Trainable Activations) | Accuracy: 96.4 Percentage error: 3.6 Trainable Parameters: 311651 |
| image-classification-on-mnist | DNN-5 (Trainable Activations) | Accuracy: 97.2 Percentage error: 2.8 Trainable Parameters: 575051 |