HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

How does topology of neural architectures impact gradient propagation and model performance?

{Radu Marculescu Guihong Li2 Kartikeya Bhardwa}

How does topology of neural architectures impact gradient propagation and model performance?

Abstract

DenseNets introduce concatenation-type skip connections that achieve state-of-the-art accuracy in several computer vision tasks. In this paper, we reveal that the topology of the concatenation-type skip connections is closely related to the gradient propagation which, in turn, enables a predictable behavior of DNNs’ test performance. To this end, we introduce a new metric called NN-Mass to quantify how effectively information flows through DNNs. Moreover, we empirically show that NN-Mass also works for other types of skip connections, e.g., for ResNets, Wide-ResNets (WRNs), and MobileNets, which contain addition-type skip connections (i.e., residuals or inverted residuals). As such, for both DenseNet-like CNNs and ResNets/WRNs/MobileNets, our theoretically grounded NN-Mass can identify models with similar accuracy, despite having significantly different size/compute requirements. Detailed experiments on both synthetic and real datasets (e.g., MNIST, CIFAR-10, CIFAR100, ImageNet) provide extensive evidence for our insights. Finally, the closed-form equation of our NN-Mass enables us to design significantly compressed DenseNets (for CIFAR10) and MobileNets (for ImageNet) directly at initialization without time-consuming training and/or searching.

Benchmarks

BenchmarkMethodologyMetrics
neural-architecture-search-on-cifar-10NN-MASS- CIFAR-C
FLOPS: 1.2G
Parameters: 3.82M
Search Time (GPU days): 0
Top-1 Error Rate: 3.18%
neural-architecture-search-on-cifar-10NN-MASS- CIFAR-A
FLOPS: 1.95G
Parameters: 5.02M
Search Time (GPU days): 0
Top-1 Error Rate: 3.0%
neural-architecture-search-on-imagenetNN-MASS-B
Accuracy: 73.3
FLOPs: 393M
MACs: 393M
Params: 3.7M
Top-1 Error Rate: 26.7
neural-architecture-search-on-imagenetNN-MASS-A
Accuracy: 72.9
FLOPs: 200M
MACs: 200M
Params: 2.3M
Top-1 Error Rate: 27.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
How does topology of neural architectures impact gradient propagation and model performance? | Papers | HyperAI