Command Palette
Search for a command to run...
Hanxiao Liu; Karen Simonyan; Yiming Yang

Abstract
This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, our method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. Extensive experiments on CIFAR-10, ImageNet, Penn Treebank and WikiText-2 show that our algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques. Our implementation has been made publicly available to facilitate further research on efficient architecture search algorithms.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| architecture-search-on-cifar-10-image | DARTS + c/o | Params: 3.4M Percentage error: 2.83 Search Time (GPU days): 4 |
| language-modelling-on-penn-treebank-word | Differentiable NAS | Params: 23M Test perplexity: 56.1 Validation perplexity: 58.3 |
| neural-architecture-search-on-cifar-10 | DARTS (second order) | Parameters: 3.3 Search Time (GPU days): 4 Top-1 Error Rate: 2.76% |
| neural-architecture-search-on-cifar-10 | DARTS (first order) | Parameters: 3.3 Search Time (GPU days): 1.5 Top-1 Error Rate: 3% |
| neural-architecture-search-on-imagenet | DARTS | Accuracy: 73.3 MACs: 595M Params: 4.9 Top-1 Error Rate: 26.7 |
| neural-architecture-search-on-nas-bench-201 | DARTS (second order) | Accuracy (Test): 16.43 Search time (s): 29902 |
| neural-architecture-search-on-nas-bench-201 | DARTS (first order) | Accuracy (Test): 16.43 Search time (s): 10890 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.