DARTS: Differentiable Architecture Search. This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Unlike conventional approaches of applying evolution or reinforcement learning over a discrete and non-differentiable search space, our method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent. Extensive experiments on CIFAR-10, ImageNet, Penn Treebank and WikiText-2 show that our algorithm excels in discovering high-performance convolutional architectures for image classification and recurrent architectures for language modeling, while being orders of magnitude faster than state-of-the-art non-differentiable techniques. Our implementation has been made publicly available to facilitate further research on efficient architecture search algorithms.
Keywords for this software
References in zbMATH (referenced in 4 articles )
Showing results 1 to 4 of 4.
- Chen, Yiming; Pan, Tianci; He, Cheng; Cheng, Ran: Efficient evolutionary deep neural architecture search (NAS) by noisy network morphism mutation (2020)
- Gu, Xue; Meng, Ziyao; Liang, Yanchun; Xu, Dong; Huang, Han; Han, Xiaosong; et al.: ESAE: evolutionary strategy-based architecture evolution (2020)
- Kandasamy, Kirthevasan; Vysyaraju, Karun Raju; Neiswanger, Willie; Paria, Biswajit; Collins, Christopher R.; Schneider, Jeff; Poczos, Barnabas; Xing, Eric P.: Tuning hyperparameters without grad students: scalable and robust Bayesian optimisation with Dragonfly (2020)
- Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V. Le, Ruslan Salakhutdinov: Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context (2019) arXiv