Aditya Malik, Nalini Ratha, et al.
CAI 2024
In neural architecture search (NAS), training every sampled architecture is very time-consuming and should be avoided. Weight-sharing is a promising solution to speed up the evaluation process. However, training the supernetwork incurs many discrepancies between the actual ranking and the predicted one. Additionally, efficient deep-learning engineering processes require incorporating realistic hardware-performance metrics into the NAS evaluation process, also known as hardware-aware NAS (HW-NAS). In HW-NAS, estimating task-specific performance and hardware efficiency are both required. This paper proposes a supernetwork training methodology that preserves the Pareto ranking between its different subnetworks resulting in more efficient and accurate neural networks for a variety of hardware platforms. The results show a 97% near Pareto front approximation in less than 2 GPU days of search, which provides 2x speed up compared to state-of-the-art methods. We validate our methodology on NAS-Bench-201, DARTS, and ImageNet. Our optimal model achieves 77.2% accuracy (+1.7% compared to baseline) with an inference time of 3.68ms on Edge GPU for ImageNet, which yields a 2.3x speedup. Training implementation can be found at https://github.com/IHIaadj/PRP-NAS.
Aditya Malik, Nalini Ratha, et al.
CAI 2024
Leonid Karlinsky, Joseph Shtok, et al.
CVPR 2019
Erik Altman, Jovan Blanusa, et al.
NeurIPS 2023
Pavel Klavík, A. Cristiano I. Malossi, et al.
Philos. Trans. R. Soc. A