Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ASP: Automatic Selection of Proxy dataset for efficient AutoML (2310.11478v1)

Published 17 Oct 2023 in cs.LG, cs.AI, and cs.CV

Abstract: Deep neural networks have gained great success due to the increasing amounts of data, and diverse effective neural network designs. However, it also brings a heavy computing burden as the amount of training data is proportional to the training time. In addition, a well-behaved model requires repeated trials of different structure designs and hyper-parameters, which may take a large amount of time even with state-of-the-art (SOTA) hyper-parameter optimization (HPO) algorithms and neural architecture search (NAS) algorithms. In this paper, we propose an Automatic Selection of Proxy dataset framework (ASP) aimed to dynamically find the informative proxy subsets of training data at each epoch, reducing the training data size as well as saving the AutoML processing time. We verify the effectiveness and generalization of ASP on CIFAR10, CIFAR100, ImageNet16-120, and ImageNet-1k, across various public model benchmarks. The experiment results show that ASP can obtain better results than other data selection methods at all selection ratios. ASP can also enable much more efficient AutoML processing with a speedup of 2x-20x while obtaining better architectures and better hyper-parameters compared to utilizing the entire dataset.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Peng Yao (16 papers)
  2. Chao Liao (13 papers)
  3. Jiyuan Jia (4 papers)
  4. Jianchao Tan (24 papers)
  5. Bin Chen (547 papers)
  6. Chengru Song (14 papers)
  7. Di Zhang (231 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.