Enabling On-Device Learning via Experience Replay with Efficient Dataset Condensation (2405.16113v1)
Abstract: Upon deployment to edge devices, it is often desirable for a model to further learn from streaming data to improve accuracy. However, extracting representative features from such data is challenging because it is typically unlabeled, non-independent and identically distributed (non-i.i.d), and is seen only once. To mitigate this issue, a common strategy is to maintain a small data buffer on the edge device to hold the most representative data for further learning. As most data is either never stored or quickly discarded, identifying the most representative data to avoid significant information loss becomes critical. In this paper, we propose an on-device framework that addresses this issue by condensing incoming data into more informative samples. Specifically, to effectively handle unlabeled incoming data, we propose a pseudo-labeling technique designed for unlabeled on-device learning environments. Additionally, we develop a dataset condensation technique that only requires little computation resources. To counteract the effects of noisy labels during the condensation process, we further utilize a contrastive learning objective to improve the purity of class data within the buffer. Our empirical results indicate substantial improvements over existing methods, particularly when buffer capacity is severely restricted. For instance, with a buffer capacity of just one sample per class, our method achieves an accuracy that outperforms the best existing baseline by 58.4% on the CIFAR-10 dataset.
- Chameleon: Dual memory replay for online continual learning on edge devices. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2023).
- Catastrophic Forgetting in Deep Learning: A Comprehensive Taxonomy. arXiv preprint arXiv:2312.10549 (2023).
- Gradient based sample selection for online continual learning. Advances in neural information processing systems 32 (2019).
- Dark experience for general continual learning: a strong, simple baseline. Advances in neural information processing systems 33 (2020), 15920–15930.
- Dataset distillation by matching training trajectories. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4750–4759.
- On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486 (2019).
- Matthias De Lange and Tinne Tuytelaars. 2021. Continual prototype evolution: Learning online from non-stationary data streams. In Proceedings of the IEEE/CVF international conference on computer vision. 8250–8259.
- Spyros Gidaris and Nikos Komodakis. 2018. Dynamic few-shot visual learning without forgetting. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4367–4375.
- Towards lossless dataset distillation via difficulty-aligned trajectory matching. arXiv preprint arXiv:2310.05773 (2023).
- Online Continual Learning in Acoustic Scene Classification: An Empirical Study. Sensors 23, 15 (2023), 6893.
- Memory efficient experience replay for streaming learning. In 2019 International Conference on Robotics and Automation (ICRA). IEEE, 9769–9776.
- Accelerating deep learning by focusing on the biggest losers. arXiv preprint arXiv:1910.00762 (2019).
- Condensing graphs via one-step gradient matching. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 720–730.
- Measuring catastrophic forgetting in neural networks. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
- Supervised contrastive learning. Advances in neural information processing systems 33 (2020), 18661–18673.
- Grad-match: Gradient matching based data subset selection for efficient deep model training. In International Conference on Machine Learning. PMLR, 5464–5474.
- Dataset condensation via efficient synthetic-data parameterization. In International Conference on Machine Learning. PMLR, 11102–11118.
- Learning multiple layers of features from tiny images. (2009).
- Dataset condensation with contrastive signals. In International Conference on Machine Learning. PMLR, 12352–12364.
- Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018).
- Semantic-driven interpretable deep multi-modal hashing for large-scale multimedia retrieval. IEEE Transactions on Multimedia 23 (2020), 4541–4554.
- Michael McCloskey and Neal J Cohen. 1989. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of learning and motivation. Vol. 24. Elsevier, 109–165.
- Reading digits in natural images with unsupervised feature learning. In NIPS workshop on deep learning and unsupervised feature learning, Vol. 2011. Granada, Spain, 7.
- icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 2001–2010.
- Imagenet large scale visual recognition challenge. International journal of computer vision 115 (2015), 211–252.
- Deep learning on multi sensor data for counter UAV applications—A systematic review. Sensors 19, 22 (2019).
- Sample condensation in online continual learning. In 2022 International Joint Conference on Neural Networks (IJCNN). IEEE, 01–08.
- Ozan Sener and Silvio Savarese. 2017. Active learning for convolutional neural networks: A core-set approach. arXiv preprint arXiv:1708.00489 (2017).
- Jahanzaib Shabbir and Tarique Anwer. 2018. A survey of deep learning techniques for mobile robot applications. arXiv preprint arXiv:1803.07608 (2018).
- Contrastive multiview coding. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI 16. Springer, 776–794.
- Jeffrey S Vitter. 1985. Random sampling with a reservoir. ACM Transactions on Mathematical Software (TOMS) 11, 1 (1985), 37–57.
- Cafe: Learning to condense dataset by aligning features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12196–12205.
- A comprehensive survey of continual learning: Theory, method and application. IEEE Transactions on Pattern Analysis and Machine Intelligence (2024).
- Dataset distillation. arXiv preprint arXiv:1811.10959 (2018).
- Enabling on-device self-supervised contrastive learning with selective data contrast. In 2021 58th ACM/IEEE Design Automation Conference (DAC).
- Bo Zhao and Hakan Bilen. 2021. Dataset condensation with differentiable siamese augmentation. In International Conference on Machine Learning. PMLR, 12674–12685.
- Bo Zhao and Hakan Bilen. 2023. Dataset condensation with distribution matching. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 6514–6523.
- Dataset condensation with gradient matching. arXiv preprint arXiv:2006.05929 (2020).