Delta: A Cloud-assisted Data Enrichment Framework for On-Device Continual Learning (2410.18378v1)
Abstract: In modern mobile applications, users frequently encounter various new contexts, necessitating on-device continual learning (CL) to ensure consistent model performance. While existing research predominantly focused on developing lightweight CL frameworks, we identify that data scarcity is a critical bottleneck for on-device CL. In this work, we explore the potential of leveraging abundant cloud-side data to enrich scarce on-device data, and propose a private, efficient and effective data enrichment framework Delta. Specifically, Delta first introduces a directory dataset to decompose the data enrichment problem into device-side and cloud-side sub-problems without sharing sensitive data. Next, Delta proposes a soft data matching strategy to effectively solve the device-side sub-problem with sparse user data, and an optimal data sampling scheme for cloud server to retrieve the most suitable dataset for enrichment with low computational complexity. Further, Delta refines the data sampling scheme by jointly considering the impact of enriched data on both new and past contexts, mitigating the catastrophic forgetting issue from a new aspect. Comprehensive experiments across four typical mobile computing tasks with varied data modalities demonstrate that Delta could enhance the overall model accuracy by an average of 15.1%, 12.4%, 1.1% and 5.6% for visual, IMU, audio and textual tasks compared with few-shot CL, and consistently reduce the communication costs by over 90% compared to federated CL.
- 2024. Apple Intelligence Preview - Apple. https://www.apple.com/apple-intelligence/.
- 2024. Common Crawl maintains a free, open repository of web crawl data that can be used by anyone. https://commoncrawl.org/.
- 2024. Google Smart Lens - Search What You See. https://lens.google/.
- A survey on homomorphic encryption schemes: Theory and implementation. ACM Computing Surveys (CSUR) 51, 4 (2018), 1–35.
- Online continual learning with maximal interfered retrieval. Advances in Neural Information Processing Systems (NeurIPS) 32 (2019).
- Apple. 2023. Legal - Siri Suggestions, Search; Privacy. https://www.apple.com/legal/privacy/data/en/siri-suggestions-search/.
- Location-based and preference-aware recommendation using sparse geo-social networking data. In International Conference on Advances in Geographic Information Systems (SIGSPATIAL). 199–208.
- A survey on data augmentation for text classification. ACM Computing Surveys (CSUR) 55, 7 (2022), 1–39.
- Ekya: Continuous learning of video analytics models on edge compute servers. In USENIX Symposium on Networked Systems Design and Implementation (NSDI). 119–135.
- Matic Broz. 2023. How many pictures are there (2024): Statistics, trends, and forecasts. https://photutorial.com/photos-statistics/.
- Dark experience for general continual learning: a strong, simple baseline. In Advances in Neural Information Processing Systems (NeurIPS). 15920–15930.
- Efficient federated learning for modern nlp. In International Conference on Mobile Computing and Networking (MobiCom). 1–16.
- Tinytl: Reduce memory, not parameters for efficient on-device learning. In Advances in Neural Information Processing Systems (NeurIPS). 11285–11297.
- Selective data acquisition in the wild for model charging. Proceedings of the VLDB Endowment (VLDB) 15 (2022), 1466–1478.
- On tiny episodic memories in continual learning. arXiv preprint arXiv:1902.10486 (2019).
- Robert C. Daley and Peter G. Neumann. 1965. A general-purpose file system for secondary storage. In Proceedings of the 1965 fall joint computer conference, part I, AFIPS 1965 (Fall, part I), Las Vegas, Nevada, USA, November 30 - December 1, 1965. 213–229.
- Qlora: Efficient finetuning of quantized llms. In Advances in Neural Information Processing Systems (NeurIPS).
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
- Federated class-incremental learning. In The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR). 10164–10173.
- Robert M French. 1999. Catastrophic forgetting in connectionist networks. Trends in Cognitive Sciences 3 (1999), 128–135.
- Oded Goldreich. 1998. Secure multi-party computation. Manuscript. Preliminary version 78, 110 (1998), 1–108.
- Chen Gong. [n. d.]. Supplementary Material. https://drive.google.com/drive/folders/1wZ5PHYLPocMURKUuy3x6DkZvEgOp1qc-?usp=drive_link.
- ODE: An Online Data Selection Framework for Federated Learning With Limited Storage. IEEE/ACM Transactions on Networking (TON) (2024).
- To store or not? online data selection for federated learning with limited storage. In ACM Web Conference(WWW). 3044–3055.
- Douglas M Hawkins. 2004. The problem of overfitting. Journal of Chemical Information and Computer Sciences 44 (2004), 1–12.
- Tyler L Hayes and Christopher Kanan. 2022. Online continual learning for embedded devices. Conference on Lifelong Learning Agents (2022).
- Deep residual learning for image recognition. In The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR). 770–778.
- Dan Hendrycks and Thomas Dietterich. 2019. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. International Conference on Learning Representations (ICLR) (2019).
- Constrained few-shot class-incremental learning. In The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR). 9057–9067.
- Apple Intelligence. 2024. Siri - Apple. https://www.apple.com/siri/.
- Angelos Katharopoulos and François Fleuret. 2018. Not all samples are created equal: Deep learning with importance sampling. In International Conference on Machine Learning (ICML). 2525–2534.
- Achieving forgetting prevention and knowledge transfer in continual learning. In Advances in Neural Information Processing Systems (NeurIPS). 22443–22456.
- {{\{{RECL}}\}}: Responsive {{\{{Resource-Efficient}}\}} continuous learning for video analytics. In USENIX Symposium on Networked Systems Design and Implementation (NSDI). 917–932.
- Overcoming catastrophic forgetting in neural networks. In Proceedings of the National Academy of Sciences (PNAS). 3521–3526.
- Design principles for lifelong learning AI accelerators. Nature Electronics 6 (2023), 807–822.
- LifeLearner: Hardware-Aware Meta Continual Learning System for Embedded Computing Platforms. In ACM Conference on Embedded Networked Sensor Systems (SenSys).
- Exploring system performance of continual learning for mobile and embedded sensing applications. In ACM/IEEE Symposium on Edge Computing (SEC). 319–332.
- Tinytrain: Deep neural network training at the extreme edge. arXiv preprint arXiv:2307.09988 (2023).
- CarM: Hierarchical episodic memory for continual learning. In Proceedings of the ACM/IEEE Design Automation Conference (DAC). 1147–1152.
- Clayton Frederick Souza Leite and Yu Xiao. 2022. Resource-efficient continual learning for sensor-based human activity recognition. ACM Transactions on Embedded Computing Systems 21, 6 (2022), 1–25.
- PyramidFL: A fine-grained client selection framework for efficient federated learning. In International Conference on Mobile Computing And Networking (MobiCom). 158–171.
- Zhizhong Li and Derek Hoiem. 2017. Learning without forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 40 (2017), 2935–2947.
- XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation. arXiv abs/2004.01401 (2020).
- David Lopez-Paz and Marc’Aurelio Ranzato. 2017. Gradient episodic memory for continual learning. In Advances in Neural Information Processing Systems (NeurIPS).
- Mining user similarity based on routine activities. Information Sciences 236 (2013), 17–32.
- Cost-effective On-device Continual Learning over Memory Hierarchy with Miro. In International Conference on Mobile Computing and Networking (MobiCom). 1–15.
- Mobile sensor data anonymization. In ACM/IEEE Conference on Internet of Things Design and Implementation (IoTDI). 49–58.
- Piggyback: Adapting a single network to multiple tasks by learning to mask weights. In European Conference on Computer Vision (ECCV). 67–82.
- Few-shot lifelong learning. In Association for the Advancement of Artificial Intelligence (AAAI). 2337–2345.
- Michael McCloskey and Neal J Cohen. 1989. Catastrophic interference in connectionist networks: The sequential learning problem. In Psychology of Learning and Motivation. Vol. 24. 109–165.
- Communication-efficient learning of deep networks from decentralized data. In International Conference on Artificial Intelligence and Statistics (AISTATS). 1273–1282.
- NVIDIA. 2023. Jetson Nano Developer Kit. https://developer.nvidia.com/embedded/jetson-nano-developer-kit.
- Official Journal of the European Union. 2018. General Data Protection Regulation. https://gdpr-info.eu/.
- Huawei Harmony OS. 2023. Data Donation. https://developer.huawei.com/consumer/en/doc/hmscore-guides/event-donate-awareness-0000001505674356.
- Gdumb: A simple approach that questions our progress in continual learning. In The European Conference on Computer Vision (ECCV). 524–540.
- A tinyml platform for on-device continual learning with quantized latent replays. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 11 (2021), 789–802.
- icarl: Incremental classifier and representation learning. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR). 2001–2010.
- Transition-aware human activity recognition using smartphones. Neurocomputing 171 (2016), 754–767.
- Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115 (2015), 211–252.
- Overcoming catastrophic forgetting with hard attention to the task. In International Conference on Machine Learning (ICML). 4548–4557.
- Overcoming catastrophic forgetting in incremental few-shot learning by finding flat minima. In Advances in Neural Information Processing Systems (NeurIPS). 6747–6761.
- Online class-incremental continual learning with adversarial shapley value. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). 9630–9638.
- Fedbalancer: Data and pace control for efficient federated learning on heterogeneous clients. In l International Conference on Mobile Systems, Applications and Services (MobiSys). 436–449.
- Fusion of smartphone motion sensors for physical activity recognition. Sensors 14, 6 (2014), 10146–10176.
- Connor Shorten and Taghi M Khoshgoftaar. 2019. A survey on image data augmentation for deep learning. Journal of big data 6, 1 (2019), 1–48.
- Lifelong machine learning systems: Beyond learning algorithms. In Association for the Advancement of Artificial Intelligence (AAAI).
- Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
- Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition. In ACM Conference on Embedded Networked Sensor Systems (SenSys). 127–140.
- Few-shot class-incremental learning. In The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR). 12183–12192.
- Sebastian Thrun and Tom M Mitchell. 1995. Lifelong robot learning. Robotics and Autonomous Systems 15 (1995), 25–46.
- Continuum: A platform for cost-aware, low-latency continual learning. In Proceedings of the ACM Symposium on Cloud Computing (SoCC). 26–40.
- A comprehensive survey of continual learning: theory, method and application. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2024).
- Pete Warden. 2018. Speech commands: A dataset for limited-vocabulary speech recognition. arXiv preprint arXiv:1804.03209 (2018).
- Kraken: memory-efficient continual learning for large-scale real-time recommendations. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC). 1–17.
- Practically Adopting Human Activity Recognition. In International Conference on Mobile Computing and Networking (MobiCom). 1–15.
- On-Device Learning for Model Personalization with Large-Scale Cloud-Coordinated Domain Adaption. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD). 2180–2190.
- Deep convolutional neural networks on multichannel time series for human activity recognition.. In International Joint Conference on Artificial Intelligence (IJCAI), Vol. 15. 3995–4001.
- Sensor-based abnormal human-activity detection. IEEE Transactions on Knowledge and Data Engineering 20 (2008), 1082–1090.
- Xue Ying. 2019. An overview of overfitting and its solutions. In Journal of Physics: Conference Series, Vol. 1168. 022022.
- Federated continual learning with weighted inter-client transfer. In International Conference on Machine Learning (ICML). 12073–12086.
- Online Coreset Selection for Rehearsal-based Continual Learning. In International Conference on Learning Representations (ICLR).
- Continual learning through synaptic intelligence. In International Conference on Machine Learning (ICML). 3987–3995.
- Few-Shot Class-Incremental Learning via Class-Aware Bilateral Distillation. In The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR). 11838–11847.
- ZipDo. 2023. Essential Apple Siri Statistics In 2024. https://zipdo.co/statistics/apple-siri/.