Deep Learning Model Reuse in the HuggingFace Community: Challenges, Benefit and Trends (2401.13177v1)
Abstract: The ubiquity of large-scale Pre-Trained Models (PTMs) is on the rise, sparking interest in model hubs, and dedicated platforms for hosting PTMs. Despite this trend, a comprehensive exploration of the challenges that users encounter and how the community leverages PTMs remains lacking. To address this gap, we conducted an extensive mixed-methods empirical study by focusing on discussion forums and the model hub of HuggingFace, the largest public model hub. Based on our qualitative analysis, we present a taxonomy of the challenges and benefits associated with PTM reuse within this community. We then conduct a quantitative study to track model-type trends and model documentation evolution over time. Our findings highlight prevalent challenges such as limited guidance for beginner users, struggles with model output comprehensibility in training or inference, and a lack of model understanding. We also identified interesting trends among models where some models maintain high upload rates despite a decline in topics related to them. Additionally, we found that despite the introduction of model documentation tools, its quantity has not increased over time, leading to difficulties in model comprehension and selection among users. Our study sheds light on new challenges in reusing PTMs that were not reported before and we provide recommendations for various stakeholders involved in PTM reuse.
- W. Jiang, N. Synovic, R. Sethi, A. Indarapu, M. Hyatt, T. R. Schorlemmer, G. K. Thiruvathukal, and J. C. Davis, “An empirical study of artifacts and security risks in the pre-trained model supply chain,” in Proceedings of the 2022 ACM Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses, 2022, pp. 105–114.
- A. Lacoste, A. Luccioni, V. Schmidt, and T. Dandres, “Quantifying the carbon emissions of machine learning,” arXiv preprint arXiv:1910.09700, 2019.
- D. Patterson, J. Gonzalez, Q. Le, C. Liang, L.-M. Munguia, D. Rothchild, D. So, M. Texier, and J. Dean, “Carbon emissions and large neural network training,” arXiv preprint arXiv:2104.10350, 2021.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25, 2012.
- M. Oquab, L. Bottou, I. Laptev, and J. Sivic, “Learning and transferring mid-level image representations using convolutional neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1717–1724.
- R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, “Natural language processing (almost) from scratch,” Journal of machine learning research, vol. 12, no. ARTICLE, pp. 2493–2537, 2011.
- X. Han, Z. Zhang, N. Ding, Y. Gu, X. Liu, Y. Huo, J. Qiu, Y. Yao, A. Zhang, L. Zhang et al., “Pre-trained models: Past, present and future,” AI Open, vol. 2, pp. 225–250, 2021.
- X. Hou, Y. Zhao, Y. Liu, Z. Yang, K. Wang, L. Li, X. Luo, D. Lo, J. Grundy, and H. Wang, “Large language models for software engineering: A systematic literature review,” arXiv preprint arXiv:2308.10620, 2023.
- X. Luo, Y. Xue, Z. Xing, and J. Sun, “Prcbert: Prompt learning for requirement classification using bert-based pretrained language models,” in Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, 2022, pp. 1–13.
- J. He, B. Xu, Z. Yang, D. Han, C. Yang, and D. Lo, “Ptm4tag: sharpening tag recommendation of stack overflow posts with pre-trained models,” in Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, 2022, pp. 1–11.
- V. Murali, C. Maddila, I. Ahmad, M. Bolin, D. Cheng, N. Ghorbani, R. Fernandez, and N. Nagappan, “Codecompose: A large-scale industrial deployment of ai-assisted code authoring,” arXiv preprint arXiv:2305.12050, 2023.
- W. Jiang, N. Synovic, M. Hyatt et al., “An empirical study of pre-trained model reuse in the hugging face deep learning model registry,” arXiv preprint arXiv:2303.02552, 2023.
- Models - Hugging Face — huggingface.co. [Online]. Available: https://huggingface.co/models
- R. Pan, S. Biswas, M. Chakraborty, B. D. Cruz, and H. Rajan, “An empirical study on the bugs found while reusing pre-trained natural language processing models,” arXiv preprint arXiv:2212.00105, 2022.
- J. C. Davis, P. Jajal, W. Jiang, T. R. Schorlemmer, N. Synovic, and G. K. Thiruvathukal, “Reusing deep learning models: Challenges and directions in software engineering,” in Proceedings of the IEEE John Vincent Atanasoff Symposium on Modern Computing (JVA’23), 2023.
- Hugging Face Forums. [Online]. Available: https://discuss.huggingface.co/
- K. You, Y. Liu, Z. Zhang, J. Wang, M. I. Jordan, and M. Long, “Ranking and tuning pre-trained models: a new paradigm for exploiting model hubs,” The Journal of Machine Learning Research, vol. 23, no. 1, pp. 9400–9446, 2022.
- Transformers — huggingface.co. [Online]. Available: https://huggingface.co/docs/transformers/index
- Datasets — huggingface.co. [Online]. Available: https://huggingface.co/docs/datasets/index
- Spaces — huggingface.co. [Online]. Available: https://huggingface.co/docs/hub/spaces
- Hugging Face on Amazon SageMaker — huggingface.co. [Online]. Available: https://huggingface.co/docs/sagemaker/index
- Optimum Neuron — huggingface.co. [Online]. Available: https://huggingface.co/docs/optimum-neuron/index
- G. I. Documentation. Gradio Interface Docs — gradio.app. [Online]. Available: https://www.gradio.app/docs/interface
- Updating the issue template, directing general question to SO · Issue #2529 · huggingface/transformers — github.com. [Online]. Available: https://github.com/huggingface/transformers/issues/2529
- What kind of questions can I ask here? — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/what-kind-of-questions-can-i-ask-here/35
- stackoverflow.com. [Online]. Available: https://www.stackoverflow.com
- Hugging Face – The AI community building the future. — huggingface.co. [Online]. Available: https://huggingface.co/users
- Organizations - Hugging Face — huggingface.co. [Online]. Available: https://huggingface.co/organizations
- huggingface (Hugging Face) — huggingface.co. [Online]. Available: https://huggingface.co/huggingface
- S. Dueñas, V. Cosentino, G. Robles et al., “Perceval: software project data at your will,” in Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings. ACM, 2018, pp. 1–4.
- A. Foundjem, E. Constantinou, T. Mens, and B. Adams, “A mixed-methods analysis of micro-collaborative coding practices in openstack,” Empir. Softw. Eng., vol. 27, no. 5, p. 120, 2022. [Online]. Available: https://doi.org/10.1007/s10664-022-10167-w
- V. Nardone, B. Muse, M. Abidi, F. Khomh, and M. Di Penta, “Video game bad smells: What they are and how developers perceive them,” ACM Transactions on Software Engineering and Methodology, vol. 32, no. 4, pp. 1–35, 2023.
- I. Ferreira, J. Cheng, and B. Adams, “The” shut the f** k up” phenomenon: Characterizing incivility in open source code review discussions,” Proceedings of the ACM on Human-Computer Interaction, vol. 5, no. CSCW2, pp. 1–35, 2021.
- R. Mohanani, P. Ralph, B. Turhan, and V. Mandic, “How templated requirements specifications inhibit creativity in software engineering,” IEEE Transactions on Software Engineering, vol. 48, no. 10, pp. 4074–4086, Oct 2022. [Online]. Available: http://dx.doi.org/10.1109/TSE.2021.3112503
- A. Ghorbani, N. Cassee, D. Robinson, A. Alami, N. A. Ernst, A. Serebrenik, and A. Wasowski, “Autonomy is an acquired taste: Exploring developer preferences for github bots,” in 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 2023, pp. 1405–1417.
- A. Foundjem, E. Eghan, and B. Adams, “Onboarding vs. diversity, productivity and quality—empirical study of the openstack ecosystem,” in 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, 2021, pp. 1033–1045.
- HF-2 — miro.com. [Online]. Available: https://miro.com/app/board/uXjVMXJ3ors=/?share_link_id=763641990316
- About Miro — Meet the team — Our mission — miro.com. [Online]. Available: https://miro.com/about/
- GitHub - mina-taraghi/SANER2024ReplicationPackage — github.com. [Online]. Available: https://github.com/mina-taraghi/SANER2024ReplicationPackage
- T. Zhang, C. Gao, L. Ma, M. Lyu, and M. Kim, “An empirical study of common challenges in developing deep learning applications,” in 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), 2019, pp. 104–115.
- Fine Tuning GPT-2 - Training job only using test sample size of 5 — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/fine-tuning-gpt-2-training-job-only-using-test-sample-size-of-5/31171
- Flexibly styling and placing elements in Gradio Blocks — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/flexibly-styling-and-placing-elements-in-gradio-blocks/19446
- Correct Format for Translation Dataset To fine tune pretrained Models — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/correct-format-for-translation-dataset-to-fine-tune-pretrained-models/35671
- Hugging Face - Documentation — huggingface.co. [Online]. Available: https://huggingface.co/docs
- Transformers Notebooks — huggingface.co. [Online]. Available: https://huggingface.co/docs/transformers/notebooks
- Introducing Hugging Face for Education — huggingface.co. [Online]. Available: https://huggingface.co/blog/education
- GitHub - huggingface/education-toolkit: Educational materials for universities — github.com. https://github.com/huggingface/education-toolkit/tree/main.
- GitHub - huggingface/optimum: Accelerate training and inference of Transformers and Diffusers with easy to use hardware optimization tools — github.com. [Online]. Available: https://github.com/huggingface/optimum
- The Partnership: Amazon SageMaker and Hugging Face — huggingface.co. [Online]. Available: https://huggingface.co/blog/the-partnership-amazon-sagemaker-and-hugging-face
- GitHub - huggingface/evaluate: Evaluate: A library for easily evaluating machine learning models and datasets. — github.com. [Online]. Available: https://github.com/huggingface/evaluate
- evaluate-metric (Evaluate Metric) — huggingface.co. [Online]. Available: https://huggingface.co/evaluate-metric
- Model Evaluator - a Hugging Face Space by autoevaluate — huggingface.co. [Online]. Available: https://huggingface.co/spaces/autoevaluate/model-evaluator
- Getting Started with Hugging Face Inference Endpoints — huggingface.co. [Online]. Available: https://huggingface.co/blog/inference-endpoints
- Hosted Inference API — huggingface.co. [Online]. Available: https://huggingface.co/docs/api-inference/index
- GitHub - huggingface/huggingface_hub: The official Python client for the Huggingface Hub. — github.com. [Online]. Available: https://github.com/huggingface/huggingface_hub
- Hub API Endpoints — huggingface.co. [Online]. Available: https://huggingface.co/docs/hub/api
- transformers/src/transformers/models/auto/modeling_auto.py. [Online]. Available: https://tinyurl.com/4vfburhw
- M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D. Raji, and T. Gebru, “Model cards for model reporting,” in Proceedings of the conference on fairness, accountability, and transparency, 2019, pp. 220–229.
- Modelcard Creator - a Hugging Face Space by huggingface — huggingface.co. [Online]. Available: https://huggingface.co/spaces/huggingface/Model_Cards_Writing_Tool
- Model Card Guidebook — huggingface.co. [Online]. Available: https://huggingface.co/docs/hub/model-card-guidebook
- huggingface_hub/src/huggingface_hub/templates /modelcard_template.md at main · huggingface/huggingface_hub — github.com. [Online]. Available: https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md
- How to Get Started with the Model — huggingface.co. [Online]. Available: https://huggingface.co/docs/hub/model-card-annotated
- S. S. Shapiro and M. B. Wilk, “An analysis of variance test for normality (complete samples),” Biometrika, vol. 52, no. 3/4, pp. 591–611, 1965.
- H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar et al., “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.
- True/False or Yes/No Question-answering? — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/true-false-or-yes-no-question-answering/11271
- Y. Shen, K. Song, X. Tan et al., “Hugginggpt: Solving ai tasks with chatgpt and its friends in huggingface,” arXiv preprint arXiv:2303.17580, 2023.
- T. Wolf. (2020) Transformers Huge Community Feedback — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/transformers-huge-community-feedback/120
- L. Debut. (2021) Transformers Huge Community feedback: 40k — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/transformers-huge-community-feedback-40k/5313
- A. McMillan-Major, S. Osei, J. D. Rodriguez et al., “Reusable templates and guides for documenting datasets and models for natural language processing and generation: A case study of the huggingface and gem data and model cards,” arXiv preprint arXiv:2108.07374, 2021.
- J. Castaño, S. Martínez-Fernández, X. Franch, and J. Bogner, “Exploring the carbon footprint of hugging face’s ml models: A repository mining study,” in 2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 2023.
- A. Ait, J. L. C. Izquierdo, and J. Cabot, “HFCommunity: A tool to analyze the hugging face hub community,” in 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2023, pp. 728–732.
- W. Jiang, C. Cheung, G. K. Thiruvathukal, and J. C. Davis, “Exploring naming conventions (and defects) of pre-trained deep learning models in hugging face and other model hubs,” arXiv preprint arXiv:2310.01642, 2023.
- A. Barua, S. W. Thomas, and A. E. Hassan, “What are developers talking about? an analysis of topics and trends in stack overflow,” Empirical software engineering, vol. 19, pp. 619–654, 2014.
- G. L. Scoccia, P. Migliarini, and M. Autili, “Challenges in developing desktop web apps: a study of stack overflow and github,” in 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). IEEE, 2021, pp. 271–282.
- Mina Taraghi (2 papers)
- Gianolli Dorcelus (1 paper)
- Armstrong Foundjem (3 papers)
- Florian Tambon (13 papers)
- Foutse Khomh (140 papers)