Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Deep Learning Model Reuse in the HuggingFace Community: Challenges, Benefit and Trends (2401.13177v1)

Published 24 Jan 2024 in cs.SE, cs.CY, and cs.LG

Abstract: The ubiquity of large-scale Pre-Trained Models (PTMs) is on the rise, sparking interest in model hubs, and dedicated platforms for hosting PTMs. Despite this trend, a comprehensive exploration of the challenges that users encounter and how the community leverages PTMs remains lacking. To address this gap, we conducted an extensive mixed-methods empirical study by focusing on discussion forums and the model hub of HuggingFace, the largest public model hub. Based on our qualitative analysis, we present a taxonomy of the challenges and benefits associated with PTM reuse within this community. We then conduct a quantitative study to track model-type trends and model documentation evolution over time. Our findings highlight prevalent challenges such as limited guidance for beginner users, struggles with model output comprehensibility in training or inference, and a lack of model understanding. We also identified interesting trends among models where some models maintain high upload rates despite a decline in topics related to them. Additionally, we found that despite the introduction of model documentation tools, its quantity has not increased over time, leading to difficulties in model comprehension and selection among users. Our study sheds light on new challenges in reusing PTMs that were not reported before and we provide recommendations for various stakeholders involved in PTM reuse.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (74)
  1. W. Jiang, N. Synovic, R. Sethi, A. Indarapu, M. Hyatt, T. R. Schorlemmer, G. K. Thiruvathukal, and J. C. Davis, “An empirical study of artifacts and security risks in the pre-trained model supply chain,” in Proceedings of the 2022 ACM Workshop on Software Supply Chain Offensive Research and Ecosystem Defenses, 2022, pp. 105–114.
  2. A. Lacoste, A. Luccioni, V. Schmidt, and T. Dandres, “Quantifying the carbon emissions of machine learning,” arXiv preprint arXiv:1910.09700, 2019.
  3. D. Patterson, J. Gonzalez, Q. Le, C. Liang, L.-M. Munguia, D. Rothchild, D. So, M. Texier, and J. Dean, “Carbon emissions and large neural network training,” arXiv preprint arXiv:2104.10350, 2021.
  4. A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Advances in neural information processing systems, vol. 25, 2012.
  5. M. Oquab, L. Bottou, I. Laptev, and J. Sivic, “Learning and transferring mid-level image representations using convolutional neural networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1717–1724.
  6. R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, “Natural language processing (almost) from scratch,” Journal of machine learning research, vol. 12, no. ARTICLE, pp. 2493–2537, 2011.
  7. X. Han, Z. Zhang, N. Ding, Y. Gu, X. Liu, Y. Huo, J. Qiu, Y. Yao, A. Zhang, L. Zhang et al., “Pre-trained models: Past, present and future,” AI Open, vol. 2, pp. 225–250, 2021.
  8. X. Hou, Y. Zhao, Y. Liu, Z. Yang, K. Wang, L. Li, X. Luo, D. Lo, J. Grundy, and H. Wang, “Large language models for software engineering: A systematic literature review,” arXiv preprint arXiv:2308.10620, 2023.
  9. X. Luo, Y. Xue, Z. Xing, and J. Sun, “Prcbert: Prompt learning for requirement classification using bert-based pretrained language models,” in Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, 2022, pp. 1–13.
  10. J. He, B. Xu, Z. Yang, D. Han, C. Yang, and D. Lo, “Ptm4tag: sharpening tag recommendation of stack overflow posts with pre-trained models,” in Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, 2022, pp. 1–11.
  11. V. Murali, C. Maddila, I. Ahmad, M. Bolin, D. Cheng, N. Ghorbani, R. Fernandez, and N. Nagappan, “Codecompose: A large-scale industrial deployment of ai-assisted code authoring,” arXiv preprint arXiv:2305.12050, 2023.
  12. W. Jiang, N. Synovic, M. Hyatt et al., “An empirical study of pre-trained model reuse in the hugging face deep learning model registry,” arXiv preprint arXiv:2303.02552, 2023.
  13. Models - Hugging Face — huggingface.co. [Online]. Available: https://huggingface.co/models
  14. R. Pan, S. Biswas, M. Chakraborty, B. D. Cruz, and H. Rajan, “An empirical study on the bugs found while reusing pre-trained natural language processing models,” arXiv preprint arXiv:2212.00105, 2022.
  15. J. C. Davis, P. Jajal, W. Jiang, T. R. Schorlemmer, N. Synovic, and G. K. Thiruvathukal, “Reusing deep learning models: Challenges and directions in software engineering,” in Proceedings of the IEEE John Vincent Atanasoff Symposium on Modern Computing (JVA’23), 2023.
  16. Hugging Face Forums. [Online]. Available: https://discuss.huggingface.co/
  17. K. You, Y. Liu, Z. Zhang, J. Wang, M. I. Jordan, and M. Long, “Ranking and tuning pre-trained models: a new paradigm for exploiting model hubs,” The Journal of Machine Learning Research, vol. 23, no. 1, pp. 9400–9446, 2022.
  18. Transformers — huggingface.co. [Online]. Available: https://huggingface.co/docs/transformers/index
  19. Datasets — huggingface.co. [Online]. Available: https://huggingface.co/docs/datasets/index
  20. Spaces — huggingface.co. [Online]. Available: https://huggingface.co/docs/hub/spaces
  21. Hugging Face on Amazon SageMaker — huggingface.co. [Online]. Available: https://huggingface.co/docs/sagemaker/index
  22. Optimum Neuron — huggingface.co. [Online]. Available: https://huggingface.co/docs/optimum-neuron/index
  23. G. I. Documentation. Gradio Interface Docs — gradio.app. [Online]. Available: https://www.gradio.app/docs/interface
  24. Updating the issue template, directing general question to SO · Issue #2529 · huggingface/transformers — github.com. [Online]. Available: https://github.com/huggingface/transformers/issues/2529
  25. What kind of questions can I ask here? — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/what-kind-of-questions-can-i-ask-here/35
  26. stackoverflow.com. [Online]. Available: https://www.stackoverflow.com
  27. Hugging Face – The AI community building the future. — huggingface.co. [Online]. Available: https://huggingface.co/users
  28. Organizations - Hugging Face — huggingface.co. [Online]. Available: https://huggingface.co/organizations
  29. huggingface (Hugging Face) — huggingface.co. [Online]. Available: https://huggingface.co/huggingface
  30. S. Dueñas, V. Cosentino, G. Robles et al., “Perceval: software project data at your will,” in Proceedings of the 40th International Conference on Software Engineering: Companion Proceeedings.   ACM, 2018, pp. 1–4.
  31. A. Foundjem, E. Constantinou, T. Mens, and B. Adams, “A mixed-methods analysis of micro-collaborative coding practices in openstack,” Empir. Softw. Eng., vol. 27, no. 5, p. 120, 2022. [Online]. Available: https://doi.org/10.1007/s10664-022-10167-w
  32. V. Nardone, B. Muse, M. Abidi, F. Khomh, and M. Di Penta, “Video game bad smells: What they are and how developers perceive them,” ACM Transactions on Software Engineering and Methodology, vol. 32, no. 4, pp. 1–35, 2023.
  33. I. Ferreira, J. Cheng, and B. Adams, “The” shut the f** k up” phenomenon: Characterizing incivility in open source code review discussions,” Proceedings of the ACM on Human-Computer Interaction, vol. 5, no. CSCW2, pp. 1–35, 2021.
  34. R. Mohanani, P. Ralph, B. Turhan, and V. Mandic, “How templated requirements specifications inhibit creativity in software engineering,” IEEE Transactions on Software Engineering, vol. 48, no. 10, pp. 4074–4086, Oct 2022. [Online]. Available: http://dx.doi.org/10.1109/TSE.2021.3112503
  35. A. Ghorbani, N. Cassee, D. Robinson, A. Alami, N. A. Ernst, A. Serebrenik, and A. Wasowski, “Autonomy is an acquired taste: Exploring developer preferences for github bots,” in 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE).   IEEE, 2023, pp. 1405–1417.
  36. A. Foundjem, E. Eghan, and B. Adams, “Onboarding vs. diversity, productivity and quality—empirical study of the openstack ecosystem,” in 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE).   IEEE, 2021, pp. 1033–1045.
  37. HF-2 — miro.com. [Online]. Available: https://miro.com/app/board/uXjVMXJ3ors=/?share_link_id=763641990316
  38. About Miro — Meet the team — Our mission — miro.com. [Online]. Available: https://miro.com/about/
  39. GitHub - mina-taraghi/SANER2024ReplicationPackage — github.com. [Online]. Available: https://github.com/mina-taraghi/SANER2024ReplicationPackage
  40. T. Zhang, C. Gao, L. Ma, M. Lyu, and M. Kim, “An empirical study of common challenges in developing deep learning applications,” in 2019 IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), 2019, pp. 104–115.
  41. Fine Tuning GPT-2 - Training job only using test sample size of 5 — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/fine-tuning-gpt-2-training-job-only-using-test-sample-size-of-5/31171
  42. Flexibly styling and placing elements in Gradio Blocks — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/flexibly-styling-and-placing-elements-in-gradio-blocks/19446
  43. Correct Format for Translation Dataset To fine tune pretrained Models — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/correct-format-for-translation-dataset-to-fine-tune-pretrained-models/35671
  44. Hugging Face - Documentation — huggingface.co. [Online]. Available: https://huggingface.co/docs
  45. Transformers Notebooks — huggingface.co. [Online]. Available: https://huggingface.co/docs/transformers/notebooks
  46. Introducing Hugging Face for Education — huggingface.co. [Online]. Available: https://huggingface.co/blog/education
  47. GitHub - huggingface/education-toolkit: Educational materials for universities — github.com. https://github.com/huggingface/education-toolkit/tree/main.
  48. GitHub - huggingface/optimum: Accelerate training and inference of Transformers and Diffusers with easy to use hardware optimization tools — github.com. [Online]. Available: https://github.com/huggingface/optimum
  49. The Partnership: Amazon SageMaker and Hugging Face — huggingface.co. [Online]. Available: https://huggingface.co/blog/the-partnership-amazon-sagemaker-and-hugging-face
  50. GitHub - huggingface/evaluate: Evaluate: A library for easily evaluating machine learning models and datasets. — github.com. [Online]. Available: https://github.com/huggingface/evaluate
  51. evaluate-metric (Evaluate Metric) — huggingface.co. [Online]. Available: https://huggingface.co/evaluate-metric
  52. Model Evaluator - a Hugging Face Space by autoevaluate — huggingface.co. [Online]. Available: https://huggingface.co/spaces/autoevaluate/model-evaluator
  53. Getting Started with Hugging Face Inference Endpoints — huggingface.co. [Online]. Available: https://huggingface.co/blog/inference-endpoints
  54. Hosted Inference API — huggingface.co. [Online]. Available: https://huggingface.co/docs/api-inference/index
  55. GitHub - huggingface/huggingface_hub: The official Python client for the Huggingface Hub. — github.com. [Online]. Available: https://github.com/huggingface/huggingface_hub
  56. Hub API Endpoints — huggingface.co. [Online]. Available: https://huggingface.co/docs/hub/api
  57. transformers/src/transformers/models/auto/modeling_auto.py. [Online]. Available: https://tinyurl.com/4vfburhw
  58. M. Mitchell, S. Wu, A. Zaldivar, P. Barnes, L. Vasserman, B. Hutchinson, E. Spitzer, I. D. Raji, and T. Gebru, “Model cards for model reporting,” in Proceedings of the conference on fairness, accountability, and transparency, 2019, pp. 220–229.
  59. Modelcard Creator - a Hugging Face Space by huggingface — huggingface.co. [Online]. Available: https://huggingface.co/spaces/huggingface/Model_Cards_Writing_Tool
  60. Model Card Guidebook — huggingface.co. [Online]. Available: https://huggingface.co/docs/hub/model-card-guidebook
  61. huggingface_hub/src/huggingface_hub/templates /modelcard_template.md at main · huggingface/huggingface_hub — github.com. [Online]. Available: https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/templates/modelcard_template.md
  62. How to Get Started with the Model — huggingface.co. [Online]. Available: https://huggingface.co/docs/hub/model-card-annotated
  63. S. S. Shapiro and M. B. Wilk, “An analysis of variance test for normality (complete samples),” Biometrika, vol. 52, no. 3/4, pp. 591–611, 1965.
  64. H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar et al., “Llama: Open and efficient foundation language models,” arXiv preprint arXiv:2302.13971, 2023.
  65. True/False or Yes/No Question-answering? — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/true-false-or-yes-no-question-answering/11271
  66. Y. Shen, K. Song, X. Tan et al., “Hugginggpt: Solving ai tasks with chatgpt and its friends in huggingface,” arXiv preprint arXiv:2303.17580, 2023.
  67. T. Wolf. (2020) Transformers Huge Community Feedback — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/transformers-huge-community-feedback/120
  68. L. Debut. (2021) Transformers Huge Community feedback: 40k — discuss.huggingface.co. [Online]. Available: https://discuss.huggingface.co/t/transformers-huge-community-feedback-40k/5313
  69. A. McMillan-Major, S. Osei, J. D. Rodriguez et al., “Reusable templates and guides for documenting datasets and models for natural language processing and generation: A case study of the huggingface and gem data and model cards,” arXiv preprint arXiv:2108.07374, 2021.
  70. J. Castaño, S. Martínez-Fernández, X. Franch, and J. Bogner, “Exploring the carbon footprint of hugging face’s ml models: A repository mining study,” in 2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 2023.
  71. A. Ait, J. L. C. Izquierdo, and J. Cabot, “HFCommunity: A tool to analyze the hugging face hub community,” in 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER).   IEEE, 2023, pp. 728–732.
  72. W. Jiang, C. Cheung, G. K. Thiruvathukal, and J. C. Davis, “Exploring naming conventions (and defects) of pre-trained deep learning models in hugging face and other model hubs,” arXiv preprint arXiv:2310.01642, 2023.
  73. A. Barua, S. W. Thomas, and A. E. Hassan, “What are developers talking about? an analysis of topics and trends in stack overflow,” Empirical software engineering, vol. 19, pp. 619–654, 2014.
  74. G. L. Scoccia, P. Migliarini, and M. Autili, “Challenges in developing desktop web apps: a study of stack overflow and github,” in 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR).   IEEE, 2021, pp. 271–282.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Mina Taraghi (2 papers)
  2. Gianolli Dorcelus (1 paper)
  3. Armstrong Foundjem (3 papers)
  4. Florian Tambon (13 papers)
  5. Foutse Khomh (140 papers)
Citations (12)

Summary

We haven't generated a summary for this paper yet.