How to Sustain a Scientific Open-Source Software Ecosystem: Learning from the Astropy Project (2402.15081v1)
Abstract: Scientific open-source software (OSS) has greatly benefited research communities through its transparent and collaborative nature. Given its critical role in scientific research, ensuring the sustainability of such software has become vital. Earlier studies have proposed sustainability strategies for conventional scientific software and open-source communities. However, it remains unclear whether these solutions can be easily adapted to the integrated framework of scientific OSS and its larger ecosystem. This study examines the challenges and opportunities to enhance the sustainability of scientific OSS in the context of interdisciplinary collaboration, open-source community, and multi-project ecosystem. We conducted a case study on a widely-used software ecosystem in the astrophysics domain, the Astropy Project, using a mixed-methods design approach. This approach includes an interview with core contributors regarding their participation in an interdisciplinary team, a survey of disengaged contributors about their motivations for contribution, reasons for disengagement, and suggestions for sustaining the communities, and finally, an analysis of cross-referenced issues and pull requests to understand best practices for collaboration on the ecosystem level. Our study reveals the implications of major challenges for sustaining scientific OSS and proposes concrete suggestions for tackling these challenges.
- 2022. GitHub-Timeline events. https://docs.github.com/en/rest/issues/timeline
- 2023a. Astropy Affiliated Packages. https://www.astropy.org/affiliated/index.html#affiliated-packages
- 2023b. Astropy Coordinated Packages. https://www.astropy.org/affiliated/#coordinated-packages
- 2023c. Astropy core library. https://github.com/astropy/astropy
- 2023d. Astropy core package maintainer. https://www.astropy.org/team.html#Core_package_general_maintainer
- 2023e. Astropy infrastructure packages. https://www.astropy.org/affiliated/index.html#infrastructure-packages
- 2023f. The Astropy Project. https://www.astropy.org/about.html#about-the-astropy-project
- 2023g. astropy/astropy-project: Infrastructure/DevOps needs attention. https://github.com/astropy/astropy-project/issues/118
- 2023h. AstropyTeams. https://www.astropy.org/teams/index.html
- 2023. Automated Pull Requests. https://github.com/astrofrog/batchpr
- 2023. Bioconductor: Open Source Software for Bioinformatics. https://www.bioconductor.org/
- 2023. CANARIE: Research Software. https://www.canarie.ca/software/
- 2023. Case Study: First Image of a Black HoleCase Study: First Image of a Black Hole. https://numpy.org/case-studies/blackhole-image/
- 2023. Center for Open Science. https://www.cos.io/
- 2023a. The Comprehensive R Archive Network. https://cran.r-project.org/
- 2023b. CRAN package repository: Available Packages. https://cran.r-project.org/web/packages/
- 2023. Digital Research Alliance of Canada: Research Software. https://alliancecan.ca/en/services/research-software
- 2023. Eclipse Foundation: Projects. https://projects.eclipse.org/
- 2023. Enhanced support for citations on GitHub. https://github.blog/2021-08-19-enhanced-support-citations-github/
- 2023a. European Commission: Horizon Europe. https://research-and-innovation.ec.europa.eu/funding/funding-opportunities/funding-programmes-and-open-calls/horizon-europe_en
- 2023. Event Horizon Telescope. https://eventhorizontelescope.org/
- 2023. EVERSE: European Virtual Institute for Research Software Excellence. https://everse.software/
- 2023b. Horizon Europe Framework Programme (HORIZON): HORIZON-INFRA-2023-EOSC-01-02, Development of community-based approaches for ensuring and improving the quality of scientific software and code. https://ec.europa.eu/info/funding-tenders/opportunities/portal/screen/opportunities/topic-details/horizon-infra-2023-eosc-01-02
- 2023. ImageJ: open source software for processing and analyzing scientific images. https://imagej.net/
- 2023. JAMES WEBB SPACE TELESCOPE: GODDARD SPACE FLIGHT CENTER. https://webb.nasa.gov/
- 2023. The Journal of Open Source Software. https://joss.theoj.org/
- 2023. Matplotlib: Visualization with Python. https://matplotlib.org/
- 2023a. NumFOCUS: Better tools to build a better world. https://numfocus.org/
- 2023b. NumPy: The fundamental package for scientific computing with Python. https://numpy.org/
- 2023. Open Science Announcements from Federal Agencies. https://shorturl.at/eqZ13
- 2023. Open Source Science Initiative. https://science.nasa.gov/researchers/open-science/
- 2023. ReSA: Research Software Alliance. https://www.researchsoft.org/
- 2023. The Robot Operating System (ROS). https://www.ros.org/
- 2023. rOpenSci: R packages for the sciences. https://ropensci.org/
- 2023. Scientific Python. https://scientific-python.org/
- 2023. SciPy: Fundamental algorithms for scientific computing in Python. https://scipy.org/
- 2023. Software Sustainability Institute. https://www.software.ac.uk/about
- 2023. Supplementary Material. https://zenodo.org/record/8206708
- 2023i. The Astropy Project. https://www.astropy.org/
- 2023. Wikipedia Category: Science software. https://en.wikipedia.org/wiki/Category:Science_software
- Water Science Software Institute: Agile and Open Source Scientific Software Development. Computing in Science & Engineering 16, 3 (May 2014), 18–26. https://doi.org/10.1109/MCSE.2014.5
- Influencers of quality assurance in an open source community. In Proceedings of the 11th International Workshop on Cooperative and Human Aspects of Software Engineering. 61–68.
- Shaosong Ou Alexander Hars. 2002. Working for free? Motivations for participating in open-source projects. International journal of electronic commerce 6, 3 (2002), 25–39.
- Alfred P. Sloan Foundation 2022. Better Software for Science. https://sloan.org/programs/digital-technology/better-software-for-science.
- Ohoud Almughram and Sultan Alyahya. 2017. Coordination support for integrating user centered design in distributed agile projects. In 2017 IEEE 15th International Conference on Software Engineering Research, Management and Applications (SERA). IEEE, 229–238.
- Development of service-oriented architectures using model-driven development: A mapping study. Information and Software Technology 62 (2015), 42–66.
- Software engineering practices for scientific software development: A systematic mapping study. Journal of Systems and Software 172 (Feb. 2021), 110848. https://doi.org/10.1016/j.jss.2020.110848
- Wolfgang Bangerth and Timo Heister. 2013. What makes computational open source software libraries successful? Computational Science & Discovery 6, 1 (Nov. 2013), 015010. https://doi.org/10.1088/1749-4699/6/1/015010
- Christian Bird. 2011. Sociotechnical coordination and collaboration in open source software. In 2011 27th IEEE International Conference on Software Maintenance (ICSM). IEEE, 568–573.
- Member checking: a tool to enhance trustworthiness or merely a nod to validation? Qualitative health research 26, 13 (2016), 1802–1811.
- Building software, building community: lessons from the rOpenSci project. Journal of open research software 3, 1 (2015).
- How to break an API: cost negotiation and community values in three software ecosystems. In Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. 109–120.
- When and How to Make Breaking Changes: Policies and Practices in 18 Open Source Software Ecosystems. ACM Transactions on Software Engineering Methodology 30, 4 (2021). https://doi.org/10.1145/3447245
- Jan Bosch. 2009. From software product lines to software ecosystems.. In SPLC, Vol. 9. 111–119.
- Jan Bosch and Petra Bosch-Sijtsema. 2010. From integration to composition: On the impact of software product lines, global development and ecosystems. Journal of Systems and Software 83, 1 (2010), 67–76.
- Bozhidar Bozhanov. 2014. The Low Quality of Scientific Code. https://techblog.bozho.net/the-astonishingly-low-quality-of-scientific-code/
- Simone Brandstädter and Karlheinz Sonntag. 2016. Interdisciplinary collaboration. In Advances in ergonomic design of systems, products and processes. Springer, 395–409.
- Frederick P Brooks. 1974. The mythical man-month. Datamation 20, 12 (1974), 44–52.
- BSSw 2022. Better Scientific Software (BSSw). https://bssw.io/.
- Software ecosystems now and in the future: A definition, systematic literature review, and integration into the business and digital ecosystem literature. IEEE Transactions on Engineering Management (2022).
- Amber G Candela. 2019. Exploring the function of member checking. The qualitative report 24, 3 (2019), 619–628.
- Social interactions around cross-system bug fixings: the case of FreeBSD and OpenBSD. In Proc. Working Conf. Mining Software Repositories (MSR). 143–152.
- Software engineering for science. CRC Press.
- Software Development Environments for Scientific and Engineering Software: A Series of Case Studies. In 29th International Conference on Software Engineering (ICSE’07). IEEE, Minneapolis, MN, 550–559. https://doi.org/10.1109/ICSE.2007.77
- Collaboration in software ecosystems: A study of work groups in open environment. Information and Software Technology 145 (2022), 106849.
- Sustainability of free/libre open source projects: A longitudinal study. Journal of the Association for Information Systems 11, 11 (2010), 5.
- Technical debt in the peer-review documentation of r packages: A rOpenSci case study. In 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR). IEEE, 195–206.
- Daniela S Cruzes and Tore Dyba. 2011. Recommended steps for thematic synthesis in software engineering. In 2011 international symposium on empirical software engineering and measurement. IEEE, 275–284.
- Conversational bot for newcomers onboarding to open source projects. In Proceedings of the IEEE/ACM 42nd International Conference on Software Engineering Workshops. 46–50.
- Community recommendations for sustainable scientific software. Journal of Open Research Software 3, 1 (2015).
- Astropy: A Community Python Package for Astronomy. Astronomy & Astrophysics 558 (Oct. 2013), A33. http://arxiv.org/abs/1307.6212 arXiv:1307.6212 [astro-ph].
- The role of mentoring and project characteristics for onboarding in open source software projects. In Proceedings of the 8th ACM/IEEE international symposium on empirical software engineering and measurement. 1–10.
- Bent Flyvbjerg. 2006. Five misunderstandings about case-study research. Qualitative inquiry 12, 2 (2006), 219–245.
- National Science Foundation. 2022. The Cyberinfrastructure for Sustained Scientific Innovation (CSSI). https://beta.nsf.gov/funding/opportunities/cyberinfrastructure-sustained-scientific-innovation-cssi
- What is an adequate sample size? Operationalising data saturation for theory-based interview studies. Psychology and health 25, 10 (2010), 1229–1245.
- Open source software ecosystems: A Systematic mapping. Information and software technology 91 (2017), 160–185.
- Jonas Gamalielsson and Björn Lundell. 2014. Sustainability of Open Source software communities beyond a fork: How and why has the LibreOffice project evolved? Journal of Systems and Software 89 (2014), 128–145.
- The evolution of the R software ecosystem. In 2013 17th European Conference on Software Maintenance and Reengineering. IEEE, 243–252.
- Scientific Open Source Software: Opportunities to Accelerate Scientific Progress. https://shorturl.ac/7b5lo
- Marco Gerosa et al. 2021a. The shifting sands of motivation: Revisiting what drives contributors in open source. In Proc. Int’l Conf. Software Engineering (ICSE). IEEE, 1046–1058.
- The Shifting Sands of Motivation: Revisiting What Drives Contributors in Open Source. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). IEEE, Madrid, ES, 1046–1058. https://doi.org/10.1109/ICSE43902.2021.00098
- Mohammad Gharehyazie et al. 2017. Some from here, some from there: Cross-project code reuse in github. In Proc. Working Conf. Mining Software Repositories (MSR). IEEE, 291–301.
- Free/libre and open source software: Survey and study. (2002).
- SUMMIT: Scaffolding Open Source Software Issue Discussion through Summarization. Proc. ACM Hum.-Comput. Interact. CSCW (2023).
- GitHub Inc. 2022. Encouraging helpful contributions to your project with labels. https://shorturl.ac/7b5iu
- Philip Gray. 2022. To disengage or not to disengage: a look at contributor disengagement in open source software. In Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings. 328–330.
- Defining Research Software: a controversial discussion. https://doi.org/10.5281/zenodo.5504016
- GSoC 2022. Google Summer of Code. https://summerofcode.withgoogle.com/.
- Timothy C Guetterman and Michael D Fetters. 2018. Two methodological approaches to the integration of mixed methods and case study designs: A systematic review. American Behavioral Scientist 62, 7 (2018), 900–918.
- Charles R Harris et al. 2020. Array programming with NumPy. Nature 585, 7825 (2020), 357–362.
- Hideaki Hata et al. 2021. Science-software linkage: the challenges of traceability between scientific knowledge and software artifacts. arXiv preprint arXiv:2104.05891 (2021).
- Joseph Hejderup et al. 2018. Software ecosystem call graph for dependency management. In Proc. Int’l Conf. Software Reuse (ICSR). 101–104.
- A systematic mapping study of developer social network research. Journal of Systems and Software 171 (2021), 110802.
- An overview of the Trilinos project. ACM Trans. Math. Software 31, 3 (Sept. 2005), 397–423. https://doi.org/10.1145/1089014.1089021
- Christine Hine. 2006. Databases as Scientific Instruments and Their Role in the Ordering of Scientific Work. Social Studies of Science 36, 2 (April 2006), 269–298. https://doi.org/10.1177/0306312706054047
- Toshiki Hirao et al. 2019. The review linkage graph for code review analytics: a recovery approach and empirical study. In Proc. Int’l Symposium Foundations of Software Engineering (FSE). 578–589.
- How do developers react to api evolution? the pharo ecosystem case. In Proc. Int’l Conf. Software Maintenance and Evolution (ICSME). IEEE, 251–260.
- James Howison and James D. Herbsleb. 2011. Scientific software production: incentives and collaboration. In Proceedings of the ACM 2011 conference on Computer supported cooperative work - CSCW ’11. ACM Press, Hangzhou, China, 513. https://doi.org/10.1145/1958824.1958904
- James Howison and James D. Herbsleb. 2013. Incentives and integration in scientific software production. In Proceedings of the 2013 conference on Computer supported cooperative work - CSCW ’13. ACM Press, San Antonio, Texas, USA, 459. https://doi.org/10.1145/2441776.2441828
- Chan Zuckerberg Initiative. 2023. Essential Open Source Software for Science. https://chanzuckerberg.com/rfa/essential-open-source-software-for-science/
- Peter Ivie and Douglas Thain. 2018. Reproducibility in scientific computing. ACM Computing Surveys (CSUR) 51, 3 (2018), 1–36.
- A sense of community: A research agenda for software ecosystems. In 2009 31st International Conference on Software Engineering-Companion Volume. IEEE, 187–190.
- Natalia Juristo and Omar S Gómez. 2012. Replication of software engineering experiments. In LASER Summer School on Software Engineering, LASER Summer School on Software Engineering, LASER Summer School on Software Engineering. Springer, 60–88.
- Upulee Kanewala and James M. Bieman. 2014. Testing scientific software: A systematic literature review. Information and Software Technology 56, 10 (Oct. 2014), 1219–1232. https://doi.org/10.1016/j.infsof.2014.05.006
- Daniel S. Katz. 2021. Towards sustainable research software. https://doi.org/10.5281/zenodo.5748175
- Daniel S Katz et al. 2015. Report on the second workshop on sustainable software for science: Practice and experiences (WSSSPE2). arXiv preprint arXiv:1507.01715 (2015).
- Diane Kelly. 2015. Scientific software development viewed as knowledge acquisition: Towards understanding the development of risk-averse scientific software. Journal of Systems and Software 109 (Nov. 2015), 50–61. https://doi.org/10.1016/j.jss.2015.07.027
- Structure and evolution of package dependency networks. In 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 102–112.
- The state of the art in end-user software engineering. Comput. Surveys 43, 3 (April 2011), 1–44. https://doi.org/10.1145/1922649.1922658
- Better together: Elements of successful scientific software development in a distributed collaborative community. PLoS computational biology 16, 5 (2020), e1007507.
- Sophia Kolak et al. 2020. It takes a village to build a robot: An empirical study of the ROS ecosystem. In Proc. Int’l Conf. Software Maintenance and Evolution (ICSME). IEEE, 430–440.
- Matthew Krafczyk et al. 2019. Scientific tests and continuous integration strategies to enhance reproducibility in the scientific software context. In Proceedings of the 2nd International Workshop on Practical Reproducible Evaluation of Computer Systems. 23–28.
- Robert E Kraut and Paul Resnick. 2012. Building successful online communities: Evidence-based social design. Mit Press.
- Peripheral developer participation in open source projects: An empirical analysis. ACM Transactions on Management Information Systems (TMIS) 6, 4 (2016), 1–31.
- Summarize Me: The Future of Issue Thread Interpretation. In 2023 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 341–345.
- Karim R Lakhani and Robert G Wolf. 2003. Why hackers do what they do: Understanding motivation and effort in free/open source software projects. Open Source Software Projects (September 2003) (2003).
- Towards FAIR principles for research software. Data Science 3, 1 (2020), 37–59.
- Katherine A. Lawrence. 2006. Walking the Tightrope: The Balancing Acts of a Large e-Research Project. Computer Supported Cooperative Work (CSCW) 15, 4 (Aug. 2006), 385–411. https://doi.org/10.1007/s10606-006-9025-0
- How Are Issue Units Linked? Empirical Study on the Linking Behavior in GitHub. In 2018 25th Asia-Pacific Software Engineering Conference (APSEC). IEEE, Nara, Japan, 386–395. https://doi.org/10.1109/APSEC.2018.00053
- Developer turnover in global, industrial open source projects: Insights from applying survival analysis. In 2017 IEEE 12th International Conference on Global Software Engineering (ICGSE). IEEE, 66–75.
- Releasing Scientific Software in GitHub: A Case Study on SWMM2PEST. In 2019 IEEE/ACM 14th International Workshop on Software Engineering for Science (SE4Science). IEEE, Montreal, QC, Canada, 47–50. https://doi.org/10.1109/SE4Science.2019.00014
- Wanwangying Ma et al. 2020. Impact analysis of cross-project bugs on software ecosystems. In Proc. Int’l Conf. Software Engineering (ICSE). 100–111.
- How Do Developers Fix Cross-Project Correlated Bugs? A Case Study on the GitHub Scientific Python Ecosystem. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). IEEE, Buenos Aires, 381–392. https://doi.org/10.1109/ICSE.2017.42
- Konstantinos Manikas. 2016. Revisiting software ecosystems research: A longitudinal literature study. Journal of Systems and Software 117 (2016), 84–103.
- Konstantinos Manikas and Klaus Marius Hansen. 2013. Software ecosystems–A systematic literature review. Journal of Systems and Software 86, 5 (2013), 1294–1306.
- Design breakdowns: designer-developer gaps in representing and interpreting interactive systems. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 630–641.
- Zeeya Merali. 2010. Computational science:… error. Nature 467, 7317 (2010), 775–777.
- David G Messerschmitt and Clemens Szyperski. 2003. Software Ecosystem: Understanding an Indispensable Technology and Industry (Chapter 6: Organization of the Software Value Chain.
- Matthew B Miles and A Michael Huberman. 1994. Qualitative data analysis: An expanded sourcebook. sage.
- Characterizing the Roles of Contributors in Open-Source Scientific Software Projects. In Proc. Working Conf. Mining Software Repositories (MSR). IEEE, Montreal, QC, Canada, 421–432. https://doi.org/10.1109/MSR.2019.00069
- Courtney Miller et al. 2019. Why do people give up flossing? a study of contributor disengagement in open source. In IFIP International Conference on Open Source Systems. Springer, 116–129.
- Chris Morris and Judith Segal. 2009. Some challenges facing scientific software developers: The case of molecular biology. In 2009 Fifth IEEE International Conference on e-Science. IEEE, 216–222.
- Collaboration Challenges in Building ML-Enabled Systems: Communication, Documentation, Engineering, and Process. Organization 1, 2 (2022), 3.
- Masao Ohira et al. 2005. Accelerating cross-project knowledge collaboration using collaborative filtering and social networks. In Proceedings of the 2005 international workshop on Mining software repositories. 1–5.
- How to not get rich: An empirical study of donations in open source. In Proceedings of the ACM/IEEE 42nd international conference on software engineering. 1209–1221.
- Drew Paine and Charlotte P. Lee. 2017. ”Who Has Plots?”: Contextualizing Scientific Software, Practice, and Visualizations. Proceedings of the ACM on Human-Computer Interaction 1, CSCW (Dec. 2017), 1–21. https://doi.org/10.1145/3134720
- Ei Pa Pa Pe-Than and James D. Herbsleb. 2019. Understanding Hackathons for Science: Collaboration, Affordances, and Outcomes. In Information in Contemporary Society, Natalie Greene Taylor, Caitlin Christian-Lamb, Michelle H. Martin, and Bonnie Nardi (Eds.). Vol. 11420. Springer International Publishing, Cham, 27–37. https://doi.org/10.1007/978-3-030-15742-5_3 Series Title: Lecture Notes in Computer Science.
- João Felipe Pimentel et al. 2019. A large-scale study about quality and reproducibility of jupyter notebooks. In Proc. Working Conf. Mining Software Repositories (MSR). IEEE, 507–517.
- Joe Pitt-Francis et al. 2008. Chaste: using agile programming techniques to develop computational biology software. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 366, 1878 (2008), 3111–3136.
- Adrian M Price-Whelan et al. 2022. The Astropy Project: Sustaining and Growing a Community-oriented Open-source Project and the Latest Major Release (v5. 0) of the Core Package. The Astrophysical Journal 935, 2 (2022), 167.
- The astropy project: building an open-science project and status of the v2. 0 core package. The Astronomical Journal 156, 3 (2018), 123.
- Science as a game: conceptual model and application in scientific software design. International Journal of Design Creativity and Innovation (2022), 1–25.
- Stress and burnout in open source: Toward finding, understanding, and mitigating unhealthy interactions. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: New Ideas and Emerging Results. 57–60.
- David Ribes and Thomas A Finholt. 2007. Planning infrastructure for the long-term: Learning from cases in the natural sciences. In Proceedings of the Third International Conference on e-Social Science. Citeseer.
- Selecting samples. Qualitative research practice: A guide for social science students and researchers 111 (2013).
- How do developers react to API deprecation? The case of a Smalltalk ecosystem. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering. 1–11.
- Danielle Robinson and Joe Hand. 2019. Sustainability in Research-Driven Open Source Software. (2019).
- Per Runeson and Martin Höst. 2009. Guidelines for conducting and reporting case study research in software engineering. Empirical software engineering 14, 2 (2009), 131–164.
- A large-scale analysis of bioinformatics code on GitHub. PLOS ONE 13, 10 (Oct. 2018), e0205898. https://doi.org/10.1371/journal.pone.0205898
- Johnny Saldaña. 2021. The coding manual for qualitative researchers. The coding manual for qualitative researchers (2021), 1–440.
- The role of software in science: a knowledge graph-based analysis of software mentions in PubMed Central. PeerJ Computer Science 8 (Jan. 2022), e835.
- Stefan Schmidt. 2016. Shall we really do it again? The powerful concept of replication is neglected in the social sciences. (2016).
- Judith Segal. 2004. Professional end user developers and software development knowledge. Department of Computing, Open University, Milton Keynes, MK7 6AA, UK, Tech. Rep (2004).
- Judith Segal. 2008. Scientists and software engineers: a tale of two cultures. (2008), 8.
- Judith Segal. 2009. Software development cultures and cooperation problems: A field study of the early stages of development of software for a scientific community. Computer Supported Cooperative Work (CSCW) 18, 5 (2009), 581–606.
- Self-admitted technical debt in R: detection and causes. Automated Software Engineering 29, 2 (2022), 53.
- GitHub sponsors: exploring a new way to contribute to open source. In Proceedings of the 44th International Conference on Software Engineering. 1058–1069.
- Guide to advanced empirical software engineering. Springer.
- A theory of the engagement in open source projects via summer of code programs. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 421–431.
- Codes of conduct in Open Source Software—for warm and fuzzy feelings or equality in community? Software Quality Journal (2021), 1–40.
- Raising the Bar: Assurance Cases for Scientific Software. Computing in Science & Engineering 23, 1 (Jan. 2021), 47–57. https://doi.org/10.1109/MCSE.2020.3019770
- A simple nlp-based approach to support onboarding and retention in open source communities. In Proc. Int’l Conf. Software Maintenance and Evolution (ICSME). IEEE, 172–182.
- Victoria Stodden and Sheila Miguez. 2013. Best practices for computational science: Software infrastructure and environments for reproducible and extensible research. Journal of Open Research Software (2013).
- Klaas-Jan Stol and Brian Fitzgerald. 2018. The ABC of software engineering research. ACM Transactions on Software Engineering and Methodology (TOSEM) 27, 3 (2018), 1–51.
- Tim Storer. 2018. Bridging the Chasm: A Survey of Software Engineering Practice in Scientific Programming. Comput. Surveys 50, 4 (July 2018), 1–32. https://doi.org/10.1145/3084225
- Harsh Suri. 2011. Purposeful sampling in qualitative research synthesis. Qualitative research journal 11, 2 (2011), 63–75.
- Xin Tan et al. 2022. An exploratory study of deep learning supply chain. In Proc. Int’l Conf. Software Engineering (ICSE). 86–98.
- Scaling open source communities: An empirical study of the Linux kernel. In Proc. Int’l Conf. Software Engineering (ICSE). IEEE, 1222–1234.
- A first look at good first issues on GitHub. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 398–409.
- Erik H Trainer et al. 2014. Community code engagements: summer of code & hackathons for community building in scientific software. In Proceedings of the 18th International Conference on Supporting Group Work. 111–121.
- Artifacts, Actors, and Interactions in the Cross-Project Coordination Practices of Open-Source Communities. Journal of the Association for Information Systems 11, 12 (Dec. 2010), 838–867. https://doi.org/10.17705/1jais.00249
- Bogdan Vasilescu et al. 2015. Gender and tenure diversity in GitHub teams. In Proceedings of the 33rd annual ACM conference on human factors in computing systems. 3789–3798.
- SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature methods 17, 3 (2020), 261–272.
- Carrots and rainbows: Motivation and social practice in open source software development. MIS quarterly (2012), 649–676.
- Toward the health measure for open source software ecosystem via projection pursuit and real-coded accelerated genetic. IEEE Access 7 (2019), 87396–87409.
- GitHub repositories with links to academic papers: Public access, traceability, and evolution. Journal of Systems and Software 183 (2022), 111117.
- Barriers to Reproducible Scientific Programming. In 2019 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC). IEEE, Memphis, TN, USA, 217–221. https://doi.org/10.1109/VLHCC.2019.8818907
- Erik Wittern et al. 2016. A look at the dynamics of the JavaScript package ecosystem. In Proc. Working Conf. Mining Software Repositories (MSR). 351–361.
- Experimentation in software engineering. Springer Science & Business Media.
- Yulun Wu et al. 2023. Understanding the Threats of Upstream Vulnerabilities to Downstream Projects in the Maven Ecosystem. In Proc. Int’l Conf. Software Engineering (ICSE). IEEE, 1046–1058.
- Historical and impact analysis of API breaking changes: A large-scale study. 138–147.
- Recommending good first issues in GitHub OSS projects. In Proceedings of the 44th International Conference on Software Engineering. 1830–1842.
- Robert K Yin. 2009. Case study research: Design and methods Sage Publications. Thousand oaks (2009).
- Qian Zhang et al. 2021. Research Software Current State Assessment. https://alliancecan.ca/sites/default/files/2022-03/RS_Current_State_Report.pdf