Data Science Education in Undergraduate Physics: Lessons Learned from a Community of Practice (2403.00961v2)
Abstract: It is becoming increasingly important that physics educators equip their students with the skills to work with data effectively. However, many educators may lack the necessary training and expertise in data science to teach these skills. To address this gap, we created the Data Science Education Community of Practice (DSECOP), bringing together graduate students and physics educators from different institutions and backgrounds to share best practices and lessons learned from integrating data science into undergraduate physics education. In this article we present insights and experiences from this community of practice, highlighting key strategies and challenges in incorporating data science into the introductory physics curriculum. Our goal is to provide guidance and inspiration to educators who seek to integrate data science into their teaching, helping to prepare the next generation of physicists for a data-driven world.
- Data science and its relationship to big data and data-driven decision making. Big data, 1(1):51–59, 2013.
- Rafael C. Alvarado. Data Science from 1963 to 2012, November 2023. arXiv:2311.03292 [cs].
- Rafael A. Irizarry. The Role of Academia in Data Science Education. Harvard Data Science Review, 2(1), jan 31 2020. https://hdsr.mitpress.mit.edu/pub/gg6swfqh.
- NIST Big Data Public Working Group Definitions and Taxonomies Subgroup. NIST Big Data Interoperability Framework: Volume 1, Definitions. Technical Report NIST SP 1500-1, National Institute of Standards and Technology, October 2015.
- Undergraduate data science degrees emphasize computer science and statistics but fall short in ethics training and domain-specific context. PeerJ Computer Science, 7:e441, 2021.
- Machine learning and the physical sciences. Reviews of Modern Physics, 91(4):045002, December 2019.
- Physics-informed machine learning. Nature Reviews Physics, 3(6):422–440, June 2021.
- The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Research, October 2009.
- Novelty detection meets collider physics. Physical Review D, 101(7):076015, 2020.
- Machine learning meets quantum physics. Physics Today, 72(3):48–54, 2019.
- Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602(7897):414–419, February 2022.
- Klaus Schwab. The Fourth Industrial Revolution. Currency, New York, illustrated edition edition, January 2017.
- Data science and machine learning in education. arXiv preprint arXiv:2207.09060, 2022.
- Data Science for Undergraduates: Opportunities and Options. The National Academies Press, Washington, DC, 2018.
- Michael R. Berthold. What Does It Take to Be a Successful Data Scientist? Harvard Data Science Review, 1(2), nov 1 2019. https://hdsr.mitpress.mit.edu/pub/5irjez4q.
- Wolfgang Losert. Greater Washington Area Knowledge Skills Abilities Report, 2021. Private Communication.
- Array programming with NumPy. Nature, 585(7825):357–362, September 2020.
- SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, 17:261–272, 2020.
- The pandas development team. pandas-dev/pandas: Pandas, February 2020.
- APS Topical Group on Data Science. Data Science Community of Practice, 2024. https://dsecop.org.
- American Physical Society. Topical Group on Data Science, 2024. https://engage.aps.org/gds/home.
- DESCOP Fellows. DSECOP Modules, 2024. https://github.com/GDS-Education-Community-of-Practice/DSECOP.
- J. D. Hunter. Matplotlib: A 2d graphics environment. Computing in Science & Engineering, 9(3):90–95, 2007.
- TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.
- Google Colaboratory. colab.google.