Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Human-in-the-loop Reinforcement Learning for Data Quality Monitoring in Particle Physics Experiments (2405.15508v1)

Published 24 May 2024 in hep-ex and cs.LG

Abstract: Data Quality Monitoring (DQM) is a crucial task in large particle physics experiments, since detector malfunctioning can compromise the data. DQM is currently performed by human shifters, which is costly and results in limited accuracy. In this work, we provide a proof-of-concept for applying human-in-the-loop Reinforcement Learning (RL) to automate the DQM process while adapting to operating conditions that change over time. We implement a prototype based on the Proximal Policy Optimization (PPO) algorithm and validate it on a simplified synthetic dataset. We demonstrate how a multi-agent system can be trained for continuous automated monitoring during data collection, with human intervention actively requested only when relevant. We show that random, unbiased noise in human classification can be reduced, leading to an improved accuracy over the baseline. Additionally, we propose data augmentation techniques to deal with scarce data and to accelerate the learning process. Finally, we discuss further steps needed to implement the approach in the real world, including protocols for periodic control of the algorithm's outputs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Aad, G. et al. The ATLAS Experiment at the CERN Large Hadron Collider. JINST 3, S08003 (2008).
  2. Chatrchyan, S. et al. The CMS Experiment at the CERN LHC. JINST 3, S08004 (2008).
  3. The LHCb Detector at the LHC. JINST 3, S08005 (2008).
  4. Aamodt, K. et al. The ALICE experiment at the CERN LHC. JINST 3, S08002 (2008).
  5. Reinforcement Learning: An Introduction (The MIT Press, 2018), second edn. URL http://incompleteideas.net/book/the-book-2nd.html.
  6. A survey of reinforcement learning from human feedback. arXiv preprint arXiv:2312.14925 (2023).
  7. Proximal policy optimization algorithms (2017). URL http://arxiv.org/abs/1707.06347. Cite arxiv:1707.06347.
  8. Multi-Agent Reinforcement Learning: Foundations and Modern Approaches (MIT Press, 2024). URL https://www.marl-book.com.
  9. Data Quality Monitoring Anomaly Detection, chap. Chapter 5, 115–149 (World Scientific Publishing Company, 2022). URL https://www.worldscientific.com/doi/abs/10.1142/9789811234033_0005. https://www.worldscientific.com/doi/pdf/10.1142/9789811234033_0005.
  10. Adinolfi, M. et al. LHCb data quality monitoring. J. Phys.: Conf. Ser. 898, 092027 (2017). URL https://cds.cern.ch/record/2298467.
  11. Detector monitoring with artificial neural networks at the CMS experiment at the CERN Large Hadron Collider. Comput. Softw. Big Sci. 3, 3 (2019). URL https://cds.cern.ch/record/2683825. 1808.00911.
  12. Pol, A. A. et al. Anomaly detection using Deep Autoencoders for the assessment of the quality of the data acquired by the CMS experiment. Tech. Rep., CERN, Geneva (2019). URL https://cds.cern.ch/record/2650715.
  13. Asres, M. W. et al. Spatio-Temporal Anomaly Detection with Graph Networks for Data Quality Monitoring of the Hadron Calorimeter. Sensors 23, 9679 (2023). 2311.04190.
  14. Deja, K. R. Using Machine Learning techniques for Data Quality Monitoring in CMS and ALICE experiments. PoS LHCP2019, 236 (2019). URL https://cds.cern.ch/record/2707754.
  15. Trigger Rate Anomaly Detection with Conditional Variational Autoencoders at the CMS Experiment. In Machine Learning and the Physical Sciences Workshop at the 33rd Conference on Neural Information Processing Systems (NeurIPS) (Vancouver, Canada, 2019). URL https://inria.hal.science/hal-02428005.
  16. Arshad, K. et al. Deep reinforcement learning for anomaly detection: A systematic review. IEEE Access 10, 124017–124035 (2022).
  17. Hanten, J. et al. Enhancement of the S-DALINAC Control System with Machine Learning Methods. In 8th International Beam Instrumentation Conference, WEBO04 (2019).
  18. Xu, C. et al. Beyond PID Controllers: PPO with Neuralized PID Policy for Proton Beam Intensity Control in Mu2e. In 37th Conference on Neural Information Processing Systems (2023). 2312.17372.
  19. Ultra fast reinforcement learning demonstrated at CERN AWAKE. JACoW IPAC2023, THPL038 (2023).
  20. Su, C. et al. Reinforcement control for LEBT and RFQ of linear accelerators. JACoW IPAC2023, WEPA099 (2023).
  21. Developing Robust Digital Twins and Reinforcement Learning for Accelerator Control Systems at the Fermilab Booster. In 12th International Particle Accelerator Conference  (2021). 2105.12847.
  22. Model-free and Bayesian Ensembling Model-based Deep Reinforcement Learning for Particle Accelerator Control Demonstrated on the FERMI FEL (2020). 2012.09737.
  23. Kain, V. et al. Sample-efficient reinforcement learning for CERN accelerator control. Phys. Rev. Accel. Beams 23, 124801 (2020).
  24. Autonomous Control of a Particle Accelerator using Deep Reinforcement Learning (2020). 2010.08141.
  25. Wen, Q. et al. Time series data augmentation for deep learning: A survey. In Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, 4653–4660 (International Joint Conferences on Artificial Intelligence Organization, 2021). URL https://doi.org/10.24963/ijcai.2021/631. Survey Track.
  26. A survey on image data augmentation for deep learning. Journal of Big Data 6, 1–48 (2019). URL https://api.semanticscholar.org/CorpusID:195811894.
  27. OpenAI. Chatgpt: Optimizing language models for dialogue (2022). URL https://openai.com/index/chatgpt/.
  28. Paszke, A. et al. Pytorch: An imperative style, high-performance deep learning library. CoRR abs/1912.01703 (2019). URL http://arxiv.org/abs/1912.01703. 1912.01703.
  29. Barhate, N. Minimal pytorch implementation of proximal policy optimization. https://github.com/nikhilbarhate99/PPO-PyTorch (2021). MIT License. Copyright (c) 2018 Nikhil Barhate.
  30. Brockman, G. et al. Openai gym. CoRR abs/1606.01540 (2016). URL http://arxiv.org/abs/1606.01540. 1606.01540.
  31. Adam: A method for stochastic optimization (2014). URL http://arxiv.org/abs/1412.6980. Cite arxiv:1412.6980Comment: Published as a conference paper at the 3rd International Conference for Learning Representations, San Diego, 2015.
  32. Learning with noisy labels. In Burges, C. J. C., Bottou, L., Ghahramani, Z. & Weinberger, K. Q. (eds.) NIPS, 1196–1204 (2013). URL http://dblp.uni-trier.de/db/conf/nips/nips2013.html#NatarajanDRT13.
  33. Roberts, S. W. Control chart tests based on geometric moving averages. Technometrics 1, 239–250 (1959). URL http://www.jstor.org/stable/1266443.
  34. Williams, R. J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning 8, 229–256 (1992).
  35. Achiam, J. Proximal policy optimization. OpenAI Spinning Up https://spinningup.openai.com/en/latest/algorithms/ppo.html (2018). Accessed: 2024-05-17.

Summary

We haven't generated a summary for this paper yet.