Laboratory Experiments of Model-based Reinforcement Learning for Adaptive Optics Control (2401.00242v1)
Abstract: Direct imaging of Earth-like exoplanets is one of the most prominent scientific drivers of the next generation of ground-based telescopes. Typically, Earth-like exoplanets are located at small angular separations from their host stars, making their detection difficult. Consequently, the adaptive optics (AO) system's control algorithm must be carefully designed to distinguish the exoplanet from the residual light produced by the host star. A new promising avenue of research to improve AO control builds on data-driven control methods such as Reinforcement Learning (RL). RL is an active branch of the machine learning research field, where control of a system is learned through interaction with the environment. Thus, RL can be seen as an automated approach to AO control, where its usage is entirely a turnkey operation. In particular, model-based reinforcement learning (MBRL) has been shown to cope with both temporal and misregistration errors. Similarly, it has been demonstrated to adapt to non-linear wavefront sensing while being efficient in training and execution. In this work, we implement and adapt an RL method called Policy Optimization for AO (PO4AO) to the GHOST test bench at ESO headquarters, where we demonstrate a strong performance of the method in a laboratory environment. Our implementation allows the training to be performed parallel to inference, which is crucial for on-sky operation. In particular, we study the predictive and self-calibrating aspects of the method. The new implementation on GHOST running PyTorch introduces only around 700 microseconds in addition to hardware, pipeline, and Python interface latency. We open-source well-documented code for the implementation and specify the requirements for the RTC pipeline. We also discuss the important hyperparameters of the method, the source of the latency, and the possible paths for a lower latency implementation.
- C. Marois, B. Zuckerman, Q. M. Konopacky, et al., “Images of a fourth planet orbiting hr 8799,” Nature 468(7327), 1080–1083 (2010).
- A.-M. Lagrange, D. Gratadour, G. Chauvin, et al., “A probable giant planet imaged in the β𝛽\betaitalic_β pictoris disk-vlt/naco deep l’-band imaging,” Astronomy & Astrophysics 493(2), L21–L25 (2009).
- B. Macintosh, J. Graham, T. Barman, et al., “Discovery and spectroscopy of the young jovian planet 51 eri b with the gemini planet imager,” Science 350(6256), 64–67 (2015).
- O. Guyon, “Limits of adaptive optics for high-contrast imaging,” The Astrophysical Journal 629(1), 592 (2005).
- N. Cerpa-Urra, M. Kasper, C. Kulcsár, et al., “Cascade adaptive optics: contrast performance analysis of a two-stage controller by numerical simulations,” Journal of Astronomical Telescopes, Instruments, and Systems 8(1), 019001 (2022).
- A. Boccaletti, G. Chauvin, D. Mouillet, et al., “Sphere+: Imaging young jupiters down to the snowline,” arXiv preprint arXiv:2003.05714 (2020).
- L. Poyneer, M. van Dam, and J.-P. Véran, “Experimental verification of the frozen flow atmospheric turbulence assumption with use of astronomical adaptive optics telemetry,” JOSA A 26(4), 833–846 (2009).
- C. Heritier, S. Esposito, T. Fusco, et al., “A new calibration strategy for adaptive telescopes with pyramid wfs,” Monthly Notices of the Royal Astronomical Society 481(2), 2829–2840 (2018).
- V. Korkiakoski, C. Vérinaud, and M. Le Louarn, “Improving the performance of a pyramid wavefront sensor with modal sensitivity compensation,” Applied optics 47(1), 79–87 (2008).
- V. Deo, É. Gendron, G. Rousset, et al., “A telescope-ready approach for modal compensation of pyramid wavefront sensor optical gain,” Astronomy & Astrophysics 629, A107 (2019).
- J. Nousiainen, C. Rajani, M. Kasper, et al., “Toward on-sky adaptive optics control using reinforcement learning-model-based policy optimization for adaptive optics,” Astronomy & Astrophysics 664, A71 (2022).
- J. Nousiainen, C. Rajani, M. Kasper, et al., “Adaptive optics control using model-based reinforcement learning,” Optics Express 29(10), 15327–15344 (2021).
- S. Y. Haffert, J. Males, L. Close, et al., “Data-driven subspace predictive control: lab demonstration and future outlook,” in Techniques and Instrumentation for Detection of Exoplanets X, 11823, 118231C, International Society for Optics and Photonics (2021).
- C. Kulcsár, H.-F. Raynaud, C. Petit, et al., “Optimal control, observers and integrators in adaptive optics,” Optics express, 14(17):7464–7476 (2006).
- R. N. Paschall and D. J. Anderson, “Linear quadratic gaussian control of a deformable mirror adaptive optics system with time-delayed measurements,” Applied optics 32(31), 6347–6358 (1993).
- M. Gray and B. Le Roux, “Ensemble transform kalman filter, a nonstationary control law for complex ao systems on elts: theoretical aspects and first simulations results,” in Adaptive Optics Systems III, 8447, 84471T, International Society for Optics and Photonics (2012).
- J.-M. Conan, H. Raynaud, C. AR, Kulcsár, et al., “Are integral controllers adapted to the new era of elt adaptive optics?,” in AO4ELT, (2011).
- C. Correia, J.-M. Conan, C. Kulcsár, et al., “Adapting optimal lqg methods to elt-sized ao systems,” in 1st AO4ELT conference-Adaptive Optics for Extremely Large Telescopes, 07003, EDP Sciences (2010).
- C. Correia, H.-F. Raynaud, C. Kulcsár, et al., “On the optimal reconstruction and control of adaptive optical systems with mirror dynamics,” JOSA A 27(2), 333–349 (2010).
- C. M. Correia, C. Z. Bond, J.-F. Sauvage, et al., “Modeling astronomical adaptive optics performance with temporally filtered wiener reconstruction of slope data,” JOSA A 34(10), 1877–1887 (2017).
- B. Sinquin, L. Prengère, C. Kulcsár, et al., “On-sky results for adaptive optics control with data-driven models on low-order modes,” Monthly Notices of the Royal Astronomical Society 498(3), 3228–3240 (2020).
- O. Guyon and J. Males, “Adaptive optics predictive control with empirical orthogonal functions (eofs),” arXiv preprint arXiv:1707.00570 (2017).
- L. A. Poyneer, B. A. Macintosh, and J.-P. Véran, “Fourier transform wavefront control with adaptive prediction of the atmosphere,” JOSA A 24(9), 2645–2660 (2007).
- C. Dessenne, P.-Y. Madec, and G. Rousset, “Optimization of a predictive controller for closed-loop adaptive optics,” Applied optics 37(21), 4623–4633 (1998).
- M. van Kooten, N. Doelman, and M. Kenworthy, “Impact of time-variant turbulence behavior on prediction for adaptive optics systems,” JOSA A 36(5), 731–740 (2019).
- J. R. Males and O. Guyon, “Ground-based adaptive optics coronagraphic performance under closed-loop predictive control,” Journal of Astronomical Telescopes, Instruments, and Systems 4(1), 019001 (2018).
- M. A. van Kooten, R. Jensen-Clem, S. Cetre, et al., “Predictive wavefront control on keck ii adaptive optics bench: on-sky coronagraphic results,” Journal of Astronomical Telescopes, Instruments, and Systems 8(2), 029006 (2022).
- R. Swanson, M. Lamb, C. Correia, et al., “Wavefront reconstruction and prediction with convolutional neural networks,” in Adaptive Optics Systems VI, 10703, 107031F, International Society for Optics and Photonics (2018).
- Z. Sun, Y. Chen, X. Li, et al., “A bayesian regularized artificial neural network for adaptive optics forecasting,” Optics Communications 382, 519–527 (2017).
- X. Liu, T. Morris, and C. Saunter, “Using long short-term memory for wavefront prediction in adaptive optics,” in International Conference on Artificial Neural Networks, 537–542, Springer (2019).
- A. P. Wong, B. R. Norris, P. G. Tuthill, et al., “Predictive control for adaptive optics using neural networks,” Journal of Astronomical Telescopes, Instruments, and Systems 7(1), 019001 (2021).
- R. Swanson, M. Lamb, C. M. Correia, et al., “Closed loop predictive control of adaptive optics systems with convolutional neural networks,” Monthly Notices of the Royal Astronomical Society 503(2), 2944–2954 (2021).
- R. Hafeez, F. Archinuk, S. Fabbro, et al., “Forecasting wavefront corrections in an adaptive optics system,” Journal of Astronomical Telescopes, Instruments, and Systems 8(2), 029003–029003 (2022).
- R. Landman and S. Y. Haffert, “Nonlinear wavefront reconstruction with convolutional neural networks for fourier-based wavefront sensors,” Opt. Express 28, 16644–16657 (2020).
- A. P. Wong, B. R. Norris, V. Deo, et al., “Nonlinear wave front reconstruction from a pyramid sensor using neural networks,” Publications of the Astronomical Society of the Pacific 135(1053), 114501 (2023).
- F. Archinuk, R. Hafeez, S. Fabbro, et al., “Mitigating the non-linearities in a pyramid wavefront sensor,” arXiv preprint arXiv:2305.09805 (2023).
- Y. He, Z. Liu, Y. Ning, et al., “Deep learning wavefront sensing method for shack-hartmann sensors with sparse sub-apertures,” Optics Express 29(11), 17669–17682 (2021).
- B. Pou, F. Ferreira, E. Quinones, et al., “Adaptive optics control with multi-agent model-free reinforcement learning,” Opt. Express 30, 2991–3015 (2022).
- R. Landman, S. Y. Haffert, V. M. Radhakrishnan, et al., “Self-optimizing adaptive optics control with reinforcement learning,” in Adaptive Optics Systems VII, 11448, 1144849, International Society for Optics and Photonics (2020).
- R. Landman, S. Y. Haffert, V. M. Radhakrishnan, et al., “Self-optimizing adaptive optics control with reinforcement learning for high-contrast imaging,” Journal of Astronomical Telescopes, Instruments, and Systems 7(3), 039002 (2021).
- J. Fowler and R. Landman, “Tempestas ex machina: a review of machine learning methods for wavefront control,” Techniques and Instrumentation for Detection of Exoplanets XI 12680, 100–114 (2023).
- E. Gendron, “Modal Control Optimization in an Adaptive Optics System,” in European Southern Observatory Conference and Workshop Proceedings, European Southern Observatory Conference and Workshop Proceedings 48, 187 (1994).
- P.-Y. Madec, “Control techniques,” Adaptive optics in astronomy , 131–154 (1999).
- R. Bellman, “A markovian decision process,” Journal of mathematics and mechanics , 679–684 (1957).
- D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980 (2014).
- B. Engler, M. Kasper, S. Leveratto, et al., “The gpu-based high-order adaptive optics testbench,” in Adaptive Optics Systems VIII, 12185, 1218558, SPIE (2022).
- F. Ferreira, A. Sevin, J. Bernard, et al., “Hard real-time core software of the ao rtc cosmic platform: architecture and performance,” in Adaptive Optics Systems VII, 11448, 239–254, SPIE (2020).
- C. T. Heritier, “Object oriented python adaptive optics (oopao),” AO4ELT7 Proceedings (2023).
- C. Petit, J.-F. Sauvage, A. Costille, et al., “SAXO: the extreme adaptive optics system of SPHERE (I) system overview and global laboratory performance,” Journal of Astronomical Telescopes, Instruments, and Systems 2(2), 025003 (2016).