Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Formal and Practical Elements for the Certification of Machine Learning Systems (2310.03217v1)

Published 5 Oct 2023 in cs.LG

Abstract: Over the past decade, machine learning has demonstrated impressive results, often surpassing human capabilities in sensing tasks relevant to autonomous flight. Unlike traditional aerospace software, the parameters of machine learning models are not hand-coded nor derived from physics but learned from data. They are automatically adjusted during a training phase, and their values do not usually correspond to physical requirements. As a result, requirements cannot be directly traced to lines of code, hindering the current bottom-up aerospace certification paradigm. This paper attempts to address this gap by 1) demystifying the inner workings and processes to build machine learning models, 2) formally establishing theoretical guarantees given by those processes, and 3) complementing these formal elements with practical considerations to develop a complete certification argument for safety-critical machine learning systems. Based on a scalable statistical verifier, our proposed framework is model-agnostic and tool-independent, making it adaptable to many use cases in the industry. We demonstrate results on a widespread application in autonomous flight: vision-based landing.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. J. Dean, “A Golden Decade of Deep Learning: Computing Systems & Applications,” Daedalus, vol. 151, no. 2, pp. 58–74, 05 2022.
  2. J.-G. Durand, “Journey to Autonomy: The Potential of Vision-Based Systems,” Oct. 2022. [Online]. Available: https://www.linkedin.com/pulse/journey-autonomy-potential-vision-based-systems-xwing
  3. S-18 Aircraft and Sys Dev and Safety Assessment Committee, “Guidelines for Development of Civil Aircraft and Systems,” SAE International, Tech. Rep. ARP4754A / ED-79A, Dec. 2010.
  4. SAE International, “Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne Systems and Equipment,” SAE International, Tech. Rep. ARP4761, Jan. 1996.
  5. RTCA SC-180 / EUROCAE WG-46, “Design Assurance Guidance for Airborne Electronic Hardware,” RTCA, Inc. / EUROCAE, Tech. Rep. DO-254 / ED-80, Apr. 2000.
  6. RTCA SC-205 / EUROCAE WG-12, “Software Considerations in Airborne Systems and Equipment Certification,” RTCA, Inc. / EUROCAE, Tech. Rep. DO-178C / ED-12C, Dec. 2011.
  7. RTCA, “Standards for Processing Aeronautical Data,” RTCA, Inc., Tech. Rep. DO-200B, Jun. 2015.
  8. FAA, “Verification of Adaptive Systems,” Federal Aviation Administration, Tech. Rep. Final report, Apr. 2016.
  9. EASA, “Artificial Intelligence Roadmap: A human-centric approach to AI in aviation,” European Union Aviation Safety Agency, Tech. Rep. 1.0, Feb. 2020.
  10. ——, “Artificial Intelligence Roadmap: A human-centric approach to AI in aviation,” European Union Aviation Safety Agency, Tech. Rep. 2.0, May 2023.
  11. Innovation Network and EASA AI Task Force and Daedalean AG, “Concepts of Design Assurance for Neural Networks (CoDANN),” European Union Aviation Safety Agency, Tech. Rep. 1.0, Mar. 2020.
  12. ——, “Concepts of Design Assurance for Neural Networks (CoDANN) II,” European Union Aviation Safety Agency, Tech. Rep. 1.0, May 2021.
  13. G. Balduzzi, M. Ferrari Bravo, A. Chernova, C. Cruceru, L. van Dijk, P. de Lange, J. Jerez, N. Koehler, M. Koerner, C. Perret-Gentil, Z. Pillio, R. Polak, H. Silva, R. Valentin, I. Whittington, and G. Yakushev, “Neural network based runway landing guidance for general aviation autoland,” Federal Aviation Administration, Tech. Rep., Nov 2021, DOT/FAA/TC-21/48.
  14. EASA, “EASA Concept Paper: First usable guidance for Level 1 machine learning applications,” European Union Aviation Safety Agency, Tech. Rep. 01, Apr. 2021.
  15. ——, “EASA Concept Paper: First usable guidance for Level 1&2 machine learning applications,” European Union Aviation Safety Agency, Tech. Rep. 02, Feb. 2023.
  16. M. Gariel, B. Shimanuki, R. Timpe, and E. Wilson, “Framework for Certification of AI-Based Systems,” Mar. 2021, xwing Inc. internal document.
  17. C. Liu, T. Arnon, C. Lazarus, C. Strong, C. Barrett, M. J. Kochenderfer et al., “Algorithms for verifying deep neural networks,” Foundations and Trends® in Optimization, vol. 4, no. 3-4, pp. 244–404, 2021.
  18. E. Botoeva, P. Kouvaros, J. Kronqvist, A. Lomuscio, and R. Misener, “Efficient verification of relu-based neural networks via dependency analysis,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, 2020, pp. 3291–3299.
  19. P. Kouvaros, T. Kyono, F. Leofante, A. Lomuscio, D. Margineantu, D. Osipychev, and Y. Zheng, “Formal analysis of neural network-based systems in the aircraft domain,” in International Symposium on Formal Methods.   Springer, 2021, pp. 730–740.
  20. T. Gehr, M. Mirman, D. Drachsler-Cohen, P. Tsankov, S. Chaudhuri, and M. Vechev, “AI2: Safety and robustness certification of neural networks with abstract interpretation,” in 2018 IEEE symposium on security and privacy (SP).   IEEE, 2018, pp. 3–18.
  21. P. Henriksen and A. Lomuscio, “Efficient neural network verification via adaptive refinement and adversarial search,” in ECAI 2020.   IOS Press, 2020, pp. 2513–2520.
  22. P. Henriksen, K. Hammernik, D. Rueckert, and A. Lomuscio, “Bias field robustness verification of large neural image classifiers,” in British Machine Vision Conference (BMVC21), 2021.
  23. G. Katz, C. Barrett, D. L. Dill, K. Julian, and M. J. Kochenderfer, “Reluplex: An efficient SMT solver for verifying deep neural networks,” in International conference on computer aided verification.   Springer, 2017, pp. 97–117.
  24. G. Katz, D. A. Huang, D. Ibeling, K. Julian, C. Lazarus, R. Lim, P. Shah, S. Thakoor, H. Wu, A. Zeljić et al., “The Marabou Framework for Verification and Analysis of Deep Neural Networks,” in International Conference on Computer Aided Verification.   Springer, 2019, pp. 443–452.
  25. Y. Tian, K. Pei, S. Jana, and B. Ray, “Deeptest: Automated testing of deep-neural-network-driven autonomous cars,” in International Conference on Software Engineering, 2018, pp. 303–314.
  26. G. Singh, T. Gehr, M. Mirman, M. Püschel, and M. Vechev, “Fast and effective robustness certification,” Advances in neural information processing systems, vol. 31, 2018.
  27. D. Gopinath, G. Katz, C. S. Păsăreanu, and C. Barrett, “Deepsafe: A data-driven approach for assessing robustness of neural networks,” in International symposium on automated technology for verification and analysis.   Springer, 2018, pp. 3–19.
  28. X. Xie, L. Ma, F. Juefei-Xu, M. Xue, H. Chen, Y. Liu, J. Zhao, B. Li, J. Yin, and S. See, “Deephunter: a coverage-guided fuzz testing framework for deep neural networks,” in Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2019, pp. 146–157.
  29. J. Mohapatra, T.-W. Weng, P.-Y. Chen, S. Liu, and L. Daniel, “Towards verifying robustness of neural networks against a family of semantic perturbations,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 244–252.
  30. Y. Yang and M. Rinard, “Correctness verification of neural networks,” arXiv preprint arXiv:1906.01030, 2019.
  31. P. Linardatos, V. Papastefanopoulos, and S. Kotsiantis, “Explainable AI: A review of machine learning interpretability methods,” Entropy, vol. 23, no. 1, p. 18, 2020.
  32. M. T. Ribeiro, S. Singh, and C. Guestrin, “”Why should i trust you?” Explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135–1144.
  33. A. Corso, R. J. Moss, M. Koren, R. Lee, and M. J. Kochenderfer, “A Survey of Algorithms for Black-Box Safety Validation of Cyber-Physical Systems,” Journal of Artificial Intelligence Research, 2021.
  34. R. Lee, O. Mengshoel, A. Saksena, R. Gardner, D. Genin, J. Silbermann, M. Owen, and M. Kochenderfer, “Adaptive Stress Testing: Finding Likely Failure Events with Reinforcement Learning,” Journal of Artificial Intelligence Research, vol. 69, pp. 1165–1201, 12 2020.
  35. J. Norden, M. O’Kelly, and A. Sinha, “Efficient black-box assessment of autonomous vehicle safety,” Computing Research Repository (CoRR) in arXiv, vol. abs/1912.03618, 2019.
  36. T. T. Pham, “Identifying the Characteristics of Training Data as Necessary Conditions for Machine Learning Verification,” Oct. 2022, Federal Aviation Administration internal document.
  37. N. Kalra and S. M. Paddock, “Driving to safety: How many miles of driving would it take to demonstrate autonomous vehicle reliability?” Transportation Research Part A: Policy and Practice, vol. 94, pp. 182–193, 2016.
  38. Y. He and J. Schumann, “A framework for the analysis of deep neural networks in aerospace applications using Bayesian statistics,” in 2020 International Joint Conference on Neural Networks (IJCNN).   IEEE, 2020, pp. 1–9.
  39. R. J. Moss, M. J. Kochenderfer, M. Gariel, and A. Dubois, “Bayesian Safety Validation for Black-Box Systems,” in AIAA Aviation, 2023.
  40. G. Kimchi, “Moving Forward Safely: Amazon’s Approach to Drone Delivery,” https://www.youtube.com/watch?v=yHY-ZWpC8T4, Jun. 2019.
  41. K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” in Proceedings of the IEEE international conference on computer vision, 2017, pp. 2961–2969.
  42. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” in 2009 IEEE conference on computer vision and pattern recognition.   Ieee, 2009, pp. 248–255.
  43. T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, “Microsoft COCO: Common Objects in Context,” in European conference on computer vision.   Springer, 2014, pp. 740–755.
Citations (4)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com