Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 86 tok/s
Gemini 2.5 Pro 60 tok/s Pro
GPT-5 Medium 28 tok/s
GPT-5 High 34 tok/s Pro
GPT-4o 72 tok/s
GPT OSS 120B 441 tok/s Pro
Kimi K2 200 tok/s Pro
2000 character limit reached

The Fourth International Verification of Neural Networks Competition (VNN-COMP 2023): Summary and Results (2312.16760v1)

Published 28 Dec 2023 in cs.LG, cs.AI, and cs.SE

Abstract: This report summarizes the 4th International Verification of Neural Networks Competition (VNN-COMP 2023), held as a part of the 6th Workshop on Formal Methods for ML-Enabled Autonomous Systems (FoMLAS), that was collocated with the 35th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2023 iteration, 7 teams participated on a diverse set of 10 scored and 4 unscored benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. Belgian Traffic Sign Database. https://www.kaggle.com/datasets/shazaelmorsh/trafficsigns. Accessed: March 25th, 2023.
  2. Chinese Traffic Sign Database. https://www.kaggle.com/datasets/dmitryyemelyanov/chinese-traffic-signs. Accessed: March 25th, 2023.
  3. German Traffic Sign Recognition Benchmark. https://www.kaggle.com/datasets/meowmeowmeowmeowmeow/gtsrb-german-traffic-sign?datasetId=82373&language=Python. Accessed: March 25th, 2023.
  4. PyRAT Analyzer website. https://pyrat-analyzer.com/. Accessed: December 15th, 2023.
  5. Stanley Bak. Execution-guided overapproximation (ego) for improving scalability of neural network verification, 2020.
  6. Stanley Bak. nnenum: Verification of relu neural networks with optimized abstraction refinement. In NASA Formal Methods Symposium, pages 19–36. Springer, 2021.
  7. Improved geometric path enumeration for verifying ReLU neural networks. In 32nd International Conference on Computer-Aided Verification (CAV), July 2020.
  8. A unified view of piecewise linear neural network verification. Advances in Neural Information Processing Systems, 2018.
  9. Supporting standardization of neural networks verification with vnnlib and coconet. In Nina Narodytska, Guy Amir, Guy Katz, and Omri Isac, editors, Proceedings of the 6th Workshop on Formal Methods for ML-Enabled Autonomous Systems, volume 16 of Kalpa Publications in Computing, pages 47–58. EasyChair, 2023.
  10. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2020.
  11. A DPLL(T) Framework for Verifying Deep Neural Networks, 2023. arXiv, 25 pages.
  12. Reciph: Relational coefficients for input partitioning heuristic. In 1st Workshop on Formal Verification of Machine Learning (WFVML 2022), 2022.
  13. EASA and Collins Aerospace. Formal Methods use for Learning Assurance (ForMuLA). Technical report, April 2023.
  14. Fast BATLLNN: fast box analysis of two-level lattice neural networks. In Ezio Bartocci and Sylvie Putot, editors, HSCC ’22: 25th ACM International Conference on Hybrid Systems: Computation and Control, Milan, Italy, May 4 - 6, 2022, pages 23:1–23:11. ACM, 2022.
  15. On the effectiveness of interval bound propagation for training verifiably robust models. arXiv preprint arXiv:1810.12715, 2018.
  16. Characterizing neural network verification for systems with NN4SYSBench. 1st Workshop on Formal Verification of Machine Learning, 2022.
  17. Binarized Neural Networks. Advances in Neural Information Processing Systems, 29, 2016.
  18. Reluplex: An efficient smt solver for verifying deep neural networks. In International Conference on Computer Aided Verification, pages 97–117. Springer, 2017.
  19. The marabou framework for verification and analysis of deep neural networks. In Isil Dillig and Serdar Tasiran, editors, Computer Aided Verification - 31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part I, volume 11561 of Lecture Notes in Computer Science, pages 443–452. Springer, 2019.
  20. The marabou framework for verification and analysis of deep neural networks. In International Conference on Computer Aided Verification, pages 443–452. Springer, 2019.
  21. Benchmark: remaining useful life predictor for aircraft equipment. In International Conference on Bridging the Gap between AI and Reality, pages 299–304. Springer, 2023.
  22. Formal verification of a neural network based prognostics system for aircraft equipment. In International Conference on Bridging the Gap between AI and Reality, pages 225–240. Springer, 2023.
  23. The case for learned index structures. In Proceedings of the 2018 International Conference on Management of Data, 2018.
  24. NNV 2.0: The neural network verification tool. In 35th International Conference on Computer-Aided Verification (CAV), July 2023.
  25. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
  26. Reachability analysis of a general class of neural ordinary differential equation. In Proceedings of the 20th International Conference on Formal Modeling and Analysis of Timed Systems (FORMATS 2022), Co-Located with CONCUR, FMICS, and QEST as part of CONFEST 2022., Warsaw, Poland, September 2022.
  27. Architecturing binarized neural networks for traffic sign recognition. arXiv preprint arXiv:2303.15005, 2023.
  28. Formal verification for neural networks with general nonlinearities via branch-and-bound. In 2nd Workshop on Formal Verification of Machine Learning (WFVML 2023), 2023.
  29. Fast certified robust training with short warmup. Advances in Neural Information Processing Systems, 34:18335–18349, 2021.
  30. Robustness verification for transformers. In International Conference on Learning Representations, 2019.
  31. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  32. An abstract domain for certifying neural networks. Proc. ACM Program. Lang., 3(POPL):41:1–41:30, 2019.
  33. Intriguing Properties of Neural Networks. arXiv preprint arXiv:1312.6199, 2013.
  34. Evaluating robustness of neural networks with mixed integer programming. In ICLR, 2019.
  35. Verification of piecewise deep neural networks: A star set approach with zonotope pre-filter. Formal aspects of computing, 2021.
  36. Verification of deep convolutional neural networks using imagestars. In 32nd International Conference on Computer-Aided Verification (CAV). Springer, July 2020.
  37. Safety verification of cyber-physical systems with reinforcement learning control. In ACM SIGBED International Conference on Embedded Software (EMSOFT’19). ACM, October 2019.
  38. Verification of recurrent neural networks using star reachability. In The 26th ACM International Conference on Hybrid Systems: Computation and Control (HSCC), May 2023.
  39. Star-based reachability analysis for deep neural networks. In 23rd International Symposium on Formal Methods (FM’19). Springer International Publishing, October 2019.
  40. Robustness verification of semantic segmentation neural networks using relaxed reachability. In 33rd International Conference on Computer-Aided Verification (CAV). Springer, July 2021.
  41. NNV: The neural network verification tool for deep neural networks and learning-enabled cyber-physical systems. In 32nd International Conference on Computer-Aided Verification (CAV), July 2020.
  42. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  43. Beta-CROWN: Efficient bound propagation with per-neuron split constraints for complete and incomplete neural network verification. arXiv preprint arXiv:2103.06624, 2021.
  44. Convex bounds on the softmax function with applications to robustness verification. In International Conference on Artificial Intelligence and Statistics, pages 6853–6878. PMLR, 2023.
  45. Scalable verification of gnn-based job schedulers. 6(OOPSLA2), oct 2022.
  46. Parallelization techniques for verifying neural networks. In # PLACEHOLDER_PARENT_METADATA_VALUE#, volume 1, pages 128–137. TU Wien Academic Press, 2020.
  47. Toward certified robustness against real-world distribution shifts. arXiv preprint arXiv:2206.03669, 2022.
  48. Efficient neural network analysis with sum-of-infeasibilities. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems, pages 143–163. Springer, 2022.
  49. Automatic perturbation analysis for scalable certified robustness and beyond. Advances in Neural Information Processing Systems, 33, 2020.
  50. Fast and Complete: Enabling complete neural network verification with rapid and massively parallel incomplete verifiers. In International Conference on Learning Representations, 2021.
  51. General cutting planes for bound-propagation-based neural network verification. Advances in Neural Information Processing Systems (NeurIPS), 2022.
  52. Efficient neural network robustness certification with general activation functions. Advances in Neural Information Processing Systems, 31:4939–4948, 2018.
  53. Lightweight Deep Network for Traffic Sign Classification. Annals of Telecommunications, 75:369–379, 2020.
Citations (24)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper demonstrates advanced neural network verification tools by evaluating 7 teams across 10 benchmarks with high accuracy.
  • The paper employs a standardized evaluation pipeline using ONNX and VNN-LIB formats on cost-equivalent AWS instances to ensure fairness.
  • The paper outlines future directions, including batch processing and minimized tool tuning, to better reflect real-world testing scenarios.

Overview of the Neural Network Verification Competition

The fourth International Verification of Neural Networks Competition (VNN-COMP 2023) serves as an annual event for researchers and developers to compare their neural network verification tools in a rigorous setting. This competition aims to assess the capabilities of state-of-the-art verification tools in ensuring the reliability and safety of neural network-based systems, which are particularly crucial in safety-critical applications such as autonomous driving and robotics.

Competition Structure and Evaluation

Participants in the competition are provided with standardized formats for both neural networks (using ONNX) and specifications (through VNN-LIB), ensuring a level playing field. The tools are evaluated on diverse benchmarks that replicate real-world problems, with evaluations performed on cost-equivalent Amazon Web Services (AWS) instances to ensure fairness regardless of computational resource differences. A uniform evaluation pipeline standardized tool interfaces, which allowed for the automated and consistent assessment of each participating tool.

Participants and Benchmarks

The 2023 iteration featured 7 teams competing across 10 scored benchmarks, each consisting of various instances that required verification within specific time constraints. The tools were tasked to either prove the correctness of properties (verification) or find counterexamples (falsification). The benchmarks covered a wide array of applications, from power system management to image generation with generative adversarial networks (cGANs) and advanced network architectures, including Vision Transformers (ViTs).

Results and Observations

The results demonstrated significant advancements in verification tool performance, with tools largely converging towards GPU-enabled linear bound propagation methods, augmented with branch-and-bound frameworks for enhanced efficiency and scalability. Despite the complexity of the tasks and the growing diversity of network architectures, the leading tools achieved high accuracy scores, with several successfully verifying hundreds of instances across various benchmarks.

Concluding Remarks and Future Directions

The VNN-COMP 2023 has not only provided valuable insights into the current state of neural network verification technologies but also established benchmarks for future research. Considerations for future competitions include the introduction of benchmarks to track progress over time, minimizing tool tuning to reflect real-world scenario testing, and the potential for batch processing modes. Ensuring tools' soundness will remain a priority to foster trust and broader adoption of neural network verification methods in critical applications.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com