The Fourth International Verification of Neural Networks Competition (VNN-COMP 2023): Summary and Results (2312.16760v1)

Published 28 Dec 2023 in cs.LG, cs.AI, and cs.SE

Abstract: This report summarizes the 4th International Verification of Neural Networks Competition (VNN-COMP 2023), held as a part of the 6th Workshop on Formal Methods for ML-Enabled Autonomous Systems (FoMLAS), that was collocated with the 35th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2023 iteration, 7 teams participated on a diverse set of 10 scored and 4 unscored benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.

References (53)

Citations (24)

View on Semantic Scholar

Collections

Summary

The paper demonstrates advanced neural network verification tools by evaluating 7 teams across 10 benchmarks with high accuracy.
The paper employs a standardized evaluation pipeline using ONNX and VNN-LIB formats on cost-equivalent AWS instances to ensure fairness.
The paper outlines future directions, including batch processing and minimized tool tuning, to better reflect real-world testing scenarios.

Overview of the Neural Network Verification Competition

The fourth International Verification of Neural Networks Competition (VNN-COMP 2023) serves as an annual event for researchers and developers to compare their neural network verification tools in a rigorous setting. This competition aims to assess the capabilities of state-of-the-art verification tools in ensuring the reliability and safety of neural network-based systems, which are particularly crucial in safety-critical applications such as autonomous driving and robotics.

Competition Structure and Evaluation

Participants in the competition are provided with standardized formats for both neural networks (using ONNX) and specifications (through VNN-LIB), ensuring a level playing field. The tools are evaluated on diverse benchmarks that replicate real-world problems, with evaluations performed on cost-equivalent Amazon Web Services (AWS) instances to ensure fairness regardless of computational resource differences. A uniform evaluation pipeline standardized tool interfaces, which allowed for the automated and consistent assessment of each participating tool.

Participants and Benchmarks

The 2023 iteration featured 7 teams competing across 10 scored benchmarks, each consisting of various instances that required verification within specific time constraints. The tools were tasked to either prove the correctness of properties (verification) or find counterexamples (falsification). The benchmarks covered a wide array of applications, from power system management to image generation with generative adversarial networks (cGANs) and advanced network architectures, including Vision Transformers (ViTs).

Results and Observations

The results demonstrated significant advancements in verification tool performance, with tools largely converging towards GPU-enabled linear bound propagation methods, augmented with branch-and-bound frameworks for enhanced efficiency and scalability. Despite the complexity of the tasks and the growing diversity of network architectures, the leading tools achieved high accuracy scores, with several successfully verifying hundreds of instances across various benchmarks.

Concluding Remarks and Future Directions

The VNN-COMP 2023 has not only provided valuable insights into the current state of neural network verification technologies but also established benchmarks for future research. Considerations for future competitions include the introduction of benchmarks to track progress over time, minimizing tool tuning to reflect real-world scenario testing, and the potential for batch processing modes. Ensuring tools' soundness will remain a priority to foster trust and broader adoption of neural network verification methods in critical applications.

PDF Markdown

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Authors (4)

Tweets

https://twitter.com/135746708/status/1740635959425646729