Abstract Neural Networks (2009.05660v1)

Published 11 Sep 2020 in cs.LG, cs.PL, and stat.ML

Abstract: Deep Neural Networks (DNNs) are rapidly being applied to safety-critical domains such as drone and airplane control, motivating techniques for verifying the safety of their behavior. Unfortunately, DNN verification is NP-hard, with current algorithms slowing exponentially with the number of nodes in the DNN. This paper introduces the notion of Abstract Neural Networks (ANNs), which can be used to soundly overapproximate DNNs while using fewer nodes. An ANN is like a DNN except weight matrices are replaced by values in a given abstract domain. We present a framework parameterized by the abstract domain and activation functions used in the DNN that can be used to construct a corresponding ANN. We present necessary and sufficient conditions on the DNN activation functions for the constructed ANN to soundly over-approximate the given DNN. Prior work on DNN abstraction was restricted to the interval domain and ReLU activation function. Our framework can be instantiated with other abstract domains such as octagons and polyhedra, as well as other activation functions such as Leaky ReLU, Sigmoid, and Hyperbolic Tangent.

Authors (2)

Matthew Sotoudeh (11 papers)
Aditya V. Thakur (11 papers)

Citations (17)

View on Semantic Scholar

Summary

Abstract Neural Networks: An Overview

Deep Neural Networks (DNNs) have become integral to applications in safety-critical domains such as drone and airplane control, necessitating reliable verification techniques for their behavior. Despite the importance, verifying DNNs is a computationally challenging NP-hard problem, where the complexity increases exponentially with the number of nodes in the network. This paper introduces the concept of Abstract Neural Networks (ANNs) to address these challenges by soundly over-approximating DNNs using fewer nodes.

Key Contributions

Abstract Neural Networks (ANNs): The paper introduces ANNs, an abstraction where DNN weight matrices are replaced by values from an abstract domain. Unlike traditional DNNs that utilize scalar weight matrices, ANNs deploy abstract values to represent the weights, facilitating reduced computational complexity.
Framework Parameters: The proposed framework is parameterized by the choice of abstract domains and activation functions, providing flexibility in constructing ANNs. Necessary and sufficient conditions are given for the activation functions to ensure that the ANN soundly over-approximates the DNN.
Generalizing Prior Work: The methodology extends beyond prior work restricted to the interval domain and ReLU activation functions. It includes other abstract domains like octagons and polyhedra and activation functions such as Leaky ReLU, Sigmoid, and Hyperbolic Tangent.

Theoretical Foundations and Soundness

The paper lays down strong theoretical foundations for ANNs, proving soundness under two principal conditions:

Activation functions output non-negative values.
Activation functions satisfy the Weakened Intermediate Value Property (WIVP).

The soundness theorem guarantees that the ANN over-approximates the original DNN, provided these conditions are met. Importantly, the abstraction process maintains the properties of the original DNN, making it a viable proxy for verification purposes.

Algorithm for Constructing ANNs

The paper introduces the Layer-Wise Abstraction Algorithm, designed to merge groups of nodes in a DNN to form corresponding abstract nodes in an ANN. The approach is mathematically grounded and ensures computational feasibility by leveraging convex abstract domains, thereby limiting the abstractions required to only binary combinations.

Handling Activation Functions and Negative Values

While most activation functions used in practice conform to the defined conditions, the paper also handles edge cases:

DNNs with activation functions yielding negative values are adjusted by shifting lower-bound values.
Non-continuous activation functions are managed by replacing scalars with set-valued activation functions, ensuring consistent over-approximation.

Practical and Theoretical Implications

The implications of this research are substantial for both practical applications and theoretical advancements in AI:

Practical: The proposed framework can significantly enhance the scalability and efficiency of DNN verification processes. By facilitating sound and computationally feasible abstractions, the framework can be integrated into existing DNN verification tools to handle larger networks with diverse activation functions and abstract domains.
Theoretical: Future research can explore completeness of abstract domains for ANNs and extend the approach to accommodate non-convex abstract domains or symbolic abstractions. Additionally, the methodology could be adapted for more complex neural network architectures, including convolutional and recursive neural networks.

Conclusion

This paper's introduction of Abstract Neural Networks represents a notable advancement in the field of DNN verification. By generalizing the abstraction approach to various domains and activation functions, the framework offers a robust tool for tackling the computational challenges inherent in DNN verification. The theoretical guarantees provided by the soundness conditions ensure that the abstractions faithfully represent the properties of the original networks, thus paving the way for practical implementations in safety-critical applications.

PDF Markdown

Related Papers

Find Related Papers