Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 78 tok/s

Gemini 2.5 Pro 55 tok/s Pro

GPT-5 Medium 30 tok/s Pro

GPT-5 High 28 tok/s Pro

GPT-4o 83 tok/s Pro

Kimi K2 175 tok/s Pro

GPT OSS 120B 444 tok/s Pro

Claude Sonnet 4.5 34 tok/s Pro

2000 character limit reached

Strong mixed-integer programming formulations for trained neural networks (1811.01988v4)

Published 5 Nov 2018 in math.OC

Abstract: We present strong mixed-integer programming (MIP) formulations for high-dimensional piecewise linear functions that correspond to trained neural networks. These formulations can be used for a number of important tasks, such as verifying that an image classification network is robust to adversarial inputs, or solving decision problems where the objective function is a machine learning model. We present a generic framework, which may be of independent interest, that provides a way to construct sharp or ideal formulations for the maximum of d affine functions over arbitrary polyhedral input domains. We apply this result to derive MIP formulations for a number of the most popular nonlinear operations (e.g. ReLU and max pooling) that are strictly stronger than other approaches from the literature. We corroborate this computationally, showing that our formulations are able to offer substantial improvements in solve time on verification tasks for image classification networks.

Citations (236)

View on Semantic Scholar

Summary

The paper introduces a generic MIP framework that constructs sharp and ideal formulations for high-dimensional piecewise linear functions in neural networks.
The paper demonstrates improved computational performance by tailoring formulations for nonlinearities such as ReLU and max pooling, enhancing robustness verification.
The paper offers practical implications with provably optimal MIP models that boost AI reliability in critical tasks like image classification and decision optimization.

Strong Mixed-Integer Programming Formulations for Trained Neural Networks

The paper presents a detailed investigation into providing robust mixed-integer programming (MIP) frameworks to model neural networks with high-dimensional piecewise linear functions. The significance of these formulations is underscored by their utility in crucial applications such as verifying robustness of image classification networks against adversarial attacks or optimizing decision models where objectives are determined by machine learning outputs.

Key Contributions

The authors expound on a generic framework for constructing both sharp and ideal MIP formulations for the maximum of $d$ affine functions over arbitrary polyhedral domains. These formulations exhibit enhanced strength over traditional methodologies for nonlinear operations like ReLU and max pooling, as evidenced by improved computational performance benchmarks when applied to verification tasks involving image classification.

Significant contributions are as follows:

Generic Framework Development: The paper propounds multiple recipes for constructing stronger MIP formulations. The authors derive both primal and dual characterizations for:
- Ideal formulations using the Cayley embedding.
- Hereditarily sharp formulations by relaxing conventional constraints, thus yielding simpler expressions.
Special Case Simplifications:
- When the number of affine functions is two, the formulations simplify considerably and coincide under certain conditions, being both sharp and ideal.
- For scenarios where the input domain is a product of simplices, the formulation can be expressed with explicit inequalities facilitating more straightforward computation.
Application to Neural Network Operations:
- Theoretical advancements are applied to derive enhanced MIP formulations for prevalent nonlinearities such as ReLU and max pooling. In particular, the paper provides an explicit ideal non-extended formulation for ReLU neurons over a box input domain, which can be separated efficiently in linear time.
Computational Experiments:
- Verification tasks using image classification networks trained on the MNIST dataset demonstrate MIP formulations' efficacy. The paper reports substantial improvements in solving time, underscoring practical utility and robustness of proposed formulations.

Theoretical and Practical Implications

The practical implications of this work are extensive, providing methodologies that potentially elevate efficiency and reliability in optimization tasks involving neural networks. On a theoretical level, the dual characterizations of the formulations and their separation routines indicate broader applicability beyond neural network robustness.

Furthermore, the authors address some limitations of heuristic methods through a MIP framework offering provably optimal solutions, ensuring rigorous guarantees of robustness and precision. The convergence of MIP techniques with neural network operability signifies a pivotal step towards more transparent and dependable AI systems, particularly relevant as machine learning models are increasingly integrated into critical domains like autonomous driving and medical diagnostics.

Prospects for Future Developments

Future work might extend these frameworks to include other non-linear transformations within neural networks. Investigations into scalability for even larger networks or different architectures could further refine these methodologies. Additionally, the development of software packages that integrate these formulations within existing deep learning libraries would bridge the gap between research and application.

Overall, the paper offers substantive advancements in leveraging MIP to augment the potential of neural networks, delivering not just improvements in specific applications but also contributing broadly to the field of AI safety and reliability.