An Ambiguity Measure for Recognizing the Unknowns in Deep Learning (2312.06077v1)
Abstract: We study the understanding of deep neural networks from the scope in which they are trained on. While the accuracy of these models is usually impressive on the aggregate level, they still make mistakes, sometimes on cases that appear to be trivial. Moreover, these models are not reliable in realizing what they do not know leading to failures such as adversarial vulnerability and out-of-distribution failures. Here, we propose a measure for quantifying the ambiguity of inputs for any given model with regard to the scope of its training. We define the ambiguity based on the geometric arrangements of the decision boundaries and the convex hull of training set in the feature space learned by the trained model, and demonstrate that a single ambiguity measure may detect a considerable portion of mistakes of a model on in-distribution samples, adversarial inputs, as well as out-of-distribution inputs. Using our ambiguity measure, a model may abstain from classification when it encounters ambiguous inputs leading to a better model accuracy not just on a given testing set, but on the inputs it may encounter at the world at large. In pursuit of this measure, we develop a theoretical framework that can identify the unknowns of the model in relation to its scope. We put this in perspective with the confidence of the model and develop formulations to identify the regions of the domain which are unknown to the model, yet the model is guaranteed to have high confidence.
- Claire Fisher Adler. Modern Geometry. McGraw-Hill, 1967.
- Keith Ball. Cube slicing in Rnsuperscript𝑅𝑛{R}^{n}italic_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT. Proceedings of the American Mathematical Society, 97(3):465–473, 1986.
- Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8), 2013.
- Sparse approximation via generating point sets. ACM Transactions on Algorithms, 15(3):1–16, 2019.
- Exact volume computation for polytopes: A practical study. In Polytopes—Combinatorics and Computation, pages 131–154. Springer, 2000.
- Extrapolation and AI transparency: Why machine learning models should reveal when they make decisions beyond their training. Big Data & Society, 10(1), 2023.
- Learning to screen for fast softmax inference on large vocabulary neural networks. In International Conference on Learning Representations, 2019.
- Separability and geometry of object manifolds in deep neural networks. Nature Communications, 11(1):1–13, 2020.
- Imagenet: A large-scale hierarchical image database. In IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009.
- UCI machine learning repository, 2017.
- Dataset security for machine learning: Data poisoning, backdoor attacks, and defenses. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(2), 2023.
- Matrix Computations. JHU Press, Baltimore, 4th edition, 2012.
- Space exploration via proximity search. Discrete & Computational Geometry, 56:357–376, 2016.
- A baseline for detecting misclassified and out-of-distribution examples in neural networks. In International Conference on Learning Representations, 2017.
- A call to reflect on evaluation practices for failure detection in image classification. In International Conference on Learning Representations, 2023.
- Distilling model failures as directions in latent space. In International Conference on Learning Representations, 2023.
- Jim Lawrence. Polytope volume computation. Mathematics of Computation, 57(195):259–271, 1991.
- Simulated annealing in convex bodies and an O*(n4)superscript𝑂superscript𝑛4{O}^{*}(n^{4})italic_O start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ( italic_n start_POSTSUPERSCRIPT 4 end_POSTSUPERSCRIPT ) volume algorithm. Journal of Computer and System Sciences, 72(2):392–417, 2006.
- Slices, slabs, and sections of the unit hypercube. Online Journal of Analytic Combinatorics, 3:1–11, 2008.
- Sample efficient detection and classification of adversarial attacks via self-supervised embeddings. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7677–7686, 2021.
- Do deep generative models know what they don’t know? In International Conference on Learning Representations, 2019.
- Measuring calibration in deep learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2019.
- Numerical Optimization. Springer, New York, 2nd edition, 2006.
- Patrick E O’Neil. Hyperplane cuts of an n-cube. Discrete Mathematics, 1(2):193–195, 1971.
- Can you trust your model’s uncertainty? Evaluating predictive uncertainty under dataset shift. In Advances in Neural Information Processing Systems, 2019.
- Likelihood ratios for out-of-distribution detection. In Advances in Neural Information Processing Systems, pages 14680–14691, 2019.
- SVD-softmax: Fast softmax approximation on large vocabulary neural networks. Advances in Neural Information Processing Systems, 2017.
- Miklós Simonovits. How to compute the volume in high dimension? Mathematical Programming, 97:337–374, 2003.
- Intriguing properties of neural networks. In International Conference on Learning Representations, 2014.
- OoD-Bench: Quantifying and understanding two dimensions of out-of-distribution generalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- Roozbeh Yousefzadeh. Deep learning generalization and the convex hull of training sets. arXiv preprint arXiv:2101.09849, 2020.
- Roozbeh Yousefzadeh. Decision boundaries and convex hulls in the feature space that deep learning functions learn from images. arXiv preprint arXiv:2202.04052, 2022.
- mixup: Beyond empirical risk minimization. In International Conference on Learning Representations, 2018.