Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DOS: Diverse Outlier Sampling for Out-of-Distribution Detection (2306.02031v2)

Published 3 Jun 2023 in cs.LG

Abstract: Modern neural networks are known to give overconfident prediction for out-of-distribution inputs when deployed in the open world. It is common practice to leverage a surrogate outlier dataset to regularize the model during training, and recent studies emphasize the role of uncertainty in designing the sampling strategy for outlier dataset. However, the OOD samples selected solely based on predictive uncertainty can be biased towards certain types, which may fail to capture the full outlier distribution. In this work, we empirically show that diversity is critical in sampling outliers for OOD detection performance. Motivated by the observation, we propose a straightforward and novel sampling strategy named DOS (Diverse Outlier Sampling) to select diverse and informative outliers. Specifically, we cluster the normalized features at each iteration, and the most informative outlier from each cluster is selected for model training with absent category loss. With DOS, the sampled outliers efficiently shape a globally compact decision boundary between ID and OOD data. Extensive experiments demonstrate the superiority of DOS, reducing the average FPR95 by up to 25.79% on CIFAR-100 with TI-300K.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Contextual diversity for active learning. In European Conference on Computer Vision, pp.  137–153. Springer, 2020.
  2. K-means++ the advantages of careful seeding. In Annual ACM-SIAM symposium on Discrete algorithms, pp.  1027–1035, 2007.
  3. Clustering on the unit hypersphere using von mises-fisher distributions. Journal of Machine Learning Research, 6(9), 2005.
  4. Towards open set deep networks. In Conference on Computer Vision and Pattern Recognition, pp.  1563–1572, 2016.
  5. Breaking down out-of-distribution detection: Many methods based on ood training data estimate a combination of the same core quantities. In International Conference on Machine Learning, pp.  2041–2074. PMLR, 2022.
  6. A dendrite method for cluster analysis. Communications in Statistics-theory and Methods, 3(1):1–27, 1974.
  7. Atom: Robustifying out-of-distribution detection using outlier mining. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp.  430–445. Springer, 2021.
  8. Super-samples from kernel herding. arXiv preprint arXiv:1203.3472, 2012.
  9. Describing textures in the wild. In Conference on Computer Vision and Pattern Recognition, pp.  3606–3613, 2014.
  10. Imagenet: A large-scale hierarchical image database. In Conference on Computer Vision and Pattern Recognition, pp.  248–255. Ieee, 2009.
  11. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
  12. Dense open-set recognition with synthetic outliers generated by real nvp. arXiv preprint arXiv:2011.11094, 2020.
  13. Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the problem. In Conference on Computer Vision and Pattern Recognition, June 2019.
  14. A baseline for detecting misclassified and out-of-distribution examples in neural networks. International Conference on Learning Representations, 2017.
  15. Using pre-training can improve model robustness and uncertainty. International Conference on Machine Learning, 2019a.
  16. Deep anomaly detection with outlier exposure. International Conference on Learning Representations, 2019b.
  17. Generalized odin: Detecting out-of-distribution image without learning from out-of-distribution data. In Conference on Computer Vision and Pattern Recognition, pp.  10951–10960, 2020.
  18. Densely connected convolutional networks. In Conference on Computer Vision and Pattern Recognition, 2017.
  19. On the importance of gradients for detecting distributional shifts in the wild. Advances in Neural Information Processing Systems, 34, 2021.
  20. Read: Aggregating reconstruction error into out-of-distribution detection. In AAAI Conference on Artificial Intelligence, volume 37, pp.  14910–14918, 2023.
  21. Opengan: Open-set recognition via open data generation. In International Conference on Computer Vision, pp.  813–822, 2021.
  22. Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario, 2009.
  23. Training confidence-calibrated classifiers for detecting out-of-distribution samples. In International Conference on Learning Representations, 2018a. URL https://openreview.net/forum?id=ryiAv2xAZ.
  24. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. In Advances in Neural Information Processing Systems, 2018b.
  25. Deep anomaly detection under labeling budget constraints. In International Conference on Machine Learning, pp.  19882–19910. PMLR, 2023.
  26. Yi Li and Nuno Vasconcelos. Background data resampling for outlier-aware classification. In Conference on Computer Vision and Pattern Recognition, June 2020.
  27. Enhancing the reliability of out-of-distribution image detection in neural networks. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=H1VGkIxRZ.
  28. Energy-based out-of-distribution detection. Advances in Neural Information Processing Systems, 33:21464–21475, 2020.
  29. Towards neural networks that provably know when they don’t know. arXiv preprint arXiv:1909.12180, 2019.
  30. Delving into out-of-distribution detection with vision-language representations. In Advances in Neural Information Processing Systems, 2022a.
  31. Poem: Out-of-distribution detection with posterior sampling. In International Conference on Machine Learning. PMLR, 2022b.
  32. Self-supervised learning for generalizable out-of-distribution detection. In AAAI Conference on Artificial Intelligence, volume 34, pp.  5216–5223, 2020.
  33. Provable guarantees for understanding out-of-distribution detection. In AAAI Conference on Artificial Intelligence, volume 36, pp.  7831–7840, 2022.
  34. Ward’s hierarchical agglomerative clustering method: which algorithms implement ward’s criterion? Journal of Classification, 31:274–295, 2014.
  35. Reading digits in natural images with unsupervised feature learning. 2011.
  36. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Conference on Computer Vision and Pattern Recognition, pp.  427–436, 2015.
  37. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, pp.  8748–8763. PMLR, 2021.
  38. Analyzing the robustness of open-world machine learning. In Workshop on Artificial Intelligence and Security, pp.  105–116, 2019.
  39. Active learning for convolutional neural networks: A core-set approach. arXiv preprint arXiv:1708.00489, 2017.
  40. Small-gan: Speeding up gan training using core-sets. In International Conference on Machine Learning, pp.  9005–9015. PMLR, 2020.
  41. Dice: Leveraging sparsification for out-of-distribution detection. In European Conference on Computer Vision, 2022.
  42. React: Out-of-distribution detection with rectified activations. Advances in Neural Information Processing Systems, 34, 2021.
  43. Out-of-distribution detection with deep nearest neighbors. International Conference on Machine Learning, 2022.
  44. Csi: Novelty detection via contrastive learning on distributionally shifted instances. Advances in Neural Information Processing Systems, 33:11839–11852, 2020.
  45. The inaturalist species classification and detection dataset. In Conference on Computer Vision and Pattern Recognition, pp.  8769–8778, 2018.
  46. Can multi-label classification networks know what they don’t know? Advances in Neural Information Processing Systems, 34:29074–29087, 2021.
  47. Learning to augment distributions for out-of-distribution detection. In Advances in Neural Information Processing Systems, 2023a.
  48. Out-of-distribution detection with implicit outlier transformation. arXiv preprint arXiv:2303.05033, 2023b.
  49. Mitigating neural network overconfidence with logit normalization. In International Conference on Machine Learning, 2022.
  50. Sun database: Large-scale scene recognition from abbey to zoo. In Conference on Computer Vision and Pattern Recognition, pp.  3485–3492. IEEE, 2010.
  51. Turkergaze: Crowdsourcing saliency with webcam based eye tracking. arXiv preprint arXiv:1504.06755, 2015.
  52. Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop. arXiv preprint arXiv:1506.03365, 2015.
  53. Wide residual networks. In British Machine Vision Conference, 2016.
  54. Places: A 10 million image database for scene recognition. Transactions on Pattern Analysis and Machine Intelligence, 40(6):1452–1464, 2017.
  55. Diversified outlier exposure for out-of-distribution detection via informative extrapolation. In Advances in Neural Information Processing Systems, 2023.
  56. Boosting out-of-distribution detection with typical features. arXiv preprint arXiv:2210.04200, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Wenyu Jiang (13 papers)
  2. Hao Cheng (190 papers)
  3. Mingcai Chen (11 papers)
  4. Chongjun Wang (27 papers)
  5. Hongxin Wei (45 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.